[Users] IB topology config and polling state

Hal Rosenstock hal.rosenstock at gmail.com
Wed Oct 7 13:59:21 PDT 2015


Doesn't the HP enclosure have some built in management ?

On Wed, Oct 7, 2015 at 4:38 PM, German Anders <ganders at despegar.com> wrote:

> I think so, there's no much information out there regarding the
> configuration of the hp blc ib switches..
>
> I run the following command from one of the blades:
>
> # *ibhosts*
> Ca    : 0x0011750000791fec ports 2 "Intel Infiniband HCA ubuntu"
>
> # *ibswitches *
> # *ibnetdiscover*
> #
> # Topology file: generated on Wed Oct  7 16:29:17 2015
> #
> # Initiated from node 0011750000791fec port 0011750000791fec
>
> vendid=0x1175
> devid=0x7322
> sysimgguid=0x11750000791fec
> caguid=0x11750000791fectell
> Ca    2 "H-0011750000791fec"        # "Intel Infiniband HCA ubuntu"
>
> I don't know exactly if I should configure somehow so that the internal
> GUID ports to be mapped to the sm running on SWIB01 or maybe I should tell
> opensm on the blade to start on the HP IB SW GUID... really missed up here
> :(
>
>
>
>
> *German* <ganders at despegar.com>
>
> 2015-10-07 17:27 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>
>> It's a physical connectivity problem in the enclosure. Are you sure
>> things are correct there ?
>>
>> On Wed, Oct 7, 2015 at 4:26 PM, German Anders <ganders at despegar.com>
>> wrote:
>>
>>> I've try everything...also to start a opensm on one of the blades:
>>>
>>> root at ubuntu:/etc/infiniband# /etc/init.d/opensm start
>>>
>>> *Starting opensm on 0x0011750000791fec: Starting opensm on
>>> 0x0011750000791fed:*
>>> root at ubuntu:/etc/infiniband#
>>> root at ubuntu:/etc/infiniband# ps -ef | grep opensm
>>> root      4427     1  0 09:44 ?        00:00:01 /usr/sbin/opensm -g
>>> 0x0011750000791fec -f /var/log/opensm.0x0011750000791fec.log
>>> root      4429     1  0 09:44 ?        00:00:01 /usr/sbin/opensm -g
>>> 0x0011750000791fed -f /var/log/opensm.0x0011750000791fed.log
>>> root      7409  1853  0 16:21 pts/0    00:00:00 grep --color=auto opensm
>>> root at ubuntu:/etc/infiniband#
>>> root at ubuntu:/etc/infiniband# tail -f
>>> /var/log/opensm.0x0011750000791fec.log
>>> Oct 07 09:44:10 642680 [D4F60740] 0x03 -> OpenSM 3.3.15
>>> Oct 07 09:44:10 642750 [D4F60740] 0x80 -> OpenSM 3.3.15
>>> Oct 07 09:44:10 646683 [D4F60740] 0x02 -> osm_vendor_init: 1000 pending
>>> umads specified
>>> Oct 07 09:44:10 646931 [D4F60740] 0x80 -> Entering DISCOVERING state
>>> Oct 07 09:44:10 647189 [D4F60740] 0x02 -> osm_vendor_bind: Mgmt class
>>> 0x81 binding to port GUID 0x11750000791fec
>>> Oct 07 09:44:10 649663 [D4F60740] 0x02 -> osm_vendor_bind: Mgmt class
>>> 0x03 binding to port GUID 0x11750000791fec
>>> Oct 07 09:44:10 649723 [D4F60740] 0x02 -> osm_vendor_bind: Mgmt class
>>> 0x04 binding to port GUID 0x11750000791fec
>>> Oct 07 09:44:10 649781 [D4F60740] 0x02 -> osm_vendor_bind: Mgmt class
>>> 0x21 binding to port GUID 0x11750000791fec
>>> Oct 07 09:44:10 649855 [D4F60740] 0x02 -> osm_opensm_bind: Setting IS_SM
>>> on port 0x0011750000791fec
>>> Oct 07 09:44:10 651518 [B2CDA700] 0x80 -> SM port is down
>>>
>>>
>>> root at ubuntu:/etc/infiniband# ibstat
>>>
>>> CA 'qib0'
>>>     CA type: InfiniPath_QMH7342
>>>     Number of ports: 2
>>>     Firmware version:
>>>     Hardware version: 2
>>>     Node GUID: 0x0011750000791fec
>>>     System image GUID: 0x0011750000791fec
>>>     Port 1:
>>>         State: Down
>>>         Physical state: Polling
>>>         Rate: 40
>>>         Base lid: 4660
>>>         LMC: 0
>>>         SM lid: 4660
>>>         Capability mask: 0x0761086a
>>>         Port GUID: 0x0011750000791fec
>>>         Link layer: InfiniBand
>>>     Port 2:
>>>         State: Down
>>>         Physical state: Polling
>>>         Rate: 40
>>>         Base lid: 4660
>>>         LMC: 0
>>>         SM lid: 4660
>>>         Capability mask: 0x0761086a
>>>         Port GUID: 0x0011750000791fed
>>>         Link layer: InfiniBand
>>>
>>>
>>>
>>>
>>>
>>> *German* <ganders at despegar.com>
>>>
>>> 2015-10-07 17:24 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>>>
>>>> Yes, somehow the enclosure is not internally connected properly
>>>> otherwise you'd see more topology off the other switch ports. I think it
>>>> has more than 1 switch.
>>>>
>>>> On Wed, Oct 7, 2015 at 3:56 PM, German Anders <ganders at despegar.com>
>>>> wrote:
>>>>
>>>>> find some of the output of the ibnetdiscover from one of the working
>>>>> hosts:
>>>>>
>>>>> # ibnetdiscover
>>>>> #
>>>>> # Topology file: generated on Wed Oct  7 15:51:55 2015
>>>>> #
>>>>> # Initiated from node e41d2d0300163650 port e41d2d0300163651
>>>>>
>>>>> vendid=0x2c9
>>>>> devid=0xbd36
>>>>> sysimgguid=0x2c902004b0918
>>>>> switchguid=0x2c902004b0918(2c902004b0918)
>>>>> Switch    32 "S-0002c902004b0918"        # "Infiniscale-IV Mellanox
>>>>> Technologies" base port 0 *lid 29* lmc 0
>>>>> [1]    "S-e41d2d030031e9c1"[9]        # "MF0;GWIB01:SX6036G/U1" lid 24
>>>>> 4xQDR
>>>>>
>>>>> vendid=0x2c9
>>>>> devid=0xc738
>>>>> sysimgguid=0xe41d2d030031e9c0
>>>>> switchguid=0xe41d2d030031e9c1(e41d2d030031e9c1)
>>>>> Switch    37 "S-e41d2d030031e9c1"        # "MF0;GWIB01:SX6036G/U1"
>>>>> enhanced port 0 lid 24 lmc 0
>>>>> [9]    "S-0002c902004b0918"[1]        # "Infiniscale-IV Mellanox
>>>>> Technologies" lid 29 4xQDR
>>>>> [33]    "S-e41d2d030031eb41"[33]        # "MF0;GWIB02:SX6036G/U1" lid
>>>>> 23 4xFDR10
>>>>> [34]    "S-f45214030073f500"[16]        # "MF0;SWIB02:SX6018/U1" lid 1
>>>>> 4xFDR10
>>>>> [35]    "S-e41d2d030031eb41"[35]        # "MF0;GWIB02:SX6036G/U1" lid
>>>>> 23 4xFDR10
>>>>> [36]    "S-e41d2d0300097630"[18]        # "MF0;SWIB01:SX6018/U1" lid 2
>>>>> 4xFDR10
>>>>> [37]    "H-e41d2d030031e9c2"[1](e41d2d030031e9c2)         #
>>>>> "MF0;GWIB01:SX60XX/GW" lid 25 4xFDR
>>>>>
>>>>> (...)
>>>>>
>>>>> Clearly the connectivity problem is from the HP IB SW to the blades...
>>>>>
>>>>>
>>>>> *German*
>>>>>
>>>>> 2015-10-07 16:26 GMT-03:00 German Anders <ganders at despegar.com>:
>>>>>
>>>>>> for port #1:
>>>>>>
>>>>>> # ibportstate -L 29 1 query
>>>>>> Switch PortInfo:
>>>>>> # Port info: Lid 29 port 1
>>>>>> LinkState:.......................Active
>>>>>> PhysLinkState:...................LinkUp
>>>>>> Lid:.............................75
>>>>>> SMLid:...........................2328
>>>>>> LMC:.............................0
>>>>>> LinkWidthSupported:..............1X or 4X
>>>>>> LinkWidthEnabled:................1X or 4X
>>>>>> LinkWidthActive:.................4X
>>>>>> LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>>> LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>>> LinkSpeedActive:.................10.0 Gbps
>>>>>> Peer PortInfo:
>>>>>> # Port info: Lid 29 DR path slid 4; dlid 65535; 0,1 port 9
>>>>>> LinkState:.......................Active
>>>>>> PhysLinkState:...................LinkUp
>>>>>> Lid:.............................0
>>>>>> SMLid:...........................0
>>>>>> LMC:.............................0
>>>>>> LinkWidthSupported:..............1X or 4X
>>>>>> LinkWidthEnabled:................1X or 4X
>>>>>> LinkWidthActive:.................4X
>>>>>> LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>>> LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>>> LinkSpeedActive:.................10.0 Gbps
>>>>>> LinkSpeedExtSupported:...........14.0625 Gbps
>>>>>> LinkSpeedExtEnabled:.............14.0625 Gbps
>>>>>> LinkSpeedExtActive:..............No Extended Speed
>>>>>> # MLNX ext Port info: Lid 29 DR path slid 4; dlid 65535; 0,1 port 9
>>>>>> StateChangeEnable:...............0x00
>>>>>> LinkSpeedSupported:..............0x01
>>>>>> LinkSpeedEnabled:................0x01
>>>>>> LinkSpeedActive:.................0x00
>>>>>>
>>>>>>
>>>>>> *German*
>>>>>>
>>>>>> 2015-10-07 16:24 GMT-03:00 German Anders <ganders at despegar.com>:
>>>>>>
>>>>>>> Yeah, is there any command that I can run in order to change the
>>>>>>> port state on the remote switch? I mean everything looks good but in the hp
>>>>>>> blades still getting:
>>>>>>>
>>>>>>>
>>>>>>> # ibstat
>>>>>>> CA 'qib0'
>>>>>>>     CA type: InfiniPath_QMH7342
>>>>>>>     Number of ports: 2
>>>>>>>     Firmware version:
>>>>>>>     Hardware version: 2
>>>>>>>     Node GUID: 0x0011750000791fec
>>>>>>>     System image GUID: 0x0011750000791fec
>>>>>>>     Port 1:
>>>>>>>         State: *Down*
>>>>>>>         Physical state: *Polling*
>>>>>>>         Rate: 40
>>>>>>>         Base lid: 4660
>>>>>>>         LMC: 0
>>>>>>>         SM lid: 4660
>>>>>>>         Capability mask: 0x0761086a
>>>>>>>         Port GUID: 0x0011750000791fec
>>>>>>>         Link layer: InfiniBand
>>>>>>>     Port 2:
>>>>>>>         State: *Down*
>>>>>>>         Physical state: *Polling*
>>>>>>>         Rate: 40
>>>>>>>         Base lid: 4660
>>>>>>>         LMC: 0
>>>>>>>         SM lid: 4660
>>>>>>>         Capability mask: 0x0761086a
>>>>>>>         Port GUID: 0x0011750000791fed
>>>>>>>         Link layer: InfiniBand
>>>>>>>
>>>>>>>
>>>>>>> Also on working hosts I only see devices from the local network, but
>>>>>>> didn't see any of the blades hca connections.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *German* <ganders at despegar.com>
>>>>>>>
>>>>>>> 2015-10-07 16:21 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>
>>>>>>> :
>>>>>>>
>>>>>>>> The screen shot looks good :-) SM brought the link up to active.
>>>>>>>>
>>>>>>>> Note that the ibportstate command you gave was for switch port 0 of
>>>>>>>> the Mellanox IS-4 switch in the QLogic HP BLc 4X QDR IB Switch.
>>>>>>>>
>>>>>>>> On Wed, Oct 7, 2015 at 3:06 PM, German Anders <ganders at despegar.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Yes, find attached an screenshot of the port information (# 9) the
>>>>>>>>> one that makes the ISL to the QLogic HP BLc 4X QDR IB Switch, also from one
>>>>>>>>> of the hosts that are connected to one of the SX6018F I can see the
>>>>>>>>> 'remote' HP IB SW:
>>>>>>>>>
>>>>>>>>> # *ibnodes*
>>>>>>>>>
>>>>>>>>> (...)
>>>>>>>>> Switch    : 0x0002c902004b0918 ports 32 "Infiniscale-IV Mellanox
>>>>>>>>> Technologies" base port 0 *lid 29 *lmc 0
>>>>>>>>> Switch    : 0xe41d2d030031e9c1 ports 37 "MF0;GWIB01:SX6036G/U1"
>>>>>>>>> enhanced port 0 lid 24 lmc 0
>>>>>>>>> (...)
>>>>>>>>>
>>>>>>>>> # *ibportstate -L 29 query*
>>>>>>>>> Switch PortInfo:
>>>>>>>>> # Port info: Lid 29 port 0
>>>>>>>>> LinkState:.......................Active
>>>>>>>>> PhysLinkState:...................LinkUp
>>>>>>>>> Lid:.............................29
>>>>>>>>> SMLid:...........................2
>>>>>>>>> LMC:.............................0
>>>>>>>>> LinkWidthSupported:..............1X or 4X
>>>>>>>>> LinkWidthEnabled:................1X or 4X
>>>>>>>>> LinkWidthActive:.................4X
>>>>>>>>> LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>>>>>> LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>>>>>> LinkSpeedActive:.................10.0 Gbps
>>>>>>>>> Mkey:............................<not displayed>
>>>>>>>>> MkeyLeasePeriod:.................0
>>>>>>>>> ProtectBits:.....................0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *German* <ganders at despegar.com>
>>>>>>>>>
>>>>>>>>> 2015-10-07 16:00 GMT-03:00 Hal Rosenstock <
>>>>>>>>> hal.rosenstock at gmail.com>:
>>>>>>>>>
>>>>>>>>>> One more thing hopefully before playing with the low level phy
>>>>>>>>>> settings:
>>>>>>>>>>
>>>>>>>>>> Are you using known good cables ? Do you have FDR cables on the
>>>>>>>>>> FDR <-> FDR links ? Cable lengths can matter as well.
>>>>>>>>>>
>>>>>>>>>> On Wed, Oct 7, 2015 at 12:57 PM, Hal Rosenstock <
>>>>>>>>>> hal.rosenstock at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Were the ports mapped to the phy profile shutdown when you
>>>>>>>>>>> changed this ?
>>>>>>>>>>>
>>>>>>>>>>> LLR is a proprietary Mellanox mechanism.
>>>>>>>>>>>
>>>>>>>>>>> You might want 2 different profiles: one for the interfaces
>>>>>>>>>>> connected to other gateway interfaces (which are FDR (and FDR-10) capable
>>>>>>>>>>> and the other for the interfaces connecting to QDR (the older equipment in
>>>>>>>>>>> your network). By configuring the Switch-X interfaces to the appropriate
>>>>>>>>>>> possible speeds and disabling the proprietary mechanisms there, the link
>>>>>>>>>>> should not only come up but also this will occur faster than if FDR/FDR10
>>>>>>>>>>> are enabled.
>>>>>>>>>>>
>>>>>>>>>>> I suspect that due to the Switch-X configuration that the links
>>>>>>>>>>> to the switch(es) in the HP enclosures do not negotiate properly (as shown
>>>>>>>>>>> by down rather than LinkUp).
>>>>>>>>>>>
>>>>>>>>>>> Once you get all your links to INIT, negotiation has occurred
>>>>>>>>>>> and then it's time for SM to bring links to active.
>>>>>>>>>>>
>>>>>>>>>>> Since you have down links, the SM can't do anything about those.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Oct 7, 2015 at 12:44 PM, German Anders <
>>>>>>>>>>> ganders at despegar.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Anyone had any experience with HP BLc 4X QDR IB Switch?? I know
>>>>>>>>>>>> that this kind of SW does not come with an embedded sm, but I don't know
>>>>>>>>>>>> how to access any mgmt at all on this particularly switch, I mean for
>>>>>>>>>>>> example to setup speed or anything like that, is possible to access through
>>>>>>>>>>>> the chassis?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *German* <ganders at despegar.com>
>>>>>>>>>>>>
>>>>>>>>>>>> 2015-10-07 13:19 GMT-03:00 German Anders <ganders at despegar.com>
>>>>>>>>>>>> :
>>>>>>>>>>>>
>>>>>>>>>>>>> I think so, but when trying to configured the phy-profile on
>>>>>>>>>>>>> the interface in order to negotiate on QDR it failed to map the profile:
>>>>>>>>>>>>>
>>>>>>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # show phy-profile
>>>>>>>>>>>>> high-speed-ber
>>>>>>>>>>>>>
>>>>>>>>>>>>>   Profile: high-speed-ber
>>>>>>>>>>>>>   --------
>>>>>>>>>>>>>   llr support ib-speed
>>>>>>>>>>>>>   SDR: disable
>>>>>>>>>>>>>   DDR: disable
>>>>>>>>>>>>>   QDR: disable
>>>>>>>>>>>>>   FDR10: enable-request
>>>>>>>>>>>>>   FDR: enable-request
>>>>>>>>>>>>>
>>>>>>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # show phy-profile
>>>>>>>>>>>>> hp-encl-isl
>>>>>>>>>>>>>
>>>>>>>>>>>>>   Profile: hp-encl-isl
>>>>>>>>>>>>>   --------
>>>>>>>>>>>>>   llr support ib-speed
>>>>>>>>>>>>>   SDR: disable
>>>>>>>>>>>>>   DDR: disable
>>>>>>>>>>>>>   QDR: enable
>>>>>>>>>>>>>   FDR10: enable-request
>>>>>>>>>>>>>   FDR: enable-request
>>>>>>>>>>>>>
>>>>>>>>>>>>> GWIB01 [proxy-ha-group: master] (config) #
>>>>>>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # interface ib 1/9
>>>>>>>>>>>>> phy-profile map hp-encl-isl
>>>>>>>>>>>>> *% Cannot map profile hp-encl-isl to port:  1/9*
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> *German* <ganders at despegar.com>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2015-10-07 13:17 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The driver ‘qib’ is loading fine.  As can be seen by the
>>>>>>>>>>>>>> ibstat output.  The ib_ipath is an older card.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The problem is the link is not coming up to init.  Like Hal
>>>>>>>>>>>>>> said the link should transition to “link up” without the SMs involvement.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think you are on to something with the fact that it seems
>>>>>>>>>>>>>> like your switch ports are not configured to do QDR.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Ira
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *From:* German Anders [mailto:ganders at despegar.com]
>>>>>>>>>>>>>> *Sent:* Wednesday, October 07, 2015 9:05 AM
>>>>>>>>>>>>>> *To:* Weiny, Ira
>>>>>>>>>>>>>> *Cc:* Hal Rosenstock; users at lists.openfabrics.org
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes I've that file:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> /usr/sbin/truescale-serdes.cmds
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also I've done the install of libipathverbs:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # apt-get install libipathverbs-dev
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But I try to load the ib_ipath module but I'm getting the
>>>>>>>>>>>>>> following error msg:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # modprobe ib_ipath
>>>>>>>>>>>>>> modprobe: ERROR: could not insert 'ib_ipath': Device or
>>>>>>>>>>>>>> resource busy
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *German*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2015-10-07 12:54 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There are a few issues for routing in that diagram but the
>>>>>>>>>>>>>> links should come up.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I assume there is some backplane between the blade servers
>>>>>>>>>>>>>> and the switch in that chassis?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Have you gotten libipathverbs installed?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In ipathverbs there is a serdes tuning script.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://github.com/01org/libipathverbs/blob/master/truescale-serdes.cmds
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Does your libipathverbs include that file?  If not try the
>>>>>>>>>>>>>> latest from github.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Ira
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *From:* users-bounces at lists.openfabrics.org [mailto:
>>>>>>>>>>>>>> users-bounces at lists.openfabrics.org] *On Behalf Of *German
>>>>>>>>>>>>>> Anders
>>>>>>>>>>>>>> *Sent:* Wednesday, October 07, 2015 8:41 AM
>>>>>>>>>>>>>> *To:* Hal Rosenstock
>>>>>>>>>>>>>> *Cc:* users at lists.openfabrics.org
>>>>>>>>>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Hal,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for the reply, I've attach a pdf with the diagram
>>>>>>>>>>>>>> topology, I don't know if this is the best way to go or if there's another
>>>>>>>>>>>>>> way to connect and setup the IB network, tips and suggestions will be very
>>>>>>>>>>>>>> appreciated, also the mezzanine cards are already installed on the blade
>>>>>>>>>>>>>> hosts:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # lspci
>>>>>>>>>>>>>> (...)
>>>>>>>>>>>>>> 41:00.0 InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand HCA
>>>>>>>>>>>>>> (rev 02)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks in advance,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *German*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2015-10-07 11:47 GMT-03:00 Hal Rosenstock <
>>>>>>>>>>>>>> hal.rosenstock at gmail.com>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi again German,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Looks like you made some progress from yesterday as the qib
>>>>>>>>>>>>>> ports are now Polling rather than Disabled.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But since they are Down, do you have them cabled to a switch
>>>>>>>>>>>>>> ? That should bring the links up and the port state will be Init. That is
>>>>>>>>>>>>>> the "starting" point.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> You will also then need to be running SM to bring the ports
>>>>>>>>>>>>>> up to Active.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -- Hal
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Oct 7, 2015 at 10:37 AM, German Anders <
>>>>>>>>>>>>>> ganders at despegar.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I don't know if this is the mailist list for this kind of
>>>>>>>>>>>>>> topic but I'm really new to IB and I've just install two SX6036G gateways
>>>>>>>>>>>>>> connected to each other through two ISL ports, then I've configured a
>>>>>>>>>>>>>> proxy-arp between both nodes (sm is disable on both gw's):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # show proxy-arp ha
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Load balancing algorithm: ib-base-ip
>>>>>>>>>>>>>> Number of Proxy-Arp interfaces: 1
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Proxy-ARP VIP
>>>>>>>>>>>>>> =============
>>>>>>>>>>>>>> Pra-group name: proxy-ha-group
>>>>>>>>>>>>>> HA VIP address: 10.xx.xx.xx/xx
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Active nodes:
>>>>>>>>>>>>>> ID                   State                IP
>>>>>>>>>>>>>> --------------------------------------------------------------
>>>>>>>>>>>>>> GWIB01               master               10.xx.xx.xx1
>>>>>>>>>>>>>> GWIB02               standby              10.xx.xx.xx2
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Then I setup two SX6018F switches (*SWIB01* and *SWIB02*),
>>>>>>>>>>>>>> one connected to GWIB01 and the other connected to GWIB02. The SM is
>>>>>>>>>>>>>> configured locally on both SWIB01 & SWIB02 switches. So far so good, after
>>>>>>>>>>>>>> this config I setup a commodity server with a MLNX IB ADPT FDR to the
>>>>>>>>>>>>>> SWIB01 & SWIB02 switches, config the drivers, etc and then get it up &
>>>>>>>>>>>>>> running fine.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Finally I've setup a HP Enclosure with an internal IB SW
>>>>>>>>>>>>>> (then connect port 1 of the internal SW to GWIB01 - link is up but LLR
>>>>>>>>>>>>>> status is inactive), install one of the blades and I see the following:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # ibstat
>>>>>>>>>>>>>> CA 'qib0'
>>>>>>>>>>>>>>     CA type: InfiniPath_QMH7342
>>>>>>>>>>>>>>     Number of ports: 2
>>>>>>>>>>>>>>     Firmware version:
>>>>>>>>>>>>>>     Hardware version: 2
>>>>>>>>>>>>>>     Node GUID: 0x0011750000791fec
>>>>>>>>>>>>>>     System image GUID: 0x0011750000791fec
>>>>>>>>>>>>>>     Port 1:
>>>>>>>>>>>>>>         State: Down
>>>>>>>>>>>>>>         Physical state: Polling
>>>>>>>>>>>>>>         Rate: 40
>>>>>>>>>>>>>>         Base lid: 4660
>>>>>>>>>>>>>>         LMC: 0
>>>>>>>>>>>>>>         SM lid: 4660
>>>>>>>>>>>>>>         Capability mask: 0x0761086a
>>>>>>>>>>>>>>         Port GUID: 0x0011750000791fec
>>>>>>>>>>>>>>         Link layer: InfiniBand
>>>>>>>>>>>>>>     Port 2:
>>>>>>>>>>>>>>         State: Down
>>>>>>>>>>>>>>         Physical state: Polling
>>>>>>>>>>>>>>         Rate: 40
>>>>>>>>>>>>>>         Base lid: 4660
>>>>>>>>>>>>>>         LMC: 0
>>>>>>>>>>>>>>         SM lid: 4660
>>>>>>>>>>>>>>         Capability mask: 0x0761086a
>>>>>>>>>>>>>>         Port GUID: 0x0011750000791fed
>>>>>>>>>>>>>>         Link layer: InfiniBand
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So I was wondering if maybe the SM is not being recognized on
>>>>>>>>>>>>>> the Blade system and that's why is not passing the Polling state, is that
>>>>>>>>>>>>>> possible? Or maybe is not possible to connect an ISL between the GW and the
>>>>>>>>>>>>>> HP internal SW so that the sm is available or maybe the inactive LLR is
>>>>>>>>>>>>>> causing this thing, any ideas? I thought about connecting
>>>>>>>>>>>>>> the ISL of the HP IB SW to the SWIB01 or SWIB02 instead of the GW's but I
>>>>>>>>>>>>>> don't have any available ports.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks in advance,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *German*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Users mailing list
>>>>>>>>>>>>>> Users at lists.openfabrics.org
>>>>>>>>>>>>>> http://lists.openfabrics.org/mailman/listinfo/users
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20151007/e52843c5/attachment.html>


More information about the Users mailing list