[Users] IB topology config and polling state

German Anders ganders at despegar.com
Wed Oct 7 12:06:04 PDT 2015


Yes, find attached an screenshot of the port information (# 9) the one that
makes the ISL to the QLogic HP BLc 4X QDR IB Switch, also from one of the
hosts that are connected to one of the SX6018F I can see the 'remote' HP IB
SW:

# *ibnodes*

(...)
Switch    : 0x0002c902004b0918 ports 32 "Infiniscale-IV Mellanox
Technologies" base port 0 *lid 29 *lmc 0
Switch    : 0xe41d2d030031e9c1 ports 37 "MF0;GWIB01:SX6036G/U1" enhanced
port 0 lid 24 lmc 0
(...)

# *ibportstate -L 29 query*
Switch PortInfo:
# Port info: Lid 29 port 0
LinkState:.......................Active
PhysLinkState:...................LinkUp
Lid:.............................29
SMLid:...........................2
LMC:.............................0
LinkWidthSupported:..............1X or 4X
LinkWidthEnabled:................1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedActive:.................10.0 Gbps
Mkey:............................<not displayed>
MkeyLeasePeriod:.................0
ProtectBits:.....................0





*German* <ganders at despegar.com>

2015-10-07 16:00 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:

> One more thing hopefully before playing with the low level phy settings:
>
> Are you using known good cables ? Do you have FDR cables on the FDR <->
> FDR links ? Cable lengths can matter as well.
>
> On Wed, Oct 7, 2015 at 12:57 PM, Hal Rosenstock <hal.rosenstock at gmail.com>
> wrote:
>
>> Were the ports mapped to the phy profile shutdown when you changed this ?
>>
>> LLR is a proprietary Mellanox mechanism.
>>
>> You might want 2 different profiles: one for the interfaces connected to
>> other gateway interfaces (which are FDR (and FDR-10) capable and the other
>> for the interfaces connecting to QDR (the older equipment in your network).
>> By configuring the Switch-X interfaces to the appropriate possible speeds
>> and disabling the proprietary mechanisms there, the link should not only
>> come up but also this will occur faster than if FDR/FDR10 are enabled.
>>
>> I suspect that due to the Switch-X configuration that the links to
>> the switch(es) in the HP enclosures do not negotiate properly (as shown by
>> down rather than LinkUp).
>>
>> Once you get all your links to INIT, negotiation has occurred and then
>> it's time for SM to bring links to active.
>>
>> Since you have down links, the SM can't do anything about those.
>>
>>
>> On Wed, Oct 7, 2015 at 12:44 PM, German Anders <ganders at despegar.com>
>> wrote:
>>
>>> Anyone had any experience with HP BLc 4X QDR IB Switch?? I know that
>>> this kind of SW does not come with an embedded sm, but I don't know how to
>>> access any mgmt at all on this particularly switch, I mean for example to
>>> setup speed or anything like that, is possible to access through the
>>> chassis?
>>>
>>>
>>> *German* <ganders at despegar.com>
>>>
>>> 2015-10-07 13:19 GMT-03:00 German Anders <ganders at despegar.com>:
>>>
>>>> I think so, but when trying to configured the phy-profile on the
>>>> interface in order to negotiate on QDR it failed to map the profile:
>>>>
>>>> GWIB01 [proxy-ha-group: master] (config) # show phy-profile
>>>> high-speed-ber
>>>>
>>>>   Profile: high-speed-ber
>>>>   --------
>>>>   llr support ib-speed
>>>>   SDR: disable
>>>>   DDR: disable
>>>>   QDR: disable
>>>>   FDR10: enable-request
>>>>   FDR: enable-request
>>>>
>>>> GWIB01 [proxy-ha-group: master] (config) # show phy-profile hp-encl-isl
>>>>
>>>>   Profile: hp-encl-isl
>>>>   --------
>>>>   llr support ib-speed
>>>>   SDR: disable
>>>>   DDR: disable
>>>>   QDR: enable
>>>>   FDR10: enable-request
>>>>   FDR: enable-request
>>>>
>>>> GWIB01 [proxy-ha-group: master] (config) #
>>>> GWIB01 [proxy-ha-group: master] (config) # interface ib 1/9 phy-profile
>>>> map hp-encl-isl
>>>> *% Cannot map profile hp-encl-isl to port:  1/9*
>>>>
>>>>
>>>> *German* <ganders at despegar.com>
>>>>
>>>> 2015-10-07 13:17 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>
>>>>> The driver ‘qib’ is loading fine.  As can be seen by the ibstat
>>>>> output.  The ib_ipath is an older card.
>>>>>
>>>>>
>>>>>
>>>>> The problem is the link is not coming up to init.  Like Hal said the
>>>>> link should transition to “link up” without the SMs involvement.
>>>>>
>>>>>
>>>>>
>>>>> I think you are on to something with the fact that it seems like your
>>>>> switch ports are not configured to do QDR.
>>>>>
>>>>>
>>>>>
>>>>> Ira
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *From:* German Anders [mailto:ganders at despegar.com]
>>>>> *Sent:* Wednesday, October 07, 2015 9:05 AM
>>>>> *To:* Weiny, Ira
>>>>> *Cc:* Hal Rosenstock; users at lists.openfabrics.org
>>>>>
>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>
>>>>>
>>>>>
>>>>> Yes I've that file:
>>>>>
>>>>> /usr/sbin/truescale-serdes.cmds
>>>>>
>>>>> Also I've done the install of libipathverbs:
>>>>>
>>>>> # apt-get install libipathverbs-dev
>>>>>
>>>>> But I try to load the ib_ipath module but I'm getting the following
>>>>> error msg:
>>>>>
>>>>> # modprobe ib_ipath
>>>>> modprobe: ERROR: could not insert 'ib_ipath': Device or resource busy
>>>>>
>>>>>
>>>>>
>>>>> *German*
>>>>>
>>>>>
>>>>>
>>>>> 2015-10-07 12:54 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>
>>>>> There are a few issues for routing in that diagram but the links
>>>>> should come up.
>>>>>
>>>>>
>>>>>
>>>>> I assume there is some backplane between the blade servers and the
>>>>> switch in that chassis?
>>>>>
>>>>>
>>>>>
>>>>> Have you gotten libipathverbs installed?
>>>>>
>>>>>
>>>>>
>>>>> In ipathverbs there is a serdes tuning script.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> https://github.com/01org/libipathverbs/blob/master/truescale-serdes.cmds
>>>>>
>>>>>
>>>>>
>>>>> Does your libipathverbs include that file?  If not try the latest from
>>>>> github.
>>>>>
>>>>>
>>>>>
>>>>> Ira
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *From:* users-bounces at lists.openfabrics.org [mailto:
>>>>> users-bounces at lists.openfabrics.org] *On Behalf Of *German Anders
>>>>> *Sent:* Wednesday, October 07, 2015 8:41 AM
>>>>> *To:* Hal Rosenstock
>>>>> *Cc:* users at lists.openfabrics.org
>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>
>>>>>
>>>>>
>>>>> Hi Hal,
>>>>>
>>>>> Thanks for the reply, I've attach a pdf with the diagram topology, I
>>>>> don't know if this is the best way to go or if there's another way to
>>>>> connect and setup the IB network, tips and suggestions will be very
>>>>> appreciated, also the mezzanine cards are already installed on the blade
>>>>> hosts:
>>>>>
>>>>> # lspci
>>>>> (...)
>>>>> 41:00.0 InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand HCA (rev 02)
>>>>>
>>>>>
>>>>>
>>>>> Thanks in advance,
>>>>>
>>>>> Cheers,
>>>>>
>>>>>
>>>>> *German*
>>>>>
>>>>>
>>>>>
>>>>> 2015-10-07 11:47 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>>>>>
>>>>> Hi again German,
>>>>>
>>>>>
>>>>>
>>>>> Looks like you made some progress from yesterday as the qib ports are
>>>>> now Polling rather than Disabled.
>>>>>
>>>>>
>>>>>
>>>>> But since they are Down, do you have them cabled to a switch ? That
>>>>> should bring the links up and the port state will be Init. That is the
>>>>> "starting" point.
>>>>>
>>>>>
>>>>>
>>>>> You will also then need to be running SM to bring the ports up to
>>>>> Active.
>>>>>
>>>>>
>>>>>
>>>>> -- Hal
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Oct 7, 2015 at 10:37 AM, German Anders <ganders at despegar.com>
>>>>> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I don't know if this is the mailist list for this kind of topic but
>>>>> I'm really new to IB and I've just install two SX6036G gateways connected
>>>>> to each other through two ISL ports, then I've configured a proxy-arp
>>>>> between both nodes (sm is disable on both gw's):
>>>>>
>>>>> GWIB01 [proxy-ha-group: master] (config) # show proxy-arp ha
>>>>>
>>>>> Load balancing algorithm: ib-base-ip
>>>>> Number of Proxy-Arp interfaces: 1
>>>>>
>>>>> Proxy-ARP VIP
>>>>> =============
>>>>> Pra-group name: proxy-ha-group
>>>>> HA VIP address: 10.xx.xx.xx/xx
>>>>>
>>>>> Active nodes:
>>>>> ID                   State                IP
>>>>> --------------------------------------------------------------
>>>>> GWIB01               master               10.xx.xx.xx1
>>>>> GWIB02               standby              10.xx.xx.xx2
>>>>>
>>>>> Then I setup two SX6018F switches (*SWIB01* and *SWIB02*), one
>>>>> connected to GWIB01 and the other connected to GWIB02. The SM is configured
>>>>> locally on both SWIB01 & SWIB02 switches. So far so good, after this config
>>>>> I setup a commodity server with a MLNX IB ADPT FDR to the SWIB01 & SWIB02
>>>>> switches, config the drivers, etc and then get it up & running fine.
>>>>>
>>>>> Finally I've setup a HP Enclosure with an internal IB SW (then connect
>>>>> port 1 of the internal SW to GWIB01 - link is up but LLR status is
>>>>> inactive), install one of the blades and I see the following:
>>>>>
>>>>> # ibstat
>>>>> CA 'qib0'
>>>>>     CA type: InfiniPath_QMH7342
>>>>>     Number of ports: 2
>>>>>     Firmware version:
>>>>>     Hardware version: 2
>>>>>     Node GUID: 0x0011750000791fec
>>>>>     System image GUID: 0x0011750000791fec
>>>>>     Port 1:
>>>>>         State: Down
>>>>>         Physical state: Polling
>>>>>         Rate: 40
>>>>>         Base lid: 4660
>>>>>         LMC: 0
>>>>>         SM lid: 4660
>>>>>         Capability mask: 0x0761086a
>>>>>         Port GUID: 0x0011750000791fec
>>>>>         Link layer: InfiniBand
>>>>>     Port 2:
>>>>>         State: Down
>>>>>         Physical state: Polling
>>>>>         Rate: 40
>>>>>         Base lid: 4660
>>>>>         LMC: 0
>>>>>         SM lid: 4660
>>>>>         Capability mask: 0x0761086a
>>>>>         Port GUID: 0x0011750000791fed
>>>>>         Link layer: InfiniBand
>>>>>
>>>>> So I was wondering if maybe the SM is not being recognized on the
>>>>> Blade system and that's why is not passing the Polling state, is that
>>>>> possible? Or maybe is not possible to connect an ISL between the GW and the
>>>>> HP internal SW so that the sm is available or maybe the inactive LLR is
>>>>> causing this thing, any ideas? I thought about connecting the ISL of
>>>>> the HP IB SW to the SWIB01 or SWIB02 instead of the GW's but I don't have
>>>>> any available ports.
>>>>>
>>>>> Thanks in advance,
>>>>>
>>>>> Cheers,
>>>>>
>>>>>
>>>>> *German*
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at lists.openfabrics.org
>>>>> http://lists.openfabrics.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20151007/80d33920/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: GWIB01 port 9 information.png
Type: image/png
Size: 266295 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20151007/80d33920/attachment.png>


More information about the Users mailing list