[Users] IB topology config and polling state

German Anders ganders at despegar.com
Wed Oct 7 12:24:43 PDT 2015


Yeah, is there any command that I can run in order to change the port state
on the remote switch? I mean everything looks good but in the hp blades
still getting:

# ibstat
CA 'qib0'
    CA type: InfiniPath_QMH7342
    Number of ports: 2
    Firmware version:
    Hardware version: 2
    Node GUID: 0x0011750000791fec
    System image GUID: 0x0011750000791fec
    Port 1:
        State: *Down*
        Physical state: *Polling*
        Rate: 40
        Base lid: 4660
        LMC: 0
        SM lid: 4660
        Capability mask: 0x0761086a
        Port GUID: 0x0011750000791fec
        Link layer: InfiniBand
    Port 2:
        State: *Down*
        Physical state: *Polling*
        Rate: 40
        Base lid: 4660
        LMC: 0
        SM lid: 4660
        Capability mask: 0x0761086a
        Port GUID: 0x0011750000791fed
        Link layer: InfiniBand


Also on working hosts I only see devices from the local network, but didn't
see any of the blades hca connections.



*German* <ganders at despegar.com>

2015-10-07 16:21 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:

> The screen shot looks good :-) SM brought the link up to active.
>
> Note that the ibportstate command you gave was for switch port 0 of the
> Mellanox IS-4 switch in the QLogic HP BLc 4X QDR IB Switch.
>
> On Wed, Oct 7, 2015 at 3:06 PM, German Anders <ganders at despegar.com>
> wrote:
>
>> Yes, find attached an screenshot of the port information (# 9) the one
>> that makes the ISL to the QLogic HP BLc 4X QDR IB Switch, also from one of
>> the hosts that are connected to one of the SX6018F I can see the 'remote'
>> HP IB SW:
>>
>> # *ibnodes*
>>
>> (...)
>> Switch    : 0x0002c902004b0918 ports 32 "Infiniscale-IV Mellanox
>> Technologies" base port 0 *lid 29 *lmc 0
>> Switch    : 0xe41d2d030031e9c1 ports 37 "MF0;GWIB01:SX6036G/U1" enhanced
>> port 0 lid 24 lmc 0
>> (...)
>>
>> # *ibportstate -L 29 query*
>> Switch PortInfo:
>> # Port info: Lid 29 port 0
>> LinkState:.......................Active
>> PhysLinkState:...................LinkUp
>> Lid:.............................29
>> SMLid:...........................2
>> LMC:.............................0
>> LinkWidthSupported:..............1X or 4X
>> LinkWidthEnabled:................1X or 4X
>> LinkWidthActive:.................4X
>> LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>> LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>> LinkSpeedActive:.................10.0 Gbps
>> Mkey:............................<not displayed>
>> MkeyLeasePeriod:.................0
>> ProtectBits:.....................0
>>
>>
>>
>>
>>
>> *German* <ganders at despegar.com>
>>
>> 2015-10-07 16:00 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>>
>>> One more thing hopefully before playing with the low level phy settings:
>>>
>>> Are you using known good cables ? Do you have FDR cables on the FDR <->
>>> FDR links ? Cable lengths can matter as well.
>>>
>>> On Wed, Oct 7, 2015 at 12:57 PM, Hal Rosenstock <
>>> hal.rosenstock at gmail.com> wrote:
>>>
>>>> Were the ports mapped to the phy profile shutdown when you changed this
>>>> ?
>>>>
>>>> LLR is a proprietary Mellanox mechanism.
>>>>
>>>> You might want 2 different profiles: one for the interfaces connected
>>>> to other gateway interfaces (which are FDR (and FDR-10) capable and the
>>>> other for the interfaces connecting to QDR (the older equipment in your
>>>> network). By configuring the Switch-X interfaces to the appropriate
>>>> possible speeds and disabling the proprietary mechanisms there, the link
>>>> should not only come up but also this will occur faster than if FDR/FDR10
>>>> are enabled.
>>>>
>>>> I suspect that due to the Switch-X configuration that the links to
>>>> the switch(es) in the HP enclosures do not negotiate properly (as shown by
>>>> down rather than LinkUp).
>>>>
>>>> Once you get all your links to INIT, negotiation has occurred and then
>>>> it's time for SM to bring links to active.
>>>>
>>>> Since you have down links, the SM can't do anything about those.
>>>>
>>>>
>>>> On Wed, Oct 7, 2015 at 12:44 PM, German Anders <ganders at despegar.com>
>>>> wrote:
>>>>
>>>>> Anyone had any experience with HP BLc 4X QDR IB Switch?? I know that
>>>>> this kind of SW does not come with an embedded sm, but I don't know how to
>>>>> access any mgmt at all on this particularly switch, I mean for example to
>>>>> setup speed or anything like that, is possible to access through the
>>>>> chassis?
>>>>>
>>>>>
>>>>> *German* <ganders at despegar.com>
>>>>>
>>>>> 2015-10-07 13:19 GMT-03:00 German Anders <ganders at despegar.com>:
>>>>>
>>>>>> I think so, but when trying to configured the phy-profile on the
>>>>>> interface in order to negotiate on QDR it failed to map the profile:
>>>>>>
>>>>>> GWIB01 [proxy-ha-group: master] (config) # show phy-profile
>>>>>> high-speed-ber
>>>>>>
>>>>>>   Profile: high-speed-ber
>>>>>>   --------
>>>>>>   llr support ib-speed
>>>>>>   SDR: disable
>>>>>>   DDR: disable
>>>>>>   QDR: disable
>>>>>>   FDR10: enable-request
>>>>>>   FDR: enable-request
>>>>>>
>>>>>> GWIB01 [proxy-ha-group: master] (config) # show phy-profile
>>>>>> hp-encl-isl
>>>>>>
>>>>>>   Profile: hp-encl-isl
>>>>>>   --------
>>>>>>   llr support ib-speed
>>>>>>   SDR: disable
>>>>>>   DDR: disable
>>>>>>   QDR: enable
>>>>>>   FDR10: enable-request
>>>>>>   FDR: enable-request
>>>>>>
>>>>>> GWIB01 [proxy-ha-group: master] (config) #
>>>>>> GWIB01 [proxy-ha-group: master] (config) # interface ib 1/9
>>>>>> phy-profile map hp-encl-isl
>>>>>> *% Cannot map profile hp-encl-isl to port:  1/9*
>>>>>>
>>>>>>
>>>>>> *German* <ganders at despegar.com>
>>>>>>
>>>>>> 2015-10-07 13:17 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>>
>>>>>>> The driver ‘qib’ is loading fine.  As can be seen by the ibstat
>>>>>>> output.  The ib_ipath is an older card.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The problem is the link is not coming up to init.  Like Hal said the
>>>>>>> link should transition to “link up” without the SMs involvement.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I think you are on to something with the fact that it seems like
>>>>>>> your switch ports are not configured to do QDR.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Ira
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *From:* German Anders [mailto:ganders at despegar.com]
>>>>>>> *Sent:* Wednesday, October 07, 2015 9:05 AM
>>>>>>> *To:* Weiny, Ira
>>>>>>> *Cc:* Hal Rosenstock; users at lists.openfabrics.org
>>>>>>>
>>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Yes I've that file:
>>>>>>>
>>>>>>> /usr/sbin/truescale-serdes.cmds
>>>>>>>
>>>>>>> Also I've done the install of libipathverbs:
>>>>>>>
>>>>>>> # apt-get install libipathverbs-dev
>>>>>>>
>>>>>>> But I try to load the ib_ipath module but I'm getting the following
>>>>>>> error msg:
>>>>>>>
>>>>>>> # modprobe ib_ipath
>>>>>>> modprobe: ERROR: could not insert 'ib_ipath': Device or resource busy
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *German*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2015-10-07 12:54 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>>>
>>>>>>> There are a few issues for routing in that diagram but the links
>>>>>>> should come up.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I assume there is some backplane between the blade servers and the
>>>>>>> switch in that chassis?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Have you gotten libipathverbs installed?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> In ipathverbs there is a serdes tuning script.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> https://github.com/01org/libipathverbs/blob/master/truescale-serdes.cmds
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Does your libipathverbs include that file?  If not try the latest
>>>>>>> from github.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Ira
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *From:* users-bounces at lists.openfabrics.org [mailto:
>>>>>>> users-bounces at lists.openfabrics.org] *On Behalf Of *German Anders
>>>>>>> *Sent:* Wednesday, October 07, 2015 8:41 AM
>>>>>>> *To:* Hal Rosenstock
>>>>>>> *Cc:* users at lists.openfabrics.org
>>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi Hal,
>>>>>>>
>>>>>>> Thanks for the reply, I've attach a pdf with the diagram topology, I
>>>>>>> don't know if this is the best way to go or if there's another way to
>>>>>>> connect and setup the IB network, tips and suggestions will be very
>>>>>>> appreciated, also the mezzanine cards are already installed on the blade
>>>>>>> hosts:
>>>>>>>
>>>>>>> # lspci
>>>>>>> (...)
>>>>>>> 41:00.0 InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand HCA (rev 02)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks in advance,
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>>
>>>>>>> *German*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2015-10-07 11:47 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com
>>>>>>> >:
>>>>>>>
>>>>>>> Hi again German,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Looks like you made some progress from yesterday as the qib ports
>>>>>>> are now Polling rather than Disabled.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> But since they are Down, do you have them cabled to a switch ? That
>>>>>>> should bring the links up and the port state will be Init. That is the
>>>>>>> "starting" point.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> You will also then need to be running SM to bring the ports up to
>>>>>>> Active.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -- Hal
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Oct 7, 2015 at 10:37 AM, German Anders <ganders at despegar.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I don't know if this is the mailist list for this kind of topic but
>>>>>>> I'm really new to IB and I've just install two SX6036G gateways connected
>>>>>>> to each other through two ISL ports, then I've configured a proxy-arp
>>>>>>> between both nodes (sm is disable on both gw's):
>>>>>>>
>>>>>>> GWIB01 [proxy-ha-group: master] (config) # show proxy-arp ha
>>>>>>>
>>>>>>> Load balancing algorithm: ib-base-ip
>>>>>>> Number of Proxy-Arp interfaces: 1
>>>>>>>
>>>>>>> Proxy-ARP VIP
>>>>>>> =============
>>>>>>> Pra-group name: proxy-ha-group
>>>>>>> HA VIP address: 10.xx.xx.xx/xx
>>>>>>>
>>>>>>> Active nodes:
>>>>>>> ID                   State                IP
>>>>>>> --------------------------------------------------------------
>>>>>>> GWIB01               master               10.xx.xx.xx1
>>>>>>> GWIB02               standby              10.xx.xx.xx2
>>>>>>>
>>>>>>> Then I setup two SX6018F switches (*SWIB01* and *SWIB02*), one
>>>>>>> connected to GWIB01 and the other connected to GWIB02. The SM is configured
>>>>>>> locally on both SWIB01 & SWIB02 switches. So far so good, after this config
>>>>>>> I setup a commodity server with a MLNX IB ADPT FDR to the SWIB01 & SWIB02
>>>>>>> switches, config the drivers, etc and then get it up & running fine.
>>>>>>>
>>>>>>> Finally I've setup a HP Enclosure with an internal IB SW (then
>>>>>>> connect port 1 of the internal SW to GWIB01 - link is up but LLR status is
>>>>>>> inactive), install one of the blades and I see the following:
>>>>>>>
>>>>>>> # ibstat
>>>>>>> CA 'qib0'
>>>>>>>     CA type: InfiniPath_QMH7342
>>>>>>>     Number of ports: 2
>>>>>>>     Firmware version:
>>>>>>>     Hardware version: 2
>>>>>>>     Node GUID: 0x0011750000791fec
>>>>>>>     System image GUID: 0x0011750000791fec
>>>>>>>     Port 1:
>>>>>>>         State: Down
>>>>>>>         Physical state: Polling
>>>>>>>         Rate: 40
>>>>>>>         Base lid: 4660
>>>>>>>         LMC: 0
>>>>>>>         SM lid: 4660
>>>>>>>         Capability mask: 0x0761086a
>>>>>>>         Port GUID: 0x0011750000791fec
>>>>>>>         Link layer: InfiniBand
>>>>>>>     Port 2:
>>>>>>>         State: Down
>>>>>>>         Physical state: Polling
>>>>>>>         Rate: 40
>>>>>>>         Base lid: 4660
>>>>>>>         LMC: 0
>>>>>>>         SM lid: 4660
>>>>>>>         Capability mask: 0x0761086a
>>>>>>>         Port GUID: 0x0011750000791fed
>>>>>>>         Link layer: InfiniBand
>>>>>>>
>>>>>>> So I was wondering if maybe the SM is not being recognized on the
>>>>>>> Blade system and that's why is not passing the Polling state, is that
>>>>>>> possible? Or maybe is not possible to connect an ISL between the GW and the
>>>>>>> HP internal SW so that the sm is available or maybe the inactive LLR is
>>>>>>> causing this thing, any ideas? I thought about connecting the ISL
>>>>>>> of the HP IB SW to the SWIB01 or SWIB02 instead of the GW's but I don't
>>>>>>> have any available ports.
>>>>>>>
>>>>>>> Thanks in advance,
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>>
>>>>>>> *German*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Users mailing list
>>>>>>> Users at lists.openfabrics.org
>>>>>>> http://lists.openfabrics.org/mailman/listinfo/users
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20151007/8452dc55/attachment.html>


More information about the Users mailing list