[Users] IB topology config and polling state

German Anders ganders at despegar.com
Tue Oct 13 10:09:15 PDT 2015


Ok, some good news! I've find the way to get LinkUp on the blade port...
but.. not for so long :S, it seems that after making the following changes
the ports at the end come up...but then in a couple of minutes goes down
again:

- Move the Encl IB SW to Bay7
- Move the Blade Mezz cards to slot 3

Then for ex:

Blade IB SW - LID: 29
Blade    4    IB-SW-PORT    28

1)

# ibportstate -L 29 28
Switch PortInfo:
# Port info: Lid 29 port 28
LinkState:.......................Down
PhysLinkState:...................Polling
Lid:.............................75
SMLid:...........................2328
LMC:.............................0
LinkWidthSupported:..............1X or 4X
LinkWidthEnabled:................1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedActive:.................2.5 Gbps


2)

# ibportstate -L 29 28 disable
# ibportstate -L 29 28 speed 4
# ibportstate -L 29 28 espeed 4
# ibportstate -L 29 28 smlid 2
# ibportstate -L 29 28 enable

3)

# ibportstate -L 29 28
Switch PortInfo:
# Port info: Lid 29 port 28
LinkState:.......................Active
PhysLinkState:...................LinkUp
Lid:.............................75
SMLid:...........................2328
LMC:.............................0
LinkWidthSupported:..............1X or 4X
LinkWidthEnabled:................1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedActive:.................10.0 Gbps
Peer PortInfo:
# Port info: Lid 29 DR path slid 4; dlid 65535; 0,28 port 1
LinkState:.......................Active
PhysLinkState:...................LinkUp
Lid:.............................30
SMLid:...........................2
LMC:.............................0
LinkWidthSupported:..............1X or 4X
LinkWidthEnabled:................1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............10.0 Gbps (IBA extension)
LinkSpeedEnabled:................10.0 Gbps (IBA extension)
LinkSpeedActive:.................10.0 Gbps
Mkey:............................<not displayed>
MkeyLeasePeriod:.................0
ProtectBits:.....................0


# ibstat
CA 'qib0'
    CA type: InfiniPath_QMH7342
    Number of ports: 2
    Firmware version:
    Hardware version: 2
    Node GUID: 0x0011750000791fec
    System image GUID: 0x0011750000791fec
    Port 1:
        State: *Active*
        Physical state: *LinkUp*
        Rate: 40
        Base lid: 30
        LMC: 0
        SM lid: 2
        Capability mask: 0x0761086a
        Port GUID: 0x0011750000791fec
        Link layer: InfiniBand
    Port 2:
        State: Down
        Physical state: Polling
        Rate: 40
        Base lid: 65535
        LMC: 0
        SM lid: 65535
        Capability mask: 0x0761086a
        Port GUID: 0x0011750000791fed
        Link layer: InfiniBand


a couple of minutes then...

# ibportstate -L 29 28
Switch PortInfo:
# Port info: Lid 29 port 28
LinkState:.......................Down
PhysLinkState:...................Polling
Lid:.............................75
SMLid:...........................2328
LMC:.............................0
LinkWidthSupported:..............1X or 4X
LinkWidthEnabled:................1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedActive:.................2.5 Gbps


*German* <ganders at despegar.com>

2015-10-13 12:02 GMT-03:00 German Anders <ganders at despegar.com>:

> Can anyone that's using the HP BLc 4X QDR IB Switch tell me in what
> interconnect bay is configured the switch? I've that in bay3 & mezz 1, I'll
> try to move to bay7 & mezz 3 and see if at least something changed or not,
> any useful info will be really appreciated.
>
> Cheers,
>
>
> *German* <ganders at despegar.com>
>
> 2015-10-13 10:51 GMT-03:00 German Anders <ganders at despegar.com>:
>
>> yes, but the switch does not come with an external interface... just the
>> i2c port :S
>>
>>
>>
>> *German* <ganders at despegar.com>
>>
>> 2015-10-13 10:49 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>>
>>> I don't really know but I was wondering about whether your hardware
>>> supports this.
>>>
>>> HPOA_2 shows internal interface to OA is absent as is external Ethernet
>>> interface.
>>>
>>> On Tue, Oct 13, 2015 at 9:31 AM, German Anders <ganders at despegar.com>
>>> wrote:
>>>
>>>> when trying to assign the ip addr to the interconnect bay it seems that
>>>> it does not 'apply' the change, and the ip addr is not displayed as
>>>> 'configured' so there's no way to enter from outside. ideas?
>>>>
>>>> *German*
>>>>
>>>> 2015-10-13 10:15 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>>>>
>>>>> What about the 3 critical errors ? What are they ?
>>>>>
>>>>> On Tue, Oct 13, 2015 at 9:13 AM, German Anders <ganders at despegar.com>
>>>>> wrote:
>>>>>
>>>>>> I've try that and in fact I try to put the IP ADDR like this:
>>>>>>
>>>>>> SET EBIPA INTERCONNECT XX.XX.XX.XX XXX.XXX.XXX.XXX 3
>>>>>> SET EBIPA INTERCONNECT GATEWAY XX.XX.XX.XX 3
>>>>>> SET EBIPA INTERCONNECT DOMAIN "xxxxxx.net" 3
>>>>>> ADD EBIPA INTERCONNECT DNS 10.xx.xx.xx 3
>>>>>> ADD EBIPA INTERCONNECT DNS 10.xx.xx.xx 3
>>>>>> SET EBIPA INTERCONNECT NTP PRIMARY NONE 3
>>>>>> SET EBIPA INTERCONNECT NTP SECONDARY NONE 3
>>>>>> ENABLE EBIPA INTERCONNECT 3
>>>>>>
>>>>>> SAVE EBIPA
>>>>>>
>>>>>> But i'm not getting any ip response, also I've try many diff ip addr with no luck...if i put that ip to one of the blades it works fine, but not to the interconnect bay :( any other idea?
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>>
>>>>>> *German* <ganders at despegar.com>
>>>>>>
>>>>>> 2015-10-13 10:01 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>>>>>>
>>>>>>> Looks like there are 3 critical errors in system status. Did you
>>>>>>> look at these ?
>>>>>>>
>>>>>>> I don't know if you've seen this but there is some info on
>>>>>>> configuring the management IPs in
>>>>>>> http://h10032.www1.hp.com/ctg/Manual/c00814176.pdf
>>>>>>>
>>>>>>> Have you looked at/tried the command line interface ?
>>>>>>>
>>>>>>> On Tue, Oct 13, 2015 at 8:28 AM, German Anders <ganders at despegar.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Hi Hal,
>>>>>>>>
>>>>>>>> It does not allow me to setup an IP ADDR to the Internal SW so I
>>>>>>>> can't access from outside, except from the tools that I mentioned before,
>>>>>>>> also it doesn't allow me to access through serial connection from inside
>>>>>>>> the enclosure. I've attach some screen-shots about the connectivity.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *German*
>>>>>>>>
>>>>>>>> 2015-10-13 9:13 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>
>>>>>>>> :
>>>>>>>>
>>>>>>>>> Hi German,
>>>>>>>>>
>>>>>>>>> Are the cards in the correct bays and slots ?
>>>>>>>>>
>>>>>>>>> Do you have the HP Onboard Administrator tool ? What does it say
>>>>>>>>> about internal connectivity ?
>>>>>>>>>
>>>>>>>>> -- Hal
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Oct 13, 2015 at 7:44 AM, German Anders <
>>>>>>>>> ganders at despegar.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Ira,
>>>>>>>>>>
>>>>>>>>>> I've some HP documentation but it quite short, also it doesn't
>>>>>>>>>> describe any 'config' or 'impl' steps in order to get the internal switch
>>>>>>>>>> up and running. The version of SW that came with enclosure does not had any
>>>>>>>>>> management module at all, so it depends on external management. During
>>>>>>>>>> weekend I've found a way to upgrade the firmware of the HP switch with the
>>>>>>>>>> following command (mlxburn -d lid-0x001D -fw fw-IS4.mlx), and the I've run
>>>>>>>>>> (flint -d /dev/mst/SW_MT48438_0x2c902004b0918_lid-0x001D dc
>>>>>>>>>> /home/ceph/HPIBSW.INI) and found the following inside that file:
>>>>>>>>>>
>>>>>>>>>> [PS_INFO]
>>>>>>>>>> Name = 489184-B21
>>>>>>>>>> Description = HP BLc 4X QDR IB Switch
>>>>>>>>>>
>>>>>>>>>> [ADAPTER]
>>>>>>>>>> PSID = HP_0100000009
>>>>>>>>>>
>>>>>>>>>> (...)
>>>>>>>>>>
>>>>>>>>>> [IB_TO_HW_MAP]
>>>>>>>>>> PORT1=14
>>>>>>>>>> PORT2=15
>>>>>>>>>> PORT3=16
>>>>>>>>>> PORT4=17
>>>>>>>>>> PORT5=18
>>>>>>>>>> PORT6=12
>>>>>>>>>> PORT7=11
>>>>>>>>>> PORT8=10
>>>>>>>>>> PORT9=9
>>>>>>>>>> PORT10=8
>>>>>>>>>> PORT11=7
>>>>>>>>>> PORT12=6
>>>>>>>>>> PORT13=5
>>>>>>>>>> PORT14=4
>>>>>>>>>> PORT15=3
>>>>>>>>>> PORT16=2
>>>>>>>>>> PORT17=20
>>>>>>>>>> PORT18=22
>>>>>>>>>> PORT19=24
>>>>>>>>>>
>>>>>>>>>> PORT20=26
>>>>>>>>>> PORT21=28
>>>>>>>>>> PORT22=30
>>>>>>>>>> PORT23=35
>>>>>>>>>> PORT24=33
>>>>>>>>>> PORT25=21
>>>>>>>>>> PORT26=23
>>>>>>>>>> PORT27=25
>>>>>>>>>> PORT28=27
>>>>>>>>>> PORT29=29
>>>>>>>>>> PORT30=36
>>>>>>>>>> PORT31=34
>>>>>>>>>> PORT32=32
>>>>>>>>>> PORT33=1
>>>>>>>>>> PORT34=13
>>>>>>>>>> PORT35=19
>>>>>>>>>> PORT36=31
>>>>>>>>>>
>>>>>>>>>> [unused_ports]
>>>>>>>>>> hw_port1_not_in_use=1
>>>>>>>>>> hw_port13_not_in_use=1
>>>>>>>>>> hw_port19_not_in_use=1
>>>>>>>>>> hw_port31_not_in_use=1
>>>>>>>>>>
>>>>>>>>>> (...)
>>>>>>>>>>
>>>>>>>>>> I don't know if maybe there's some issue with the port mapping,
>>>>>>>>>> anyone had used this kind of switch?
>>>>>>>>>>
>>>>>>>>>> The summary of the problem is correct, the connectivity between
>>>>>>>>>> the IB network (MLNX switches/gw) and the HP IB switch is working since I
>>>>>>>>>> was able to upgrade the firmare of the switch and get information about it.
>>>>>>>>>> But, the connection between the mezzanine cards of the blades and the
>>>>>>>>>> internal IB sw enclosure is not working at all. Note, that if I go to the
>>>>>>>>>> OA administration of the enclosure I can see the 'green' ports mapping of
>>>>>>>>>> each of the blades and the interconnection switch, so I'm guessing that it
>>>>>>>>>> should be working.
>>>>>>>>>>
>>>>>>>>>> Regarding the questions:
>>>>>>>>>>
>>>>>>>>>> 1)      What type of switch is in the HP chassis?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *QLogic HP BLc 4X QDR IB Switch*
>>>>>>>>>>
>>>>>>>>>> *PSID = HP_0100000009*
>>>>>>>>>>
>>>>>>>>>> *Image type:   FS2*
>>>>>>>>>>
>>>>>>>>>> *FW ver:         7.4.3000*
>>>>>>>>>>
>>>>>>>>>> *Device ID:     48438*
>>>>>>>>>> *GUI:              0002c902004b0918*
>>>>>>>>>>
>>>>>>>>>> 2)      Do you have console access or http access to that switch?
>>>>>>>>>>
>>>>>>>>>> *No, since it didn't had any manage module mezzanine card inside
>>>>>>>>>> the switch, it only come with a i2c port. But, i can have access through
>>>>>>>>>> the mlxburn and flint tools from one host that's connected to the ib
>>>>>>>>>> network (outside the enclosure).*
>>>>>>>>>>
>>>>>>>>>> 3)      Does that switch have an SM in it?
>>>>>>>>>>
>>>>>>>>>> *No*
>>>>>>>>>>
>>>>>>>>>> 4)      What version of the kernel are you running with the qib
>>>>>>>>>> cards?
>>>>>>>>>>
>>>>>>>>>> a.       I assume you are using the qib driver in that kernel.
>>>>>>>>>>
>>>>>>>>>> *Ubuntu 14.04.3 LTS - kernel 3.18.20-031820-generic*
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> At some point Hal spoke of “LLR being a Mellanox thing”  Was that
>>>>>>>>>> to solve the problem of connecting the “HP switch” to the Mellanox switch?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *No, since LLR is only supported between mlnx devices, the ISL
>>>>>>>>>> are up and working, since it's possible for me to query the switch*
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I would like it if you could verify that the
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> /usr/sbin/truescale-serdes.cmds
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Is being run?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *When trying to run the command:*
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *#
>>>>>>>>>> /usr/sbin/truescale-serdes.cmds/usr/sbin/truescale-serdes.cmds: 100:
>>>>>>>>>> /usr/sbin/truescale-serdes.cmds: Syntax error: "(" unexpected (expecting
>>>>>>>>>> "}")*
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Also what version of libipathverbs do you have?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *# rpm -qa | grep libipathverbslibipathverbs-1.3-1.x86_64*
>>>>>>>>>> Thanks in advance,
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *German*
>>>>>>>>>> 2015-10-13 2:14 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>>>>>>
>>>>>>>>>>> German,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Do you have any documentation on the HP blade system?  And the
>>>>>>>>>>> switch which is in that system?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I have to admit I have not followed everything in this thread
>>>>>>>>>>> regarding your configuration but it seems like you have some mellanox
>>>>>>>>>>> switches connected into an HP chassis which has both a switch and blades
>>>>>>>>>>> with qib (Truescale) cards.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The connection from the mellanox switch to the “HP chassis
>>>>>>>>>>> switch” is linkup (active) but the connections to the individual qib HCAs
>>>>>>>>>>> are not even linkup.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Is that a correct summary of the problem?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> If so here are some questions:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 1)      What type of switch is in the HP chassis?
>>>>>>>>>>>
>>>>>>>>>>> 2)      Do you have console access or http access to that
>>>>>>>>>>> switch?
>>>>>>>>>>>
>>>>>>>>>>> 3)      Does that switch have an SM in it?
>>>>>>>>>>>
>>>>>>>>>>> 4)      What version of the kernel are you running with the qib
>>>>>>>>>>> cards?
>>>>>>>>>>>
>>>>>>>>>>> a.       I assume you are using the qib driver in that kernel.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> At some point Hal spoke of “LLR being a Mellanox thing”  Was
>>>>>>>>>>> that to solve the problem of connecting the “HP switch” to the Mellanox
>>>>>>>>>>> switch?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I would like it if you could verify that the
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> /usr/sbin/truescale-serdes.cmds
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Is being run?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Also what version of libipathverbs do you have?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Ira
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *From:* users-bounces at lists.openfabrics.org [mailto:
>>>>>>>>>>> users-bounces at lists.openfabrics.org] *On Behalf Of *Weiny, Ira
>>>>>>>>>>> *Sent:* Wednesday, October 07, 2015 1:31 PM
>>>>>>>>>>> *To:* Hal Rosenstock; German Anders
>>>>>>>>>>>
>>>>>>>>>>> *Cc:* users at lists.openfabrics.org
>>>>>>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Agree with Hal here.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I’m not familiar with those blades/switches.  I’ll ask around.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Ira
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *From:* Hal Rosenstock [mailto:hal.rosenstock at gmail.com
>>>>>>>>>>> <hal.rosenstock at gmail.com>]
>>>>>>>>>>> *Sent:* Wednesday, October 07, 2015 1:26 PM
>>>>>>>>>>> *To:* German Anders
>>>>>>>>>>> *Cc:* Weiny, Ira; users at lists.openfabrics.org
>>>>>>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> That's the gateway to the switch in the enclosure. It's the
>>>>>>>>>>> internal connectivity in the blade enclosure that's (physically) broken.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Oct 7, 2015 at 4:24 PM, German Anders <
>>>>>>>>>>> ganders at despegar.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> cabled
>>>>>>>>>>>
>>>>>>>>>>> the blade it's:
>>>>>>>>>>>
>>>>>>>>>>> vendid=0x2c9
>>>>>>>>>>> devid=0xbd36
>>>>>>>>>>> sysimgguid=0x2c902004b0918
>>>>>>>>>>> switchguid=0x2c902004b0918(2c902004b0918)
>>>>>>>>>>> Switch    32 "S-0002c902004b0918"        # "Infiniscale-IV
>>>>>>>>>>> Mellanox Technologies" base port 0 *lid 29* lmc 0
>>>>>>>>>>> [1]    "S-e41d2d030031e9c1"[9]        # "MF0;GWIB01:SX6036G/U1"
>>>>>>>>>>> lid 24 4xQDR
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *German*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2015-10-07 17:21 GMT-03:00 Hal Rosenstock <
>>>>>>>>>>> hal.rosenstock at gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>> What are those HCAs cabled to or is it internal to the blade
>>>>>>>>>>> enclosure ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Oct 7, 2015 at 3:24 PM, German Anders <
>>>>>>>>>>> ganders at despegar.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Yeah, is there any command that I can run in order to change the
>>>>>>>>>>> port state on the remote switch? I mean everything looks good but in the hp
>>>>>>>>>>> blades still getting:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> # ibstat
>>>>>>>>>>> CA 'qib0'
>>>>>>>>>>>     CA type: InfiniPath_QMH7342
>>>>>>>>>>>     Number of ports: 2
>>>>>>>>>>>     Firmware version:
>>>>>>>>>>>     Hardware version: 2
>>>>>>>>>>>     Node GUID: 0x0011750000791fec
>>>>>>>>>>>     System image GUID: 0x0011750000791fec
>>>>>>>>>>>     Port 1:
>>>>>>>>>>>         State: *Down*
>>>>>>>>>>>         Physical state: *Polling*
>>>>>>>>>>>         Rate: 40
>>>>>>>>>>>         Base lid: 4660
>>>>>>>>>>>         LMC: 0
>>>>>>>>>>>         SM lid: 4660
>>>>>>>>>>>         Capability mask: 0x0761086a
>>>>>>>>>>>         Port GUID: 0x0011750000791fec
>>>>>>>>>>>         Link layer: InfiniBand
>>>>>>>>>>>     Port 2:
>>>>>>>>>>>         State: *Down*
>>>>>>>>>>>         Physical state: *Polling*
>>>>>>>>>>>         Rate: 40
>>>>>>>>>>>         Base lid: 4660
>>>>>>>>>>>         LMC: 0
>>>>>>>>>>>         SM lid: 4660
>>>>>>>>>>>         Capability mask: 0x0761086a
>>>>>>>>>>>         Port GUID: 0x0011750000791fed
>>>>>>>>>>>         Link layer: InfiniBand
>>>>>>>>>>>
>>>>>>>>>>> Also on working hosts I only see devices from the local network,
>>>>>>>>>>> but didn't see any of the blades hca connections.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *German*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2015-10-07 16:21 GMT-03:00 Hal Rosenstock <
>>>>>>>>>>> hal.rosenstock at gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>> The screen shot looks good :-) SM brought the link up to active.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Note that the ibportstate command you gave was for switch port 0
>>>>>>>>>>> of the Mellanox IS-4 switch in the QLogic HP BLc 4X QDR IB Switch.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Oct 7, 2015 at 3:06 PM, German Anders <
>>>>>>>>>>> ganders at despegar.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Yes, find attached an screenshot of the port information (# 9)
>>>>>>>>>>> the one that makes the ISL to the QLogic HP BLc 4X QDR IB Switch, also from
>>>>>>>>>>> one of the hosts that are connected to one of the SX6018F I can see the
>>>>>>>>>>> 'remote' HP IB SW:
>>>>>>>>>>>
>>>>>>>>>>> # *ibnodes*
>>>>>>>>>>>
>>>>>>>>>>> (...)
>>>>>>>>>>> Switch    : 0x0002c902004b0918 ports 32 "Infiniscale-IV Mellanox
>>>>>>>>>>> Technologies" base port 0 *lid 29* lmc 0
>>>>>>>>>>> Switch    : 0xe41d2d030031e9c1 ports 37 "MF0;GWIB01:SX6036G/U1"
>>>>>>>>>>> enhanced port 0 lid 24 lmc 0
>>>>>>>>>>> (...)
>>>>>>>>>>>
>>>>>>>>>>> # *ibportstate -L 29 query*
>>>>>>>>>>> Switch PortInfo:
>>>>>>>>>>> # Port info: Lid 29 port 0
>>>>>>>>>>> LinkState:.......................Active
>>>>>>>>>>> PhysLinkState:...................LinkUp
>>>>>>>>>>> Lid:.............................29
>>>>>>>>>>> SMLid:...........................2
>>>>>>>>>>> LMC:.............................0
>>>>>>>>>>> LinkWidthSupported:..............1X or 4X
>>>>>>>>>>> LinkWidthEnabled:................1X or 4X
>>>>>>>>>>> LinkWidthActive:.................4X
>>>>>>>>>>> LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0
>>>>>>>>>>> Gbps
>>>>>>>>>>> LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0
>>>>>>>>>>> Gbps
>>>>>>>>>>> LinkSpeedActive:.................10.0 Gbps
>>>>>>>>>>> Mkey:............................<not displayed>
>>>>>>>>>>> MkeyLeasePeriod:.................0
>>>>>>>>>>> ProtectBits:.....................0
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *German*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2015-10-07 16:00 GMT-03:00 Hal Rosenstock <
>>>>>>>>>>> hal.rosenstock at gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>> One more thing hopefully before playing with the low level phy
>>>>>>>>>>> settings:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Are you using known good cables ? Do you have FDR cables on the
>>>>>>>>>>> FDR <-> FDR links ? Cable lengths can matter as well.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Oct 7, 2015 at 12:57 PM, Hal Rosenstock <
>>>>>>>>>>> hal.rosenstock at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Were the ports mapped to the phy profile shutdown when you
>>>>>>>>>>> changed this ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> LLR is a proprietary Mellanox mechanism.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> You might want 2 different profiles: one for the interfaces
>>>>>>>>>>> connected to other gateway interfaces (which are FDR (and FDR-10) capable
>>>>>>>>>>> and the other for the interfaces connecting to QDR (the older equipment in
>>>>>>>>>>> your network). By configuring the Switch-X interfaces to the appropriate
>>>>>>>>>>> possible speeds and disabling the proprietary mechanisms there, the link
>>>>>>>>>>> should not only come up but also this will occur faster than if FDR/FDR10
>>>>>>>>>>> are enabled.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I suspect that due to the Switch-X configuration that the links
>>>>>>>>>>> to the switch(es) in the HP enclosures do not negotiate properly (as shown
>>>>>>>>>>> by down rather than LinkUp).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Once you get all your links to INIT, negotiation has occurred
>>>>>>>>>>> and then it's time for SM to bring links to active.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Since you have down links, the SM can't do anything about those.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Oct 7, 2015 at 12:44 PM, German Anders <
>>>>>>>>>>> ganders at despegar.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Anyone had any experience with HP BLc 4X QDR IB Switch?? I know
>>>>>>>>>>> that this kind of SW does not come with an embedded sm, but I don't know
>>>>>>>>>>> how to access any mgmt at all on this particularly switch, I mean for
>>>>>>>>>>> example to setup speed or anything like that, is possible to access through
>>>>>>>>>>> the chassis?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *German*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2015-10-07 13:19 GMT-03:00 German Anders <ganders at despegar.com>:
>>>>>>>>>>>
>>>>>>>>>>> I think so, but when trying to configured the phy-profile on the
>>>>>>>>>>> interface in order to negotiate on QDR it failed to map the profile:
>>>>>>>>>>>
>>>>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # show phy-profile
>>>>>>>>>>> high-speed-ber
>>>>>>>>>>>
>>>>>>>>>>>   Profile: high-speed-ber
>>>>>>>>>>>   --------
>>>>>>>>>>>   llr support ib-speed
>>>>>>>>>>>   SDR: disable
>>>>>>>>>>>   DDR: disable
>>>>>>>>>>>   QDR: disable
>>>>>>>>>>>   FDR10: enable-request
>>>>>>>>>>>   FDR: enable-request
>>>>>>>>>>>
>>>>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # show phy-profile
>>>>>>>>>>> hp-encl-isl
>>>>>>>>>>>
>>>>>>>>>>>   Profile: hp-encl-isl
>>>>>>>>>>>   --------
>>>>>>>>>>>   llr support ib-speed
>>>>>>>>>>>   SDR: disable
>>>>>>>>>>>   DDR: disable
>>>>>>>>>>>   QDR: enable
>>>>>>>>>>>   FDR10: enable-request
>>>>>>>>>>>   FDR: enable-request
>>>>>>>>>>>
>>>>>>>>>>> GWIB01 [proxy-ha-group: master] (config) #
>>>>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # interface ib 1/9
>>>>>>>>>>> phy-profile map hp-encl-isl
>>>>>>>>>>> *% Cannot map profile hp-encl-isl to port:  1/9*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *German*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2015-10-07 13:17 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>>>>>>>
>>>>>>>>>>> The driver ‘qib’ is loading fine.  As can be seen by the ibstat
>>>>>>>>>>> output.  The ib_ipath is an older card.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The problem is the link is not coming up to init.  Like Hal said
>>>>>>>>>>> the link should transition to “link up” without the SMs involvement.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I think you are on to something with the fact that it seems like
>>>>>>>>>>> your switch ports are not configured to do QDR.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Ira
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *From:* German Anders [mailto:ganders at despegar.com]
>>>>>>>>>>> *Sent:* Wednesday, October 07, 2015 9:05 AM
>>>>>>>>>>> *To:* Weiny, Ira
>>>>>>>>>>> *Cc:* Hal Rosenstock; users at lists.openfabrics.org
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yes I've that file:
>>>>>>>>>>>
>>>>>>>>>>> /usr/sbin/truescale-serdes.cmds
>>>>>>>>>>>
>>>>>>>>>>> Also I've done the install of libipathverbs:
>>>>>>>>>>>
>>>>>>>>>>> # apt-get install libipathverbs-dev
>>>>>>>>>>>
>>>>>>>>>>> But I try to load the ib_ipath module but I'm getting the
>>>>>>>>>>> following error msg:
>>>>>>>>>>>
>>>>>>>>>>> # modprobe ib_ipath
>>>>>>>>>>> modprobe: ERROR: could not insert 'ib_ipath': Device or resource
>>>>>>>>>>> busy
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *German*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2015-10-07 12:54 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>>>>>>>
>>>>>>>>>>> There are a few issues for routing in that diagram but the links
>>>>>>>>>>> should come up.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I assume there is some backplane between the blade servers and
>>>>>>>>>>> the switch in that chassis?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Have you gotten libipathverbs installed?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> In ipathverbs there is a serdes tuning script.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> https://github.com/01org/libipathverbs/blob/master/truescale-serdes.cmds
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Does your libipathverbs include that file?  If not try the
>>>>>>>>>>> latest from github.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Ira
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *From:* users-bounces at lists.openfabrics.org [mailto:
>>>>>>>>>>> users-bounces at lists.openfabrics.org] *On Behalf Of *German
>>>>>>>>>>> Anders
>>>>>>>>>>> *Sent:* Wednesday, October 07, 2015 8:41 AM
>>>>>>>>>>> *To:* Hal Rosenstock
>>>>>>>>>>> *Cc:* users at lists.openfabrics.org
>>>>>>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hi Hal,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the reply, I've attach a pdf with the diagram
>>>>>>>>>>> topology, I don't know if this is the best way to go or if there's another
>>>>>>>>>>> way to connect and setup the IB network, tips and suggestions will be very
>>>>>>>>>>> appreciated, also the mezzanine cards are already installed on the blade
>>>>>>>>>>> hosts:
>>>>>>>>>>>
>>>>>>>>>>> # lspci
>>>>>>>>>>> (...)
>>>>>>>>>>> 41:00.0 InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand HCA (rev
>>>>>>>>>>> 02)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks in advance,
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *German*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2015-10-07 11:47 GMT-03:00 Hal Rosenstock <
>>>>>>>>>>> hal.rosenstock at gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>> Hi again German,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Looks like you made some progress from yesterday as the qib
>>>>>>>>>>> ports are now Polling rather than Disabled.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> But since they are Down, do you have them cabled to a switch ?
>>>>>>>>>>> That should bring the links up and the port state will be Init. That is the
>>>>>>>>>>> "starting" point.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> You will also then need to be running SM to bring the ports up
>>>>>>>>>>> to Active.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -- Hal
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Oct 7, 2015 at 10:37 AM, German Anders <
>>>>>>>>>>> ganders at despegar.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> I don't know if this is the mailist list for this kind of topic
>>>>>>>>>>> but I'm really new to IB and I've just install two SX6036G gateways
>>>>>>>>>>> connected to each other through two ISL ports, then I've configured a
>>>>>>>>>>> proxy-arp between both nodes (sm is disable on both gw's):
>>>>>>>>>>>
>>>>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # show proxy-arp ha
>>>>>>>>>>>
>>>>>>>>>>> Load balancing algorithm: ib-base-ip
>>>>>>>>>>> Number of Proxy-Arp interfaces: 1
>>>>>>>>>>>
>>>>>>>>>>> Proxy-ARP VIP
>>>>>>>>>>> =============
>>>>>>>>>>> Pra-group name: proxy-ha-group
>>>>>>>>>>> HA VIP address: 10.xx.xx.xx/xx
>>>>>>>>>>>
>>>>>>>>>>> Active nodes:
>>>>>>>>>>> ID                   State                IP
>>>>>>>>>>> --------------------------------------------------------------
>>>>>>>>>>> GWIB01               master               10.xx.xx.xx1
>>>>>>>>>>> GWIB02               standby              10.xx.xx.xx2
>>>>>>>>>>>
>>>>>>>>>>> Then I setup two SX6018F switches (*SWIB01* and *SWIB02*), one
>>>>>>>>>>> connected to GWIB01 and the other connected to GWIB02. The SM is configured
>>>>>>>>>>> locally on both SWIB01 & SWIB02 switches. So far so good, after this config
>>>>>>>>>>> I setup a commodity server with a MLNX IB ADPT FDR to the SWIB01 & SWIB02
>>>>>>>>>>> switches, config the drivers, etc and then get it up & running fine.
>>>>>>>>>>>
>>>>>>>>>>> Finally I've setup a HP Enclosure with an internal IB SW (then
>>>>>>>>>>> connect port 1 of the internal SW to GWIB01 - link is up but LLR status is
>>>>>>>>>>> inactive), install one of the blades and I see the following:
>>>>>>>>>>>
>>>>>>>>>>> # ibstat
>>>>>>>>>>> CA 'qib0'
>>>>>>>>>>>     CA type: InfiniPath_QMH7342
>>>>>>>>>>>     Number of ports: 2
>>>>>>>>>>>     Firmware version:
>>>>>>>>>>>     Hardware version: 2
>>>>>>>>>>>     Node GUID: 0x0011750000791fec
>>>>>>>>>>>     System image GUID: 0x0011750000791fec
>>>>>>>>>>>     Port 1:
>>>>>>>>>>>         State: Down
>>>>>>>>>>>         Physical state: Polling
>>>>>>>>>>>         Rate: 40
>>>>>>>>>>>         Base lid: 4660
>>>>>>>>>>>         LMC: 0
>>>>>>>>>>>         SM lid: 4660
>>>>>>>>>>>         Capability mask: 0x0761086a
>>>>>>>>>>>         Port GUID: 0x0011750000791fec
>>>>>>>>>>>         Link layer: InfiniBand
>>>>>>>>>>>     Port 2:
>>>>>>>>>>>         State: Down
>>>>>>>>>>>         Physical state: Polling
>>>>>>>>>>>         Rate: 40
>>>>>>>>>>>         Base lid: 4660
>>>>>>>>>>>         LMC: 0
>>>>>>>>>>>         SM lid: 4660
>>>>>>>>>>>         Capability mask: 0x0761086a
>>>>>>>>>>>         Port GUID: 0x0011750000791fed
>>>>>>>>>>>         Link layer: InfiniBand
>>>>>>>>>>>
>>>>>>>>>>> So I was wondering if maybe the SM is not being recognized on
>>>>>>>>>>> the Blade system and that's why is not passing the Polling state, is that
>>>>>>>>>>> possible? Or maybe is not possible to connect an ISL between the GW and the
>>>>>>>>>>> HP internal SW so that the sm is available or maybe the inactive LLR is
>>>>>>>>>>> causing this thing, any ideas? I thought about connecting the
>>>>>>>>>>> ISL of the HP IB SW to the SWIB01 or SWIB02 instead of the GW's but I don't
>>>>>>>>>>> have any available ports.
>>>>>>>>>>>
>>>>>>>>>>> Thanks in advance,
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *German*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Users mailing list
>>>>>>>>>>> Users at lists.openfabrics.org
>>>>>>>>>>> http://lists.openfabrics.org/mailman/listinfo/users
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20151013/d7302d9e/attachment.html>


More information about the Users mailing list