[Users] IB topology config and polling state

Hal Rosenstock hal.rosenstock at gmail.com
Tue Oct 13 06:15:12 PDT 2015


What about the 3 critical errors ? What are they ?

On Tue, Oct 13, 2015 at 9:13 AM, German Anders <ganders at despegar.com> wrote:

> I've try that and in fact I try to put the IP ADDR like this:
>
> SET EBIPA INTERCONNECT XX.XX.XX.XX XXX.XXX.XXX.XXX 3
> SET EBIPA INTERCONNECT GATEWAY XX.XX.XX.XX 3
> SET EBIPA INTERCONNECT DOMAIN "xxxxxx.net" 3
> ADD EBIPA INTERCONNECT DNS 10.xx.xx.xx 3
> ADD EBIPA INTERCONNECT DNS 10.xx.xx.xx 3
> SET EBIPA INTERCONNECT NTP PRIMARY NONE 3
> SET EBIPA INTERCONNECT NTP SECONDARY NONE 3
> ENABLE EBIPA INTERCONNECT 3
>
> SAVE EBIPA
>
> But i'm not getting any ip response, also I've try many diff ip addr with no luck...if i put that ip to one of the blades it works fine, but not to the interconnect bay :( any other idea?
>
> Cheers,
>
>
> *German* <ganders at despegar.com>
>
> 2015-10-13 10:01 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>
>> Looks like there are 3 critical errors in system status. Did you look at
>> these ?
>>
>> I don't know if you've seen this but there is some info on configuring
>> the management IPs in http://h10032.www1.hp.com/ctg/Manual/c00814176.pdf
>>
>> Have you looked at/tried the command line interface ?
>>
>> On Tue, Oct 13, 2015 at 8:28 AM, German Anders <ganders at despegar.com>
>> wrote:
>>
>>> Hi Hal,
>>>
>>> It does not allow me to setup an IP ADDR to the Internal SW so I can't
>>> access from outside, except from the tools that I mentioned before, also it
>>> doesn't allow me to access through serial connection from inside the
>>> enclosure. I've attach some screen-shots about the connectivity.
>>>
>>>
>>>
>>> *German*
>>>
>>> 2015-10-13 9:13 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>>>
>>>> Hi German,
>>>>
>>>> Are the cards in the correct bays and slots ?
>>>>
>>>> Do you have the HP Onboard Administrator tool ? What does it say about
>>>> internal connectivity ?
>>>>
>>>> -- Hal
>>>>
>>>>
>>>>
>>>> On Tue, Oct 13, 2015 at 7:44 AM, German Anders <ganders at despegar.com>
>>>> wrote:
>>>>
>>>>> Hi Ira,
>>>>>
>>>>> I've some HP documentation but it quite short, also it doesn't
>>>>> describe any 'config' or 'impl' steps in order to get the internal switch
>>>>> up and running. The version of SW that came with enclosure does not had any
>>>>> management module at all, so it depends on external management. During
>>>>> weekend I've found a way to upgrade the firmware of the HP switch with the
>>>>> following command (mlxburn -d lid-0x001D -fw fw-IS4.mlx), and the I've run
>>>>> (flint -d /dev/mst/SW_MT48438_0x2c902004b0918_lid-0x001D dc
>>>>> /home/ceph/HPIBSW.INI) and found the following inside that file:
>>>>>
>>>>> [PS_INFO]
>>>>> Name = 489184-B21
>>>>> Description = HP BLc 4X QDR IB Switch
>>>>>
>>>>> [ADAPTER]
>>>>> PSID = HP_0100000009
>>>>>
>>>>> (...)
>>>>>
>>>>> [IB_TO_HW_MAP]
>>>>> PORT1=14
>>>>> PORT2=15
>>>>> PORT3=16
>>>>> PORT4=17
>>>>> PORT5=18
>>>>> PORT6=12
>>>>> PORT7=11
>>>>> PORT8=10
>>>>> PORT9=9
>>>>> PORT10=8
>>>>> PORT11=7
>>>>> PORT12=6
>>>>> PORT13=5
>>>>> PORT14=4
>>>>> PORT15=3
>>>>> PORT16=2
>>>>> PORT17=20
>>>>> PORT18=22
>>>>> PORT19=24
>>>>>
>>>>> PORT20=26
>>>>> PORT21=28
>>>>> PORT22=30
>>>>> PORT23=35
>>>>> PORT24=33
>>>>> PORT25=21
>>>>> PORT26=23
>>>>> PORT27=25
>>>>> PORT28=27
>>>>> PORT29=29
>>>>> PORT30=36
>>>>> PORT31=34
>>>>> PORT32=32
>>>>> PORT33=1
>>>>> PORT34=13
>>>>> PORT35=19
>>>>> PORT36=31
>>>>>
>>>>> [unused_ports]
>>>>> hw_port1_not_in_use=1
>>>>> hw_port13_not_in_use=1
>>>>> hw_port19_not_in_use=1
>>>>> hw_port31_not_in_use=1
>>>>>
>>>>> (...)
>>>>>
>>>>> I don't know if maybe there's some issue with the port mapping, anyone
>>>>> had used this kind of switch?
>>>>>
>>>>> The summary of the problem is correct, the connectivity between the IB
>>>>> network (MLNX switches/gw) and the HP IB switch is working since I was able
>>>>> to upgrade the firmare of the switch and get information about it. But, the
>>>>> connection between the mezzanine cards of the blades and the internal IB sw
>>>>> enclosure is not working at all. Note, that if I go to the OA
>>>>> administration of the enclosure I can see the 'green' ports mapping of each
>>>>> of the blades and the interconnection switch, so I'm guessing that it
>>>>> should be working.
>>>>>
>>>>> Regarding the questions:
>>>>>
>>>>> 1)      What type of switch is in the HP chassis?
>>>>>
>>>>>
>>>>> *QLogic HP BLc 4X QDR IB Switch*
>>>>>
>>>>> *PSID = HP_0100000009*
>>>>>
>>>>> *Image type:   FS2*
>>>>>
>>>>> *FW ver:         7.4.3000*
>>>>>
>>>>> *Device ID:     48438*
>>>>> *GUI:              0002c902004b0918*
>>>>>
>>>>> 2)      Do you have console access or http access to that switch?
>>>>>
>>>>> *No, since it didn't had any manage module mezzanine card inside the
>>>>> switch, it only come with a i2c port. But, i can have access through the
>>>>> mlxburn and flint tools from one host that's connected to the ib network
>>>>> (outside the enclosure).*
>>>>>
>>>>> 3)      Does that switch have an SM in it?
>>>>>
>>>>> *No*
>>>>>
>>>>> 4)      What version of the kernel are you running with the qib cards?
>>>>>
>>>>> a.       I assume you are using the qib driver in that kernel.
>>>>>
>>>>> *Ubuntu 14.04.3 LTS - kernel 3.18.20-031820-generic*
>>>>>
>>>>>
>>>>>
>>>>> At some point Hal spoke of “LLR being a Mellanox thing”  Was that to
>>>>> solve the problem of connecting the “HP switch” to the Mellanox switch?
>>>>>
>>>>>
>>>>>
>>>>> *No, since LLR is only supported between mlnx devices, the ISL are up
>>>>> and working, since it's possible for me to query the switch*
>>>>>
>>>>>
>>>>>
>>>>> I would like it if you could verify that the
>>>>>
>>>>>
>>>>>
>>>>> /usr/sbin/truescale-serdes.cmds
>>>>>
>>>>>
>>>>>
>>>>> Is being run?
>>>>>
>>>>>
>>>>> *When trying to run the command:*
>>>>>
>>>>>
>>>>>
>>>>> *# /usr/sbin/truescale-serdes.cmds/usr/sbin/truescale-serdes.cmds:
>>>>> 100: /usr/sbin/truescale-serdes.cmds: Syntax error: "(" unexpected
>>>>> (expecting "}")*
>>>>>
>>>>>
>>>>>
>>>>> Also what version of libipathverbs do you have?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *# rpm -qa | grep libipathverbslibipathverbs-1.3-1.x86_64*
>>>>> Thanks in advance,
>>>>>
>>>>> Cheers,
>>>>>
>>>>>
>>>>>
>>>>> *German*
>>>>> 2015-10-13 2:14 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>
>>>>>> German,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Do you have any documentation on the HP blade system?  And the switch
>>>>>> which is in that system?
>>>>>>
>>>>>>
>>>>>>
>>>>>> I have to admit I have not followed everything in this thread
>>>>>> regarding your configuration but it seems like you have some mellanox
>>>>>> switches connected into an HP chassis which has both a switch and blades
>>>>>> with qib (Truescale) cards.
>>>>>>
>>>>>>
>>>>>>
>>>>>> The connection from the mellanox switch to the “HP chassis switch” is
>>>>>> linkup (active) but the connections to the individual qib HCAs are not even
>>>>>> linkup.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Is that a correct summary of the problem?
>>>>>>
>>>>>>
>>>>>>
>>>>>> If so here are some questions:
>>>>>>
>>>>>>
>>>>>>
>>>>>> 1)      What type of switch is in the HP chassis?
>>>>>>
>>>>>> 2)      Do you have console access or http access to that switch?
>>>>>>
>>>>>> 3)      Does that switch have an SM in it?
>>>>>>
>>>>>> 4)      What version of the kernel are you running with the qib
>>>>>> cards?
>>>>>>
>>>>>> a.       I assume you are using the qib driver in that kernel.
>>>>>>
>>>>>>
>>>>>>
>>>>>> At some point Hal spoke of “LLR being a Mellanox thing”  Was that to
>>>>>> solve the problem of connecting the “HP switch” to the Mellanox switch?
>>>>>>
>>>>>>
>>>>>>
>>>>>> I would like it if you could verify that the
>>>>>>
>>>>>>
>>>>>>
>>>>>> /usr/sbin/truescale-serdes.cmds
>>>>>>
>>>>>>
>>>>>>
>>>>>> Is being run?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Also what version of libipathverbs do you have?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Ira
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From:* users-bounces at lists.openfabrics.org [mailto:
>>>>>> users-bounces at lists.openfabrics.org] *On Behalf Of *Weiny, Ira
>>>>>> *Sent:* Wednesday, October 07, 2015 1:31 PM
>>>>>> *To:* Hal Rosenstock; German Anders
>>>>>>
>>>>>> *Cc:* users at lists.openfabrics.org
>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>
>>>>>>
>>>>>>
>>>>>> Agree with Hal here.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I’m not familiar with those blades/switches.  I’ll ask around.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Ira
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From:* Hal Rosenstock [mailto:hal.rosenstock at gmail.com
>>>>>> <hal.rosenstock at gmail.com>]
>>>>>> *Sent:* Wednesday, October 07, 2015 1:26 PM
>>>>>> *To:* German Anders
>>>>>> *Cc:* Weiny, Ira; users at lists.openfabrics.org
>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>
>>>>>>
>>>>>>
>>>>>> That's the gateway to the switch in the enclosure. It's the internal
>>>>>> connectivity in the blade enclosure that's (physically) broken.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 7, 2015 at 4:24 PM, German Anders <ganders at despegar.com>
>>>>>> wrote:
>>>>>>
>>>>>> cabled
>>>>>>
>>>>>> the blade it's:
>>>>>>
>>>>>> vendid=0x2c9
>>>>>> devid=0xbd36
>>>>>> sysimgguid=0x2c902004b0918
>>>>>> switchguid=0x2c902004b0918(2c902004b0918)
>>>>>> Switch    32 "S-0002c902004b0918"        # "Infiniscale-IV Mellanox
>>>>>> Technologies" base port 0 *lid 29* lmc 0
>>>>>> [1]    "S-e41d2d030031e9c1"[9]        # "MF0;GWIB01:SX6036G/U1" lid
>>>>>> 24 4xQDR
>>>>>>
>>>>>>
>>>>>> *German*
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015-10-07 17:21 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>>>>>>
>>>>>> What are those HCAs cabled to or is it internal to the blade
>>>>>> enclosure ?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 7, 2015 at 3:24 PM, German Anders <ganders at despegar.com>
>>>>>> wrote:
>>>>>>
>>>>>> Yeah, is there any command that I can run in order to change the port
>>>>>> state on the remote switch? I mean everything looks good but in the hp
>>>>>> blades still getting:
>>>>>>
>>>>>>
>>>>>>
>>>>>> # ibstat
>>>>>> CA 'qib0'
>>>>>>     CA type: InfiniPath_QMH7342
>>>>>>     Number of ports: 2
>>>>>>     Firmware version:
>>>>>>     Hardware version: 2
>>>>>>     Node GUID: 0x0011750000791fec
>>>>>>     System image GUID: 0x0011750000791fec
>>>>>>     Port 1:
>>>>>>         State: *Down*
>>>>>>         Physical state: *Polling*
>>>>>>         Rate: 40
>>>>>>         Base lid: 4660
>>>>>>         LMC: 0
>>>>>>         SM lid: 4660
>>>>>>         Capability mask: 0x0761086a
>>>>>>         Port GUID: 0x0011750000791fec
>>>>>>         Link layer: InfiniBand
>>>>>>     Port 2:
>>>>>>         State: *Down*
>>>>>>         Physical state: *Polling*
>>>>>>         Rate: 40
>>>>>>         Base lid: 4660
>>>>>>         LMC: 0
>>>>>>         SM lid: 4660
>>>>>>         Capability mask: 0x0761086a
>>>>>>         Port GUID: 0x0011750000791fed
>>>>>>         Link layer: InfiniBand
>>>>>>
>>>>>> Also on working hosts I only see devices from the local network, but
>>>>>> didn't see any of the blades hca connections.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *German*
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015-10-07 16:21 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>>>>>>
>>>>>> The screen shot looks good :-) SM brought the link up to active.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Note that the ibportstate command you gave was for switch port 0 of
>>>>>> the Mellanox IS-4 switch in the QLogic HP BLc 4X QDR IB Switch.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 7, 2015 at 3:06 PM, German Anders <ganders at despegar.com>
>>>>>> wrote:
>>>>>>
>>>>>> Yes, find attached an screenshot of the port information (# 9) the
>>>>>> one that makes the ISL to the QLogic HP BLc 4X QDR IB Switch, also from one
>>>>>> of the hosts that are connected to one of the SX6018F I can see the
>>>>>> 'remote' HP IB SW:
>>>>>>
>>>>>> # *ibnodes*
>>>>>>
>>>>>> (...)
>>>>>> Switch    : 0x0002c902004b0918 ports 32 "Infiniscale-IV Mellanox
>>>>>> Technologies" base port 0 *lid 29* lmc 0
>>>>>> Switch    : 0xe41d2d030031e9c1 ports 37 "MF0;GWIB01:SX6036G/U1"
>>>>>> enhanced port 0 lid 24 lmc 0
>>>>>> (...)
>>>>>>
>>>>>> # *ibportstate -L 29 query*
>>>>>> Switch PortInfo:
>>>>>> # Port info: Lid 29 port 0
>>>>>> LinkState:.......................Active
>>>>>> PhysLinkState:...................LinkUp
>>>>>> Lid:.............................29
>>>>>> SMLid:...........................2
>>>>>> LMC:.............................0
>>>>>> LinkWidthSupported:..............1X or 4X
>>>>>> LinkWidthEnabled:................1X or 4X
>>>>>> LinkWidthActive:.................4X
>>>>>> LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>>> LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>>> LinkSpeedActive:.................10.0 Gbps
>>>>>> Mkey:............................<not displayed>
>>>>>> MkeyLeasePeriod:.................0
>>>>>> ProtectBits:.....................0
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *German*
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015-10-07 16:00 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>>>>>>
>>>>>> One more thing hopefully before playing with the low level phy
>>>>>> settings:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Are you using known good cables ? Do you have FDR cables on the FDR
>>>>>> <-> FDR links ? Cable lengths can matter as well.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 7, 2015 at 12:57 PM, Hal Rosenstock <
>>>>>> hal.rosenstock at gmail.com> wrote:
>>>>>>
>>>>>> Were the ports mapped to the phy profile shutdown when you changed
>>>>>> this ?
>>>>>>
>>>>>>
>>>>>>
>>>>>> LLR is a proprietary Mellanox mechanism.
>>>>>>
>>>>>>
>>>>>>
>>>>>> You might want 2 different profiles: one for the interfaces connected
>>>>>> to other gateway interfaces (which are FDR (and FDR-10) capable and the
>>>>>> other for the interfaces connecting to QDR (the older equipment in your
>>>>>> network). By configuring the Switch-X interfaces to the appropriate
>>>>>> possible speeds and disabling the proprietary mechanisms there, the link
>>>>>> should not only come up but also this will occur faster than if FDR/FDR10
>>>>>> are enabled.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I suspect that due to the Switch-X configuration that the links to
>>>>>> the switch(es) in the HP enclosures do not negotiate properly (as shown by
>>>>>> down rather than LinkUp).
>>>>>>
>>>>>>
>>>>>>
>>>>>> Once you get all your links to INIT, negotiation has occurred and
>>>>>> then it's time for SM to bring links to active.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Since you have down links, the SM can't do anything about those.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 7, 2015 at 12:44 PM, German Anders <ganders at despegar.com>
>>>>>> wrote:
>>>>>>
>>>>>> Anyone had any experience with HP BLc 4X QDR IB Switch?? I know that
>>>>>> this kind of SW does not come with an embedded sm, but I don't know how to
>>>>>> access any mgmt at all on this particularly switch, I mean for example to
>>>>>> setup speed or anything like that, is possible to access through the
>>>>>> chassis?
>>>>>>
>>>>>>
>>>>>> *German*
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015-10-07 13:19 GMT-03:00 German Anders <ganders at despegar.com>:
>>>>>>
>>>>>> I think so, but when trying to configured the phy-profile on the
>>>>>> interface in order to negotiate on QDR it failed to map the profile:
>>>>>>
>>>>>> GWIB01 [proxy-ha-group: master] (config) # show phy-profile
>>>>>> high-speed-ber
>>>>>>
>>>>>>   Profile: high-speed-ber
>>>>>>   --------
>>>>>>   llr support ib-speed
>>>>>>   SDR: disable
>>>>>>   DDR: disable
>>>>>>   QDR: disable
>>>>>>   FDR10: enable-request
>>>>>>   FDR: enable-request
>>>>>>
>>>>>> GWIB01 [proxy-ha-group: master] (config) # show phy-profile
>>>>>> hp-encl-isl
>>>>>>
>>>>>>   Profile: hp-encl-isl
>>>>>>   --------
>>>>>>   llr support ib-speed
>>>>>>   SDR: disable
>>>>>>   DDR: disable
>>>>>>   QDR: enable
>>>>>>   FDR10: enable-request
>>>>>>   FDR: enable-request
>>>>>>
>>>>>> GWIB01 [proxy-ha-group: master] (config) #
>>>>>> GWIB01 [proxy-ha-group: master] (config) # interface ib 1/9
>>>>>> phy-profile map hp-encl-isl
>>>>>> *% Cannot map profile hp-encl-isl to port:  1/9*
>>>>>>
>>>>>>
>>>>>> *German*
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015-10-07 13:17 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>>
>>>>>> The driver ‘qib’ is loading fine.  As can be seen by the ibstat
>>>>>> output.  The ib_ipath is an older card.
>>>>>>
>>>>>>
>>>>>>
>>>>>> The problem is the link is not coming up to init.  Like Hal said the
>>>>>> link should transition to “link up” without the SMs involvement.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I think you are on to something with the fact that it seems like your
>>>>>> switch ports are not configured to do QDR.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Ira
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From:* German Anders [mailto:ganders at despegar.com]
>>>>>> *Sent:* Wednesday, October 07, 2015 9:05 AM
>>>>>> *To:* Weiny, Ira
>>>>>> *Cc:* Hal Rosenstock; users at lists.openfabrics.org
>>>>>>
>>>>>>
>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>
>>>>>>
>>>>>>
>>>>>> Yes I've that file:
>>>>>>
>>>>>> /usr/sbin/truescale-serdes.cmds
>>>>>>
>>>>>> Also I've done the install of libipathverbs:
>>>>>>
>>>>>> # apt-get install libipathverbs-dev
>>>>>>
>>>>>> But I try to load the ib_ipath module but I'm getting the following
>>>>>> error msg:
>>>>>>
>>>>>> # modprobe ib_ipath
>>>>>> modprobe: ERROR: could not insert 'ib_ipath': Device or resource busy
>>>>>>
>>>>>>
>>>>>> *German*
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015-10-07 12:54 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>>
>>>>>> There are a few issues for routing in that diagram but the links
>>>>>> should come up.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I assume there is some backplane between the blade servers and the
>>>>>> switch in that chassis?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Have you gotten libipathverbs installed?
>>>>>>
>>>>>>
>>>>>>
>>>>>> In ipathverbs there is a serdes tuning script.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> https://github.com/01org/libipathverbs/blob/master/truescale-serdes.cmds
>>>>>>
>>>>>>
>>>>>>
>>>>>> Does your libipathverbs include that file?  If not try the latest
>>>>>> from github.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Ira
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From:* users-bounces at lists.openfabrics.org [mailto:
>>>>>> users-bounces at lists.openfabrics.org] *On Behalf Of *German Anders
>>>>>> *Sent:* Wednesday, October 07, 2015 8:41 AM
>>>>>> *To:* Hal Rosenstock
>>>>>> *Cc:* users at lists.openfabrics.org
>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi Hal,
>>>>>>
>>>>>> Thanks for the reply, I've attach a pdf with the diagram topology, I
>>>>>> don't know if this is the best way to go or if there's another way to
>>>>>> connect and setup the IB network, tips and suggestions will be very
>>>>>> appreciated, also the mezzanine cards are already installed on the blade
>>>>>> hosts:
>>>>>>
>>>>>> # lspci
>>>>>> (...)
>>>>>> 41:00.0 InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand HCA (rev 02)
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks in advance,
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>>
>>>>>> *German*
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015-10-07 11:47 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>>>>>>
>>>>>> Hi again German,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Looks like you made some progress from yesterday as the qib ports are
>>>>>> now Polling rather than Disabled.
>>>>>>
>>>>>>
>>>>>>
>>>>>> But since they are Down, do you have them cabled to a switch ? That
>>>>>> should bring the links up and the port state will be Init. That is the
>>>>>> "starting" point.
>>>>>>
>>>>>>
>>>>>>
>>>>>> You will also then need to be running SM to bring the ports up to
>>>>>> Active.
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- Hal
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 7, 2015 at 10:37 AM, German Anders <ganders at despegar.com>
>>>>>> wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I don't know if this is the mailist list for this kind of topic but
>>>>>> I'm really new to IB and I've just install two SX6036G gateways connected
>>>>>> to each other through two ISL ports, then I've configured a proxy-arp
>>>>>> between both nodes (sm is disable on both gw's):
>>>>>>
>>>>>> GWIB01 [proxy-ha-group: master] (config) # show proxy-arp ha
>>>>>>
>>>>>> Load balancing algorithm: ib-base-ip
>>>>>> Number of Proxy-Arp interfaces: 1
>>>>>>
>>>>>> Proxy-ARP VIP
>>>>>> =============
>>>>>> Pra-group name: proxy-ha-group
>>>>>> HA VIP address: 10.xx.xx.xx/xx
>>>>>>
>>>>>> Active nodes:
>>>>>> ID                   State                IP
>>>>>> --------------------------------------------------------------
>>>>>> GWIB01               master               10.xx.xx.xx1
>>>>>> GWIB02               standby              10.xx.xx.xx2
>>>>>>
>>>>>> Then I setup two SX6018F switches (*SWIB01* and *SWIB02*), one
>>>>>> connected to GWIB01 and the other connected to GWIB02. The SM is configured
>>>>>> locally on both SWIB01 & SWIB02 switches. So far so good, after this config
>>>>>> I setup a commodity server with a MLNX IB ADPT FDR to the SWIB01 & SWIB02
>>>>>> switches, config the drivers, etc and then get it up & running fine.
>>>>>>
>>>>>> Finally I've setup a HP Enclosure with an internal IB SW (then
>>>>>> connect port 1 of the internal SW to GWIB01 - link is up but LLR status is
>>>>>> inactive), install one of the blades and I see the following:
>>>>>>
>>>>>> # ibstat
>>>>>> CA 'qib0'
>>>>>>     CA type: InfiniPath_QMH7342
>>>>>>     Number of ports: 2
>>>>>>     Firmware version:
>>>>>>     Hardware version: 2
>>>>>>     Node GUID: 0x0011750000791fec
>>>>>>     System image GUID: 0x0011750000791fec
>>>>>>     Port 1:
>>>>>>         State: Down
>>>>>>         Physical state: Polling
>>>>>>         Rate: 40
>>>>>>         Base lid: 4660
>>>>>>         LMC: 0
>>>>>>         SM lid: 4660
>>>>>>         Capability mask: 0x0761086a
>>>>>>         Port GUID: 0x0011750000791fec
>>>>>>         Link layer: InfiniBand
>>>>>>     Port 2:
>>>>>>         State: Down
>>>>>>         Physical state: Polling
>>>>>>         Rate: 40
>>>>>>         Base lid: 4660
>>>>>>         LMC: 0
>>>>>>         SM lid: 4660
>>>>>>         Capability mask: 0x0761086a
>>>>>>         Port GUID: 0x0011750000791fed
>>>>>>         Link layer: InfiniBand
>>>>>>
>>>>>> So I was wondering if maybe the SM is not being recognized on the
>>>>>> Blade system and that's why is not passing the Polling state, is that
>>>>>> possible? Or maybe is not possible to connect an ISL between the GW and the
>>>>>> HP internal SW so that the sm is available or maybe the inactive LLR is
>>>>>> causing this thing, any ideas? I thought about connecting the ISL of
>>>>>> the HP IB SW to the SWIB01 or SWIB02 instead of the GW's but I don't have
>>>>>> any available ports.
>>>>>>
>>>>>> Thanks in advance,
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>>
>>>>>> *German*
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> Users at lists.openfabrics.org
>>>>>> http://lists.openfabrics.org/mailman/listinfo/users
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20151013/dbf3441a/attachment.html>


More information about the Users mailing list