[Users] IB topology config and polling state

Hal Rosenstock hal.rosenstock at gmail.com
Tue Oct 13 06:49:33 PDT 2015


I don't really know but I was wondering about whether your hardware
supports this.

HPOA_2 shows internal interface to OA is absent as is external Ethernet
interface.

On Tue, Oct 13, 2015 at 9:31 AM, German Anders <ganders at despegar.com> wrote:

> when trying to assign the ip addr to the interconnect bay it seems that it
> does not 'apply' the change, and the ip addr is not displayed as
> 'configured' so there's no way to enter from outside. ideas?
>
> *German*
>
> 2015-10-13 10:15 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>
>> What about the 3 critical errors ? What are they ?
>>
>> On Tue, Oct 13, 2015 at 9:13 AM, German Anders <ganders at despegar.com>
>> wrote:
>>
>>> I've try that and in fact I try to put the IP ADDR like this:
>>>
>>> SET EBIPA INTERCONNECT XX.XX.XX.XX XXX.XXX.XXX.XXX 3
>>> SET EBIPA INTERCONNECT GATEWAY XX.XX.XX.XX 3
>>> SET EBIPA INTERCONNECT DOMAIN "xxxxxx.net" 3
>>> ADD EBIPA INTERCONNECT DNS 10.xx.xx.xx 3
>>> ADD EBIPA INTERCONNECT DNS 10.xx.xx.xx 3
>>> SET EBIPA INTERCONNECT NTP PRIMARY NONE 3
>>> SET EBIPA INTERCONNECT NTP SECONDARY NONE 3
>>> ENABLE EBIPA INTERCONNECT 3
>>>
>>> SAVE EBIPA
>>>
>>> But i'm not getting any ip response, also I've try many diff ip addr with no luck...if i put that ip to one of the blades it works fine, but not to the interconnect bay :( any other idea?
>>>
>>> Cheers,
>>>
>>>
>>> *German* <ganders at despegar.com>
>>>
>>> 2015-10-13 10:01 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>>>
>>>> Looks like there are 3 critical errors in system status. Did you look
>>>> at these ?
>>>>
>>>> I don't know if you've seen this but there is some info on configuring
>>>> the management IPs in
>>>> http://h10032.www1.hp.com/ctg/Manual/c00814176.pdf
>>>>
>>>> Have you looked at/tried the command line interface ?
>>>>
>>>> On Tue, Oct 13, 2015 at 8:28 AM, German Anders <ganders at despegar.com>
>>>> wrote:
>>>>
>>>>> Hi Hal,
>>>>>
>>>>> It does not allow me to setup an IP ADDR to the Internal SW so I can't
>>>>> access from outside, except from the tools that I mentioned before, also it
>>>>> doesn't allow me to access through serial connection from inside the
>>>>> enclosure. I've attach some screen-shots about the connectivity.
>>>>>
>>>>>
>>>>>
>>>>> *German*
>>>>>
>>>>> 2015-10-13 9:13 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>>>>>
>>>>>> Hi German,
>>>>>>
>>>>>> Are the cards in the correct bays and slots ?
>>>>>>
>>>>>> Do you have the HP Onboard Administrator tool ? What does it say
>>>>>> about internal connectivity ?
>>>>>>
>>>>>> -- Hal
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Oct 13, 2015 at 7:44 AM, German Anders <ganders at despegar.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Ira,
>>>>>>>
>>>>>>> I've some HP documentation but it quite short, also it doesn't
>>>>>>> describe any 'config' or 'impl' steps in order to get the internal switch
>>>>>>> up and running. The version of SW that came with enclosure does not had any
>>>>>>> management module at all, so it depends on external management. During
>>>>>>> weekend I've found a way to upgrade the firmware of the HP switch with the
>>>>>>> following command (mlxburn -d lid-0x001D -fw fw-IS4.mlx), and the I've run
>>>>>>> (flint -d /dev/mst/SW_MT48438_0x2c902004b0918_lid-0x001D dc
>>>>>>> /home/ceph/HPIBSW.INI) and found the following inside that file:
>>>>>>>
>>>>>>> [PS_INFO]
>>>>>>> Name = 489184-B21
>>>>>>> Description = HP BLc 4X QDR IB Switch
>>>>>>>
>>>>>>> [ADAPTER]
>>>>>>> PSID = HP_0100000009
>>>>>>>
>>>>>>> (...)
>>>>>>>
>>>>>>> [IB_TO_HW_MAP]
>>>>>>> PORT1=14
>>>>>>> PORT2=15
>>>>>>> PORT3=16
>>>>>>> PORT4=17
>>>>>>> PORT5=18
>>>>>>> PORT6=12
>>>>>>> PORT7=11
>>>>>>> PORT8=10
>>>>>>> PORT9=9
>>>>>>> PORT10=8
>>>>>>> PORT11=7
>>>>>>> PORT12=6
>>>>>>> PORT13=5
>>>>>>> PORT14=4
>>>>>>> PORT15=3
>>>>>>> PORT16=2
>>>>>>> PORT17=20
>>>>>>> PORT18=22
>>>>>>> PORT19=24
>>>>>>>
>>>>>>> PORT20=26
>>>>>>> PORT21=28
>>>>>>> PORT22=30
>>>>>>> PORT23=35
>>>>>>> PORT24=33
>>>>>>> PORT25=21
>>>>>>> PORT26=23
>>>>>>> PORT27=25
>>>>>>> PORT28=27
>>>>>>> PORT29=29
>>>>>>> PORT30=36
>>>>>>> PORT31=34
>>>>>>> PORT32=32
>>>>>>> PORT33=1
>>>>>>> PORT34=13
>>>>>>> PORT35=19
>>>>>>> PORT36=31
>>>>>>>
>>>>>>> [unused_ports]
>>>>>>> hw_port1_not_in_use=1
>>>>>>> hw_port13_not_in_use=1
>>>>>>> hw_port19_not_in_use=1
>>>>>>> hw_port31_not_in_use=1
>>>>>>>
>>>>>>> (...)
>>>>>>>
>>>>>>> I don't know if maybe there's some issue with the port mapping,
>>>>>>> anyone had used this kind of switch?
>>>>>>>
>>>>>>> The summary of the problem is correct, the connectivity between the
>>>>>>> IB network (MLNX switches/gw) and the HP IB switch is working since I was
>>>>>>> able to upgrade the firmare of the switch and get information about it.
>>>>>>> But, the connection between the mezzanine cards of the blades and the
>>>>>>> internal IB sw enclosure is not working at all. Note, that if I go to the
>>>>>>> OA administration of the enclosure I can see the 'green' ports mapping of
>>>>>>> each of the blades and the interconnection switch, so I'm guessing that it
>>>>>>> should be working.
>>>>>>>
>>>>>>> Regarding the questions:
>>>>>>>
>>>>>>> 1)      What type of switch is in the HP chassis?
>>>>>>>
>>>>>>>
>>>>>>> *QLogic HP BLc 4X QDR IB Switch*
>>>>>>>
>>>>>>> *PSID = HP_0100000009*
>>>>>>>
>>>>>>> *Image type:   FS2*
>>>>>>>
>>>>>>> *FW ver:         7.4.3000*
>>>>>>>
>>>>>>> *Device ID:     48438*
>>>>>>> *GUI:              0002c902004b0918*
>>>>>>>
>>>>>>> 2)      Do you have console access or http access to that switch?
>>>>>>>
>>>>>>> *No, since it didn't had any manage module mezzanine card inside the
>>>>>>> switch, it only come with a i2c port. But, i can have access through the
>>>>>>> mlxburn and flint tools from one host that's connected to the ib network
>>>>>>> (outside the enclosure).*
>>>>>>>
>>>>>>> 3)      Does that switch have an SM in it?
>>>>>>>
>>>>>>> *No*
>>>>>>>
>>>>>>> 4)      What version of the kernel are you running with the qib
>>>>>>> cards?
>>>>>>>
>>>>>>> a.       I assume you are using the qib driver in that kernel.
>>>>>>>
>>>>>>> *Ubuntu 14.04.3 LTS - kernel 3.18.20-031820-generic*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> At some point Hal spoke of “LLR being a Mellanox thing”  Was that to
>>>>>>> solve the problem of connecting the “HP switch” to the Mellanox switch?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *No, since LLR is only supported between mlnx devices, the ISL are
>>>>>>> up and working, since it's possible for me to query the switch*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I would like it if you could verify that the
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> /usr/sbin/truescale-serdes.cmds
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Is being run?
>>>>>>>
>>>>>>>
>>>>>>> *When trying to run the command:*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *# /usr/sbin/truescale-serdes.cmds/usr/sbin/truescale-serdes.cmds:
>>>>>>> 100: /usr/sbin/truescale-serdes.cmds: Syntax error: "(" unexpected
>>>>>>> (expecting "}")*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Also what version of libipathverbs do you have?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *# rpm -qa | grep libipathverbslibipathverbs-1.3-1.x86_64*
>>>>>>> Thanks in advance,
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *German*
>>>>>>> 2015-10-13 2:14 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>>>
>>>>>>>> German,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Do you have any documentation on the HP blade system?  And the
>>>>>>>> switch which is in that system?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I have to admit I have not followed everything in this thread
>>>>>>>> regarding your configuration but it seems like you have some mellanox
>>>>>>>> switches connected into an HP chassis which has both a switch and blades
>>>>>>>> with qib (Truescale) cards.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The connection from the mellanox switch to the “HP chassis switch”
>>>>>>>> is linkup (active) but the connections to the individual qib HCAs are not
>>>>>>>> even linkup.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Is that a correct summary of the problem?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> If so here are some questions:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 1)      What type of switch is in the HP chassis?
>>>>>>>>
>>>>>>>> 2)      Do you have console access or http access to that switch?
>>>>>>>>
>>>>>>>> 3)      Does that switch have an SM in it?
>>>>>>>>
>>>>>>>> 4)      What version of the kernel are you running with the qib
>>>>>>>> cards?
>>>>>>>>
>>>>>>>> a.       I assume you are using the qib driver in that kernel.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> At some point Hal spoke of “LLR being a Mellanox thing”  Was that
>>>>>>>> to solve the problem of connecting the “HP switch” to the Mellanox switch?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I would like it if you could verify that the
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> /usr/sbin/truescale-serdes.cmds
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Is being run?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Also what version of libipathverbs do you have?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Ira
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *From:* users-bounces at lists.openfabrics.org [mailto:
>>>>>>>> users-bounces at lists.openfabrics.org] *On Behalf Of *Weiny, Ira
>>>>>>>> *Sent:* Wednesday, October 07, 2015 1:31 PM
>>>>>>>> *To:* Hal Rosenstock; German Anders
>>>>>>>>
>>>>>>>> *Cc:* users at lists.openfabrics.org
>>>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Agree with Hal here.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I’m not familiar with those blades/switches.  I’ll ask around.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Ira
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *From:* Hal Rosenstock [mailto:hal.rosenstock at gmail.com
>>>>>>>> <hal.rosenstock at gmail.com>]
>>>>>>>> *Sent:* Wednesday, October 07, 2015 1:26 PM
>>>>>>>> *To:* German Anders
>>>>>>>> *Cc:* Weiny, Ira; users at lists.openfabrics.org
>>>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> That's the gateway to the switch in the enclosure. It's the
>>>>>>>> internal connectivity in the blade enclosure that's (physically) broken.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Oct 7, 2015 at 4:24 PM, German Anders <ganders at despegar.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> cabled
>>>>>>>>
>>>>>>>> the blade it's:
>>>>>>>>
>>>>>>>> vendid=0x2c9
>>>>>>>> devid=0xbd36
>>>>>>>> sysimgguid=0x2c902004b0918
>>>>>>>> switchguid=0x2c902004b0918(2c902004b0918)
>>>>>>>> Switch    32 "S-0002c902004b0918"        # "Infiniscale-IV Mellanox
>>>>>>>> Technologies" base port 0 *lid 29* lmc 0
>>>>>>>> [1]    "S-e41d2d030031e9c1"[9]        # "MF0;GWIB01:SX6036G/U1" lid
>>>>>>>> 24 4xQDR
>>>>>>>>
>>>>>>>>
>>>>>>>> *German*
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2015-10-07 17:21 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com
>>>>>>>> >:
>>>>>>>>
>>>>>>>> What are those HCAs cabled to or is it internal to the blade
>>>>>>>> enclosure ?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Oct 7, 2015 at 3:24 PM, German Anders <ganders at despegar.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Yeah, is there any command that I can run in order to change the
>>>>>>>> port state on the remote switch? I mean everything looks good but in the hp
>>>>>>>> blades still getting:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> # ibstat
>>>>>>>> CA 'qib0'
>>>>>>>>     CA type: InfiniPath_QMH7342
>>>>>>>>     Number of ports: 2
>>>>>>>>     Firmware version:
>>>>>>>>     Hardware version: 2
>>>>>>>>     Node GUID: 0x0011750000791fec
>>>>>>>>     System image GUID: 0x0011750000791fec
>>>>>>>>     Port 1:
>>>>>>>>         State: *Down*
>>>>>>>>         Physical state: *Polling*
>>>>>>>>         Rate: 40
>>>>>>>>         Base lid: 4660
>>>>>>>>         LMC: 0
>>>>>>>>         SM lid: 4660
>>>>>>>>         Capability mask: 0x0761086a
>>>>>>>>         Port GUID: 0x0011750000791fec
>>>>>>>>         Link layer: InfiniBand
>>>>>>>>     Port 2:
>>>>>>>>         State: *Down*
>>>>>>>>         Physical state: *Polling*
>>>>>>>>         Rate: 40
>>>>>>>>         Base lid: 4660
>>>>>>>>         LMC: 0
>>>>>>>>         SM lid: 4660
>>>>>>>>         Capability mask: 0x0761086a
>>>>>>>>         Port GUID: 0x0011750000791fed
>>>>>>>>         Link layer: InfiniBand
>>>>>>>>
>>>>>>>> Also on working hosts I only see devices from the local network,
>>>>>>>> but didn't see any of the blades hca connections.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *German*
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2015-10-07 16:21 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com
>>>>>>>> >:
>>>>>>>>
>>>>>>>> The screen shot looks good :-) SM brought the link up to active.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Note that the ibportstate command you gave was for switch port 0 of
>>>>>>>> the Mellanox IS-4 switch in the QLogic HP BLc 4X QDR IB Switch.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Oct 7, 2015 at 3:06 PM, German Anders <ganders at despegar.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Yes, find attached an screenshot of the port information (# 9) the
>>>>>>>> one that makes the ISL to the QLogic HP BLc 4X QDR IB Switch, also from one
>>>>>>>> of the hosts that are connected to one of the SX6018F I can see the
>>>>>>>> 'remote' HP IB SW:
>>>>>>>>
>>>>>>>> # *ibnodes*
>>>>>>>>
>>>>>>>> (...)
>>>>>>>> Switch    : 0x0002c902004b0918 ports 32 "Infiniscale-IV Mellanox
>>>>>>>> Technologies" base port 0 *lid 29* lmc 0
>>>>>>>> Switch    : 0xe41d2d030031e9c1 ports 37 "MF0;GWIB01:SX6036G/U1"
>>>>>>>> enhanced port 0 lid 24 lmc 0
>>>>>>>> (...)
>>>>>>>>
>>>>>>>> # *ibportstate -L 29 query*
>>>>>>>> Switch PortInfo:
>>>>>>>> # Port info: Lid 29 port 0
>>>>>>>> LinkState:.......................Active
>>>>>>>> PhysLinkState:...................LinkUp
>>>>>>>> Lid:.............................29
>>>>>>>> SMLid:...........................2
>>>>>>>> LMC:.............................0
>>>>>>>> LinkWidthSupported:..............1X or 4X
>>>>>>>> LinkWidthEnabled:................1X or 4X
>>>>>>>> LinkWidthActive:.................4X
>>>>>>>> LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>>>>> LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>>>>> LinkSpeedActive:.................10.0 Gbps
>>>>>>>> Mkey:............................<not displayed>
>>>>>>>> MkeyLeasePeriod:.................0
>>>>>>>> ProtectBits:.....................0
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *German*
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2015-10-07 16:00 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com
>>>>>>>> >:
>>>>>>>>
>>>>>>>> One more thing hopefully before playing with the low level phy
>>>>>>>> settings:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Are you using known good cables ? Do you have FDR cables on the FDR
>>>>>>>> <-> FDR links ? Cable lengths can matter as well.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Oct 7, 2015 at 12:57 PM, Hal Rosenstock <
>>>>>>>> hal.rosenstock at gmail.com> wrote:
>>>>>>>>
>>>>>>>> Were the ports mapped to the phy profile shutdown when you changed
>>>>>>>> this ?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> LLR is a proprietary Mellanox mechanism.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> You might want 2 different profiles: one for the interfaces
>>>>>>>> connected to other gateway interfaces (which are FDR (and FDR-10) capable
>>>>>>>> and the other for the interfaces connecting to QDR (the older equipment in
>>>>>>>> your network). By configuring the Switch-X interfaces to the appropriate
>>>>>>>> possible speeds and disabling the proprietary mechanisms there, the link
>>>>>>>> should not only come up but also this will occur faster than if FDR/FDR10
>>>>>>>> are enabled.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I suspect that due to the Switch-X configuration that the links to
>>>>>>>> the switch(es) in the HP enclosures do not negotiate properly (as shown by
>>>>>>>> down rather than LinkUp).
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Once you get all your links to INIT, negotiation has occurred and
>>>>>>>> then it's time for SM to bring links to active.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Since you have down links, the SM can't do anything about those.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Oct 7, 2015 at 12:44 PM, German Anders <
>>>>>>>> ganders at despegar.com> wrote:
>>>>>>>>
>>>>>>>> Anyone had any experience with HP BLc 4X QDR IB Switch?? I know
>>>>>>>> that this kind of SW does not come with an embedded sm, but I don't know
>>>>>>>> how to access any mgmt at all on this particularly switch, I mean for
>>>>>>>> example to setup speed or anything like that, is possible to access through
>>>>>>>> the chassis?
>>>>>>>>
>>>>>>>>
>>>>>>>> *German*
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2015-10-07 13:19 GMT-03:00 German Anders <ganders at despegar.com>:
>>>>>>>>
>>>>>>>> I think so, but when trying to configured the phy-profile on the
>>>>>>>> interface in order to negotiate on QDR it failed to map the profile:
>>>>>>>>
>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # show phy-profile
>>>>>>>> high-speed-ber
>>>>>>>>
>>>>>>>>   Profile: high-speed-ber
>>>>>>>>   --------
>>>>>>>>   llr support ib-speed
>>>>>>>>   SDR: disable
>>>>>>>>   DDR: disable
>>>>>>>>   QDR: disable
>>>>>>>>   FDR10: enable-request
>>>>>>>>   FDR: enable-request
>>>>>>>>
>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # show phy-profile
>>>>>>>> hp-encl-isl
>>>>>>>>
>>>>>>>>   Profile: hp-encl-isl
>>>>>>>>   --------
>>>>>>>>   llr support ib-speed
>>>>>>>>   SDR: disable
>>>>>>>>   DDR: disable
>>>>>>>>   QDR: enable
>>>>>>>>   FDR10: enable-request
>>>>>>>>   FDR: enable-request
>>>>>>>>
>>>>>>>> GWIB01 [proxy-ha-group: master] (config) #
>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # interface ib 1/9
>>>>>>>> phy-profile map hp-encl-isl
>>>>>>>> *% Cannot map profile hp-encl-isl to port:  1/9*
>>>>>>>>
>>>>>>>>
>>>>>>>> *German*
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2015-10-07 13:17 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>>>>
>>>>>>>> The driver ‘qib’ is loading fine.  As can be seen by the ibstat
>>>>>>>> output.  The ib_ipath is an older card.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The problem is the link is not coming up to init.  Like Hal said
>>>>>>>> the link should transition to “link up” without the SMs involvement.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I think you are on to something with the fact that it seems like
>>>>>>>> your switch ports are not configured to do QDR.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Ira
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *From:* German Anders [mailto:ganders at despegar.com]
>>>>>>>> *Sent:* Wednesday, October 07, 2015 9:05 AM
>>>>>>>> *To:* Weiny, Ira
>>>>>>>> *Cc:* Hal Rosenstock; users at lists.openfabrics.org
>>>>>>>>
>>>>>>>>
>>>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Yes I've that file:
>>>>>>>>
>>>>>>>> /usr/sbin/truescale-serdes.cmds
>>>>>>>>
>>>>>>>> Also I've done the install of libipathverbs:
>>>>>>>>
>>>>>>>> # apt-get install libipathverbs-dev
>>>>>>>>
>>>>>>>> But I try to load the ib_ipath module but I'm getting the following
>>>>>>>> error msg:
>>>>>>>>
>>>>>>>> # modprobe ib_ipath
>>>>>>>> modprobe: ERROR: could not insert 'ib_ipath': Device or resource
>>>>>>>> busy
>>>>>>>>
>>>>>>>>
>>>>>>>> *German*
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2015-10-07 12:54 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>>>>
>>>>>>>> There are a few issues for routing in that diagram but the links
>>>>>>>> should come up.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I assume there is some backplane between the blade servers and the
>>>>>>>> switch in that chassis?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Have you gotten libipathverbs installed?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> In ipathverbs there is a serdes tuning script.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> https://github.com/01org/libipathverbs/blob/master/truescale-serdes.cmds
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Does your libipathverbs include that file?  If not try the latest
>>>>>>>> from github.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Ira
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *From:* users-bounces at lists.openfabrics.org [mailto:
>>>>>>>> users-bounces at lists.openfabrics.org] *On Behalf Of *German Anders
>>>>>>>> *Sent:* Wednesday, October 07, 2015 8:41 AM
>>>>>>>> *To:* Hal Rosenstock
>>>>>>>> *Cc:* users at lists.openfabrics.org
>>>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Hal,
>>>>>>>>
>>>>>>>> Thanks for the reply, I've attach a pdf with the diagram topology,
>>>>>>>> I don't know if this is the best way to go or if there's another way to
>>>>>>>> connect and setup the IB network, tips and suggestions will be very
>>>>>>>> appreciated, also the mezzanine cards are already installed on the blade
>>>>>>>> hosts:
>>>>>>>>
>>>>>>>> # lspci
>>>>>>>> (...)
>>>>>>>> 41:00.0 InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand HCA (rev 02)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks in advance,
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>>
>>>>>>>> *German*
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2015-10-07 11:47 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com
>>>>>>>> >:
>>>>>>>>
>>>>>>>> Hi again German,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Looks like you made some progress from yesterday as the qib ports
>>>>>>>> are now Polling rather than Disabled.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> But since they are Down, do you have them cabled to a switch ? That
>>>>>>>> should bring the links up and the port state will be Init. That is the
>>>>>>>> "starting" point.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> You will also then need to be running SM to bring the ports up to
>>>>>>>> Active.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -- Hal
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Oct 7, 2015 at 10:37 AM, German Anders <
>>>>>>>> ganders at despegar.com> wrote:
>>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I don't know if this is the mailist list for this kind of topic but
>>>>>>>> I'm really new to IB and I've just install two SX6036G gateways connected
>>>>>>>> to each other through two ISL ports, then I've configured a proxy-arp
>>>>>>>> between both nodes (sm is disable on both gw's):
>>>>>>>>
>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # show proxy-arp ha
>>>>>>>>
>>>>>>>> Load balancing algorithm: ib-base-ip
>>>>>>>> Number of Proxy-Arp interfaces: 1
>>>>>>>>
>>>>>>>> Proxy-ARP VIP
>>>>>>>> =============
>>>>>>>> Pra-group name: proxy-ha-group
>>>>>>>> HA VIP address: 10.xx.xx.xx/xx
>>>>>>>>
>>>>>>>> Active nodes:
>>>>>>>> ID                   State                IP
>>>>>>>> --------------------------------------------------------------
>>>>>>>> GWIB01               master               10.xx.xx.xx1
>>>>>>>> GWIB02               standby              10.xx.xx.xx2
>>>>>>>>
>>>>>>>> Then I setup two SX6018F switches (*SWIB01* and *SWIB02*), one
>>>>>>>> connected to GWIB01 and the other connected to GWIB02. The SM is configured
>>>>>>>> locally on both SWIB01 & SWIB02 switches. So far so good, after this config
>>>>>>>> I setup a commodity server with a MLNX IB ADPT FDR to the SWIB01 & SWIB02
>>>>>>>> switches, config the drivers, etc and then get it up & running fine.
>>>>>>>>
>>>>>>>> Finally I've setup a HP Enclosure with an internal IB SW (then
>>>>>>>> connect port 1 of the internal SW to GWIB01 - link is up but LLR status is
>>>>>>>> inactive), install one of the blades and I see the following:
>>>>>>>>
>>>>>>>> # ibstat
>>>>>>>> CA 'qib0'
>>>>>>>>     CA type: InfiniPath_QMH7342
>>>>>>>>     Number of ports: 2
>>>>>>>>     Firmware version:
>>>>>>>>     Hardware version: 2
>>>>>>>>     Node GUID: 0x0011750000791fec
>>>>>>>>     System image GUID: 0x0011750000791fec
>>>>>>>>     Port 1:
>>>>>>>>         State: Down
>>>>>>>>         Physical state: Polling
>>>>>>>>         Rate: 40
>>>>>>>>         Base lid: 4660
>>>>>>>>         LMC: 0
>>>>>>>>         SM lid: 4660
>>>>>>>>         Capability mask: 0x0761086a
>>>>>>>>         Port GUID: 0x0011750000791fec
>>>>>>>>         Link layer: InfiniBand
>>>>>>>>     Port 2:
>>>>>>>>         State: Down
>>>>>>>>         Physical state: Polling
>>>>>>>>         Rate: 40
>>>>>>>>         Base lid: 4660
>>>>>>>>         LMC: 0
>>>>>>>>         SM lid: 4660
>>>>>>>>         Capability mask: 0x0761086a
>>>>>>>>         Port GUID: 0x0011750000791fed
>>>>>>>>         Link layer: InfiniBand
>>>>>>>>
>>>>>>>> So I was wondering if maybe the SM is not being recognized on the
>>>>>>>> Blade system and that's why is not passing the Polling state, is that
>>>>>>>> possible? Or maybe is not possible to connect an ISL between the GW and the
>>>>>>>> HP internal SW so that the sm is available or maybe the inactive LLR is
>>>>>>>> causing this thing, any ideas? I thought about connecting the ISL
>>>>>>>> of the HP IB SW to the SWIB01 or SWIB02 instead of the GW's but I don't
>>>>>>>> have any available ports.
>>>>>>>>
>>>>>>>> Thanks in advance,
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>>
>>>>>>>> *German*
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Users mailing list
>>>>>>>> Users at lists.openfabrics.org
>>>>>>>> http://lists.openfabrics.org/mailman/listinfo/users
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20151013/96f9da7e/attachment.html>


More information about the Users mailing list