[Users] IB topology config and polling state
German Anders
ganders at despegar.com
Wed Oct 7 13:38:02 PDT 2015
I think so, there's no much information out there regarding the
configuration of the hp blc ib switches..
I run the following command from one of the blades:
# *ibhosts*
Ca : 0x0011750000791fec ports 2 "Intel Infiniband HCA ubuntu"
# *ibswitches *
# *ibnetdiscover*
#
# Topology file: generated on Wed Oct 7 16:29:17 2015
#
# Initiated from node 0011750000791fec port 0011750000791fec
vendid=0x1175
devid=0x7322
sysimgguid=0x11750000791fec
caguid=0x11750000791fectell
Ca 2 "H-0011750000791fec" # "Intel Infiniband HCA ubuntu"
I don't know exactly if I should configure somehow so that the internal
GUID ports to be mapped to the sm running on SWIB01 or maybe I should tell
opensm on the blade to start on the HP IB SW GUID... really missed up here
:(
*German* <ganders at despegar.com>
2015-10-07 17:27 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
> It's a physical connectivity problem in the enclosure. Are you sure things
> are correct there ?
>
> On Wed, Oct 7, 2015 at 4:26 PM, German Anders <ganders at despegar.com>
> wrote:
>
>> I've try everything...also to start a opensm on one of the blades:
>>
>> root at ubuntu:/etc/infiniband# /etc/init.d/opensm start
>>
>> *Starting opensm on 0x0011750000791fec: Starting opensm on
>> 0x0011750000791fed:*
>> root at ubuntu:/etc/infiniband#
>> root at ubuntu:/etc/infiniband# ps -ef | grep opensm
>> root 4427 1 0 09:44 ? 00:00:01 /usr/sbin/opensm -g
>> 0x0011750000791fec -f /var/log/opensm.0x0011750000791fec.log
>> root 4429 1 0 09:44 ? 00:00:01 /usr/sbin/opensm -g
>> 0x0011750000791fed -f /var/log/opensm.0x0011750000791fed.log
>> root 7409 1853 0 16:21 pts/0 00:00:00 grep --color=auto opensm
>> root at ubuntu:/etc/infiniband#
>> root at ubuntu:/etc/infiniband# tail -f
>> /var/log/opensm.0x0011750000791fec.log
>> Oct 07 09:44:10 642680 [D4F60740] 0x03 -> OpenSM 3.3.15
>> Oct 07 09:44:10 642750 [D4F60740] 0x80 -> OpenSM 3.3.15
>> Oct 07 09:44:10 646683 [D4F60740] 0x02 -> osm_vendor_init: 1000 pending
>> umads specified
>> Oct 07 09:44:10 646931 [D4F60740] 0x80 -> Entering DISCOVERING state
>> Oct 07 09:44:10 647189 [D4F60740] 0x02 -> osm_vendor_bind: Mgmt class
>> 0x81 binding to port GUID 0x11750000791fec
>> Oct 07 09:44:10 649663 [D4F60740] 0x02 -> osm_vendor_bind: Mgmt class
>> 0x03 binding to port GUID 0x11750000791fec
>> Oct 07 09:44:10 649723 [D4F60740] 0x02 -> osm_vendor_bind: Mgmt class
>> 0x04 binding to port GUID 0x11750000791fec
>> Oct 07 09:44:10 649781 [D4F60740] 0x02 -> osm_vendor_bind: Mgmt class
>> 0x21 binding to port GUID 0x11750000791fec
>> Oct 07 09:44:10 649855 [D4F60740] 0x02 -> osm_opensm_bind: Setting IS_SM
>> on port 0x0011750000791fec
>> Oct 07 09:44:10 651518 [B2CDA700] 0x80 -> SM port is down
>>
>>
>> root at ubuntu:/etc/infiniband# ibstat
>>
>> CA 'qib0'
>> CA type: InfiniPath_QMH7342
>> Number of ports: 2
>> Firmware version:
>> Hardware version: 2
>> Node GUID: 0x0011750000791fec
>> System image GUID: 0x0011750000791fec
>> Port 1:
>> State: Down
>> Physical state: Polling
>> Rate: 40
>> Base lid: 4660
>> LMC: 0
>> SM lid: 4660
>> Capability mask: 0x0761086a
>> Port GUID: 0x0011750000791fec
>> Link layer: InfiniBand
>> Port 2:
>> State: Down
>> Physical state: Polling
>> Rate: 40
>> Base lid: 4660
>> LMC: 0
>> SM lid: 4660
>> Capability mask: 0x0761086a
>> Port GUID: 0x0011750000791fed
>> Link layer: InfiniBand
>>
>>
>>
>>
>>
>> *German* <ganders at despegar.com>
>>
>> 2015-10-07 17:24 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>>
>>> Yes, somehow the enclosure is not internally connected properly
>>> otherwise you'd see more topology off the other switch ports. I think it
>>> has more than 1 switch.
>>>
>>> On Wed, Oct 7, 2015 at 3:56 PM, German Anders <ganders at despegar.com>
>>> wrote:
>>>
>>>> find some of the output of the ibnetdiscover from one of the working
>>>> hosts:
>>>>
>>>> # ibnetdiscover
>>>> #
>>>> # Topology file: generated on Wed Oct 7 15:51:55 2015
>>>> #
>>>> # Initiated from node e41d2d0300163650 port e41d2d0300163651
>>>>
>>>> vendid=0x2c9
>>>> devid=0xbd36
>>>> sysimgguid=0x2c902004b0918
>>>> switchguid=0x2c902004b0918(2c902004b0918)
>>>> Switch 32 "S-0002c902004b0918" # "Infiniscale-IV Mellanox
>>>> Technologies" base port 0 *lid 29* lmc 0
>>>> [1] "S-e41d2d030031e9c1"[9] # "MF0;GWIB01:SX6036G/U1" lid 24
>>>> 4xQDR
>>>>
>>>> vendid=0x2c9
>>>> devid=0xc738
>>>> sysimgguid=0xe41d2d030031e9c0
>>>> switchguid=0xe41d2d030031e9c1(e41d2d030031e9c1)
>>>> Switch 37 "S-e41d2d030031e9c1" # "MF0;GWIB01:SX6036G/U1"
>>>> enhanced port 0 lid 24 lmc 0
>>>> [9] "S-0002c902004b0918"[1] # "Infiniscale-IV Mellanox
>>>> Technologies" lid 29 4xQDR
>>>> [33] "S-e41d2d030031eb41"[33] # "MF0;GWIB02:SX6036G/U1" lid
>>>> 23 4xFDR10
>>>> [34] "S-f45214030073f500"[16] # "MF0;SWIB02:SX6018/U1" lid 1
>>>> 4xFDR10
>>>> [35] "S-e41d2d030031eb41"[35] # "MF0;GWIB02:SX6036G/U1" lid
>>>> 23 4xFDR10
>>>> [36] "S-e41d2d0300097630"[18] # "MF0;SWIB01:SX6018/U1" lid 2
>>>> 4xFDR10
>>>> [37] "H-e41d2d030031e9c2"[1](e41d2d030031e9c2) #
>>>> "MF0;GWIB01:SX60XX/GW" lid 25 4xFDR
>>>>
>>>> (...)
>>>>
>>>> Clearly the connectivity problem is from the HP IB SW to the blades...
>>>>
>>>>
>>>> *German*
>>>>
>>>> 2015-10-07 16:26 GMT-03:00 German Anders <ganders at despegar.com>:
>>>>
>>>>> for port #1:
>>>>>
>>>>> # ibportstate -L 29 1 query
>>>>> Switch PortInfo:
>>>>> # Port info: Lid 29 port 1
>>>>> LinkState:.......................Active
>>>>> PhysLinkState:...................LinkUp
>>>>> Lid:.............................75
>>>>> SMLid:...........................2328
>>>>> LMC:.............................0
>>>>> LinkWidthSupported:..............1X or 4X
>>>>> LinkWidthEnabled:................1X or 4X
>>>>> LinkWidthActive:.................4X
>>>>> LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>> LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>> LinkSpeedActive:.................10.0 Gbps
>>>>> Peer PortInfo:
>>>>> # Port info: Lid 29 DR path slid 4; dlid 65535; 0,1 port 9
>>>>> LinkState:.......................Active
>>>>> PhysLinkState:...................LinkUp
>>>>> Lid:.............................0
>>>>> SMLid:...........................0
>>>>> LMC:.............................0
>>>>> LinkWidthSupported:..............1X or 4X
>>>>> LinkWidthEnabled:................1X or 4X
>>>>> LinkWidthActive:.................4X
>>>>> LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>> LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>> LinkSpeedActive:.................10.0 Gbps
>>>>> LinkSpeedExtSupported:...........14.0625 Gbps
>>>>> LinkSpeedExtEnabled:.............14.0625 Gbps
>>>>> LinkSpeedExtActive:..............No Extended Speed
>>>>> # MLNX ext Port info: Lid 29 DR path slid 4; dlid 65535; 0,1 port 9
>>>>> StateChangeEnable:...............0x00
>>>>> LinkSpeedSupported:..............0x01
>>>>> LinkSpeedEnabled:................0x01
>>>>> LinkSpeedActive:.................0x00
>>>>>
>>>>>
>>>>> *German*
>>>>>
>>>>> 2015-10-07 16:24 GMT-03:00 German Anders <ganders at despegar.com>:
>>>>>
>>>>>> Yeah, is there any command that I can run in order to change the port
>>>>>> state on the remote switch? I mean everything looks good but in the hp
>>>>>> blades still getting:
>>>>>>
>>>>>>
>>>>>> # ibstat
>>>>>> CA 'qib0'
>>>>>> CA type: InfiniPath_QMH7342
>>>>>> Number of ports: 2
>>>>>> Firmware version:
>>>>>> Hardware version: 2
>>>>>> Node GUID: 0x0011750000791fec
>>>>>> System image GUID: 0x0011750000791fec
>>>>>> Port 1:
>>>>>> State: *Down*
>>>>>> Physical state: *Polling*
>>>>>> Rate: 40
>>>>>> Base lid: 4660
>>>>>> LMC: 0
>>>>>> SM lid: 4660
>>>>>> Capability mask: 0x0761086a
>>>>>> Port GUID: 0x0011750000791fec
>>>>>> Link layer: InfiniBand
>>>>>> Port 2:
>>>>>> State: *Down*
>>>>>> Physical state: *Polling*
>>>>>> Rate: 40
>>>>>> Base lid: 4660
>>>>>> LMC: 0
>>>>>> SM lid: 4660
>>>>>> Capability mask: 0x0761086a
>>>>>> Port GUID: 0x0011750000791fed
>>>>>> Link layer: InfiniBand
>>>>>>
>>>>>>
>>>>>> Also on working hosts I only see devices from the local network, but
>>>>>> didn't see any of the blades hca connections.
>>>>>>
>>>>>>
>>>>>>
>>>>>> *German* <ganders at despegar.com>
>>>>>>
>>>>>> 2015-10-07 16:21 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com>:
>>>>>>
>>>>>>> The screen shot looks good :-) SM brought the link up to active.
>>>>>>>
>>>>>>> Note that the ibportstate command you gave was for switch port 0 of
>>>>>>> the Mellanox IS-4 switch in the QLogic HP BLc 4X QDR IB Switch.
>>>>>>>
>>>>>>> On Wed, Oct 7, 2015 at 3:06 PM, German Anders <ganders at despegar.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Yes, find attached an screenshot of the port information (# 9) the
>>>>>>>> one that makes the ISL to the QLogic HP BLc 4X QDR IB Switch, also from one
>>>>>>>> of the hosts that are connected to one of the SX6018F I can see the
>>>>>>>> 'remote' HP IB SW:
>>>>>>>>
>>>>>>>> # *ibnodes*
>>>>>>>>
>>>>>>>> (...)
>>>>>>>> Switch : 0x0002c902004b0918 ports 32 "Infiniscale-IV Mellanox
>>>>>>>> Technologies" base port 0 *lid 29 *lmc 0
>>>>>>>> Switch : 0xe41d2d030031e9c1 ports 37 "MF0;GWIB01:SX6036G/U1"
>>>>>>>> enhanced port 0 lid 24 lmc 0
>>>>>>>> (...)
>>>>>>>>
>>>>>>>> # *ibportstate -L 29 query*
>>>>>>>> Switch PortInfo:
>>>>>>>> # Port info: Lid 29 port 0
>>>>>>>> LinkState:.......................Active
>>>>>>>> PhysLinkState:...................LinkUp
>>>>>>>> Lid:.............................29
>>>>>>>> SMLid:...........................2
>>>>>>>> LMC:.............................0
>>>>>>>> LinkWidthSupported:..............1X or 4X
>>>>>>>> LinkWidthEnabled:................1X or 4X
>>>>>>>> LinkWidthActive:.................4X
>>>>>>>> LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>>>>> LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>>>>>> LinkSpeedActive:.................10.0 Gbps
>>>>>>>> Mkey:............................<not displayed>
>>>>>>>> MkeyLeasePeriod:.................0
>>>>>>>> ProtectBits:.....................0
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *German* <ganders at despegar.com>
>>>>>>>>
>>>>>>>> 2015-10-07 16:00 GMT-03:00 Hal Rosenstock <hal.rosenstock at gmail.com
>>>>>>>> >:
>>>>>>>>
>>>>>>>>> One more thing hopefully before playing with the low level phy
>>>>>>>>> settings:
>>>>>>>>>
>>>>>>>>> Are you using known good cables ? Do you have FDR cables on the
>>>>>>>>> FDR <-> FDR links ? Cable lengths can matter as well.
>>>>>>>>>
>>>>>>>>> On Wed, Oct 7, 2015 at 12:57 PM, Hal Rosenstock <
>>>>>>>>> hal.rosenstock at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Were the ports mapped to the phy profile shutdown when you
>>>>>>>>>> changed this ?
>>>>>>>>>>
>>>>>>>>>> LLR is a proprietary Mellanox mechanism.
>>>>>>>>>>
>>>>>>>>>> You might want 2 different profiles: one for the interfaces
>>>>>>>>>> connected to other gateway interfaces (which are FDR (and FDR-10) capable
>>>>>>>>>> and the other for the interfaces connecting to QDR (the older equipment in
>>>>>>>>>> your network). By configuring the Switch-X interfaces to the appropriate
>>>>>>>>>> possible speeds and disabling the proprietary mechanisms there, the link
>>>>>>>>>> should not only come up but also this will occur faster than if FDR/FDR10
>>>>>>>>>> are enabled.
>>>>>>>>>>
>>>>>>>>>> I suspect that due to the Switch-X configuration that the links
>>>>>>>>>> to the switch(es) in the HP enclosures do not negotiate properly (as shown
>>>>>>>>>> by down rather than LinkUp).
>>>>>>>>>>
>>>>>>>>>> Once you get all your links to INIT, negotiation has occurred and
>>>>>>>>>> then it's time for SM to bring links to active.
>>>>>>>>>>
>>>>>>>>>> Since you have down links, the SM can't do anything about those.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Oct 7, 2015 at 12:44 PM, German Anders <
>>>>>>>>>> ganders at despegar.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Anyone had any experience with HP BLc 4X QDR IB Switch?? I know
>>>>>>>>>>> that this kind of SW does not come with an embedded sm, but I don't know
>>>>>>>>>>> how to access any mgmt at all on this particularly switch, I mean for
>>>>>>>>>>> example to setup speed or anything like that, is possible to access through
>>>>>>>>>>> the chassis?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *German* <ganders at despegar.com>
>>>>>>>>>>>
>>>>>>>>>>> 2015-10-07 13:19 GMT-03:00 German Anders <ganders at despegar.com>:
>>>>>>>>>>>
>>>>>>>>>>>> I think so, but when trying to configured the phy-profile on
>>>>>>>>>>>> the interface in order to negotiate on QDR it failed to map the profile:
>>>>>>>>>>>>
>>>>>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # show phy-profile
>>>>>>>>>>>> high-speed-ber
>>>>>>>>>>>>
>>>>>>>>>>>> Profile: high-speed-ber
>>>>>>>>>>>> --------
>>>>>>>>>>>> llr support ib-speed
>>>>>>>>>>>> SDR: disable
>>>>>>>>>>>> DDR: disable
>>>>>>>>>>>> QDR: disable
>>>>>>>>>>>> FDR10: enable-request
>>>>>>>>>>>> FDR: enable-request
>>>>>>>>>>>>
>>>>>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # show phy-profile
>>>>>>>>>>>> hp-encl-isl
>>>>>>>>>>>>
>>>>>>>>>>>> Profile: hp-encl-isl
>>>>>>>>>>>> --------
>>>>>>>>>>>> llr support ib-speed
>>>>>>>>>>>> SDR: disable
>>>>>>>>>>>> DDR: disable
>>>>>>>>>>>> QDR: enable
>>>>>>>>>>>> FDR10: enable-request
>>>>>>>>>>>> FDR: enable-request
>>>>>>>>>>>>
>>>>>>>>>>>> GWIB01 [proxy-ha-group: master] (config) #
>>>>>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # interface ib 1/9
>>>>>>>>>>>> phy-profile map hp-encl-isl
>>>>>>>>>>>> *% Cannot map profile hp-encl-isl to port: 1/9*
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *German* <ganders at despegar.com>
>>>>>>>>>>>>
>>>>>>>>>>>> 2015-10-07 13:17 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>>>>>>>>
>>>>>>>>>>>>> The driver ‘qib’ is loading fine. As can be seen by the
>>>>>>>>>>>>> ibstat output. The ib_ipath is an older card.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> The problem is the link is not coming up to init. Like Hal
>>>>>>>>>>>>> said the link should transition to “link up” without the SMs involvement.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think you are on to something with the fact that it seems
>>>>>>>>>>>>> like your switch ports are not configured to do QDR.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ira
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> *From:* German Anders [mailto:ganders at despegar.com]
>>>>>>>>>>>>> *Sent:* Wednesday, October 07, 2015 9:05 AM
>>>>>>>>>>>>> *To:* Weiny, Ira
>>>>>>>>>>>>> *Cc:* Hal Rosenstock; users at lists.openfabrics.org
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yes I've that file:
>>>>>>>>>>>>>
>>>>>>>>>>>>> /usr/sbin/truescale-serdes.cmds
>>>>>>>>>>>>>
>>>>>>>>>>>>> Also I've done the install of libipathverbs:
>>>>>>>>>>>>>
>>>>>>>>>>>>> # apt-get install libipathverbs-dev
>>>>>>>>>>>>>
>>>>>>>>>>>>> But I try to load the ib_ipath module but I'm getting the
>>>>>>>>>>>>> following error msg:
>>>>>>>>>>>>>
>>>>>>>>>>>>> # modprobe ib_ipath
>>>>>>>>>>>>> modprobe: ERROR: could not insert 'ib_ipath': Device or
>>>>>>>>>>>>> resource busy
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> *German*
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2015-10-07 12:54 GMT-03:00 Weiny, Ira <ira.weiny at intel.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>> There are a few issues for routing in that diagram but the
>>>>>>>>>>>>> links should come up.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I assume there is some backplane between the blade servers and
>>>>>>>>>>>>> the switch in that chassis?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Have you gotten libipathverbs installed?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> In ipathverbs there is a serdes tuning script.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://github.com/01org/libipathverbs/blob/master/truescale-serdes.cmds
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Does your libipathverbs include that file? If not try the
>>>>>>>>>>>>> latest from github.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ira
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> *From:* users-bounces at lists.openfabrics.org [mailto:
>>>>>>>>>>>>> users-bounces at lists.openfabrics.org] *On Behalf Of *German
>>>>>>>>>>>>> Anders
>>>>>>>>>>>>> *Sent:* Wednesday, October 07, 2015 8:41 AM
>>>>>>>>>>>>> *To:* Hal Rosenstock
>>>>>>>>>>>>> *Cc:* users at lists.openfabrics.org
>>>>>>>>>>>>> *Subject:* Re: [Users] IB topology config and polling state
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Hal,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the reply, I've attach a pdf with the diagram
>>>>>>>>>>>>> topology, I don't know if this is the best way to go or if there's another
>>>>>>>>>>>>> way to connect and setup the IB network, tips and suggestions will be very
>>>>>>>>>>>>> appreciated, also the mezzanine cards are already installed on the blade
>>>>>>>>>>>>> hosts:
>>>>>>>>>>>>>
>>>>>>>>>>>>> # lspci
>>>>>>>>>>>>> (...)
>>>>>>>>>>>>> 41:00.0 InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand HCA
>>>>>>>>>>>>> (rev 02)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks in advance,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> *German*
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2015-10-07 11:47 GMT-03:00 Hal Rosenstock <
>>>>>>>>>>>>> hal.rosenstock at gmail.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi again German,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Looks like you made some progress from yesterday as the qib
>>>>>>>>>>>>> ports are now Polling rather than Disabled.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> But since they are Down, do you have them cabled to a switch ?
>>>>>>>>>>>>> That should bring the links up and the port state will be Init. That is the
>>>>>>>>>>>>> "starting" point.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> You will also then need to be running SM to bring the ports up
>>>>>>>>>>>>> to Active.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -- Hal
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Oct 7, 2015 at 10:37 AM, German Anders <
>>>>>>>>>>>>> ganders at despegar.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I don't know if this is the mailist list for this kind of
>>>>>>>>>>>>> topic but I'm really new to IB and I've just install two SX6036G gateways
>>>>>>>>>>>>> connected to each other through two ISL ports, then I've configured a
>>>>>>>>>>>>> proxy-arp between both nodes (sm is disable on both gw's):
>>>>>>>>>>>>>
>>>>>>>>>>>>> GWIB01 [proxy-ha-group: master] (config) # show proxy-arp ha
>>>>>>>>>>>>>
>>>>>>>>>>>>> Load balancing algorithm: ib-base-ip
>>>>>>>>>>>>> Number of Proxy-Arp interfaces: 1
>>>>>>>>>>>>>
>>>>>>>>>>>>> Proxy-ARP VIP
>>>>>>>>>>>>> =============
>>>>>>>>>>>>> Pra-group name: proxy-ha-group
>>>>>>>>>>>>> HA VIP address: 10.xx.xx.xx/xx
>>>>>>>>>>>>>
>>>>>>>>>>>>> Active nodes:
>>>>>>>>>>>>> ID State IP
>>>>>>>>>>>>> --------------------------------------------------------------
>>>>>>>>>>>>> GWIB01 master 10.xx.xx.xx1
>>>>>>>>>>>>> GWIB02 standby 10.xx.xx.xx2
>>>>>>>>>>>>>
>>>>>>>>>>>>> Then I setup two SX6018F switches (*SWIB01* and *SWIB02*),
>>>>>>>>>>>>> one connected to GWIB01 and the other connected to GWIB02. The SM is
>>>>>>>>>>>>> configured locally on both SWIB01 & SWIB02 switches. So far so good, after
>>>>>>>>>>>>> this config I setup a commodity server with a MLNX IB ADPT FDR to the
>>>>>>>>>>>>> SWIB01 & SWIB02 switches, config the drivers, etc and then get it up &
>>>>>>>>>>>>> running fine.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Finally I've setup a HP Enclosure with an internal IB SW (then
>>>>>>>>>>>>> connect port 1 of the internal SW to GWIB01 - link is up but LLR status is
>>>>>>>>>>>>> inactive), install one of the blades and I see the following:
>>>>>>>>>>>>>
>>>>>>>>>>>>> # ibstat
>>>>>>>>>>>>> CA 'qib0'
>>>>>>>>>>>>> CA type: InfiniPath_QMH7342
>>>>>>>>>>>>> Number of ports: 2
>>>>>>>>>>>>> Firmware version:
>>>>>>>>>>>>> Hardware version: 2
>>>>>>>>>>>>> Node GUID: 0x0011750000791fec
>>>>>>>>>>>>> System image GUID: 0x0011750000791fec
>>>>>>>>>>>>> Port 1:
>>>>>>>>>>>>> State: Down
>>>>>>>>>>>>> Physical state: Polling
>>>>>>>>>>>>> Rate: 40
>>>>>>>>>>>>> Base lid: 4660
>>>>>>>>>>>>> LMC: 0
>>>>>>>>>>>>> SM lid: 4660
>>>>>>>>>>>>> Capability mask: 0x0761086a
>>>>>>>>>>>>> Port GUID: 0x0011750000791fec
>>>>>>>>>>>>> Link layer: InfiniBand
>>>>>>>>>>>>> Port 2:
>>>>>>>>>>>>> State: Down
>>>>>>>>>>>>> Physical state: Polling
>>>>>>>>>>>>> Rate: 40
>>>>>>>>>>>>> Base lid: 4660
>>>>>>>>>>>>> LMC: 0
>>>>>>>>>>>>> SM lid: 4660
>>>>>>>>>>>>> Capability mask: 0x0761086a
>>>>>>>>>>>>> Port GUID: 0x0011750000791fed
>>>>>>>>>>>>> Link layer: InfiniBand
>>>>>>>>>>>>>
>>>>>>>>>>>>> So I was wondering if maybe the SM is not being recognized on
>>>>>>>>>>>>> the Blade system and that's why is not passing the Polling state, is that
>>>>>>>>>>>>> possible? Or maybe is not possible to connect an ISL between the GW and the
>>>>>>>>>>>>> HP internal SW so that the sm is available or maybe the inactive LLR is
>>>>>>>>>>>>> causing this thing, any ideas? I thought about connecting the
>>>>>>>>>>>>> ISL of the HP IB SW to the SWIB01 or SWIB02 instead of the GW's but I don't
>>>>>>>>>>>>> have any available ports.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks in advance,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> *German*
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Users mailing list
>>>>>>>>>>>>> Users at lists.openfabrics.org
>>>>>>>>>>>>> http://lists.openfabrics.org/mailman/listinfo/users
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20151007/32d2865f/attachment.html>
More information about the Users
mailing list