[Users] Weird IPoIB issue

Robert LeBlanc robert_leblanc at byu.edu
Tue Oct 29 12:56:11 PDT 2013


Both ports show up in the "saquery MCMR" results with a JoinState of 0x1.

How can I dump the parameters of a non-managed switch so that I can confirm
that multicast is not turned off on the Dell chassis IB switches?


Robert LeBlanc
OIT Infrastructure & Virtualization Engineer
Brigham Young University


On Mon, Oct 28, 2013 at 5:04 PM, Coulter, Susan K <skc at lanl.gov> wrote:

>
>  /sys/class/net should give you the details on your devices, like this:
>
>  -bash-4.1# cd /sys/class/net
> -bash-4.1# ls -l
> total 0
> lrwxrwxrwx 1 root root 0 Oct 23 12:59 eth0 ->
> ../../devices/pci0000:00/0000:00:02.0/0000:04:00.0/net/eth0
> lrwxrwxrwx 1 root root 0 Oct 23 12:59 eth1 ->
> ../../devices/pci0000:00/0000:00:02.0/0000:04:00.1/net/eth1
> lrwxrwxrwx 1 root root 0 Oct 23 15:42 ib0 ->
> ../../devices/pci0000:40/0000:40:0c.0/0000:47:00.0/net/ib0
> lrwxrwxrwx 1 root root 0 Oct 23 15:42 ib1 ->
> ../../devices/pci0000:40/0000:40:0c.0/0000:47:00.0/net/ib1
> lrwxrwxrwx 1 root root 0 Oct 23 15:42 ib2 ->
> ../../devices/pci0000:c0/0000:c0:0c.0/0000:c7:00.0/net/ib2
> lrwxrwxrwx 1 root root 0 Oct 23 15:42 ib3 ->
> ../../devices/pci0000:c0/0000:c0:0c.0/0000:c7:00.0/net/ib3
>
>  Then use "lspci | grep Mell"  to get the pci device numbers.
>
>  47:00.0 Network controller: Mellanox Technologies MT26428 [ConnectX VPI
> PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)
> c7:00.0 Network controller: Mellanox Technologies MT26428 [ConnectX VPI
> PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)
>
>  In this example, ib0 and 1 are referencing the device at  47:00.0
> And ib2 and ib3 are referencing the device at c7:00.0
>
>  That said, if you only have one card - this is probably not the problem.
> Additionally, since the arp requests are being seen going out ib0, your
> emulation appears to be working.
>
>  If those arp requests are not being seen on the other end, it seems like
> a problem with the mgids.
> Like maybe the port you are trying to reach is not in the IPoIB multicast
> group?
>
>  You can look at all the multicast member records with "saquery MCMR".
> Or - you can grep for mcmr_rcv_join_mgrp references in your SM logs …
>
>  HTH
>
>
>
>  On Oct 28, 2013, at 1:08 PM, Robert LeBlanc <robert_leblanc at byu.edu>
> wrote:
>
>  I can ibping between both hosts just fine.
>
>  [root at desxi003 ~]# ibping 0x37
> Pong from desxi004.(none) (Lid 55): time 0.111 ms
> Pong from desxi004.(none) (Lid 55): time 0.189 ms
> Pong from desxi004.(none) (Lid 55): time 0.189 ms
> Pong from desxi004.(none) (Lid 55): time 0.179 ms
> ^C
> --- desxi004.(none) (Lid 55) ibping statistics ---
> 4 packets transmitted, 4 received, 0% packet loss, time 3086 ms
> rtt min/avg/max = 0.111/0.167/0.189 ms
>
>  [root at desxi004 ~]# ibping 0x2d
> Pong from desxi003.(none) (Lid 45): time 0.156 ms
> Pong from desxi003.(none) (Lid 45): time 0.175 ms
> Pong from desxi003.(none) (Lid 45): time 0.176 ms
> ^C
> --- desxi003.(none) (Lid 45) ibping statistics ---
> 3 packets transmitted, 3 received, 0% packet loss, time 2302 ms
> rtt min/avg/max = 0.156/0.169/0.176 ms
>
>  When I do an Ethernet ping to the IPoIB address, tcpdump only shows the
> outgoing ARP request.
>
>  [root at desxi003 ~]# tcpdump -i ib0
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on ib0, link-type LINUX_SLL (Linux cooked), capture size 65535
> bytes
> 19:00:08.950320 ARP, Request who-has 192.168.9.4 tell 192.168.9.3, length
> 56
> 19:00:09.950320 ARP, Request who-has 192.168.9.4 tell 192.168.9.3, length
> 56
> 19:00:10.950307 ARP, Request who-has 192.168.9.4 tell 192.168.9.3, length
> 56
>
>  Running tcpdump on the rack servers I don't see the ARP request there
> which I should.
>
>  From what I've read, ib0 should be mapped to the first port and ib1
> should be mapped to the second port. We have one IB card with two ports.
> The modprobe is the default installed with the Mellanox drivers.
>
>  [root at desxi003 etc]# cat modprobe.d/ib_ipoib.conf
> # install ib_ipoib modprobe --ignore-install ib_ipoib &&
> /sbin/ib_ipoib_sysctl load
> # remove ib_ipoib /sbin/ib_ipoib_sysctl unload ; modprobe -r
> --ignore-remove ib_ipoib
> alias ib0 ib_ipoib
> alias ib1 ib_ipoib
>
>  Can you give me some pointers on digging into the device layer to make
> sure IPoIB is connected correctly? Would I look in /sys or /proc for that?
>
>  Dell has not been able to replicate the problem in their environment and
> they only support Red Hat and won't work with my CentOS live CD. These
> blades don't have internal hard drives so it makes it hard to install any
> OS. I don't know if I can engage Mellanox since they build the switch
> hardware and driver stack we are using.
>
>  I really appreciate all the help you guys have given thus far, I'm
> learning a lot as this progresses. I'm reading through
> https://tools.ietf.org/html/rfc4391 trying to understand IPoIB from top
> to bottom.
>
>  Thanks,
>
>
>  Robert LeBlanc
> OIT Infrastructure & Virtualization Engineer
> Brigham Young University
>
>
> On Mon, Oct 28, 2013 at 12:53 PM, Coulter, Susan K <skc at lanl.gov> wrote:
>
>>
>>  If you are not seeing any packets leave the ib0 interface, it sounds
>> like the emulation layer is not connected to the right device.
>>
>>  If ib_ipoib kernel module is loaded, and a simple native IB test works
>> between those blades - (like ib_read_bw) you need to dig into the device
>> layer and insure ipoib is "connected" to the right device.
>>
>>  Do you have more than 1 IB card?
>> What does your modprobe config look like for ipoib?
>>
>>
>>   On Oct 28, 2013, at 12:38 PM, Robert LeBlanc <robert_leblanc at byu.edu>
>>   wrote:
>>
>>  These ESX hosts (2 blade server and 2 rack servers) are booted into a
>> CentOS 6.2 Live CD that I built. Right now everything I'm trying to get
>> working is CentOS 6.2. All of our other hosts are running ESXi and have
>> IPoIB interfaces, but none of them are configured and I'm not trying to get
>> those working right now.
>>
>>  Ideally, we would like our ESX hosts to communicate with each other for
>> vMotion and protected VM traffic as well as with our Commvault backup
>> servers (Windows) over IPoIB (or Oracle's PVI which is very similar).
>>
>>
>>  Robert LeBlanc
>> OIT Infrastructure & Virtualization Engineer
>> Brigham Young University
>>
>>
>> On Mon, Oct 28, 2013 at 12:33 PM, Hal Rosenstock <
>> hal.rosenstock at gmail.com> wrote:
>>
>>> Are those ESXi IPoIB interfaces ? Do some of these work and others not ?
>>> Are there normal Linux IPoIB interfaces ? Do they work ?
>>>
>>>
>>> On Mon, Oct 28, 2013 at 2:24 PM, Robert LeBlanc <robert_leblanc at byu.edu>wrote:
>>>
>>>> Yes, I can not ping them over the IPoIB interface. It is a very simple
>>>> network set-up.
>>>>
>>>>  desxi003
>>>>  8: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc pfifo_fast
>>>> state UP qlen 256
>>>>     link/infiniband
>>>> 80:20:00:54:fe:80:00:00:00:00:00:00:f0:4d:a2:90:97:78:e7:d1 brd
>>>> 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
>>>>     inet 192.168.9.3/24 brd 192.168.9.255 scope global ib0
>>>>     inet6 fe80::f24d:a290:9778:e7d1/64 scope link
>>>>        valid_lft forever preferred_lft forever
>>>>
>>>>  desxi004
>>>>  8: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc pfifo_fast
>>>> state UP qlen 256
>>>>     link/infiniband
>>>> 80:20:00:54:fe:80:00:00:00:00:00:00:f0:4d:a2:90:97:78:e7:15 brd
>>>> 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
>>>>     inet 192.168.9.4/24 brd 192.168.9.255 scope global ib0
>>>>     inet6 fe80::f24d:a290:9778:e715/64 scope link
>>>>        valid_lft forever preferred_lft forever
>>>>
>>>>
>>>>
>>>>  Robert LeBlanc
>>>> OIT Infrastructure & Virtualization Engineer
>>>> Brigham Young University
>>>>
>>>>
>>>>  On Mon, Oct 28, 2013 at 12:22 PM, Hal Rosenstock <
>>>> hal.rosenstock at gmail.com> wrote:
>>>>
>>>>> So these 2 hosts have trouble talking IPoIB to each other ?
>>>>>
>>>>>
>>>>> On Mon, Oct 28, 2013 at 2:16 PM, Robert LeBlanc <
>>>>> robert_leblanc at byu.edu> wrote:
>>>>>
>>>>>> I was just wondering about that. It seems reasonable that the
>>>>>> broadcast traffic would go over multicast, but effectively channels would
>>>>>> be created for node to node communication, otherwise the entire multicast
>>>>>> group would be limited to 10 Gbps (in this instance) for the whole group.
>>>>>> That doesn't scale very well.
>>>>>>
>>>>>>  The things I've read about IPoIB performance tuning seems pretty
>>>>>> vague, and the changes most people recommend seem to be already in place on
>>>>>> the systems I'm using. Some people said, try using a newer version of
>>>>>> Ubuntu, but ultimately, I have very little control over VMware. Once I can
>>>>>> get the Linux machines to communicate IPoIB between the racks and blades,
>>>>>> then I'm going to turn my attention over to performance optimization. It
>>>>>> doesn't seem to make much sense to spend time there when it is not working
>>>>>> at all for most machines.
>>>>>>
>>>>>>  I've done ibtracert between the two nodes, is that what you mean by
>>>>>> walking the route?
>>>>>>
>>>>>>  [root at desxi003 ~]# ibtracert -m 0xc000 0x2d 0x37
>>>>>> From ca 0xf04da2909778e7d0 port 1 lid 45-45 "localhost HCA-1"
>>>>>> [1] -> switch 0x2c90200448ec8[17] lid 51 "Infiniscale-IV Mellanox
>>>>>> Technologies"
>>>>>> [18] -> ca 0xf04da2909778e714[1] lid 55 "localhost HCA-1"
>>>>>> To ca 0xf04da2909778e714 port 1 lid 55-55 "localhost HCA-1"
>>>>>>
>>>>>>  [root at desxi004 ~]# ibtracert -m 0xc000 0x37 0x2d
>>>>>> From ca 0xf04da2909778e714 port 1 lid 55-55 "localhost HCA-1"
>>>>>> [1] -> switch 0x2c90200448ec8[18] lid 51 "Infiniscale-IV Mellanox
>>>>>> Technologies"
>>>>>> [17] -> ca 0xf04da2909778e7d0[1] lid 45 "localhost HCA-1"
>>>>>> To ca 0xf04da2909778e7d0 port 1 lid 45-45 "localhost HCA-1"
>>>>>>
>>>>>>  As you can see, the route is on the same switch, the blades are
>>>>>> right next to each other.
>>>>>>
>>>>>>
>>>>>>  Robert LeBlanc
>>>>>> OIT Infrastructure & Virtualization Engineer
>>>>>> Brigham Young University
>>>>>>
>>>>>>
>>>>>>  On Mon, Oct 28, 2013 at 12:05 PM, Hal Rosenstock <
>>>>>> hal.rosenstock at gmail.com> wrote:
>>>>>>
>>>>>>>  Which mystery is explained ? The 10 Gbps is a multicast only limit
>>>>>>> and does not apply to unicast. The BW limitation you're seeing is due to
>>>>>>> other factors. There's been much written about IPoIB performance.
>>>>>>>
>>>>>>> If all the MC members are joined and routed, then the IPoIB
>>>>>>> connectivity issue is some other issue. Are you sure this is the case ? Did
>>>>>>> you walk the route between 2 nodes where you have a connectivity issue ?
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 28, 2013 at 1:58 PM, Robert LeBlanc <
>>>>>>> robert_leblanc at byu.edu> wrote:
>>>>>>>
>>>>>>>> Well, that explains one mystery, now I need to figure out why it
>>>>>>>> seems the Dell blades are not passing the traffic.
>>>>>>>>
>>>>>>>>
>>>>>>>>  Robert LeBlanc
>>>>>>>> OIT Infrastructure & Virtualization Engineer
>>>>>>>> Brigham Young University
>>>>>>>>
>>>>>>>>
>>>>>>>>  On Mon, Oct 28, 2013 at 11:51 AM, Hal Rosenstock <
>>>>>>>> hal.rosenstock at gmail.com> wrote:
>>>>>>>>
>>>>>>>>>  Yes, that's the IPoIB IPv4 broadcast group for the default
>>>>>>>>> (0xffff) partition. 0x80 part of mtu and rate just means "is equal to". mtu
>>>>>>>>> 0x04 is 2K (2048) and rate 0x3 is 10 Gb/sec. These are indeed the defaults.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Oct 28, 2013 at 1:45 PM, Robert LeBlanc <
>>>>>>>>> robert_leblanc at byu.edu> wrote:
>>>>>>>>>
>>>>>>>>>> The info for that MGID is:
>>>>>>>>>> MCMemberRecord group dump:
>>>>>>>>>>                 MGID....................ff12:401b:ffff::ffff:ffff
>>>>>>>>>>                 Mlid....................0xC000
>>>>>>>>>>                 Mtu.....................0x84
>>>>>>>>>>                 pkey....................0xFFFF
>>>>>>>>>>                 Rate....................0x83
>>>>>>>>>>                 SL......................0x0
>>>>>>>>>>
>>>>>>>>>>  I don't understand the MTU and Rate (130 and 131 dec). When I
>>>>>>>>>> run iperf between the two hosts over IPoIB in connected mode and MTU 65520.
>>>>>>>>>> I've tried multiple threads, but the sum is still 10 Gbps.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  Robert LeBlanc
>>>>>>>>>> OIT Infrastructure & Virtualization Engineer
>>>>>>>>>> Brigham Young University
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  On Mon, Oct 28, 2013 at 11:40 AM, Hal Rosenstock <
>>>>>>>>>> hal.rosenstock at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>>  saquery -g should show what MGID is mapped to MLID 0xc000 and
>>>>>>>>>>> the group parameters.
>>>>>>>>>>>
>>>>>>>>>>>  When you say 10 Gbps max, is that multicast or unicast ? That
>>>>>>>>>>> limit is only on the multicast.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Oct 28, 2013 at 1:28 PM, Robert LeBlanc <
>>>>>>>>>>> robert_leblanc at byu.edu> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Well, that can explain why I'm only able to get 10 Gbps max
>>>>>>>>>>>> from the two hosts that are working.
>>>>>>>>>>>>
>>>>>>>>>>>>  I have tried updn and dnup and they didn't help either. I
>>>>>>>>>>>> think the only thing that will help is Automatic Path Migration is it tries
>>>>>>>>>>>> very hard to route the alternative LIDs through different systemguids. I
>>>>>>>>>>>> suspect it would require re-LIDing everything which would mean an outage.
>>>>>>>>>>>> I'm still trying to get answers from Oracle if that is even a possibility.
>>>>>>>>>>>> I've tried seeding some of the algorithms with information like root nodes,
>>>>>>>>>>>> etc, but none of them worked better.
>>>>>>>>>>>>
>>>>>>>>>>>>  The MLID 0xc000 exists and I can see all the nodes joined to
>>>>>>>>>>>> the group using saquery. I've checked the route using ibtracert specifying
>>>>>>>>>>>> the MLID. The only thing I'm not sure how to check is the group parameters.
>>>>>>>>>>>> What tool would I use for that?
>>>>>>>>>>>>
>>>>>>>>>>>>  Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  Robert LeBlanc
>>>>>>>>>>>> OIT Infrastructure & Virtualization Engineer
>>>>>>>>>>>> Brigham Young University
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  On Mon, Oct 28, 2013 at 11:16 AM, Hal Rosenstock <
>>>>>>>>>>>> hal.rosenstock at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>  Xsigo's SM is not "straight" OpenSM. They have some
>>>>>>>>>>>>> proprietary enhancements and it may be based on old vintage of OpenSM. You
>>>>>>>>>>>>> will likely need to work with them/Oracle now on issues.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Lack of a partitions file does mean default partition and
>>>>>>>>>>>>> default rate (10 Gbps) so from what I saw all ports had sufficient rate to
>>>>>>>>>>>>> join MC group.
>>>>>>>>>>>>>
>>>>>>>>>>>>> There are certain topology requirements for running various
>>>>>>>>>>>>> routing algorithms. Did you try updn or dnup ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> The key is determining whether the IPoIB broadcast group is
>>>>>>>>>>>>> setup correctly. What MLID is the group built on (usually 0xc000) ? What
>>>>>>>>>>>>> are the group parameters (rate, MTU) ? Are all members that are running
>>>>>>>>>>>>> IPoIB joined ? Is the group routed to all such members ? There are
>>>>>>>>>>>>> infiniband-diags for all of this.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Oct 28, 2013 at 12:19 PM, Robert LeBlanc <
>>>>>>>>>>>>> robert_leblanc at byu.edu> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> OpenSM (the SM runs on Xsigo so they manage it) is using
>>>>>>>>>>>>>> minhop. I've loaded the ibnetdiscover output into ibsim and run all the
>>>>>>>>>>>>>> different routing algorithms against it with and without scatter ports.
>>>>>>>>>>>>>> Minhop had 50% of our hosts running all paths through a single IS5030
>>>>>>>>>>>>>> switch (at least the LIDs we need which represent Ethernet and Fibre
>>>>>>>>>>>>>> Channel cards the hosts should communicate with). Ftree, dor, and dfsssp
>>>>>>>>>>>>>> failed back to minhop, the others routed more paths through the same IS5030
>>>>>>>>>>>>>> in some cases increasing our host count with single point of failure to
>>>>>>>>>>>>>> 75%.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  As far as I can tell there is no partitions.conf file so I
>>>>>>>>>>>>>> assume we are using the default partition. There is an opensm.opts file,
>>>>>>>>>>>>>> but it only specifies logging information.
>>>>>>>>>>>>>>  # SA database file name
>>>>>>>>>>>>>> sa_db_file /var/log/opensm-sa.dump
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  # If TRUE causes OpenSM to dump SA database at the end of
>>>>>>>>>>>>>> # every light sweep, regardless of the verbosity level
>>>>>>>>>>>>>> sa_db_dump TRUE
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  # The directory to hold the file OpenSM dumps
>>>>>>>>>>>>>> dump_files_dir /var/log/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  The SM node is:
>>>>>>>>>>>>>>  xsigoa:/opt/xsigo/xsigos/current/ofed/etc# ibaddr
>>>>>>>>>>>>>> GID fe80::13:9702:100:979 LID start 0x1 end 0x1
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  We do have Switch-X in two of the Dell m1000e chassis but
>>>>>>>>>>>>>> the cards, ports 17-32, are FDR10 (the switch may be straight FDR, but I'm
>>>>>>>>>>>>>> not 100% sure). The IS5030 are QDR which the Switch-X are connected to, the
>>>>>>>>>>>>>> switches in the Xsigo directors are QDR, but the Ethernet and Fibre Channel
>>>>>>>>>>>>>> cards are DDR. The DDR cards will not be running IPoIB (at least to my
>>>>>>>>>>>>>> knowledge they don't have the ability), only the hosts should be leveraging
>>>>>>>>>>>>>> IPoIB. I hope that clears up some of your questions. If you have more, I
>>>>>>>>>>>>>> will try to answer them.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Robert LeBlanc
>>>>>>>>>>>>>> OIT Infrastructure & Virtualization Engineer
>>>>>>>>>>>>>> Brigham Young University
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  On Mon, Oct 28, 2013 at 9:57 AM, Hal Rosenstock <
>>>>>>>>>>>>>> hal.rosenstock at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  What routing algorithm is configured in OpenSM ? What does
>>>>>>>>>>>>>>> your partitions.conf file look like ? Which node is your OpenSM ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Also, I only see QDR and DDR links although you have
>>>>>>>>>>>>>>> Switch-X so I assume all FDR ports are connected to slower (QDR) devices. I
>>>>>>>>>>>>>>> don't see any FDR-10 ports but maybe they're also connected to QDR ports so
>>>>>>>>>>>>>>> show up as QDR in the topology.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There are DDR CAs in Xsigo box but not sure whether or not
>>>>>>>>>>>>>>> they run IPoIB.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -- Hal
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  On Sun, Oct 27, 2013 at 9:46 PM, Robert LeBlanc <
>>>>>>>>>>>>>>> robert_leblanc at byu.edu> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  Since you guys are amazingly helpful, I thought I would
>>>>>>>>>>>>>>>> pick your brains in a new problem.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  We have two Xsigo directors cross connected to four
>>>>>>>>>>>>>>>> Mellanox IS5030 switches. Connected to those we have four Dell m1000e
>>>>>>>>>>>>>>>> chassis each with two IB switches (two chassis have QDR and two have
>>>>>>>>>>>>>>>> FDR10). We have 9 dual-port rack servers connected to the IS5030 switches.
>>>>>>>>>>>>>>>> For testing purposes we have an additional Dell m1000e QDR chassis
>>>>>>>>>>>>>>>> connected to one Xsigo director and two dual-port FDR10 rack servers
>>>>>>>>>>>>>>>> connected to the other Xsigo director.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  I can get IPoIB to work between the two test rack servers
>>>>>>>>>>>>>>>> connected to the one Xsigo director. But I can not get IPoIB to work
>>>>>>>>>>>>>>>> between any blades either right next to each other or to the working rack
>>>>>>>>>>>>>>>> servers. I'm using the same exact live CentOS ISO on all four servers. I've
>>>>>>>>>>>>>>>> checked opensm and the blades have joined the multicast group 0xc000
>>>>>>>>>>>>>>>> properly. tcpdump basically says that traffic is not leaving the blades.
>>>>>>>>>>>>>>>> tcpdump also shows no traffic entering the blades from the rack servers. An
>>>>>>>>>>>>>>>> ibtracert using 0xc000 mlid shows that routing exists between hosts.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  I've read about MulticastFDBTop=0xBFFF but I don't know
>>>>>>>>>>>>>>>> how to set it and I doubt it would have been set by default.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  Anyone have some ideas on troubleshooting steps to try? I
>>>>>>>>>>>>>>>> think Google is tired of me asking questions about it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  Thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  Robert LeBlanc
>>>>>>>>>>>>>>>> OIT Infrastructure & Virtualization Engineer
>>>>>>>>>>>>>>>> Brigham Young University
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  _______________________________________________
>>>>>>>>>>>>>>>> Users mailing list
>>>>>>>>>>>>>>>> Users at lists.openfabrics.org
>>>>>>>>>>>>>>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/users
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>  _______________________________________________
>> Users mailing list
>> Users at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/users
>>
>>
>>  ====================================
>>
>>  Susan Coulter
>> HPC-3 Network/Infrastructure
>> 505-667-8425
>> Increase the Peace...
>> An eye for an eye leaves the whole world blind
>> ====================================
>>
>>
>
>  ====================================
>
>  Susan Coulter
> HPC-3 Network/Infrastructure
> 505-667-8425
> Increase the Peace...
> An eye for an eye leaves the whole world blind
> ====================================
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20131029/76975abf/attachment.html>


More information about the Users mailing list