[Users] Weird IPoIB issue

Robert LeBlanc robert_leblanc at byu.edu
Sun Oct 27 18:46:18 PDT 2013


Since you guys are amazingly helpful, I thought I would pick your brains in
a new problem.

We have two Xsigo directors cross connected to four Mellanox IS5030
switches. Connected to those we have four Dell m1000e chassis each with two
IB switches (two chassis have QDR and two have FDR10). We have 9 dual-port
rack servers connected to the IS5030 switches. For testing purposes we have
an additional Dell m1000e QDR chassis connected to one Xsigo director and
two dual-port FDR10 rack servers connected to the other Xsigo director.

I can get IPoIB to work between the two test rack servers connected to the
one Xsigo director. But I can not get IPoIB to work between any blades
either right next to each other or to the working rack servers. I'm using
the same exact live CentOS ISO on all four servers. I've checked opensm and
the blades have joined the multicast group 0xc000 properly. tcpdump
basically says that traffic is not leaving the blades. tcpdump also shows
no traffic entering the blades from the rack servers. An ibtracert using
0xc000 mlid shows that routing exists between hosts.

I've read about MulticastFDBTop=0xBFFF but I don't know how to set it and I
doubt it would have been set by default.

Anyone have some ideas on troubleshooting steps to try? I think Google is
tired of me asking questions about it.

Thanks,

Robert LeBlanc
OIT Infrastructure & Virtualization Engineer
Brigham Young University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20131027/a9c237bf/attachment.html>


More information about the Users mailing list