[Users] IPoIB on CentOS 6.5

Weiny, Ira ira.weiny at intel.com
Thu Mar 19 14:25:45 PDT 2015


> 
> Hi,
> thats good to hear, that this issue is put on high priority.
> Our Redhat case is 01368360.
> 
> Our Problem with ipoib is slightly different of what Peter explained.
> I did not noticed any islands being formed.
> After a opensm failover, none of the client can use the ipoib any more and
> unloading the ib_ipoib is also not possible.

This seems pretty serious.  Any idea why?
 
> 
> What i noticed is that the arp requests are not answered after a failover.
> If a node has still a valid arp cache entry for another IB node he can still ping it.
> After clearing cache the client does not get any arp answers for the previous
> node.
> 
> Hope that Redhat fixes this issue soon.
> 

Has the failover completed?  Did the SM Lid get properly reassigned? 

For a node which is failing arp are the mcast groups joined?

Besides the opensm log, and saquery tools; Ipoib has some debugfs entries which can help here.

[root at phcppriv12 oib_utils]# cat /sys/kernel/debug/ipoib/ib0_mcg 
GID: ff12:401b:ffff:0:0:0:0:1
  created: 6034115225
  queuelen:         0
  complete:       yes
  send_only:       no

...


This sounds like an issue with Client Reregister/mcast join after the failover.

If possible it would be nice if those experiencing IPoIB issues like this could try Dougs latest patch series which I believe fix various mcast join issues with IPoIB.

https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg23114.html

Ira


> 
> 
> best regards
> M.Soysal
> 
> 
> 
> On 19.03.2015 17:17, Foraker, Jim wrote:
> > Peter,
> >       Thanks.  I've told our RedHat folks that the IPoIB issue is a
> > high priority for us.  Our bug for the qib kernel RDMA issue is
> > 1188417, which was closed as a duplicate of
> > https://bugzilla.redhat.com/show_bug.cgi?id=1171803.
> >
> >       Jim
> >
> > On 3/19/15, 2:04 AM, "Peter Kjellström" <cap at nsc.liu.se> wrote:
> >
> >> On Wed, 18 Mar 2015 21:08:17 +0000
> >> "Foraker, Jim" <foraker1 at llnl.gov> wrote:
> >>
> >>>       Does ³known broken²
> >> By "known broken" I meant
> >> 1) several sites including ours had to back off to older or patched
> >> version to get sanity for IPoIB And
> >> 2) We cased this to Redhat and they've been working on a fix. Our
> >> support case nr for this is 01321081 and I suspect also bz1159925.
> >>
> >> work on linux-rdma:
> >>   [PATCH V3 FIX For-3.19 0/3] IB/ipoib: Fix multicast join flow
> >>
> >> https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg22511.html
> >>
> >>> mean Mehmet¹s case where IPoIB dies after an opensm failover, or
> >>> broken in other ways?
> >> And the failure mode is essentially that islands of connectivity form
> >> as the SM is restarted (a secondary symptom is that the ib_ipoib
> >> module cannot be unloaded once broken / after sm restart).
> >>
> >> Here's a step by step way that shows the problem on one of our system
> >> (written by a colleague):
> >>
> >> --- begin example
> >>
> >> IPoIB does not handle subnet manager restarts.
> >>
> >> I will show this using an example from yesterday:
> >>
> >>   n[464-472] ran CentOS 6.5
> >>   n[564-572] ran CentOS 6.6
> >>
> >> The IPoIB interface ib0 was down on all nodes, and we had just
> >> restarted OpenSM.
> >>
> >> Step 1: Bring up IPoIB on 7 nodes running 6.5 and 7 nodes running 6.6:
> >>
> >>     # pdsh -w "n[564-570],n[464-470]" ifup ib0
> >>
> >> Step 2: Verify connectivity
> >>
> >>   All nodes can ping all other nodes:
> >>
> >>     # pdsh -w "n[564-570],n[464-470]" coping -o -e
> >> "ni[564-570],ni[464-470]"|pshbak -c
> >>     ----------------
> >>     n[464-470,564-570]
> >>     ----------------
> >>     2014-12-11 15:03:54  ni[464-470,564-570]  initially up
> >>
> >> Step 3: Restart OpenSM
> >>
> >> Step 4: Verify connectivity again:
> >>
> >>   Still OK:
> >>     # pdsh -w "n[564-570],n[464-470]" coping -o -e
> >> "ni[564-570],ni[464-470]"|pshbak -c
> >>     ----------------
> >>     n[464-470,564-570]
> >>     ----------------
> >>     2014-12-11 15:07:01  ni[464-470,564-570]  initially up
> >>
> >> Step 5: Start IPoIB on 4 additional nodes (two 6.5 and two 6.6):
> >>
> >>     # pdsh -w "n[571-572,471,472]" ifup ib0
> >>
> >> Step 6: Verify connectivity:
> >>
> >>   Broken:
> >>   * 6.6 nodes started in Step 1 can still ping all nodes from Step 1,
> >>     but not the nodes started in Step 5.
> >>   * 6.5 nodes started in Step 1 can ping everything.
> >>   * Nodes from Step 5 can ping each other, but only the 6.5 nodes from
> >>     Step 1, not the 6.6.
> >>
> >>     [root at trio yum.repos.d]# pdsh -w "n[564-572],n[464-472]" coping
> >> -o -e "ni[564-572],ni[464-472]"|sort|pshbak -c
> >>     ----------------
> >>     n[564-570]
> >>     ----------------
> >>     2014-12-11 15:08:24  ni[464-470,564-570]  initially up
> >>     2014-12-11 15:08:24  ni[471-472,571-572]  initially DOWN
> >>     ----------------
> >>     n[464-470]
> >>     ----------------
> >>     2014-12-11 15:08:24  ni[464-472,564-572]  initially up
> >>     ----------------
> >>     n[471-472,571-572]
> >>     ----------------
> >>     2014-12-11 15:08:24  ni[464-472,571-572]  initially up
> >>     2014-12-11 15:08:24  ni[564-570]  initially DOWN
> >>
> >> ---- end example
> >>
> >>>   The only issue we¹ve seen
> >>> with IPoIB in RHEL 6.6 has been a bug with QIB hardware and
> >>> kernel-based RDMA (Lustre, SRP). Is there a RHEL bugzilla bug open
> >>> on the issue(s)?
> >> For bug id and possible bz see beginning of my e-mail.
> >>
> >> Is there a bz for the QIB bug you mentioned? (we've seen this too and
> >> switched to ofed-3.12-1 on that system that required lnet on qib).
> >>
> >> /Peter
> >>
> >>>       Jim
> >>>
> >>>
> >>> On 3/18/15, 10:30 AM, "Peter Kjellström" <cap at nsc.liu.se> wrote:
> >>>
> >>>> On Tue, 17 Mar 2015 15:54:02 +0100
> >>>> Mehmet Soysal <mehmet.soysal at kit.edu> wrote:
> >>>>
> >>>>> Hi,
> >>>>> did you solved the problem ?
> >>>>> We have a similar issue since a upgrade to RHEL 6.5 or higher.
> >>>>>
> >>>>> On our nodes ipoib is not working any longer after a opensm fail
> >>>>> over occurs.
> >>>> Actually IPoIB is known broken in rhel6 (6.5 zstream 431-x, x > 37
> >>>> and for 6.6 all released -504). Redhat knows this and is working on
> >>>> a fix (there may be a candidate fix kernel to request). Meanwhile
> >>>> we've rebuilt latest -504 with the ipoib from 6.5 (which works fine
> >>>> for us).
> >>>>
> >>>> If you're interested in our -504 pkgs with old/working ipoib
> >>>> contact me offlist.
> >>>>
> >>>> Since last week you also get the additional complication of the
> >>>> verbs CVE to take into account when picking a working setup...
> >>>>
> >>>> /Peter K
> >>>> _______________________________________________
> >>>> Users mailing list
> >>>> Users at lists.openfabrics.org
> >>>> http://lists.openfabrics.org/mailman/listinfo/users
> >>>>
> >>>
> >>
> >
> > _______________________________________________
> > Users mailing list
> > Users at lists.openfabrics.org
> > http://lists.openfabrics.org/mailman/listinfo/users
> 
> --
> ----------------------------------------------------------------------------
> Mehmet Soysal
> Scientific Computing and Services (SCS)
> 
> Karlsruher Institut für Technologie (KIT) Steinbuch Centre for Computing (SCC)
> Zirkel 2, Gebäude 20.21, Raum 206
> D-76131 Karlsruhe
> Tel. : +49 721 608-46347
> Fax  : +49 721 32550
> Email: Mehmet.Soysal at kit.edu
> WWW : http://www.scc.kit.edu
> 
> KIT - Universität des Landes Baden-Württemberg und nationales
> Forschungszentrum in der Helmholtz-Gemeinschaft
> 
> _______________________________________________
> Users mailing list
> Users at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/users



More information about the Users mailing list