[Users] IPoIB on CentOS 6.5

Peter Kjellström cap at nsc.liu.se
Thu Mar 19 02:04:40 PDT 2015


On Wed, 18 Mar 2015 21:08:17 +0000
"Foraker, Jim" <foraker1 at llnl.gov> wrote:

>      Does ³known broken² 

By "known broken" I meant
 1) several sites including ours had to back off to older or patched
 version to get sanity for IPoIB
And
 2) We cased this to Redhat and they've been working on a fix. Our
 support case nr for this is 01321081 and I suspect also bz1159925.

 work on linux-rdma:
  [PATCH V3 FIX For-3.19 0/3] IB/ipoib: Fix multicast join flow
  https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg22511.html

> mean Mehmet¹s case where IPoIB dies after an
> opensm failover, or broken in other ways?

And the failure mode is essentially that islands of connectivity form
as the SM is restarted (a secondary symptom is that the ib_ipoib module
cannot be unloaded once broken / after sm restart).

Here's a step by step way that shows the problem on one of our system
(written by a colleague):

--- begin example

IPoIB does not handle subnet manager restarts.

I will show this using an example from yesterday:

  n[464-472] ran CentOS 6.5
  n[564-572] ran CentOS 6.6

The IPoIB interface ib0 was down on all nodes, and we had just restarted
OpenSM.

Step 1: Bring up IPoIB on 7 nodes running 6.5 and 7 nodes running 6.6:

    # pdsh -w "n[564-570],n[464-470]" ifup ib0

Step 2: Verify connectivity

  All nodes can ping all other nodes:
  
    # pdsh -w "n[564-570],n[464-470]" coping -o -e "ni[564-570],ni[464-470]"|pshbak -c
    ----------------
    n[464-470,564-570]
    ----------------
    2014-12-11 15:03:54  ni[464-470,564-570]  initially up

Step 3: Restart OpenSM

Step 4: Verify connectivity again:

  Still OK:
    # pdsh -w "n[564-570],n[464-470]" coping -o -e "ni[564-570],ni[464-470]"|pshbak -c
    ----------------
    n[464-470,564-570]
    ----------------
    2014-12-11 15:07:01  ni[464-470,564-570]  initially up

Step 5: Start IPoIB on 4 additional nodes (two 6.5 and two 6.6):

    # pdsh -w "n[571-572,471,472]" ifup ib0

Step 6: Verify connectivity:

  Broken:
  * 6.6 nodes started in Step 1 can still ping all nodes from Step 1,
    but not the nodes started in Step 5.
  * 6.5 nodes started in Step 1 can ping everything.
  * Nodes from Step 5 can ping each other, but only the 6.5 nodes from
    Step 1, not the 6.6.

    [root at trio yum.repos.d]# pdsh -w "n[564-572],n[464-472]" coping -o -e "ni[564-572],ni[464-472]"|sort|pshbak -c  
    ----------------
    n[564-570]
    ----------------
    2014-12-11 15:08:24  ni[464-470,564-570]  initially up
    2014-12-11 15:08:24  ni[471-472,571-572]  initially DOWN
    ----------------
    n[464-470]
    ----------------
    2014-12-11 15:08:24  ni[464-472,564-572]  initially up
    ----------------
    n[471-472,571-572]
    ----------------
    2014-12-11 15:08:24  ni[464-472,571-572]  initially up
    2014-12-11 15:08:24  ni[564-570]  initially DOWN

---- end example

>  The only issue we¹ve seen
> with IPoIB in RHEL 6.6 has been a bug with QIB hardware and
> kernel-based RDMA (Lustre, SRP). Is there a RHEL bugzilla bug open on
> the issue(s)?

For bug id and possible bz see beginning of my e-mail.

Is there a bz for the QIB bug you mentioned? (we've seen this too
and switched to ofed-3.12-1 on that system that required lnet on qib).

/Peter 
 
>      Jim
> 
> 
> On 3/18/15, 10:30 AM, "Peter Kjellström" <cap at nsc.liu.se> wrote:
> 
> >On Tue, 17 Mar 2015 15:54:02 +0100
> >Mehmet Soysal <mehmet.soysal at kit.edu> wrote:
> >
> >> Hi,
> >> did you solved the problem ?
> >> We have a similar issue since a upgrade to RHEL 6.5 or higher.
> >> 
> >> On our nodes ipoib is not working any longer after a opensm fail
> >> over occurs.
> >
> >Actually IPoIB is known broken in rhel6 (6.5 zstream 431-x, x > 37
> >and for 6.6 all released -504). Redhat knows this and is working on
> >a fix (there may be a candidate fix kernel to request). Meanwhile
> >we've rebuilt latest -504 with the ipoib from 6.5 (which works fine
> >for us).
> >
> >If you're interested in our -504 pkgs with old/working ipoib contact
> >me offlist.
> >
> >Since last week you also get the additional complication of the verbs
> >CVE to take into account when picking a working setup...
> >
> >/Peter K
> >_______________________________________________
> >Users mailing list
> >Users at lists.openfabrics.org
> >http://lists.openfabrics.org/mailman/listinfo/users
> >
> 
> 




More information about the Users mailing list