[Users] IPoIB on CentOS 6.5

Mehmet Soysal mehmet.soysal at kit.edu
Thu Mar 19 09:31:53 PDT 2015


Hi,
thats good to hear, that this issue is put on high priority.
Our Redhat case is 01368360.

Our Problem with ipoib is slightly different of what Peter explained.
I did not noticed any islands being formed.
After a opensm failover, none of the client can use the ipoib any more
and unloading the ib_ipoib is also not possible.

What i noticed is that the arp requests are not answered after a failover.
If a node has still a valid arp cache entry for another IB node he can 
still ping it.
After clearing cache the client does not get any arp answers for the 
previous node.

Hope that Redhat fixes this issue soon.



best regards
M.Soysal



On 19.03.2015 17:17, Foraker, Jim wrote:
> Peter,
>       Thanks.  I’ve told our RedHat folks that the IPoIB issue is a high
> priority for us.  Our bug for the qib kernel RDMA issue is 1188417, which
> was closed as a duplicate of
> https://bugzilla.redhat.com/show_bug.cgi?id=1171803.
>
>       Jim
>
> On 3/19/15, 2:04 AM, "Peter Kjellström" <cap at nsc.liu.se> wrote:
>
>> On Wed, 18 Mar 2015 21:08:17 +0000
>> "Foraker, Jim" <foraker1 at llnl.gov> wrote:
>>
>>>       Does ³known broken²
>> By "known broken" I meant
>> 1) several sites including ours had to back off to older or patched
>> version to get sanity for IPoIB
>> And
>> 2) We cased this to Redhat and they've been working on a fix. Our
>> support case nr for this is 01321081 and I suspect also bz1159925.
>>
>> work on linux-rdma:
>>   [PATCH V3 FIX For-3.19 0/3] IB/ipoib: Fix multicast join flow
>>   https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg22511.html
>>
>>> mean Mehmet¹s case where IPoIB dies after an
>>> opensm failover, or broken in other ways?
>> And the failure mode is essentially that islands of connectivity form
>> as the SM is restarted (a secondary symptom is that the ib_ipoib module
>> cannot be unloaded once broken / after sm restart).
>>
>> Here's a step by step way that shows the problem on one of our system
>> (written by a colleague):
>>
>> --- begin example
>>
>> IPoIB does not handle subnet manager restarts.
>>
>> I will show this using an example from yesterday:
>>
>>   n[464-472] ran CentOS 6.5
>>   n[564-572] ran CentOS 6.6
>>
>> The IPoIB interface ib0 was down on all nodes, and we had just restarted
>> OpenSM.
>>
>> Step 1: Bring up IPoIB on 7 nodes running 6.5 and 7 nodes running 6.6:
>>
>>     # pdsh -w "n[564-570],n[464-470]" ifup ib0
>>
>> Step 2: Verify connectivity
>>
>>   All nodes can ping all other nodes:
>>   
>>     # pdsh -w "n[564-570],n[464-470]" coping -o -e
>> "ni[564-570],ni[464-470]"|pshbak -c
>>     ----------------
>>     n[464-470,564-570]
>>     ----------------
>>     2014-12-11 15:03:54  ni[464-470,564-570]  initially up
>>
>> Step 3: Restart OpenSM
>>
>> Step 4: Verify connectivity again:
>>
>>   Still OK:
>>     # pdsh -w "n[564-570],n[464-470]" coping -o -e
>> "ni[564-570],ni[464-470]"|pshbak -c
>>     ----------------
>>     n[464-470,564-570]
>>     ----------------
>>     2014-12-11 15:07:01  ni[464-470,564-570]  initially up
>>
>> Step 5: Start IPoIB on 4 additional nodes (two 6.5 and two 6.6):
>>
>>     # pdsh -w "n[571-572,471,472]" ifup ib0
>>
>> Step 6: Verify connectivity:
>>
>>   Broken:
>>   * 6.6 nodes started in Step 1 can still ping all nodes from Step 1,
>>     but not the nodes started in Step 5.
>>   * 6.5 nodes started in Step 1 can ping everything.
>>   * Nodes from Step 5 can ping each other, but only the 6.5 nodes from
>>     Step 1, not the 6.6.
>>
>>     [root at trio yum.repos.d]# pdsh -w "n[564-572],n[464-472]" coping -o -e
>> "ni[564-572],ni[464-472]"|sort|pshbak -c
>>     ----------------
>>     n[564-570]
>>     ----------------
>>     2014-12-11 15:08:24  ni[464-470,564-570]  initially up
>>     2014-12-11 15:08:24  ni[471-472,571-572]  initially DOWN
>>     ----------------
>>     n[464-470]
>>     ----------------
>>     2014-12-11 15:08:24  ni[464-472,564-572]  initially up
>>     ----------------
>>     n[471-472,571-572]
>>     ----------------
>>     2014-12-11 15:08:24  ni[464-472,571-572]  initially up
>>     2014-12-11 15:08:24  ni[564-570]  initially DOWN
>>
>> ---- end example
>>
>>>   The only issue we¹ve seen
>>> with IPoIB in RHEL 6.6 has been a bug with QIB hardware and
>>> kernel-based RDMA (Lustre, SRP). Is there a RHEL bugzilla bug open on
>>> the issue(s)?
>> For bug id and possible bz see beginning of my e-mail.
>>
>> Is there a bz for the QIB bug you mentioned? (we've seen this too
>> and switched to ofed-3.12-1 on that system that required lnet on qib).
>>
>> /Peter
>>
>>>       Jim
>>>
>>>
>>> On 3/18/15, 10:30 AM, "Peter Kjellström" <cap at nsc.liu.se> wrote:
>>>
>>>> On Tue, 17 Mar 2015 15:54:02 +0100
>>>> Mehmet Soysal <mehmet.soysal at kit.edu> wrote:
>>>>
>>>>> Hi,
>>>>> did you solved the problem ?
>>>>> We have a similar issue since a upgrade to RHEL 6.5 or higher.
>>>>>
>>>>> On our nodes ipoib is not working any longer after a opensm fail
>>>>> over occurs.
>>>> Actually IPoIB is known broken in rhel6 (6.5 zstream 431-x, x > 37
>>>> and for 6.6 all released -504). Redhat knows this and is working on
>>>> a fix (there may be a candidate fix kernel to request). Meanwhile
>>>> we've rebuilt latest -504 with the ipoib from 6.5 (which works fine
>>>> for us).
>>>>
>>>> If you're interested in our -504 pkgs with old/working ipoib contact
>>>> me offlist.
>>>>
>>>> Since last week you also get the additional complication of the verbs
>>>> CVE to take into account when picking a working setup...
>>>>
>>>> /Peter K
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at lists.openfabrics.org
>>>> http://lists.openfabrics.org/mailman/listinfo/users
>>>>
>>>
>>
>
> _______________________________________________
> Users mailing list
> Users at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/users

-- 
----------------------------------------------------------------------------
Mehmet Soysal
Scientific Computing and Services (SCS)

Karlsruher Institut für Technologie (KIT)
Steinbuch Centre for Computing (SCC)
Zirkel 2, Gebäude 20.21, Raum 206
D-76131 Karlsruhe
Tel. : +49 721 608-46347
Fax  : +49 721 32550
Email: Mehmet.Soysal at kit.edu
WWW : http://www.scc.kit.edu

KIT - Universität des Landes Baden-Württemberg und
nationales Forschungszentrum in der Helmholtz-Gemeinschaft




More information about the Users mailing list