[ewg] bug 1918 - openmpi broken due to rdma-cm changes

Sean Hefty sean.hefty at intel.com
Fri Feb 5 10:42:44 PST 2010


>Is the issue 6f8372b6 ("RDMA/cm: fix loopback address support")?  This
>just went in for 2.6.33, which is still at -rc6, so if we can quickly
>reach a consensus, there is still time to get a fix in for 2.6.33.

That should be the patch in question.  I'm not sure about reaching consensus. :)
If the other changes to the rdma_cm aren't closely tied to that change, we may
be able to back that one patch out until we can get whatever other fix may be
needed.

In my view, openmpi has a bug in that it can pass a loopback address to a remote
peer and expect it to be used to establish a connection.  Steve seems to agree
with this.

My original intent was to allow the use of the loopback address with the
rdma_cm.  I.e. 127.0.0.1 meant 'this host', and not 'software loopback'.  I just
had Arlin run a quick test with OFED 1.4 over IB, and it allows binding to
127.0.0.1, but never forms connections.  I.e. ucmatose -b 127.0.0.1 succeeds in
listening, but ucmatose -s 127.0.0.1 fails to connect because of a route error.
(Hmm... I'm still confused about what openmpi is doing then.)

Even if an application were to use non-loopback IP addresses, there's no
guarantee of forming a connection if those addresses map to an iwarp device.
So, even if the rdma_cm fails binding to 127.0.0.1 unless there's some RDMA
device (software or hardware - not sure why we care) capable of supporting it,
an application would need to also deal with failures from rdma_resolve_addr.

Indicating loopback through a device capability flag seems like the right
approach, and the rdma_cm can use this to fail rdma_bind_addr/rdma_resolve_addr
calls.  That's probably not a trivial patch however.

- Sean




More information about the ewg mailing list