[ewg] bug 1918 - openmpi broken due to rdma-cm changes

Jeff Squyres jsquyres at cisco.com
Fri Feb 5 09:57:55 PST 2010


On Feb 5, 2010, at 11:16 AM, Steve Wise wrote:

> > Note that it is highly unlikely that we will release open mpi 1.4.2 in
> > time for ofed 1.5.1.
> 
> Jeff, there is no way to handle high priority bug fixes in the current
> released stream?

We have 1.4.2 cooking, but it's not ready yet.  

I'll take it back to the OMPI community to see if they want to do a high-priority release, but I'm not excited about it (see below).

> > Also note that trying to bind rdma cm to all interface ip addresses
> > was the way that we were advised by openfabrics to figure out which
> > devices are rdma-capable.
> >
> > As such, it is highly desirable to get the fix transparently in rdmacm
> > and preserve the old semantic. More specifically, it seems undesirable
> > to change this semantic in a minor ofed point release.
> 
> I agree that we should probably not allow 127.0.0.1 binds in ofed-1.5.1
> at all because it regresses OpenMPI.  Even with IB systems, if the bind
> to 127.0.0.1 succeeds, then OpenMPI assumes 127.0.0.1 is bound to that
> rdma interface and advertises this address to its peer as an address
> to-which that peer can rdma connect!  This will break IB clusters too,
> not just T3/iWARP cluster.   While I think OpenMPI needs to skip
> 127.0.0.1 in its logic, I think we should probably defer allowing
> 127.0.0.1 binds until ofed-1.6.

I agree that Open MPI should not advertise 127.0.0.1 to peers.  However, the logic that we were advised to use was to try to RDMA CM bind to each IP address.  If the bind succeeds, then it's an RDMA-capable device and therefore it's advertisable.  The rationale was that 127.0.0.1 (really, any loopback address) is *not* an RDMA device and therefore the RDMA CM bind should *never* succeed on it.  Hence, it wasn't necessary to add a "is this a loopback address?" check in the logic.

I guess I don't understand why that rationale is now incorrect -- 127.0.0.1 is still not an RDMA-capable device, right?

> But Jeff, note that if someone uses the upstream kernel and OpenMPI, its
> busted...
> 
> So I recommend:
> 
> 1) Don't allow 127.0.0.1 binds in ofed-1.5.1
> 
> 2) Fix OpenMPI ASAP to never advertise 127.0.0.1 as a valid rdma-cm
> connect address (get it in ofed-1.5.2 or ofed-1.6).

We can add this logic (because I understand that some upstream kernels now allow binding to loopback addresses), but I'm still confused (in principle) as to why it should be necessary.

Can you clarify what kernel versions allow binding LOOPBACK addresses with RDMA CM?

-- 
Jeff Squyres <jsquyres at cisco.com>
Cisco.com - http://www.cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




More information about the ewg mailing list