[OMPI devel] [ofa-general] OpenMPI and RDMA-CM

Jeff Squyres jsquyres at cisco.com
Wed May 9 04:51:07 PDT 2007


On May 9, 2007, at 1:37 AM, Or Gerlitz wrote:

> Doing a bit of zoom out from the "how to make ofed's udapl work for  
> ompi" thread, my thinking is that the ompi udapl btl enablement is  
> actually only the first step, where for production/longterm/etc you  
> want to have an rdmacm btl.

I think this is a bit of a misunderstanding.  The "BTL" in Open MPI  
is a byte transfer layer; it is a point-to-point abstraction for  
moving bytes between two processes.  BTL components (read: plugins)  
are typically distinguished by the underlying protocols used.  For  
example, we have an RC verbs-based BTL and we have a separate uDAPL- 
based BTL.  Andrew is also working on a research-quality UD verbs- 
based BTL.

Hence, how a particular BTL component makes connections between  
process peers is really a side-effect of moving bytes around, and not  
the focus of the BTL.  So having a "rdmacm" BTL doesn't really make  
sense.  If both the RC and UD verbs-based BTLs someday use the RDMA  
CM for connections, we might abstract the connection management out  
to a common piece of code between the two.  But that's a different  
issue.  If we end up having a mixed BTL someday that uses both RC and  
UD, then the need for the common code may go away.  But that's in the  
future.

> Reasoning here is made of many arguments, among them the quickest i  
> can make are:
>
> A) it seems that ompi would want to use not only RC but rather also  
> UD multicast and unicast, which are not covered by udapl
>
> B) there's actually no real justification to maintain two APIs  
> (namely udapl vs libibvers/librdmacm), so down the road, only one  
> of them would survive (udapl is implemented ***over*** libibverbs/ 
> librdmacm so if the latteres dies same does udapl). Specifically, I  
> hear here and there that the OFED stack is now on its way to be  
> deployed all over the place, specifically in commercial Unix OSs  
> (which want modern! code that supports IPoIB-CM,RDS,SRP,iSER, etc  
> you named it) so eventually the rdmacm btl can be used also over  
> Solaris et al.

I think that's not quite the point.

1. A piece of history: the uDAPL BTL was originally developed by a  
grad student just as an excuse to learn the BTL interface and OMPI  
internals.  We already had an RC verbs-based BTL at the time.

2. When Sun joined Open MPI, they took over the development and  
maintenance of the uDAPL BTL because uDAPL is the only high  
performance stack on Solaris.

3. It's fine that Sun will someday support the same verbs interface  
that OFED does.  But *today*, they don't.  So for their current  
customers, they need to support uDAPL.  As such, we have done little/ 
no testing of uDAPL on OFED since Sun took over the uDAPL BTL -- all  
testing since that point has been on Solaris uDAPL.  All of our Linux/ 
OFED efforts have been on the verbs interface.

4. The Open MPI focus on uDAPL over OFED at the moment is simply to  
jump-start iWARP testing.  Both NetEffect and Chelsio have chimed in  
to say that they will do the RDMA CM work for Open MPI, but uDAPL can  
be used as a temporary workaround that can be used [effectively]  
immediately while they get up to speed on the Open MPI code base and  
do the RDMA CM work.

-- 
Jeff Squyres
Cisco Systems




More information about the general mailing list