[OMPI devel] OMPI over OFA udapl (was Re: [ofa-general] OpenMPI and RDMA-CM)

Donald Kerr Don.Kerr at Sun.COM
Tue May 8 13:21:18 PDT 2007



Steve Wise wrote:

>On Tue, 2007-05-08 at 13:57 -0400, Andrew Friedley wrote:
>  
>
>>Steve Wise wrote:
>>    
>>
>>>>Well I've tried OMPI on ofed-1.2 udapl today and it doesn't work.  I'm
>>>>debugging now.
>>>>
>>>>        
>>>>
>>>Here's part of the problem (from ompi/btl/udapl/btl_udapl.c):
>>>
>>>    /* TODO - big bad evil hack! */
>>>    /* uDAPL doesn't ever seem to keep track of ports with addresses.  This
>>>       becomes a problem when we use dat_ep_query() to obtain a remote address
>>>       on an endpoint.  In this case, both the DAT_PORT_QUAL and the sin_port
>>>       field in the DAT_SOCK_ADDR are 0, regardless of the actual port. This is
>>>       a problem when we have more than one uDAPL process per IA - these
>>>       processes will have exactly the same address, as the port is all
>>>       we have to differentiate who is who.  Thus, our uDAPL EP -> BTL EP
>>>       matching algorithm will break down.
>>>
>>>       So, we insert the port we used for our PSP into the DAT_SOCK_ADDR for
>>>       this IA.  uDAPL then conveniently propagates this to where we need it.
>>>     */
>>>    ((struct sockaddr_in*)attr.ia_address_ptr)->sin_port = htons(port);
>>>    ((struct sockaddr_in*)&btl->udapl_addr.addr)->sin_port = htons(port);
>>>
>>>The OMPI code stuffs the port chosen by udapl for a listening endpoint
>>>into the ia address memory (which is owned by the udapl layer btw).
>>>There's a slight problem with that:  The OFA udapl openib_cma code binds
>>>cm_id's to this ia_address regularly.  When an hca is opened, a cm_id is
>>>bound to this address to obtain the local hca port number and gid that
>>>is being used.  In addition, a cm_id is bound to this address each time
>>>an endpoint is created (either at ep_create time or ep_connect time).
>>>So that ia_address field is used by the dapl cm to create local
>>>cm_ids...  Since the port was always zero, the rmda-cma would choose a
>>>unique port for each cm_id bound to that address.   
>>>
>>>But OMPI sets a the port field to non-zero, the rdma_cma fails all the
>>>subsequent rdma_bind_addr() calls since the port is already in use.
>>>
>>>Perhaps this hack really is a workaround for a DAPL bug where somebodies
>>>dapl wasn't tracking port numbers correctly?
>>>      
>>>
>>Yep. My memory is dim, but I think that was OFED's DAPL, or it was in 
>>the generic part of DAPL that all implementations seem to share.
>>
>>As hinted by the comment (I wrote it by the way), I think the best 
>>solution would be if dat_ep_query() returned the port number correctly. 
>>  Most of uDAPL seems to just pass around pointers to internal data 
>>structures (which I'm not sure is the best idea in the world), so it 
>>didn't seem like a trivial fix to me at the time.  I remember 
>>considering reporting this as a bug, but I didn't because the uDAPL 
>>standard didn't seem to enforce any requirements on passing the port 
>>number around with the address, so it technically wasn't wrong.
>>
>>Was the OFED uDAPL code switched from something else to RDMA CM at some 
>>point?  I'm almost certain I was running fine on OFED's uDAPL at one 
>>point (in fact, a lot of the uDAPL BTL development I did was using the 
>>OFED stack).
>>    
>>
>
>Yes, the OFA uDAPL was changed from using the ib-cm to the rdma-cm a
>while back.  Perhaps you ran on the ib-cm version?  And, the rdma-cma
>started using port numbers and enforcing uniqueness even more recently I
>think.
>
>Perhaps Don Kerr has some insight on how the Sun uDAPL behaves?  Should
>OMPI still need this hack?
>  
>
 From what I recall, and Andrew can probably set me straight if I get 
this wrong. This hack was included because we were not able to pull the 
remote port from dat_ep_query. If dat_ep_query supplies that data then 
we could probably do away with the hack.

I have not heard back from the developer at Sun who implemented uDAPL 
for Solaris. My thought is that it was also based on the older ib-cm but 
will confirm. I submitted a bug against Solaris uDAPL to provide the 
port via dat_ep_query awhile back and it looks like it has been fixed, I 
just have not tested this because we weren't using it.

-DON

>
>Steve.
>
>  
>



More information about the general mailing list