[OMPI devel] OMPI over OFA udapl (was Re: [ofa-general] OpenMPI and RDMA-CM)

Steve Wise swise at opengridcomputing.com
Tue May 8 12:52:59 PDT 2007


On Tue, 2007-05-08 at 13:57 -0400, Andrew Friedley wrote:
> Steve Wise wrote:
> >> Well I've tried OMPI on ofed-1.2 udapl today and it doesn't work.  I'm
> >> debugging now.
> >>
> > 
> > Here's part of the problem (from ompi/btl/udapl/btl_udapl.c):
> > 
> >     /* TODO - big bad evil hack! */
> >     /* uDAPL doesn't ever seem to keep track of ports with addresses.  This
> >        becomes a problem when we use dat_ep_query() to obtain a remote address
> >        on an endpoint.  In this case, both the DAT_PORT_QUAL and the sin_port
> >        field in the DAT_SOCK_ADDR are 0, regardless of the actual port. This is
> >        a problem when we have more than one uDAPL process per IA - these
> >        processes will have exactly the same address, as the port is all
> >        we have to differentiate who is who.  Thus, our uDAPL EP -> BTL EP
> >        matching algorithm will break down.
> > 
> >        So, we insert the port we used for our PSP into the DAT_SOCK_ADDR for
> >        this IA.  uDAPL then conveniently propagates this to where we need it.
> >      */
> >     ((struct sockaddr_in*)attr.ia_address_ptr)->sin_port = htons(port);
> >     ((struct sockaddr_in*)&btl->udapl_addr.addr)->sin_port = htons(port);
> > 
> > The OMPI code stuffs the port chosen by udapl for a listening endpoint
> > into the ia address memory (which is owned by the udapl layer btw).
> > There's a slight problem with that:  The OFA udapl openib_cma code binds
> > cm_id's to this ia_address regularly.  When an hca is opened, a cm_id is
> > bound to this address to obtain the local hca port number and gid that
> > is being used.  In addition, a cm_id is bound to this address each time
> > an endpoint is created (either at ep_create time or ep_connect time).
> > So that ia_address field is used by the dapl cm to create local
> > cm_ids...  Since the port was always zero, the rmda-cma would choose a
> > unique port for each cm_id bound to that address.   
> > 
> > But OMPI sets a the port field to non-zero, the rdma_cma fails all the
> > subsequent rdma_bind_addr() calls since the port is already in use.
> > 
> > Perhaps this hack really is a workaround for a DAPL bug where somebodies
> > dapl wasn't tracking port numbers correctly?
> 
> Yep. My memory is dim, but I think that was OFED's DAPL, or it was in 
> the generic part of DAPL that all implementations seem to share.
> 
> As hinted by the comment (I wrote it by the way), I think the best 
> solution would be if dat_ep_query() returned the port number correctly. 
>   Most of uDAPL seems to just pass around pointers to internal data 
> structures (which I'm not sure is the best idea in the world), so it 
> didn't seem like a trivial fix to me at the time.  I remember 
> considering reporting this as a bug, but I didn't because the uDAPL 
> standard didn't seem to enforce any requirements on passing the port 
> number around with the address, so it technically wasn't wrong.
> 
> Was the OFED uDAPL code switched from something else to RDMA CM at some 
> point?  I'm almost certain I was running fine on OFED's uDAPL at one 
> point (in fact, a lot of the uDAPL BTL development I did was using the 
> OFED stack).

Yes, the OFA uDAPL was changed from using the ib-cm to the rdma-cm a
while back.  Perhaps you ran on the ib-cm version?  And, the rdma-cma
started using port numbers and enforcing uniqueness even more recently I
think.

Perhaps Don Kerr has some insight on how the Sun uDAPL behaves?  Should
OMPI still need this hack?


Steve.




More information about the general mailing list