[ofa-general] Re: [OMPI devel] OMPI over ofed udapl - bugs opened

Andrew Friedley afriedle at open-mpi.org
Wed May 9 16:15:51 PDT 2007



Steve Wise wrote:
> There have been a series of discussions on the ofa general list about
> this issue, and the conclusion to date is that it cannot be resolved in
> the rdma-cm or iwarp-cm code of the linux rdma stack.  Mainly because
> sending an RDMA message involves the ULP's work queue and completion
> queue, so the CM cannot do this under the covers in a mannor that
> doesn't affect the application.  Thus, the applications must deal with
> this.

Why can't uDAPL deal with this?  As a uDAPL user, I really don't care 
what API uDAPL is using under the hood to move data from one place to 
another, nor the quirks of that API.  The whole point of uDAPL is to 
form a network-agnostic abstraction layer.  AFAIK, the uDAPL spec 
doesn't enforce any such requirement on RDMA communication either.  In 
my opinion, exposing such behavior above uDAPL is incorrect and is part 
of why uDAPL has seen limited adoption -- every single uDAPL 
implementation behaves in different ways, making it extremely difficult 
to write an application to work on any uDAPL implementation.  Sorry if 
this sounds harsh, but this comes from many hours of banging my head on 
the wall due to working around these sorts of problems :)

> 
> Here is a possible solution: 
> 
> I assume in OMPI that connections are only initiated when the mpi
> application does a send operation.   Given that, then udapl btl must
> ensure that if a given rank accepts a connection, it cannot not send
> anything until the rank at the other end of the connection sends first.
> Since the other side initiated the connection, it will have pending data
> to send...
> 
> I haven't looked into how painful this will be to implement.
> 
> Thoughts?

Following on what I wrote above, I think Open MPI is the wrong place to 
be dealing with this.  There's enough of these hacks as it is; I'm not 
interested in seeing more get added.

Andrew



More information about the general mailing list