[dat-discussions] [openib-general] [RFC] DAT2.0immediatedataproposal

Caitlin Bestler caitlinb at broadcom.com
Thu Feb 9 11:40:26 PST 2006


openib-general-bounces at openib.org wrote:
> Roland Dreier wrote:
> 
>> 
>> Hmm.  Can you put a number on how much better RDMA write with
>> immediate is on current HCA hardware?  How does using the underlying
>> OpenIB verbs ability to post a list of work requests compare (ie
>> posting an RDMA write followed by a send in one verbs call)?
>> Maybe "post multiple" is a better direction for DAT.
>> 
>> 
> With post multiple, unlike immediate data, you don't have the
> ability to distinguish between a normal receive and a rdma
> write completion indication on the other end. This is the
> uniqueness of the service that cannot be provided by the post
> multiple. Yes, post multiple would be a nice option for DAT
> it is just a different service. It would also be required to
> conform to the semantics rules of the bundled operations so
> you could not do any optimization tricks under the covers
> with an IB rdma_write_immediate operation.
> 

A post_multiple also requires defining a single "DTO" data 
structure. If the post multiple is atomic (meaning all make
it or none do) then it requires an intermediate data structure
to have been created. If it is not atomic there really isn't
reason for it to not just be a utility function layered 
above DAT.

What I'm not seeing with the immediate is this urgent need
by the application to be able to use the same 32-bit value
for both an immediate and a 4 byte message that requires
an entire additional API just to support it.  Why can't
the application just add a bool to the send message?
Or encode the 32-bits so that they come from disjoint
domains?

There seems to be agreement that a consolidated write-and-send
call would enable the application to get the benefits of
rdma write with immediate whenever the application could
distinguish the two.

I cannot see why doing this is almost free for virtually
all applications, and trivial for the remainder. Adding
and documenting an extra call to deal with such an
extreme corner case that is being presented only in
the abstract is just not justified. This extra capability
has to have enough functionality for enough applications
to justify keeping it on the books, writing test cases
for it, etc.

We already made a similar decision in having a 128-bit
IA Address. That means we cannot support a host that
interfaces to the Internet with IPv6 and an InfiniBand
network that not only had global GIDs, but allocated
a global subnetwork a network id that was already in
use as a valid public IPv6 network.

The complexity of dealing with an IA Address that was
128+1 bits was simply not jusitified to deal with
an extreme corner case that could very easily be
avoided (there is no shortage of "site local" network
IDs in the IPv6/GID format, so using a global network
prefix that was disjoint from the official IPv6 
hierarchy would be just plain silly).

So far I haven't seen any explanation as to why an
application has a need to encode this 33rd bit of
their message in this terribly transport specific
matter. Is there some severe performance penalty
to slightly restructuring the send message so that
it is no longer ambiguous with the immeidate data?




More information about the general mailing list