[dat-discussions] [openib-general] [RFC] DAT2.0immediatedataproposal

Thu Feb 9 12:50:40 PST 2006

openib-general-bounces at openib.org wrote:
>>>> Hmm.  Can you put a number on how much better RDMA write with
>>>> immediate is on current HCA hardware?  How does using the
>>>> underlying OpenIB verbs ability to post a list of work requests
>>>> compare (ie posting an RDMA write followed by a send in one verbs
>>>> call)? Maybe "post multiple" is a better direction for DAT.
>>>> 
>>>> 
>>> With post multiple, unlike immediate data, you don't have the
>>> ability to distinguish between a normal receive and a rdma write
>>> completion indication on the other end. This is the uniqueness of
>>> the service that cannot be provided by the post multiple. Yes, post
>>> multiple would be a nice option for DAT it is just a different
>>> service. It would also be required to conform to the semantics
>>> rules of the bundled operations so you could not do any
>>> optimization tricks under the covers with an IB
>>> rdma_write_immediate operation. 
>>> 
>> 
>> A post_multiple also requires defining a single "DTO" data structure.
>> If the post multiple is atomic (meaning all make it or none do) then
>> it requires an intermediate data structure to have been created. If
>> it is not atomic there really isn't reason for it to not just be a
>> utility function layered above DAT.
> 
> That is very good point.  And since the emulated immediate
> data service can't make the atomic guarantee it is the killer
> argument for just making the service plain - a potentially more
> efficient write/send. 
> 
>> 
>> What I'm not seeing with the immediate is this urgent need by the
>> application to be able to use the same 32-bit value for both an
>> immediate and a 4 byte message that requires an entire additional API
>> just to support it.  Why can't the application just add a bool to
>> the send message? Or encode the 32-bits so that they come from
>> disjoint domains? 
> 
> Some applications can do as you suggest.  Some applications
> can make good use of unambiguous indications where the buffer
> size, content, or arrival timing is not constrained.  Some
> don't need write notification at all.  What's your point?
> 
>> 
>> There seems to be agreement that a consolidated write-and-send call
>> would enable the application to get the benefits of rdma write with
>> immediate whenever the application could distinguish the two.
> 
> Well, I think there is agreement that *some* applications can
> use write-and-send in a beneficial way.  But then again,
> nothing prevents them from doing that now.  They do not need
> an additional API.  But again, I don't have an issue with
> defining a helper function.  I do have an issue with defining
> an API and semantic that says the target side needs to be
> coded in a way to always deal with both "true" immediate data
> and emulation.  Just define a write/send helper API and the
> UPL can be coded in a consistent manner if that is a
> beneficial service.  If a true unambiguous indication service
> is more beneficial or required, it can use the extension and
> accept the extra complexity.  To demand extra complexity in
> applications that obviously don't need the true immediate
> data semantic is just wrong in my option.
> 
>> 
>> I cannot see why doing this is almost free for virtually all
>> applications, and trivial for the remainder. Adding and documenting
>> an extra call to deal with such an extreme corner case that is being
>> presented only in the abstract is just not justified. This extra
>> capability has to have enough functionality for enough applications
>> to justify keeping it on the books, writing test cases for it, etc.
> 
> All we're asking is that a write/send combined API not be
> called immediate data unless it fits the semantics of
> immediate data.  I am puzzled at the resistance this is
> getting.  There is a standards body specification for
> immediate data.  If it is not followed, don't call it
> immediate data.  It's that simple.  For those transports that
> can provide the service, the UPL may be able to gain access to it
> through an extension. 
> 

I have no objection to calling this
"dat_ep_post_rdma_write_with_notifier"
and labelling the 32-bit data as a "notifier tag".

Even on iWARP transports small send data can be in-lined,
avoiding the need for buffers to be registered. A special
API where the length of the "send buffer" is known in 
advance makes this even easier.

What I still fail to see is a rationale that works down
from the application layer on why an application would
need still one more page in their cookbook. Creating an
entire new method to enable a strange method of signalling
one bit of information to the other end doesn't seem like
much of a payoff to me.