[dat-discussions] [openib-general] [RFC] DAT2.0immediatedataproposal
Caitlin Bestler
caitlinb at broadcom.com
Thu Feb 9 12:50:40 PST 2006
openib-general-bounces at openib.org wrote:
>>>> Hmm. Can you put a number on how much better RDMA write with
>>>> immediate is on current HCA hardware? How does using the
>>>> underlying OpenIB verbs ability to post a list of work requests
>>>> compare (ie posting an RDMA write followed by a send in one verbs
>>>> call)? Maybe "post multiple" is a better direction for DAT.
>>>>
>>>>
>>> With post multiple, unlike immediate data, you don't have the
>>> ability to distinguish between a normal receive and a rdma write
>>> completion indication on the other end. This is the uniqueness of
>>> the service that cannot be provided by the post multiple. Yes, post
>>> multiple would be a nice option for DAT it is just a different
>>> service. It would also be required to conform to the semantics
>>> rules of the bundled operations so you could not do any
>>> optimization tricks under the covers with an IB
>>> rdma_write_immediate operation.
>>>
>>
>> A post_multiple also requires defining a single "DTO" data structure.
>> If the post multiple is atomic (meaning all make it or none do) then
>> it requires an intermediate data structure to have been created. If
>> it is not atomic there really isn't reason for it to not just be a
>> utility function layered above DAT.
>
> That is very good point. And since the emulated immediate
> data service can't make the atomic guarantee it is the killer
> argument for just making the service plain - a potentially more
> efficient write/send.
>
>>
>> What I'm not seeing with the immediate is this urgent need by the
>> application to be able to use the same 32-bit value for both an
>> immediate and a 4 byte message that requires an entire additional API
>> just to support it. Why can't the application just add a bool to
>> the send message? Or encode the 32-bits so that they come from
>> disjoint domains?
>
> Some applications can do as you suggest. Some applications
> can make good use of unambiguous indications where the buffer
> size, content, or arrival timing is not constrained. Some
> don't need write notification at all. What's your point?
>
>>
>> There seems to be agreement that a consolidated write-and-send call
>> would enable the application to get the benefits of rdma write with
>> immediate whenever the application could distinguish the two.
>
> Well, I think there is agreement that *some* applications can
> use write-and-send in a beneficial way. But then again,
> nothing prevents them from doing that now. They do not need
> an additional API. But again, I don't have an issue with
> defining a helper function. I do have an issue with defining
> an API and semantic that says the target side needs to be
> coded in a way to always deal with both "true" immediate data
> and emulation. Just define a write/send helper API and the
> UPL can be coded in a consistent manner if that is a
> beneficial service. If a true unambiguous indication service
> is more beneficial or required, it can use the extension and
> accept the extra complexity. To demand extra complexity in
> applications that obviously don't need the true immediate
> data semantic is just wrong in my option.
>
>>
>> I cannot see why doing this is almost free for virtually all
>> applications, and trivial for the remainder. Adding and documenting
>> an extra call to deal with such an extreme corner case that is being
>> presented only in the abstract is just not justified. This extra
>> capability has to have enough functionality for enough applications
>> to justify keeping it on the books, writing test cases for it, etc.
>
> All we're asking is that a write/send combined API not be
> called immediate data unless it fits the semantics of
> immediate data. I am puzzled at the resistance this is
> getting. There is a standards body specification for
> immediate data. If it is not followed, don't call it
> immediate data. It's that simple. For those transports that
> can provide the service, the UPL may be able to gain access to it
> through an extension.
>
I have no objection to calling this
"dat_ep_post_rdma_write_with_notifier"
and labelling the 32-bit data as a "notifier tag".
Even on iWARP transports small send data can be in-lined,
avoiding the need for buffers to be registered. A special
API where the length of the "send buffer" is known in
advance makes this even easier.
What I still fail to see is a rationale that works down
from the application layer on why an application would
need still one more page in their cookbook. Creating an
entire new method to enable a strange method of signalling
one bit of information to the other end doesn't seem like
much of a payoff to me.
More information about the general
mailing list