[dat-discussions] [openib-general] [RFC] DAT2.0immediatedataproposal

Larsen, Roy K roy.k.larsen at intel.com
Tue Feb 7 14:04:10 PST 2006


>>> What is proposed in a definition of
>>> 'dat_ep_post_rdma_write_with_immediate'
>>> that can be implemented over iWARP using the sequence of messages
>>> that were intended to support the same purpose (i.e., letting the
>>> other side know that an RDMA Write transfer has been fully
received).
>>
>> No, iWARP *CAN NOT* implement write immediate data any better
>> than IB can implement send with invalidate.  Immediate data
>> *MUST* be indicated to the ULP unambiguously.  Imposing an
>> algorithm on the application to infer immediate data arrival
>> is hack, pure and simple. An application is free to perform a
>> write/send if that is the semantic they want.  Why does iWARP
>> get transport unique APIs but not IB?  I find this attempt to
>> bastardize the IB semantic of immediate data a little curious.
>>
>
>The transports aren't getting anything. Features are there for
>applications, especially when the feature can be defined in a
>way that makes sense without explaining transport mechanics.
>

APIs exist to gain access to transport services so of course it is all
about the transport.  Presumably the transport services were defined
because they seemed useful, but a transport service exists in a standard
somewhere before it is defined in DAPL.  I believe that the IB immediate
data service and semantic is useful and should be supported too.

>Completing a transaction, complete with supplying a transaction
>response and releasing the advertised STag associated with the
>transaction is something that makes sense in the application
>domain and conforms to normal DAT ordering rules.
>

I don't disagree.  And unambiguous immediate data indications fall into
that same category which is why I'm puzzled there is so much resistance.

>"Provide information about an RDMA Write to a receive operation"
>also meets that definition -- as long as it conforms to the
>existing ordering rules. Shifting to an 8 byte message over
>iWARP to allow for the write length *and* immediate 'tag'
>is certainly doable. We could even consider having the
>DAT Provider supply the 'buffer' silently in the DTO itself.
>

If you make the receive indication unambiguous as to the fact it's
associated with a write immediate, you've got my full support, even if
immediate data is delivered differently by different transports.  If
not, it is nothing more than a write/send that the application can do
itself.

>With that definition the consumer would get a receive completion
>that told them that their peer's RDMA Write had been successfully
>placed, how long it is (the length) and which one (a tag).
>
>I think that is of value. iWARP can implement it as two work
>requests and maintain the overall semantics.

If completion of the service is ambiguous, I strongly disagree.  The
application can do this with write/send now and with more flexibility.
True immediate indications are unambiguous and doesn't rely on the
contents of a receive buffer or its completion timing.  An application
must be able to perform "normal" send/receives of any size and content
simultaneously with RDMA write with immediate and without regard to when
they arrive.  The semantic proposed would put a constraint on how an
application could use the send/receive facility.  If an application can
live with such a constraint, it is free to use write/send now.  Those
that can't or would perform much better with a legitimate
write/immediate should be given access to the facility.

>
>Are you arguing that iWARP should NOT provide this service
>until it can do it in a single work request?

I'm arguing that an iWARP provider NOT support this service until it can
deliver immediate data indications unambiguously.

>It seems to
>me that allowing an extra work request and completion is
>a fairly simple accomodation as opposed to using an alternate
>algorithm in the main transaction processing of the application.
>
>If we enable the applicatin can query how a remote write
>with immediate will complete outside of the transaction loop
>then we can allow the application to have *no* overhead inside
>the main transaction loop, and *identical* logic on the sending
>side.

I would contend that placing constraints on what and when an application
can send "normal" data just to use write immediate is far far worse. And
all just too basically save one extra function call.

>
>And IB *could* implement send with invalidate by simply agreeing
>on how the RKey to be invalidated is communicated between the
>IB providers (perhaps as an immediate).

I'm afraid I don't follow.  If you're talking about providers setting up
there own private EPs to communicate, perhaps that's a solution for
iWARP providers to supply unambiguous immediate data indications....

>
>But more to the point, I don't see how the more flexible
>definition of write with immediate negatively impacts the
>IB implementation of the feature. IB providers do not need
>to allow for the extra work requests. They are not being
>asked to place the immediate data into the receive buffer,
>or to do any extra work at all.
 
This is not about extra work requests or the initiating API.  It's about
a very poor/non-existent indication semantic and the neutering of a
legitimate one.  Surely you would not allow the semantics of write with
invalidate to be relaxed or changed to support an emulation, right?

Look, if an API provides a semantic that allows ambiguous services and
leaves it as an exercise to the application to figure out the service
has been rendered, it is a hack.  I'm surprise that even has to be
argued.  Transports that support true write with immediate do not make
the immediate indication ambiguous.  The application is free to use the
receive queue for any other combination of receive operations.  A
legitimate write immediate service does not put usage constraints on the
receive queue.  I could not be convinced otherwise, so if those
proposing such a constrained semantic feel the same, I'll consider this
thread dead.



More information about the general mailing list