[dat-discussions] [openib-general] [RFC] DAT2.0immediatedataproposal

Thu Feb 9 12:38:57 PST 2006

Why both Immediate Data and the Stag which was used for RDMA Write?
Immediate data already contains info in response to what operation
the RDMA Write has completed locally.

Stag would make sence if Stag invalidation also put in the mix.

But for MPI RMR_context have a long lifecycle so not clear which
apps will be interested in combining Invalidation with RDMA Write with
Immediate data.

Arkady Kanevsky                       email: arkady at netapp.com
Network Appliance Inc.               phone: 781-768-5395
1601 Trapelo Rd. - Suite 16.        Fax: 781-895-1195
Waltham, MA 02451                   central phone: 781-768-5300

> -----Original Message-----
> From: Caitlin Bestler [mailto:caitlinb at broadcom.com] 
> Sent: Tuesday, February 07, 2006 3:03 PM
> To: Larsen, Roy K; dat-discussions at yahoogroups.com; Arlin 
> Davis; Hefty, Sean
> Cc: openib-general at openib.org
> Subject: RE: [dat-discussions] [openib-general] [RFC] 
> DAT2.0immediatedataproposal
> 
> openib-general-bounces at openib.org wrote:
> > Caitlin Bestler wrote:
> >> 
> >> Arlin Davis wrote:
> >>> Sean Hefty wrote:
> >>> 
> >>>>> The requirement is to provide an API that supports RDMA writes 
> >>>>> with immediate data.  A send that follows an RDMA write is not 
> >>>>> immediate data, and the API should not be constructed around 
> >>>>> trying to make it so.
> >>>>> 
> >>>>> 
> >>>> 
> >>>> To be clear, I believe that write with immediate should 
> be part of 
> >>>> the normal APIs, rather than an extension, but should be 
> designed 
> >>>> around those devices that provide it natively.
> >>>> 
> >>>> 
> >>> I totally agree. A standard RDMA write with immediate API can be 
> >>> very useful to RDMA applications based on the requirements (native
> >>> support) set forth in my earlier email. It is analogous to the new
> >>> dat_ep_post_send_with_invalidate() call; a call that supports a 
> >>> native iWARP transport operation but provides no 
> provisions to help 
> >>> other transports emulate. So, other transports simply return 
> >>> NOT_SUPPORTED and add it natively in the future if it makes sense.
> >>> 
> >>> -arlin
> >> 
> >> What is proposed in a definition of
> >> 'dat_ep_post_rdma_write_with_immediate'
> >> that can be implemented over iWARP using the sequence of messages 
> >> that were intended to support the same purpose (i.e., letting the 
> >> other side know that an RDMA Write transfer has been fully 
> received).
> > 
> > No, iWARP *CAN NOT* implement write immediate data any 
> better than IB 
> > can implement send with invalidate.  Immediate data
> > *MUST* be indicated to the ULP unambiguously.  Imposing an 
> algorithm 
> > on the application to infer immediate data arrival is hack, 
> pure and 
> > simple. An application is free to perform a write/send if 
> that is the 
> > semantic they want.  Why does iWARP get transport unique 
> APIs but not 
> > IB?  I find this attempt to bastardize the IB semantic of immediate 
> > data a little curious.
> > 
> 
> The transports aren't getting anything. Features are there 
> for applications, especially when the feature can be defined 
> in a way that makes sense without explaining transport mechanics.
> 
> Completing a transaction, complete with supplying a 
> transaction response and releasing the advertised STag 
> associated with the transaction is something that makes sense 
> in the application domain and conforms to normal DAT ordering rules.
> 
> "Provide information about an RDMA Write to a receive operation"
> also meets that definition -- as long as it conforms to the 
> existing ordering rules. Shifting to an 8 byte message over 
> iWARP to allow for the write length *and* immediate 'tag'
> is certainly doable. We could even consider having the DAT 
> Provider supply the 'buffer' silently in the DTO itself.
> 
> With that definition the consumer would get a receive 
> completion that told them that their peer's RDMA Write had 
> been successfully placed, how long it is (the length) and 
> which one (a tag).
> 
> I think that is of value. iWARP can implement it as two work 
> requests and maintain the overall semantics.
> 
> Are you arguing that iWARP should NOT provide this service 
> until it can do it in a single work request? It seems to me 
> that allowing an extra work request and completion is a 
> fairly simple accomodation as opposed to using an alternate 
> algorithm in the main transaction processing of the application.
> 
> If we enable the applicatin can query how a remote write with 
> immediate will complete outside of the transaction loop then 
> we can allow the application to have *no* overhead inside the 
> main transaction loop, and *identical* logic on the sending side.
> 
> And IB *could* implement send with invalidate by simply 
> agreeing on how the RKey to be invalidated is communicated 
> between the IB providers (perhaps as an immediate).
> 
> But more to the point, I don't see how the more flexible 
> definition of write with immediate negatively impacts the IB 
> implementation of the feature. IB providers do not need to 
> allow for the extra work requests. They are not being asked 
> to place the immediate data into the receive buffer, or to do 
> any extra work at all.
> 
> 
> 
>  
> Yahoo! Groups Links
> 
> <*> To visit your group on the web, go to:
>     http://groups.yahoo.com/group/dat-discussions/
> 
> <*> To unsubscribe from this group, send an email to:
>     dat-discussions-unsubscribe at yahoogroups.com
> 
> <*> Your use of Yahoo! Groups is subject to:
>     http://docs.yahoo.com/info/terms/
>  
> 
>