[ofiwg] completion flags as actually defined by OFI
Jason Gunthorpe
jgunthorpe at obsidianresearch.com
Tue Apr 14 14:46:14 PDT 2015
On Tue, Apr 14, 2015 at 08:18:11PM +0000, Hefty, Sean wrote:
> > > I believe the providers support this guarantee.
> >
> > Including the clean shutdown case?
>
> For FI_TRANSMIT_COMPLETE:
>
> - The sockets and psm providers do not generate a completion until the remote side has processed the request and acknowledged the data.
> - Cisco needs to confirm the usnic provider behavior, but it's UD anyway. I believe it adheres to the description given for completions on unreliable endpoints.
> - Verbs does not generate a completion until the data has been
> - acked by the remote side, unless I'm remembering it wrong.
The definition I gave basically said the peer will see the completion
no matter what happens to the network after the local send completion
is delvered.
IB verbs has a tiny improbable (?) race with CM MADs on
shutdown. libfabric could probably cover this with some work.
The HCA's were never designed with this restriction in mind and
probably have tiny improbable races for various error cases too (like
we generated an ack, but then timeout and error the QP, does the app
see the completion?)
iWarp, I agree with Bernard, the completion is issued when the buffer
has been shuffled to the local LLP. Not when the LLP indicates it has
delivered it to the peer.
My suggestion for your man page (and resulting behavior) is:
*FI_COMPLETION*
: Indicates that a completion entry should be generated for data
transfer operations.
*FI_INJECT_COMPLETE*
: Indicates that a completion should be generated when the
source buffer(s) may be reused. FI_INJECT_COMPLETE guarentees that
the buffers will not be read from again and the application may
reclaim them.
Any of local failure, fabric failure, or peer local failure can
prevent the delivery of the peer's completion.
[ Mandatory, all must support this ]
*FI_DELIVERY_COMPLETE*
: For reliable:
Indicates that a completion should be generated when the work
request is delivered to the peer. FI_DELIVERY_COMPLETE guarentees
that the delivery of the peer's completion is no longer dependent on
the fabric or any local resources.
A peer local failure can prevent the delivery of the peer's
completion.
For unreliable:
Indicates that a completion should be generated when the work
request is delivered to the fabric and is no longer dependent on any
local resources. No peer completion is guarenteed.
A fabric failure, or peer local failure can prevent the delivery of
the peer's completion.
[IB does this 99% today, presumably sockets/etc are 100%, iWarp does
not support this]
*FI_COMMIT_COMPLETE*
: Indicates that a completion should not be generated until the
completion has been delivered to the peer, consumed by the
application and acknowledged to be complete.
[this needs more language, what api does the application use to
signal it completed the work?]
I've choosen language that talks specifically about the peer
completion - since this is what a high level app writer cares about.
Jason
More information about the ofiwg
mailing list