[ofiwg] completion flags as actually defined by OFI

Jason Gunthorpe jgunthorpe at obsidianresearch.com
Tue Apr 14 14:46:14 PDT 2015


On Tue, Apr 14, 2015 at 08:18:11PM +0000, Hefty, Sean wrote:
> > > I believe the providers support this guarantee.
> > 
> > Including the clean shutdown case?
> 
> For FI_TRANSMIT_COMPLETE:
> 
> - The sockets and psm providers do not generate a completion until the remote side has processed the request and acknowledged the data.
> - Cisco needs to confirm the usnic provider behavior, but it's UD anyway.  I believe it adheres to the description given for completions on unreliable endpoints.

> -  Verbs does not generate a completion until the data has been
> -  acked by the remote side, unless I'm remembering it wrong.

The definition I gave basically said the peer will see the completion
no matter what happens to the network after the local send completion
is delvered.

IB verbs has a tiny improbable (?) race with CM MADs on
shutdown. libfabric could probably cover this with some work.

The HCA's were never designed with this restriction in mind and
probably have tiny improbable races for various error cases too (like
we generated an ack, but then timeout and error the QP, does the app
see the completion?)

iWarp, I agree with Bernard, the completion is issued when the buffer
has been shuffled to the local LLP. Not when the LLP indicates it has
delivered it to the peer.

My suggestion for your man page (and resulting behavior) is:

*FI_COMPLETION*
: Indicates that a completion entry should be generated for data
  transfer operations.

*FI_INJECT_COMPLETE*
: Indicates that a completion should be generated when the
  source buffer(s) may be reused.  FI_INJECT_COMPLETE guarentees that
  the buffers will not be read from again and the application may
  reclaim them.

  Any of local failure, fabric failure, or peer local failure can
  prevent the delivery of the peer's completion.

  [ Mandatory, all must support this ]

*FI_DELIVERY_COMPLETE*
: For reliable:
  
  Indicates that a completion should be generated when the work
  request is delivered to the peer. FI_DELIVERY_COMPLETE guarentees
  that the delivery of the peer's completion is no longer dependent on
  the fabric or any local resources.

  A peer local failure can prevent the delivery of the peer's
  completion.

  For unreliable:

  Indicates that a completion should be generated when the work
  request is delivered to the fabric and is no longer dependent on any
  local resources. No peer completion is guarenteed.

  A fabric failure, or peer local failure can prevent the delivery of
  the peer's completion.

  [IB does this 99% today, presumably sockets/etc are 100%, iWarp does
   not support this]

*FI_COMMIT_COMPLETE*
: Indicates that a completion should not be generated until the
  completion has been delivered to the peer, consumed by the
  application and acknowledged to be complete.
  [this needs more language, what api does the application use to
   signal it completed the work?]

I've choosen language that talks specifically about the peer
completion - since this is what a high level app writer cares about.

Jason



More information about the ofiwg mailing list