[ofiwg] definition of a completion in OFI

Jason Gunthorpe jgunthorpe at obsidianresearch.com
Wed Mar 4 16:38:48 PST 2015

On Thu, Mar 05, 2015 at 12:15:43AM +0000, Hefty, Sean wrote:
> > What I read in your message is you have a connection oriented test,
> > but no support for end of stream in the library, so I wouldn't ever
> > expect that to work properly without races.
> Okay, this is another option -- change the test.  :)
> The test is polling for all completions, calling shutdown, then
> close.  It is not waiting for a shutdown event to be generated.

If you have a thing called 'shutdown' or 'close' in your API, then it
should be broadly similar to the TCP version, otherwise it isn't
shutdown. Call it 'destroy' or something.

This means there has to be an end of stream mark.

> However, an app using an unconnected endpoint should have some
> reasonable expectation of things working.  Shutdown isn't usable
> here.

It makes sense to me that unconnected end points should try very hard
to put a packet on the wire after generating a completion. Ie in your
sockets land you can still early complete and copy to a hidden buffer,
but the buffer and socket must linger after close until packets hit
the wire. This is essentially what TCP sockets do.

> > Completion processing itself really has nothing to do with clean
> > shutdown, and completions should only be expected to say the
> > provider is done with the resource.
> Then would you agree with applying my previous definition (slightly
> modified) for fi_shutdown to fi_close?
> fi_close() - For reliable endpoints, blocks until all requests have
> been responded to by their respective destination(s), or the
> requests have timed out.  For unreliable endpoints, indicates that
> all requests have successfully been transmitted into the fabric.

Really, if there is no such thing as an 'end of stream mark' then
there is little sense in trying to talk about close or shutdown - you
can't create anything resembling those conditions from sockets.

I think what you have is a unilateral destruction of the socket:
 - UD cases try very hard to put completed sends on the wire
 - Stop generating new recv completions, and retire all
   unused WQEs
 - No gurantees of data transfer and no synchronous end of stream for

For RC the other side perceives this as an asynchronous notification
that the QP is destroy'd - either beacuse it got the CM mad, or
because it got an error for a data packet. This is not an end of
stream mark because it is not synchronous with the data transfer.

> > Not supporting clean shutdown is fine, it just means apps have to
> > handle shutdown in-band, like IB already requires. Send an in-band
> > shutdown message and when the far side echo's an ACK then things can
> > be torn down. [And of course this is far more difficult with the new
> > message ordering modes]
> In-band shutdown seems like it would suffer from the same problem.
> The sender of the ACK posts the send, then tears things down before
> the ACK is sent.

The tear down should be done using CM MADs, which provide the last bit
of synchronization to cover off that case.

But the inband sequence is required first so the CM MAD's don't race
with the QP generated packets.

This is very tricky and should be hidden and automatic for people who
need it..


More information about the ofiwg mailing list