[ofiwg] definition of a completion in OFI
jgunthorpe at obsidianresearch.com
Wed Mar 4 14:13:23 PST 2015
On Wed, Mar 04, 2015 at 09:46:54PM +0000, Hefty, Sean wrote:
> I'm not sure if all implementations can adhere to this definition.
> I looked in the iWarp and IB specs, but I couldn't find any specific
> definition of what it means for an app to retrieve a completion.
completion means your first defintion, the memory will no longer be
For IB this means the data transfer at the HCA is complete, but it
doesn't say very much about the remote application.
For instance, it doesn't mean a receive completion has been, or ever
will be generated. So from that perspective, it is not any different
than your existing sockets case.
> A second solution is to add the behavior defined above to another
> call or event, such as fi_shutdown. For example:
> fi_shutdown() - For reliable endpoints, blocks until all operations
> and their associated data have been acked by the destination. For
> unreliable endpoints, indicates that all requests have successfully
> been transmitted into the fabric.
Essentially what you are talking about here is adding a synchronous
'end of stream mark' like TCP - the mark pushes through all data and
becomes visible as a guarenteed event on the far side.
Realistically, most apps that can close a connection need something
like this. I have fought with races in IB CM land in exactly this
area, and it is very difficult to resolve.
Typically shutdown has to be unidirectional, ie I stop my send side
but can continue to read until I see the other end's end of stream
mark. Otherwise typical app patterns are difficult to realize.
> In this case, calling fi_close() without fi_shutdown() will abruptly
> close the endpoint. There may need to be other constraints for
> fi_shutdown(), such as the app must ensure that all requests have
Close should not be different from bidirectional shutdown:
- Data sent prior to close is delivered if the far side is
willing to receive.
- The far send sees an error when it attempts to send
And in the RDMA case this probably means close would have to be
blocking, and then followed with a sendq and recvq flush and resource
You can have a fi_drop_senq or something to forcibly dump the sendq
prior to calling close to get more like the current abrupt semantics.
More information about the ofiwg