[ofiwg] definition of a completion in OFI

Wed Mar 4 15:19:39 PST 2015

> In the case of IB, the completion behavior is exactly as you describe
> below as a matter of specification for both reliable and unreliable
> services.

Can you clarify which behavior?  I specified more than one.  :)

For bonus credit, can you point to the IB spec location where this is clarified?

> To me, if the provider accepts the data for transmission, it should not
> generate a completion until it has in fact completed the operation.  For a
> reliable service, that means it gets all its acks back, for unreliable it
> means that all data has been put on the wire.

Well, that's the problem.  Define "complete".  It's reasonable for someone to think that the completion of an RDMA write means that the data has been written into the remote memory.

Anyway, it sounds like you're in agreement with my 'improved' definition of a completion.

> I believe that the behavior you are describing below is characteristic of
> sockets, and I imagine that a sockets application expects this kind of
> shortcut and can deal with it.   But in our case sockets is simply another
> provider that lives below the OFI API.  I suggest that we require the
> behavior that is exposed to the application to be consistent (i.e. no
> completion until the data is actually transmitted) and therefore force the
> sockets provider to deal with the fact that it somehow has to actually
> transfer the data before it signals a completion.

It's really a characteristic of the implementation.  The sockets provider is meeting the behavior defined for a completion.  The question is should that definition change?  The stronger that we make the definition, the more requirements that get placed on the implementation.  I want to ensure that the proposal is doable with existing HW.

- Sean