[ofiwg] definition of a completion in OFI

Paul Grun grun at cray.com
Thu Mar 5 10:37:22 PST 2015

-----Original Message-----
From: Hefty, Sean [mailto:sean.hefty at intel.com] 
Sent: Wednesday, March 04, 2015 3:20 PM
To: Paul Grun; ofiwg at lists.openfabrics.org
Subject: RE: definition of a completion in OFI

> In the case of IB, the completion behavior is exactly as you describe 
> below as a matter of specification for both reliable and unreliable 
> services.

Can you clarify which behavior?  I specified more than one.  :)

[PG] I was being lazy.  You described a behavior for reliable services and one for unreliable services.  Both are correct, depending on the service selected.

In IB terms:
For reliable services, a WQE is not considered complete until all necessary responses have arrived.  (compliance statement C9-60).
For unreliable services, the requester shall consider a Message Send or RDMA WRITE complete when the last byte has been committed to the wire (or an unrecoverable error occurs).  (Compliance statement C9-180).

For bonus credit, can you point to the IB spec location where this is clarified?
[PG] See above.  How shall I collect my bonus credit?

> To me, if the provider accepts the data for transmission, it should 
> not generate a completion until it has in fact completed the 
> operation.  For a reliable service, that means it gets all its acks 
> back, for unreliable it means that all data has been put on the wire.

Well, that's the problem.  Define "complete".  It's reasonable for someone to think that the completion of an RDMA write means that the data has been written into the remote memory.

[PG] I don't think that's a reasonable definition of completion, but the point is arguable.   The definitions that IB uses for the requester side are the ones given above.  The responder side may be a different story.  I think we should adopt the same definitions as used by IB.

Anyway, it sounds like you're in agreement with my 'improved' definition of a completion.

> I believe that the behavior you are describing below is characteristic 
> of sockets, and I imagine that a sockets application expects this kind of
> shortcut and can deal with it.   But in our case sockets is simply another
> provider that lives below the OFI API.  I suggest that we require the 
> behavior that is exposed to the application to be consistent (i.e. no 
> completion until the data is actually transmitted) and therefore force 
> the sockets provider to deal with the fact that it somehow has to 
> actually transfer the data before it signals a completion.

It's really a characteristic of the implementation.  The sockets provider is meeting the behavior defined for a completion.  The question is should that definition change?  The stronger that we make the definition, the more requirements that get placed on the implementation.  I want to ensure that the proposal is doable with existing HW.

[PG] I believe that for OFI, a completion as presented to the application above us should always be the one(s) given above, which are service dependent.  If that means that the sockets provider has to do special stuff under the covers, so be it.

- Sean 

More information about the ofiwg mailing list