[ofiwg] definition of a completion in OFI

Underwood, Keith D keith.d.underwood at intel.com
Wed Mar 4 20:46:34 PST 2015


The problem is not with the definition of completion at the initiator, but with the definition (or implementation) of the target.  And, in your example, I suspect the target doesn't really "hang", but more like "hangs until the TCP connection times out".  If you don't like the length of the time-out, I'm pretty sure the provider could set an option for how long it will wait ;-)

Local completion is a Good Thing(TM).  In fact, MPI defines it that way for MPI_Send and MPI_Isend.  Interfaces in MPI that don't have a local completion option are trying to add one.  There should be an option in OFI to ask for local completion and mean local completion.  Local completion should not require remote completion.  Why would it?  MPI doesn't ask for that semantic.

There should also be an option for remote completion.  After all, there are times when that is what you want.  

All of the "clean shutdown" options feel like band-aids.  What are you going to do when App1 segfaults?  What about when it is passed SIGKILL?  

And, btw, on many large scale systems, SIGTERM is the way all apps go down. And, if the app tries to get fancy with atexit() stuff?  SIGKILL is on its way.  At large scale, you cannot get the CM in the loop for all of those either.

Now, imagine that hundreds of thousands of app processes go down with SIGTERM and they were all talking to some I/O servers...  You don't want those I/O servers waiting for all of those clients to go away nicely.  And you aren't sending the I/O server a SIGTERM.  So, the I/O server has to have some form of "give up, it isn't getting here" associated with those in flight operations.

Keith

> -----Original Message-----
> From: ofiwg-bounces at lists.openfabrics.org [mailto:ofiwg-
> bounces at lists.openfabrics.org] On Behalf Of Hefty, Sean
> Sent: Wednesday, March 04, 2015 4:47 PM
> To: ofiwg at lists.openfabrics.org
> Subject: [ofiwg] definition of a completion in OFI
> 
> I'm seeing a problem running fabtests over the sockets provider that is
> exposing an issue in what it means for an operation to be complete.  As
> defined, a completion means "that the application's buffers may be re-used".
> This seems like a minimal definition that would work with any
> implementation, but it leads to this issue:
> 
> App 1 issues a send to app 2.
> Provider 1 queues the send, making use of internal buffering.
> Provider 1 generates a completion.
> App 1 exits.
> Data from app 1 is discarded or lost
> 
> The result is app 2 hangs waiting for data that never shows up.  (This
> becomes a 2-armies problem.)
> 
> I see a couple of solutions for this.  One is to provide stronger requirements
> on when a completion can be generated, such as:
> 
> Completion: For reliable requests, indicates that the operation and its
> associated data has been acknowledged by the destination.  For unreliable
> requests, indicates that the request has successfully been transmitted into
> the fabric.
> 
> I'm not sure if all implementations can adhere to this definition.  I looked in
> the iWarp and IB specs, but I couldn't find any specific definition of what it
> means for an app to retrieve a completion.
> 
> A second solution is to add the behavior defined above to another call or
> event, such as fi_shutdown.  For example:
> 
> fi_shutdown() - For reliable endpoints, blocks until all operations and their
> associated data have been acked by the destination. For unreliable
> endpoints, indicates that all requests have successfully been transmitted into
> the fabric.
> 
> In this case, calling fi_close() without fi_shutdown() will abruptly close the
> endpoint.  There may need to be other constraints for fi_shutdown(), such as
> the app must ensure that all requests have completed.
> 
> Thoughts?
> 
> - Sean
> _______________________________________________
> ofiwg mailing list
> ofiwg at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/ofiwg



More information about the ofiwg mailing list