[ofiwg] send/recv "credits"

Wed Sep 24 12:36:19 PDT 2014

> Providing a mechanism for the app to efficiently query available credits is
> perhaps the easiest, along with the "max credits" a send/recv can consume.
> Reserving max_credits is much more palatable than dividing by max_credits,
> and that would work for us.  This could even be collapsed into something
> like "fi_ep_ok_to_send(num_sends)" which returns true if the EP has space
> to post num_sends maximally sized sends.

How would this usage model differ significantly from the data transfer call simply returning FI_EGAIN?

> In your opinion, would the apps that want to do their own credit management
> be content to let libfabric maintain the credit calculations, but make them
> queryable.  Thus, the app could always do "if (fi_ep_get_send_credits() >=
> min_credits) ..."  min_credits would be an attribute returned when ep is
> created.  I think this is my current favorite approach.
> fi_ep_get_send_credits() would likely just be an inline that returns a
> value from the ep_fid.
> 
> I agree having post operations return the number of credits consumed and
> completions report credits returned is also viable, just seems a bit more
> complex both for app and lib.
> 
> Providers like sockets could set credits to 1, min_credits to 1, and just
> never reduce current credits so that the credit test would always succeed.

My concern is that these methods may be restricting the underlying implementation, with an assumption that there is a single outbound queue.  Based on the data ordering constraints specified by the application, multiple command queues could be in use.  Examples: RMA and messages transfers may be separated.  Message transfers may be separated based on transfer size.  Transfers may be separated based on the target address.  Etc.  DAPL has code similar to this to handle performance limitations on Xeon Phi, and it's easier to have a layered provider within libfabric expand the capabilities of a lower-level provider, than have all apps implement this functionality or re-code to another interface.

The challenge is to define something simple enough that an app can use it for common cases, but also be able to support more complex provider implementations.

I should note that rsockets uses application credits to manage its data flow, in order to prevent deadlocks updating buffering, so this is a feature that I need myself.

- Sean