[ofiwg] send/recv "credits"

Wed Sep 24 10:35:46 PDT 2014

Copying ofiwg

> Perhaps this is an ongoing discussion, but I don't see any documentation or
> issues about the max number of pending (non-completed) send or receive
> operations, or how to track them.  I see the issue on "flow control (#69)"
> - I am guessing that is not this topic, but rather wire flow control?

See https://github.com/ofiwg/libfabric/issues/10 for a related discussion that ties back into this.

Also see https://github.com/ofiwg/libfabric/issues/31.

> Anyhow, one of the differences between our hardware and IB hardware is the
> way our send and receive queue entries work.  My understanding is that each
> IB queue entry can take an SGL, where each of ours simply takes an address
> and a length.
> 
> This difference percolates up into the API to mean that for IB, the max
> number of pending sends is independent of the number of SGEs in each
> operation, so it makes sense for verbs to report the number of queue
> entries available, and credit accounting is that each send/receive consumes
> one entry.
> 
> For usnic, the number of queue entries consumed is variable, based on the
> number of SGEs in the send.  For us to implement the verbs model on top of
> our HW required us to define a max SGE/send, and divide the number of HW
> queue entries by that max, and report that reduced number as the number of
> queue entries available.  This has the effect of artificially lowering both
> our reported max SGE per operation and also the reported queue depth.
> 
> When implementing our own API for the hardware, we define each QP
> (endpoint) to have a fixed number of credits (HW queue entries) associated,
> and allow each send/recv operation to consume a variable (but well defined)
> number of credits, which makes much more efficient use of the queue
> entries.
> 
> So, I'm hoping that libfabric will use a credit model that supports this
> "variable # of credits per operation" approach, or something equivalent.
> What's the current thinking on this?

I agree that something is needed, but I'm not sure what exactly.  The association of an endpoint with a specific number of queue entries or credits appears to be needed, but also limiting.  Currently the data transfer APIs are allowed to return FI_EBUSY or FI_EAGAIN (or something like that) to indicate that a request cannot be queued by the provider.  But I agree than an app should have access to some sort of credit count.

In my last response to Doug on issue 10 above, I suggested defining a value such as minimum_credit_count.  An app is guaranteed to be able to initiate this many transfers, but may be able to issue more.  For the case of usnic, there would be a couple of options.  One would be to report a max SGL size of 1, with min_credit_count equal to the queue depth.  A second would be to report a max SGL size of N, with min_credit_count equal to queue depth / N.  The latter option could support the app queuing more requests, with the provider returning FI_EAGAIN when the underlying queue was full.

This seems reasonably simple for the app to use, but could leave some entries unused for apps that manage their own credits, since the provider must be prepared to handle the worst case (i.e. all transfers use the max SGL).

- Sean