[ofiwg] send/recv "credits"

Wed Oct 8 16:25:49 PDT 2014

It would be helpful to understand better the application usage model here.  Specifically, why does the app need to know if its next post operation will return EAGAIN or not?

Exposing attributes to the app is relatively low burden on the providers.  Exposing an API that could be called at any point has the potential of negatively impacting performance.  For example, it could result in serialization between the posting of operations and their completion, with a significant impact dealing with requests that will not generate a completion.

Based on various application requirements, the libfabric API has evolved to contain the following objects.

Endpoint - an endpoint is associated with a transport level address
Transmit context - i.e. send queue, only used by advanced apps
Receive context - i.e. receive queue, only used by advanced apps

(Transmit and receive contexts are exposed using the struct fid_ep.)
Ultimately, there will be a many-to-many relationship between contexts and endpoints.  The semantic that transmit and receive contexts are fixed sized queues is something that I was hoping to move away from.  (Though, to be fair, I was also hoping to hide the transmit and receive contexts completely.)

Any credit based solution also needs to incorporate 'injected' (aka inline) data.  Injected buffers are not necessarily referenced using an IOV, but may instead require copying the data directly into the transmit queue.  The size of the consumed queue space may also be dependent on the operation, not the IOV.  Some of the complex atomic operations that have been defined could easily consume 2-3 times more space in a transmit queue than a simpler operation.

And we haven't even touched on immediate data, which apps have requested be larger than the current 32-bits.

- Sean