[ofiwg] the send credit thread that will not die

Robert D. Russell rdr at iol.unh.edu
Mon Nov 10 12:37:03 PST 2014


During last week's meeting a method of determining send/recv credits
was discussed that seemed to have some support, and I will
attempt to summarize that method here.

The idea was based on the observation that many applications use
only a fixed (small) number of different work request types, so
if there were a simple way to determine the number of credits
each of those types requires then the application could easily
cache these numbers of credits for each type once as part of
its initialization and then during operation the app could easily
track its credit usage to avoid the EAGAIN error when trying
to post a work request.  (To maximize fast path performance,
many apps probably already construct their work requests,
sges, etc. once as part of initialization, so this may not
be much of an added burden.)

The proposal was to have a query function in which a prototype
of the work request (i.e., a representative work request as
it would be built by the app) could be given to the provider
which would return a number to the app that would be the number of
credits required by that work request in an ibv_post_send
and/or ibv_post_recv (depending on the opcode).
The work request should indicate the opcode, all relevant flags
and sge list items, etc., with a "nominal" length in each sge.
The app would save the resulting credit number along with each
work request type, so that when the app wanted to use a work request
of that type, it would know the number of credits it needs in order
to successfully post it without getting an EAGAIN error back.

The big advantage to this method is that the formula to calculate
this credit number is "hidden" within the provider, so that the app
does not have to perform a complex calculation on a vector.
It assumes that everything the provider needs to know in order
to perform that calculation can be derived from the work request,
and that the provider probably has to perform this calculation
in any case when it processes a post send or receive in order
to protect itself from queue overflow and to return EAGAIN.
Therefore, the impact to the provider of adding the new query function 
should be minimal.  Furthermore, the user is not required to call this
query function in the fast path, so the imipact to the user of
tracking credits in the fast path is just simple scalar additions,
subtractions and comparisons.

Hope this summary is accurate.
Bob Russell



More information about the ofiwg mailing list