[ofiwg] send/recv "credits"

Reese Faucette (rfaucett) rfaucett at cisco.com
Wed Oct 8 07:57:01 PDT 2014


Sure - multithreaded apps sharing a resource must always take extra care relative to single-threaded apps.  This API does give an app the capability to do its own credit accounting, by doing one early call to get credits to initialize a local credit counter, and then add/subtract based on cached costs.  Then, there's no need to hold a lock over the whole block of "check ... send"
	if (atomically_check_and_consume_credits(cost) == 0) {
		prepare_for_send;
		if (some_problem)
			return_credits(cost);
		else
			do_send();
	}

-reese               

> -----Original Message-----
> From: Atchley, Scott [mailto:atchleyes at ornl.gov]
> Sent: Wednesday, October 08, 2014 5:35 AM
> To: Reese Faucette (rfaucett)
> Cc: Hefty, Sean; Jason Gunthorpe; ofiwg at lists.openfabrics.org
> Subject: Re: [ofiwg] send/recv "credits"
> 
> Reese,
> 
> I assume that I have to hold a lock between a get credits call and the
> associated send() or recv() to ensure another thread does not "steal" my
> credits...
> 
> Scott
> 
> On Oct 8, 2014, at 2:04 AM, Reese Faucette (rfaucett) <rfaucett at cisco.com>
> wrote:
> 
> > Following up on discussion in today's call, I propose two new classes of
> calls, one to check available credits and one to check the cost of operations.
> I am going to keep calling them "credits" instead of "bytes" for now, but
> these are arbitrary units of "something", you get some number of them on
> an EP and each operation you do consumes some number of them, returned
> when the operation completed.
> >
> > int fi_get_send_credits(ep);  // current number of send credits on EP
> > int fi_get_recv_credits(ep);  // current number of recv credits on EP
> >
> > // how many credits would this send require?
> > int fi_sendv_cost(ep, struct iovec *iov, int iovcnt);
> >
> > // how many credits would this recv require?
> > int fi_recvv_cost(ep, struct iovec *iov, int iovcnt);
> >
> > (a non-SGL send/recv could be defined to use same # of credits as
> > corresponding iovcnt=1 operation, but we could also add
> > fi_send_cost(ep, buf, len);)
> >
> > Then, user has a lot of flexibility in how to use these, though generally I
> would expect the app to call fi_xxx_cost() a few times at startup for
> different classes of sends and cache the results.
> >
> > One way would be to do a sendv_cost or recv_cost on maximal size send
> app will do and always test:
> >   If (fi_get_send_credits(ep) > max_credits_per_send) { compose and
> > send; }
> >
> > If an app does not care about shrinking send queue depth, it is free to say:
> >  my_credits = fi_get_send_credits(ep) / max_credits_per_send; and then
> > always increment/decrement my_credits by 1 on
> consumption/completion, independent of provider.
> >
> > Often, sends go down different silos based on various criteria in an
> application anyhow, so the app could check the cost of a send for each silo
> and select its own send routine with constants for the credit values,
> avoiding the memory references.
> >
> > And, of course, an app could ignore all of these calls and rely on EAGAIN.
> Also, not all EPs necessarily support this mode of operation, for example
> EPs with multiple underlying HW resources (queues) where any given
> send/recv gets late-bound to one of the hidden resources.
> >
> > This mode of operation would be requested via attribute
> (FI_SEND_CREDITS / FI_RECV_CREDITS ?) at endpoint open.  Open question
> of whether this implicitly turns off provider double-checking the credits on
> each send (I *think* I'd like it to), whether that mode is a separate flag
> (FI_NO_CREDIT_CHECK?) (I'm ok with that also), or whether it's not really
> worth it to get the provider to forgo the check.
> >
> > -r
> >
> >> -----Original Message-----
> >> From: Hefty, Sean [mailto:sean.hefty at intel.com]
> >> Sent: Monday, September 29, 2014 12:01 PM
> >> To: Jason Gunthorpe
> >> Cc: Reese Faucette (rfaucett); Sur, Sayantan; Doug Ledford; Jeff
> >> Squyres (jsquyres); ofiwg at lists.openfabrics.org
> >> Subject: RE: [ofiwg] send/recv "credits"
> >>
> >>>> Independent from EAGAIN, Does the op_size / iov_size / op_alignment
> >>>> proposal work for apps that want to track send queue usage separate
> >>>> from the provider's tracking?
> >>>
> >>> I didn't follow it too closely, sorry.  How does an app adapt a
> >>> provider that is telling it to use sge entries to work with a wire
> >>> protocol that is defined in terms of wqes?
> >>
> >> The size of the transmit queue is reported in bytes.  An app does
> >> this check to determine if it can queue an entry into the transmit
> >> queue.  (An app can simplify this check in certain cases.)
> >>
> >> 	needed = ((op_size + iov_size * nsge) + op_alignment - 1) &
> >> ~(op_alignment - 1)
> >>
> >> For providers that support WQEs, needed = op_alignment.
> >> For providers that support SGEs, needed = iov_size * nsge.
> >>
> >> This should also support providers where the size of the queue is
> >> fixed, but the number of entries is not.
> >>
> >>> The remote CQ doesn't overflow because every SQE and RQE is still
> >>> guarenteed by the app to have an available CQE before it is posted.
> >>> So you are guarenteed to hit RQ exhaustion before you hit CQ
> exhaustion.
> >>
> >> Libfabric supports, but does not assume a 1:1 mapping between a
> >> posted receive buffer and a CQE.  This allows for more efficient use
> >> of receive buffering, but does require a more advanced form of flow
> >> control than current hardware supports.
> >>
> >> I don't want the API to assume that even a CQ has a fixed number of
> entries.
> >> An app should be able to determine the minimum number of entries any
> >> queue may support, without restricting all providers or applications
> >> to using that same model.
> >>
> >> - Sean
> > _______________________________________________
> > ofiwg mailing list
> > ofiwg at lists.openfabrics.org
> > http://lists.openfabrics.org/mailman/listinfo/ofiwg




More information about the ofiwg mailing list