[ofiwg] wait sets

Thu Jan 28 20:59:04 PST 2016

[inline]

> On Jan 27, 2016, at 3:51 PM, Hefty, Sean <sean.hefty at intel.com> wrote:
> 
>> An attempt to describe the desired application behavior is:
>> 
>>    1. Wait for one or more events to occur
>>    2. Get a list of queues that are ready for action
>>    3. Process each queue until empty
> 
> I propose a couple of changes to the above sequence:
> 
>    1. If it is okay to wait
>    1.1.   Wait for one or more events to occur
>    2. Get list of queues ready for action
>    3. Process each queue
> 
> We then define step 1.  Ba-da-bing, ba-da-boom, and we're done.

Yes, this seems to help some.

>> b.)  libfabric only calls - for multiple queues
>> 
>>    1. fi_wait
>>    2. fi_poll
>>    3. fi_cq_read/fi_eq_read/fi_cntr_read

I take the use of read (and not sread) in b3 above implies that the wait object resetting would be done at the start of fi_wait() after checking for the "okay to wait" conditions immediately below?

> Proposal - It is okay to wait when all associated:
>           CQs are empty +
>           EQs are empty +
>           Counters have not incremented since last wait call
> 
> There are other proposals I can think of, but are more complex, such as introducing thresholds.  Counters present a challenge.  fi_wait can encapsulate these checks.  (I'm always at a lost whether to capitalize the first word of a sentence when it refers to a name that is spelled all in lowercase.)

(I always dodge by rearranging the sentence in small ways, like in the above sentence I might have stuck "Then" at the start or rearranged it to be more passive "The check can be encapsulated in fi_wait")

>> c.)  OS + libfabric calls - for one or multiple queues
>> This modifies the above sequence to:
>> 
>>    1. poll/select
>>    2. fi_poll
>>    3. fi_cq_read/fi_eq_read/fi_cntr_read
> 
> And we're back to the problem child.  For each fd that the app references, it should perform a check to see if it is okay to wait on that fd.  This suggests introducing new interfaces for that purpose.
> 
> fi_cq_trywait / fi_eq_trywait / fi_cntr_trywait / fi_trywait
> 
> Note that the app only needs to call trywait (names are hard) if it is directly using the fd from that object.  E.g. if it is polling on the fd from a wait set, and the wait set is associated with 4 CQs, the app only needs to call fi_trywait, not fi_cq_trywait.
> 
> The trywait calls can be static inline wrappers around existing calls (fi_cq_sread, fi_eq_sread, fi_cntr_wait, fi_wait).  Trywait would return 0 when it's safe to wait.

If we end up with such an fi_trywait() call, it seems like you would want it to take an array of fids to avoid requiring the app to always put its own loop around the calls.

Where is the wait object resetting occurring in this example?  In trywait (further arguing for trywait taking an array) or in an sread that must come after all the reads in c3?

-Dave