[openib-general][PATCH][kdapl]: evd upcall policy implementation

Guy German guyg at voltaire.com
Thu Aug 18 08:39:12 PDT 2005


Hi Caitlin,
Caitlin Bestler <mailto:caitlin.bestler at gmail.com> wrote:
> Some clarifications are needed here.
> 
> First the Consumer is responsible for draining the
> EVD after re-enabling it, or at least for remembering
> that there may be undrained notified events.

Can you please explain what you mean by "re-enabling"
the EVD ? Do you mean calling dat_evd_modify_upcall
and changing the upcall policy from disable, back to 
enable ? 

> 
> That is "you-have-been-notified" is a sticky boolean
> attribute that the Consumer is supposed to set to TRUE
> when the upcall is made and only clear when the EVD
> has been drained *after* re-enabling.
> 
> Second, is that the EVD is first and foremost an event
> *serializer*. It is presumed to have a finite number of
> resources for making upcalls (at most one for the typical
> case where SINGLE is enabled). The next upcall per
> resource CANNOT occur until after the current upcall
> has completed.
> 
> Whether this should be solved in the DAT Provider is
> a question of what the verb-layer provider is allowed
> to do. If the verb layer provider can in fact generate
> multiple concurrent upcalls for the same CQ then the
> EVD itself must guard against re-entrancy.
> 
> A more likely implementation is that upcalls triggered
> by post_se, CM events and CQs could theoretically
> occur at the same instance -- but that none of these
> paths can be re-entrant by themselves.
> 
> Once the potential re-entrancy from the verb layer
> is known, then an optimal strategy can be selected.
> For exaple, if the only potential re-entrancy comes
> when the upcall interrupts a post_se call then some
> simple critical regions can avoid all problems without
> general purpose spinlocks or semaphores.
> 
> On 8/16/05, James Lentini <jlentini at netapp.com> wrote:
>> 
>> 
>> On Tue, 16 Aug 2005, Guy German wrote:
>> 
>>>>>>>>> Also, the pending_event_queue is only used for kDAPL generated
>>>>>>>>> software events. This queue can be empty when there are
>>>>>>>>> events on the CQ, so your would need to be expanded your
>>>>>>>>> check to cover that.
>>>>>>> 
>>>>>>> Actually, even though, I agreed before, I tend to disagree now.
>>>>>>> The consumer will still get the DTO events as soon as the CQ
>>>>>>> upcall is triggered (enabled), so only problem is with the
>>>>>>> pending events list.
>>>>>> 
>>>>>> Why is it an error for the consumer to modify the upcall policy
>>>>>> when there are pending events?
>>>>>> 
>>>>>> dat_evd_modify_upcall should behave just like the IBTA spec's
>>>>>> Request Completion Notification verb in this respect. If there
>>>>>> were events on the EVD before the upcall is enabled, no upcall
>>>>>> needs to be generated. A correct consumer can easily work around
>>>>>> this by enabling the upcall and polling the EVD one final time
>>>>>> to ensure it is empty.
>>>>> 
>>>>> There can be more than one event, and the consumer would need to
>>>>> dequeue many times. While the consumer would do his extra
>>>>> dequeue-ing he might also get an upcall, because his policy is
>>>>> now enabled. I can't think of a design that can handle such a
>>>>> case, and if there is one it is demanding and complicated, from
>>>>> the consumers side. 
>>>> 
>>>> Isn't it the same position all event code written to the OpenIB
>>>> API is in?
>>> 
>>> I don't quite know what you are reffering to, but if you are
>>> reffering to the case of cq in IB - It's totally different: you
>>> only enable the cq once, so you will only get one upcall, and the
>>> rest of the events you will need to dequeue.
>> 
>> The consumer should only receive one upcall at a time if the upcall
>> policy is DAT_UPCALL_SINGLE_INSTANCE. If the dequeues are performed
>> in an upcall, the logic needed in an OpenIB consumer and kDAPL
>> consumer is essentially the same. 
>> 
>> The difference is that the OpenIB consumer needs to re-enable the CQ
>> upcall and poll to make sure no events were missed.
>> 
>>>> I agree with you that this programming model is difficult to use,
>>>> but I don't think it is impossible.
>>> 
>>> I think it is a bad idea to dequeue events and at the same time
>>> receive upcalls from the same queue. It is racy, and has bad
>>> performance. I don't see *any* reason to do it.
>> 
>> The current kDAPL implementation does create a situation in which an
>> upcall and poll occur simultaneously if the upcall is disabled, the
>> consumer enables the upcall, and then the consumer does a poll. In
>> this scenario an upcall can occur while the consumer is polling. I
>> was pointing out that this same race exists in the OpenIB verbs API
>> (and the IBTA verbs). 
>> 
>> Again, I agree that we can eliminate the additional poll after
>> enabling the upcall in kDAPL. We just need to do it in a way that is
>> not hardware specific. I believe we can use the same technique we
>> did in the DTO upcall. 
>> 
>> james
>> _______________________________________________
>> openib-general mailing list
>> openib-general at openib.org
>> http://openib.org/mailman/listinfo/openib-general
>> 
>> To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general



More information about the general mailing list