[openib-general][PATCH][kdapl]: evd upcall policy implementation

Tue Aug 16 03:21:20 PDT 2005

James Lentini <mailto:jlentini at netapp.com> wrote:
>>>>> You changed the order in which the CQ upcall is enabled and the
>>>>> kDAPL upcall is made.
>>> kDAPL doesn't dequeue the CQ events, so this shouldn't be an issue.
>> 
>> I think we agreed on that issue, on another thread, that I
> shall leave it:
>>>  call kDAPL upcall
>>>  enable CQ upcall
>> right ?
> 
> Given that the ib_req_notify_cq() verb conforms to the IBTA spec's
> Request Completion Notification verb, we will have to find another
> solution. 

I don't think its a good idea to enable the CQ upcall before the kdapl 
upcall. You will have a bigger problem than a race in *kdapltest* 
(over hardware that doesn't yet exist :) - like a consumer getting an 
unexpected upcall or just CQ interrupts, while in disable mode.

Any way, one suggestion, I can think of, to solve it, is this:

if (state == DAPL_EVD_STATE_OPEN) {
		dapl_evd_upcall_trigger(evd);
		if ((evd->upcall_policy != DAT_UPCALL_TEARDOWN) &&
		    (evd->upcall_policy != DAT_UPCALL_DISABLE)) {
			status = ib_req_notify_cq(evd->cq, IB_CQ_NEXT_COMP);
			if (status) 
				(void)dapl_evd_post_async_error_event(
					evd->common.owner_ia->async_error_evd,
					DAT_ASYNC_ERROR_PROVIDER_INTERNAL_ERROR,
					evd->common.owner_ia);
			else
				dapl_evd_upcall_trigger(evd);
		}
}

Note that dapl_evd_upcall_trigger makes sure that only if there *are* 
completions in the cq (and policy is enabled) the consumer would 
get another upcall.

Is this acceptable ?

(It is possible to check the cq, before calling dapl_evd_upcall_trigger, 
for performance reasons, but applications that are interested 
in performance, wouldn't go down that flow, anyway...)

>>> Currently the IB upcall is initially enabled, but there are checks
>>> on the upcall path to determine if the EVD upcall is enabled. See
>>> dapl_evd.c line 827: 
>>> 
>>>         if (state == DAPL_EVD_STATE_OPEN &&
>>>             evd->upcall_policy != DAT_UPCALL_DISABLE) {
>>> 
>>> which you've replaced with
>>> 
>>>         if (state == DAPL_EVD_STATE_OPEN) {
>> 
>> Yes, because I believe the correct place to check it is inside
>> dapl_evd_upcall_trigger, after the lock and before the actual upcall.
> 
> I agree that checking it in dapl_evd_upcall_trigger() is better.
> 
> I still don't understand why you wouldn't setup the completion
> notification based on the kDAPL consumer's upcall_policy in
> dapl_evd_internal_create(). 
> 

Ok. I will add that check.

>>>> You've only decrease the window in which that scenario could
>>>> occur, not eliminated it. If a DTO completion occurred after you
>>>> count the number of pending events but before you enable the CQ
>>>> callback, a completion will be missed.
>>>> 
>>>> gg:
>>>> I don't think so. That is what the spin_lock_irqsave is for.
>>> 
>>> What if there are multiple processors in the system?
>> 
>> AFAIK, spin_lock_irqsave does the trick. am I wrong ?
> 
> spin_lock_irqsave disables local interrupts.

The irq part disables local interrupts and the spin_lock part
protects resources from multiple processors, no ?

> In any event...[see question below]
> 
>>>> Also, the pending_event_queue is only used for kDAPL generated
>>>> software events. This queue can be empty when there are events on
>>>> the CQ, so your would need to be expanded your check to cover that.
>> 
>> Actually, even though, I agreed before, I tend to disagree now.
>> The consumer will still get the DTO events as soon as the CQ
>> upcall is triggered (enabled), so only problem is with the pending
>> events list.
> 
> Why is it an error for the consumer to modify the upcall policy when
> there are pending events? 
> 
> dat_evd_modify_upcall should behave just like the IBTA spec's Request
> Completion Notification verb in this respect. If there were events on
> the EVD before the upcall is enabled, no upcall needs to be generated.
> A correct consumer can easily work around this by enabling the upcall
> and polling the EVD one final time to ensure it is empty.

There can be more than one event, and the consumer would need to 
dequeue many times. While the consumer would do his extra dequeue-ing 
he might also get an upcall, because his policy is now enabled.
I can't think of a design that can handle such a case, and if there is one it
is demanding and complicated, from the consumers side.

After enabling the upcall policy, the consumer, in my opinion, should only
expect upcalls and not do any dequeu's.

My suggestion is an optimization (lowering the context switching) and race solving. 
I don't think it's a good idea to make it more complicated, then this simple solution. 
Indeed it is not spec compliant, even though I did send a request to dat-collabrotive 
to make it part of the spec, on March:
http://groups.yahoo.com/group/dat-discussions/message/3312
buy did not get any reply from them.

BTW, the race we are talking about actually happens, and I can say from 
a practical point of view, that this implementation (along with the one existing
in ISER side) solves this problem.

Guy