[openib-general][PATCH][kdapl]: evd upcall policy implementation

Tue Aug 16 07:53:50 PDT 2005

On Tue, 16 Aug 2005, Guy German wrote:

> James Lentini <mailto:jlentini at netapp.com> wrote:
> >>>>> You changed the order in which the CQ upcall is enabled and the
> >>>>> kDAPL upcall is made.
> >>> kDAPL doesn't dequeue the CQ events, so this shouldn't be an issue.
> >> 
> >> I think we agreed on that issue, on another thread, that I
> > shall leave it:
> >>>  call kDAPL upcall
> >>>  enable CQ upcall
> >> right ?
> > 
> > Given that the ib_req_notify_cq() verb conforms to the IBTA spec's
> > Request Completion Notification verb, we will have to find another
> > solution. 
> 
> I don't think its a good idea to enable the CQ upcall before the kdapl 
> upcall. You will have a bigger problem than a race in *kdapltest* 
> (over hardware that doesn't yet exist :) - like a consumer getting an 
> unexpected upcall or just CQ interrupts, while in disable mode.
> 
> Any way, one suggestion, I can think of, to solve it, is this:
> 
> if (state == DAPL_EVD_STATE_OPEN) {
> 		dapl_evd_upcall_trigger(evd);
> 		if ((evd->upcall_policy != DAT_UPCALL_TEARDOWN) &&
> 		    (evd->upcall_policy != DAT_UPCALL_DISABLE)) {
> 			status = ib_req_notify_cq(evd->cq, IB_CQ_NEXT_COMP);
> 			if (status) 
> 				(void)dapl_evd_post_async_error_event(
> 					evd->common.owner_ia->async_error_evd,
> 					DAT_ASYNC_ERROR_PROVIDER_INTERNAL_ERROR,
> 					evd->common.owner_ia);
> 			else
> 				dapl_evd_upcall_trigger(evd);
> 		}
> }
> 
> Note that dapl_evd_upcall_trigger makes sure that only if there *are* 
> completions in the cq (and policy is enabled) the consumer would 
> get another upcall.
> 
> Is this acceptable ?

Yes.

> (It is possible to check the cq, before calling dapl_evd_upcall_trigger, 

we could use ib_peek_cq

> for performance reasons, but applications that are interested 
> in performance, wouldn't go down that flow, anyway...)

You're assuming that applications interested in performance would not 
use upcalls?

> >>>> You've only decrease the window in which that scenario could
> >>>> occur, not eliminated it. If a DTO completion occurred after you
> >>>> count the number of pending events but before you enable the CQ
> >>>> callback, a completion will be missed.
> >>>> 
> >>>> gg:
> >>>> I don't think so. That is what the spin_lock_irqsave is for.
> >>> 
> >>> What if there are multiple processors in the system?
> >> 
> >> AFAIK, spin_lock_irqsave does the trick. am I wrong ?
> > 
> > spin_lock_irqsave disables local interrupts.
> 
> The irq part disables local interrupts and the spin_lock part
> protects resources from multiple processors, no ?

Yes. I just wasn't sure if the other processors were using the spin 
lock. 

> >>>> Also, the pending_event_queue is only used for kDAPL generated
> >>>> software events. This queue can be empty when there are events on
> >>>> the CQ, so your would need to be expanded your check to cover that.
> >> 
> >> Actually, even though, I agreed before, I tend to disagree now.
> >> The consumer will still get the DTO events as soon as the CQ
> >> upcall is triggered (enabled), so only problem is with the pending
> >> events list.
> > 
> > Why is it an error for the consumer to modify the upcall policy when
> > there are pending events? 
> > 
> > dat_evd_modify_upcall should behave just like the IBTA spec's Request
> > Completion Notification verb in this respect. If there were events on
> > the EVD before the upcall is enabled, no upcall needs to be generated.
> > A correct consumer can easily work around this by enabling the upcall
> > and polling the EVD one final time to ensure it is empty.
> 
> There can be more than one event, and the consumer would need to 
> dequeue many times. While the consumer would do his extra dequeue-ing 
> he might also get an upcall, because his policy is now enabled.
> I can't think of a design that can handle such a case, and if there is one it
> is demanding and complicated, from the consumers side.

Isn't it the same position all event code written to the OpenIB API is 
in? 

I agree with you that this programming model is difficult to use, 
but I don't think it is impossible.

> After enabling the upcall policy, the consumer, in my opinion, 
> should only expect upcalls and not do any dequeu's.

We can change the behavior as you suggest, but we shouldn't use 
features specific to current Mellanox hardware. Using proprietary 
features will make the code hard to maintain. How about using the same 
technique you've proposed for the DTO callback (ie enabling the CQ 
upcall and then calling dapl_evd_upcall_trigger)?

> My suggestion is an optimization (lowering the context switching) 
> and race solving. I don't think it's a good idea to make it more 
> complicated, then this simple solution. Indeed it is not spec 
> compliant, even though I did send a request to dat-collabrotive to 
> make it part of the spec, on March: 
> http://groups.yahoo.com/group/dat-discussions/message/3312 buy did 
> not get any reply from them.
> 
> BTW, the race we are talking about actually happens, and I can say 
> from a practical point of view, that this implementation (along with 
> the one existing in ISER side) solves this problem.