[openib-general][PATCH][kdapl]: evd upcall policy implementation

Fri Aug 12 11:26:18 PDT 2005

On Fri, 12 Aug 2005, Guy German wrote:

> > -	/* Only process events if there is an enabled callback function. */
> > -	if ((evd->upcall.upcall_func == (DAT_UPCALL_FUNC) NULL) ||
> > -	    (evd->upcall_policy == DAT_UPCALL_DISABLE)) {
> > +	
> > +	/* The function is not re-entrant (change when implementing DAT_UPCALL_MANY)*/
> 
> Why is this function not re-entrant? For reference, here is how I 
> would define re-entrant:
> 
> http://en.wikipedia.org/wiki/Reentrant
> http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?re-entrant
> 
> gg:
> The function can not be entered twice at the same time, because 
> when the upcall is at the hands of the consumer he can disable 
> the upcall policy and if it is entered twice, there is a chance 
> the consumer will get another upcall after disabling 
> the upcall policy.

I don't see the flow of control that would result in the scenario you 
describe. This piece of code

	spin_lock_irqsave (&evd->common.lock, flags);
        if (evd->is_triggered) {
                spin_unlock_irqrestore (&evd->common.lock, flags);
                return;
        }
        evd->is_triggered = 1;
        spin_unlock_irqrestore (&evd->common.lock, flags);

ensures that only one thread can be making upcalls at a time. 

> 
> 
> > +	if (evd->is_triggered)
> >  		return;
> > -	}
> 
> Why check the value here? Is it only for the efficiency of not taking 
> the spin lock when is_triggered is 1?
> 
> gg:
> No. you can't take the spin_lock here because this can cause 
> a dead lockin the case the function calls itself from 
> dat_evd_dequeue, on a uni-proccessor machines.

Can you elaborate on this? Do you mean that the thread that performs 
the upcall in dapl_evd_upcall_trigger() can be used by the consumer to 
call dapl_evd_dequeue()? If so, I don't see the flow of control that 
begins in dapl_evd_dequeue() and reaches dapl_evd_upcall_trigger().

> > @@ -820,24 +831,19 @@ static void dapl_evd_dto_callback(struct
> >  	 * This function does not dequeue from the CQ; only the consumer
> >  	 * can do that. Instead, it wakes up waiters if any exist.
> >  	 * It rearms the completion only if completions should always occur
> > -	 * (specifically if a CNO is associated with the EVD and the
> > -	 * EVD is enabled).
> >  	 */
> > -
> > -	if (state == DAPL_EVD_STATE_OPEN && 
> > -	    evd->upcall_policy != DAT_UPCALL_DISABLE) {
> > -		/*
> > -		 * Re-enable callback, *then* trigger.
> > -		 * This guarantees we won't miss any events.
> > -		 */
> > -		status = ib_req_notify_cq(evd->cq, IB_CQ_NEXT_COMP);
> > -		if (0 != status) 
> > -			(void)dapl_evd_post_async_error_event(
> > -				evd->common.owner_ia->async_error_evd,
> > -				DAT_ASYNC_ERROR_PROVIDER_INTERNAL_ERROR,
> > -				evd->common.owner_ia);
> > -
> > +	
> > +	if (state == DAPL_EVD_STATE_OPEN) {
> >  		dapl_evd_upcall_trigger(evd);
> > +		if ((evd->upcall_policy != DAT_UPCALL_TEARDOWN) &&
> > +		    (evd->upcall_policy != DAT_UPCALL_DISABLE)) {
> > +			status = ib_req_notify_cq(evd->cq, IB_CQ_NEXT_COMP);
> > +			if (0 != status) 
> > +				(void)dapl_evd_post_async_error_event(
> > +					evd->common.owner_ia->async_error_evd,
> > +					DAT_ASYNC_ERROR_PROVIDER_INTERNAL_ERROR,
> > +					evd->common.owner_ia);
> > +		}
> 
> 
> You changed the order in which the CQ upcall is enabled and the kDAPL 
> upcall is made. It used to be:
> 
>  enable CQ upcall
>  call kDAPL upcall
> 
> you are proposing
> 
>  call kDAPL upcall
>  enable CQ upcall
> 
> I think your proposed order contains a race condition. Specifically if 
> a work completion occurs after dapl_evd_upcall_trigger() 
> returns but before the CQ upcall is re-enabled with 
> ib_req_notify_cq(), no upcall will occur for the completion.
> 
> Do you agree?
> 
> gg:
> You need to enable the CQ upcall only if the consumer did
> not change his upcall policy, while in upcall context. In 
> the first case you will create a situation where the cq is 
> enabled, while the consumers doesn't want any upcalls.

Correct, but this can be hidden so that the consumer does not receive 
the upcall.

> In most real world application dapl_evd_upcall_trigger()
> will return with upcall policy disabled and there will be 
> no need to alarm the cq upcall - i.e the consumer would
> dequeue the rest of the events himself.
> 
> I see the race you talk about. It is relevent to kdapltest.
> Maybe we can check if there are pending events after 
> enabling CQ upcall, and if there are - call 
> dapl_evd_upcall_trigger() again. What do you think ?

The problem we are dealing with is that DAPL upcalls behave 
differently from IB upcalls. DAPL upcalls are enabled until they are 
disabled while IB upcalls are "one shots".

The approach taken in the current implementation is to always enable 
the IB upcalls and determine in the DAPL provider if the consumer's 
upcall should be invoked. 

You are proposing a shift away from that approach. If we do that, we 
need to preserve the original semantics. I'll ask the DAT 
Collaborative from some clarification on the meaning of the different 
upcall policy flags.

One final item. To be consistent with your design, CQ upcalls should 
be selectively enabled in dapl_evd_internal_create().

> >  	evd = (struct dapl_evd *)evd_handle;
> > +	dapl_dbg_log (DAPL_DBG_TYPE_API, "%s: (evd=%p) set to %d\n",
> > +			__func__, evd_handle, upcall_policy);
> 
> The idea was to make the DAPL_DBG_TYPE_API prints look like a 
> debugger stack trace. The following would be keeping with the other 
> print statements:
> 
> gg:
> I thought it would make it a bit more user friendly :) sometimes
> the consumers use those debug prints and they don't want to dwell 
> in the kdapl code too much in order to understand what they 
> are reading ...

You need a fair amount of familiarity with the code to know what a 
message that says "dapl_evd_modify_upcall: (evd=dbbe4e58) set to 2" 
means.

If you'd like to add the parameter names (e.g. evd=%p, 
upcall_policy=...) that is fine. I think the function call format is 
better, because a user familiar with the function signatures will know 
what each of the fields means. 

> >  
> > +	spin_lock_irqsave(&evd->common.lock, flags);
> > +	if ((upcall_policy != DAT_UPCALL_TEARDOWN) &&
> > +	    (upcall_policy != DAT_UPCALL_DISABLE)) {
> 
> Why not let the consumer setup the upcall when it disabled? That seems 
> like the only safe time to modify it.
> 
> gg:
> The consumer needs and can change the poilcy to disable and enable.
> The only time he is not allowed to change the policy to enable (in 
> this implementation) is when there are still pending events in the 
> queue. 

You mean when there are *no* pending events in the queue.

> This is to solve a race where the consumer dequeued all the events 
> and changed the policy to enable, but there were other event/s that 
> came just before calling dat_evd_modufy_upcall. In this case 
> dat_evd_modufy_upcall to enable would fail and the consumer would 
> keep dequeue-ing the events, without loosing his context.

You've only decrease the window in which that scenario could occur, 
not eliminated it. If a DTO completion occured after you count the 
number of pending events but before you enable the CQ callback, a 
completion will be missed. 

Also, the pending_event_queue is only used for kDAPL generated 
software events. This queue can be empty when there are events on the 
CQ, so your would need to be expanded your check to cover that.

> > +		if (evd->evd_flags & DAT_EVD_DTO_FLAG) {
> > +			status = ib_req_notify_cq(evd->cq, IB_CQ_NEXT_COMP);
> 
> Why do we need to re-enable the CQ upcall?
> 
> gg:
> If the consumer returned from the evd_upcall with upcall policy 
> "disabled" the CQ upcall is not enabled. So this is the only 
> place it is done.

Ok, that fits with your new approach to the problem.