[ofw] [PATCH] complib/user: fix timer race conditions

Hefty, Sean sean.hefty at intel.com
Fri Jun 25 08:08:12 PDT 2010


> The following lines are very problematic:
> 
> 	EnterCriticalSection( &p_timer->cb_lock );
> 	(p_timer->pfn_callback)( (void*)p_timer->context );
> 	LeaveCriticalSection( &p_timer->cb_lock );
> 
> Nothing specific with timer, but if you have a code that takes a lock, and
> call someone else code than this is asking for trouble. The reason is that
> if he has done something like this (take a lock and call you back) than
> there is a deadlock. This can be avoided by documentation, but I would
> rather avoid this issues...

This locking is fine.  The cb_lock lock is specifically only used here and not in other calls.  There is no chance of deadlock coming from the use of this lock.

> One thing that I'm missing here is that start might not do anything but
> still return success. I would like to have two more return code, one for
> the case of timer not running because the time is too big, and one because
> another callback is running. This is to allow the user to know if he should
> wait for a callback or not.

If start returns success, then the timer will run.

> Another issue here is that start() might end waiting for the callback to
> return. Again this is a very problematic behavior (for a general timer)
> since two of this can cerate a deadlock. This is especially true for using
> process io threads that one has no control over. That is documentation
> should say, either don't call start when there is already something
> running, or don't call timer.start() from a system thread. This is because
> two system threads might create a deadlock.

Start and trim never wait for a callback to return.  That's why NULL is passed into DeleteTimerQueueTimer rather than INVALID_HANDLE_VALUE.  NULL is non-blocking, INVALID_HANDLE_VALUE is blocking.

> Also on start one takes the critical section and calls
> 
> 		if( p_timer->h_timer )
> 			DeleteTimerQueueTimer( p_timer->h_timer_queue, p_timer-
> >h_timer, NULL );
> If the callback has already started running but haven't yet called
> __clear_timer_handle() there is a deadlock here. As one is holding the
> spinlock waiting for the other to finish running , while the other is
> trying to capture the spinlock.

This is why there are two locks.  One lock is used to synchronize the state, the other lock to serialize the callbacks.

> So, the general rule should be don't call start on this timer from io
> threads...

Start can be called from any thread.





More information about the ofw mailing list