[ofw] patch: Fix a race in the cl_timer code that caused deadlocks in opensm

Hefty, Sean sean.hefty at intel.com
Wed Jun 23 12:56:50 PDT 2010


> Ok, so HTML format mails suck for responding to patches...

Boy howdy... Under tools - trust center - you can receive all email as text.

> >+
> > 	p_timer->timeout_time = 0;
> >-	p_timer->thread_id = GetCurrentThreadId();
> >+	p_timer->thread_id = thread_id;
> >
> > 	(p_timer->pfn_callback)( (void*)p_timer->context );
> >
> >-	p_timer->thread_id = 0;

I think we need to keep this to avoid potential false positives in start.

> >+	cl_spinlock_acquire( &p_timer->spinlock );
> > 	p_timer->timeout_time = 0;
> 
> The only case where you could end up with multiple callbacks executing is
> if you call cl_timer_start from the callback.  It might be easier to flag
> that you're in a callback, store the new timeout time, and delay calling
> KeSetTimer until the callback unwinds.  You could add a flag here to
> indicate that you're in the timer callback, that you check in
> cl_timer_start to control whether you call KeSetTimer...

I can try this.

> >+	cl_spinlock_release( &p_timer->spinlock );
> >
> >+	KeAcquireGuardedMutex( &p_timer->cb_lock );
> 
> timer callback is in a DPC, at DISPATCH_LEVEL.  KeAcquireGuardedMutex must
> be <= APC_LEVEL.

Okay - I was looking for this and couldn't tell if a timer was at dispatch or not.  Odd that this doesn't just automatically crash in testing.

> >@@ -75,6 +79,8 @@ cl_timer_init(
> >
> > 	KeInitializeTimer( &p_timer->timer );
> > 	KeInitializeDpc( &p_timer->dpc, __timer_callback, p_timer );
> >+	cl_spinlock_init( &p_timer->spinlock );
> 
> Just use a KSPINLOCK here, no?  Then you can use the in-stack queued
> spinlocks if the lock might be contentious.

yep

> 
> >+	KeInitializeGuardedMutex( &p_timer->cb_lock );
> >
> > 	return( CL_SUCCESS );
> > }
> >@@ -108,6 +114,7 @@ cl_timer_start(
> > 	IN	const uint32_t		time_ms )
> > {
> > 	LARGE_INTEGER	due_time;
> >+	uint64_t		timeout_time;
> >
> > 	CL_ASSERT( p_timer );
> > 	CL_ASSERT( p_timer->pfn_callback );
> >@@ -115,11 +122,13 @@ cl_timer_start(
> >
> > 	/* Due time is in 100 ns increments.  Negative for relative time. */
> > 	due_time.QuadPart = -(int64_t)(((uint64_t)time_ms) * 10000);
> >+	timeout_time = cl_get_time_stamp() + (((uint64_t)time_ms) * 1000);
> >
> >-	/* Store the timeout time in the timer object. */
> >-	p_timer->timeout_time = cl_get_time_stamp() + (((uint64_t)time_ms) *
> 1000);
> >-
> >+	cl_spinlock_acquire( &p_timer->spinlock );
> >+	p_timer->timeout_time = timeout_time;
> 
> Check the 'in callback' flag here to conditionally call KeSetTimer.
> 
> > 	KeSetTimer( &p_timer->timer, due_time, &p_timer->dpc );
> >+	cl_spinlock_release( &p_timer->spinlock );
> >+
> > 	return( CL_SUCCESS );
> > }
> >
> >@@ -129,19 +138,25 @@ cl_timer_trim(
> > 	IN	cl_timer_t* const	p_timer,
> > 	IN	const uint32_t		time_ms )
> > {
> >+	LARGE_INTEGER	due_time;
> > 	uint64_t		timeout_time;
> >
> > 	CL_ASSERT( p_timer );
> > 	CL_ASSERT( p_timer->pfn_callback );
> >
> >-	/* Calculate the timeout time in the timer object. */
> >+	/* Due time is in 100 ns increments.  Negative for relative time. */
> >+	due_time.QuadPart = -(int64_t)(((uint64_t)time_ms) * 10000);
> 
> You can delay calculating this until the check that the new timeout time is
> before the current timeout time.

yep




More information about the ofw mailing list