[ofw] patch: Fix a race in the cl_timer code that caused deadlocks in opensm

Hefty, Sean sean.hefty at intel.com
Wed Jun 23 16:44:23 PDT 2010


> The only case where you could end up with multiple callbacks executing is
> if you call cl_timer_start from the callback.  It might be easier to flag
> that you're in a callback, store the new timeout time, and delay calling
> KeSetTimer until the callback unwinds.  You could add a flag here to
> indicate that you're in the timer callback, that you check in
> cl_timer_start to control whether you call KeSetTimer...

Trying to do this is inherently racy.  When a timer expires, the callback runs, but may not be at the point where it has set cb_running=TRUE.  A call to cl_timer_start during this time will still have set the timer.  Trying to deal with this ends up in goofy complex locking-state tracking that just isn't worth it.

It should really be the responsibility of the timer callback routine to reset the timer just before returning if it needs to be, rather than setting it repeatedly in a loop.  If the kernel complib timer abstraction were replaced with native calls, this is what would be needed anyway.

- Sean



More information about the ofw mailing list