[ofw] patch: Fix a race in the cl_timer code that caused deadlocks in opensm

Hefty, Sean sean.hefty at intel.com
Thu Jun 24 16:18:15 PDT 2010


> 1) On opensm, there is a call to timer start from the timer callback as
> well as from the timer code => multiple calls to timer start.
> 2) It seems to me that the implementation that we are using is very complex
> since callbacks can be called by multiple threads. If we limit the
> callbacks to a single thread, life is much easier, and as far as opensm is
> interested it has not performance penalty, since opensm is using one lock
> for working with the timer.
> 
> So, I believe that creating the code from scratch (with our thread) will
> make our lives much simpler.

I don't know that we need to go that far.  We just need to fix the issue at hand based on actual usage.  For example, cl_timer_stop looks like it's only used internally.  The crux of the problem is that we're trying to determine if we're calling back into the timer code from the callback.  I don't see any reason why we should even care.



More information about the ofw mailing list