[ofw] patch: Fix a race in the cl_timer code that caused deadlocks in opensm
Fab Tillier
ftillier at microsoft.com
Thu Jun 24 11:50:18 PDT 2010
Hefty, Sean wrote on Thu, 24 Jun 2010 at 11:46:53
>> One thing to keep in mind is that with the lock to serialize the
>> callback threads, you may block threadpool threads that need to run
>> something else, so the deadlock might be a side effect of serializing
>> the callbacks with a lock.
>
> Okay - the user space implementation of cl_timer uses some system
> thread pool, which defaults to a maximum of 500 threads. Deadlock may
> still be happening at a higher level, but it doesn't seem likely that
> it would come from the cl_timer implementation.
Just because the default max is 500 doesn't mean that you aren't tripping things up with a heuristic that says you only really should have two...
In any case, I think I have a fix for serializing the callbacks without holding locks, see my other mail for this.
-Fab
More information about the ofw
mailing list