[ofw] patch: Fix a race in the cl_timer code that caused deadlocks in opensm

Wed Jun 23 01:06:56 PDT 2010

The short answer to your question is this:

> app calls cl_timer_start or cl_timer_trim timer_callback #1 is invoked
> app calls cl_timer_start again (possible from callback #1)
> timer_callback #2 is invoked - this changes the thread_id

There are two cases that you mention:
1) app calls timer_start again.
2) Start is being called from the callback.

For option #1 there is no problem since we will wait for the timer callback to return. There will be no more than one callback.

For the second option, we have no option to wait for the callback to end since it will never end. So we start the timer and a second callback will be used. In that case the first callback will not touch the fields of the timer so we are safe.

That said, I agree that there still might be a race that I don't see, and there is a big chance that a change to the code will someday break something...

So, here is the long answer:

We need to decide what our demands from a timer are. The minimum we need to answer IMHO is this:

1) Do we want to allow simultaneous callbacks? For one timer? For more than one timer?
2) Do we want to allow timer start to fall (because of no resources?)? If yes, design is simpler.
3) Do we want start to be blocking sometimes (as it is today).
4) Do we want to allow start in the callback? Stop in the callback?

Without answering all of this it is very hard to say if something is a feature or a bug.

One simple set of answers to this question is this:
Use a single thread for all timers, start does not fail, start not blocking, start/ can be called at the callback, stop cannot called at the callback. This is what Linux have implemented, and we can do a similar thing.

Obviously there are other answers that bring to other designs...
[IMHO multiple callbacks means that start will have to allocate it's memory...]

Please note that kernel code asks for other questions (all can be asked for passive and dispatch as well ...).

Thanks
Tzachi

> -----Original Message-----
> From: Hefty, Sean [mailto:sean.hefty at intel.com]
> Sent: Tuesday, June 22, 2010 10:27 PM
> To: Tzachi Dar; ofw at lists.openfabrics.org
> Cc: Yevgeny Kliteynik
> Subject: RE: [ofw] patch: Fix a race in the cl_timer code that caused
> deadlocks in opensm
> 
> > The idea here is that after start, no one can call any other field
> until the call back is finished so
> > I fealt that there is no need for protection.
> 
> Here's the call trace that I'm thinking of:
> 
> app calls cl_timer_start or cl_timer_trim
> timer_callback #1 is invoked
> app calls cl_timer_start again (possible from callback #1)
> timer_callback #2 is invoked - this changes the thread_id
> 
> at this point, the thread_id could refer to either timer_callback #1 or
> timer_callback #2.  We have no way of knowing.
> 
> I think we need a larger fix to the timer code to ensure that all
> accesses to the cl_timer_t structure is fully synchronized.  There's no
> way that the user of the timer can provide this synchronization.
> Secondly, the timer_callback should be fully serialized.  This is
> easily done by holding a mutex around the callback itself, with the
> mutex only used for this purpose.
> 
> Tracking the thread_id only seems useful if the timer_callbacks are
> serialized.
> 
> 
> > Although there can be 2 callbacks at the same time, the only way that
> this could happen is this:
> 
> In theory, I don't see that there's any limit to the number of
> callbacks that can be running at once.
> 
> > This is the model that opensm is using the timer.
> 
> Can you explain why opensm is using the timer?  Is it simply to kick
> off sweeps?
> 
> I'm more concerned about the mad code, which makes heavy use of timers
> to retry and time out mads, and we want fixes to both the user space
> and kernel code.