[ofw] patch: Fix a race in the cl_timer code that caused deadlocks in opensm
Tzachi Dar
tzachid at mellanox.co.il
Tue Jun 22 12:07:57 PDT 2010
The idea here is that after start, no one can call any other field until the call back is finished so I fealt that there is no need for protection.
Although there can be 2 callbacks at the same time, the only way that this could happen is this:
>From a callback, the caller calls start. In that case stop doesn't wait for the callback to end.
After that the timer expires and a new callback starts to run. If it will call stop, we need the thread_id.
Please note that in this case we have the thread_id correct since we have already copied it from the new callback.
This is the model that opensm is using the timer.
Thanks
Tzachi
> -----Original Message-----
> From: Hefty, Sean [mailto:sean.hefty at intel.com]
> Sent: Tuesday, June 22, 2010 8:00 PM
> To: Tzachi Dar; ofw at lists.openfabrics.org
> Cc: Yevgeny Kliteynik
> Subject: RE: [ofw] patch: Fix a race in the cl_timer code that caused
> deadlocks in opensm
>
> > static void CALLBACK
> >
> > __timer_callback(
> >
> > IN cl_timer_t* const p_timer,
> >
> > IN BOOLEAN timer_signalled )
> >
> > {
> >
> > /* timer_signalled is always TRUE, and has no value.
> */
> >
> > +
> >
> > + DWORD thread_id = GetCurrentThreadId();
> >
> > CL_ASSERT( timer_signalled );
> >
> > +
> >
> > UNUSED_PARAM( timer_signalled );
> >
> >
> >
> > + CL_ASSERT(thread_id!=0);
> >
> > +
> >
> > p_timer->timeout_time = 0;
> >
> > - p_timer->thread_id = GetCurrentThreadId();
> >
> > + p_timer->thread_id = thread_id;
> >
>
> This is touching cl_timer_t fields from a separate thread not under the
> user's control. I'd feel happier if locking were added here as well.
> Although if this callback can be invoked multiple times simultaneously,
> then what does saving the thread_id buy us?
More information about the ofw
mailing list