[openib-general] [PATCH fixed] was Re: [PATCH] ipoib_multicast/ipoib_mcast_send race

Krishna Kumar2 krkumar2 at in.ibm.com
Sun Dec 11 22:37:58 PST 2005


Hi Michael,

> Exactly, you got it. If you look at mcast_send you'll see that it 
creates new
> queries, creates broad cast group and adds entries to the list.

Correct, but even with the lock, it would create those (once, and that is 
true for
whether lock is held or not). The only thing stopping creation of those is 
setting
the bit (but only for a. the non-race part or b. the race part where the 
stop_thread
executes before the mcast_send(), but not the race part where the 
mcast_send()
wins over the stop_thread), but holding the lock for the setting/testing 
of that bit
will not stop creation of those entries in the b. case.

> This cancels the query and waits on mcast "done" completion.
> completion is called and "done" is set.
> Meanwhile, ipoib_mcast_send arrives and starts a new query,
> re-initializing "done".

Isn't all that managed by clearing/testing the bit ? Because holding the 
lock doesn't solve
it. To give an example :

stop_thread()
{
        lock();
        clear();
        unlock();
        ...
        wait_for_completion(mcast);
}

mcast_send()
{
        lock();
        test();
        results_in_creation_of_entries_done_etc();;
        unlock();
}

In this case, if mcast_send() gets the lock first and proceeds while the 
stop_thread is spinning
on the lock, the entries are created and then the stop_thread() clears the 
bit and at this point
in time, no more entries can be ever created. Now if the lock were 
removed, the behavior
is identical - the mcast_send() would test the bit, and get the lock() 
while the stop_thread()
clears the bit (without a lock) and waits for completion, while *no more* 
mcast_sends() would
ever continue beyond this time.

thanks,

- KK

"Michael S. Tsirkin" <mst at mellanox.co.il> wrote on 12/12/2005 11:42:13 AM:

> Quoting Krishna Kumar2 <krkumar2 at in.ibm.com>:
> > Subject: Re: [openib-general] [PATCH fixed] was Re: [PATCH]?
> ipoib_multicast/ipoib_mcast_send race
> > 
> > 
> > Hi Micheal, 
> > 
> > But the lock doesn't help that case. The only difference with having 
the
> > 
> > lock is that in case of a race, the mcast_send() will complete 
*before* 
> > the flag is set, while without the lock the mcast_send() could be in 
the
> > 
> > *middle* of execution when the flag is set.
> 
> Exactly, you got it. If you look at mcast_send you'll see that it 
creates new
> queries, creates broad cast group and adds entries to the list.
> 
> So here's why the lock helps:
> 
> > > > > > Fix the following race scenario:
> > > > > > device is up.
> > > > > > port event or set mcast list triggers ipoib_mcast_stop_thread, 

> > > > > > This cancels the query and waits on mcast "done" completion.
> > > > > > completion is called and "done" is set.
> > > > > > Meanwhile, ipoib_mcast_send arrives and starts a new query,
> > > > > > re-initializing "done".
> 
> Clear now?
> 
> -- 
> MST
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20051212/6cc6f612/attachment.html>


More information about the general mailing list