[ofa-general] Re: [PATCH RFC] IB/ipoib: fix to_ipoib_neigh access race

Roland Dreier rdreier at cisco.com
Tue May 22 13:43:58 PDT 2007


 > hard_start_xmit dereferences to_ipoib_neigh when only tx_lock is taken.  This
 > would only be safe if all calls that modify *to_ipoib_neigh take tx_lock too.
 > Currently this is not always true for ipoib_neigh_free and path_rec_completion,
 > which results in memory corruption.  Fix this race, making sure
 > path_rec_completion and ipoib_neigh_free are always called under
 > tx_lock.
 > 
 > Signed-off-by: Michael S. Tsirkin <mst at dev.mellanox.co.il>
 > 
 > ---
 > 
 > I'm looking at
 > https://bugs.openfabrics.org/show_bug.cgi?id=604
 > and I think this could explain the crashes.
 > In any case, Roland, is there a race or am I imagining things?
 > 
 > NB: The patch is untested (I'm not at the lab now).

Yes, it does seem that there is a problem here.  However, I the first
part of this needs to be handled another way -- for example:

 > -		path_free(dev, path);
 >  		spin_lock_irq(&priv->tx_lock);
 >  		spin_lock(&priv->lock);
 > +		path_free(dev, path);

path_free already takes priv->lock internally, and also calls
ipoib_put_ah(), which may end up in ipoib_free_ah(), which also might
take priv->lock.

It's not immediately obvious what the right fix is...

 - R.



More information about the general mailing list