[ofa-general] [PATCH] ipoib: racing uses of ipoib_neigh in IPoIB-CM

akepner at sgi.com akepner at sgi.com
Thu Jun 11 09:33:22 PDT 2009


On Wed, Jun 10, 2009 at 10:52:29PM +0300, Yossi Etigin wrote:
> akepner at sgi.com wrote:
> > @@ -810,9 +810,6 @@ void ipoib_cm_handle_tx_wc(struct net_device *dev, struct ib_wc *wc)
> >  			list_del(&neigh->list);
> >  			if (neigh->ah)
> >  				ipoib_put_ah(neigh->ah);
> > -			ipoib_neigh_free(dev, neigh);
> > -
> > -			tx->neigh = NULL;
> >  		}
> >  
> >  		if (test_and_clear_bit(IPOIB_FLAG_INITIALIZED, &tx->flags)) {
> 
> Can't calling ipoib_put_ah() and keeping the neighbour alive cause neigh->ah
> pointer become invalid? What about doing list_del(&neigh->list) twice?
> 

Yeah, looks like there's a hole there. By changing the test from:

	if (neigh) {
		neigh->cm = NULL;

to:

	if (neigh && neigh->cm) {
		neigh->cm = NULL;

it can be closed.

>  ......
> 
> The piece of code that does '*to_ipoib_neigh(neigh->neighbour) = NULL' and 
> releases skb's will appear 3 times in ipoib. I think it can be put in a 
> separate function.

OK.

> 
> 
> > @@ -1343,7 +1351,23 @@ static void ipoib_cm_tx_reap(struct work_struct *work)
> >  	spin_lock_irqsave(&priv->lock, flags);
> >  
> >  	while (!list_empty(&priv->cm.reap_list)) {
> > +		struct ipoib_neigh *neigh;
> > +		struct sk_buff *skb;
> >  		p = list_entry(priv->cm.reap_list.next, typeof(*p), list);
> > +		neigh = p->neigh;
> > +		if (neigh) {
> > +			*to_ipoib_neigh(neigh->neighbour) = NULL;
> > +			ipoib_dbg(priv, "%s: releasing ref to %pI6 "
> > +				  "(refcnt: %d)\n", __func__, neigh->dgid.raw,
> > +				  atomic_read(&neigh->neighbour->refcnt));
> > +			neigh_release(neigh->neighbour);
> > +			while ((skb = __skb_dequeue(&neigh->queue))) {
> > +				++dev->stats.tx_dropped;
> > +				dev_kfree_skb_any(skb);
> > +			}
> > +			kfree(neigh);
> > +		}
> > +		p->neigh = NULL;
> >  		list_del(&p->list);
> >  		spin_unlock_irqrestore(&priv->lock, flags);
> >  		netif_tx_unlock_bh(dev);
> 
> Couldn't this race with ipoib_neigh_cleanup()?
> 

Generally the fact that we hold a reference should prevent 
ipoib_neigh_cleanup() from being invoked. 

But, alas, there are a couple of cases where the neighbour can 
be have the neigh_cleanup() method invoked even when there are 
references held (e.g., when the interface goes down). 

So, yeah, there seems to be a hole there. A smaller hole, but 
still....

-- 
Arthur




More information about the general mailing list