[ofa-general] IPoIB kernel Oops -- race condition

Jack Morgenstein jackm at dev.mellanox.co.il
Sun Jun 28 21:01:10 PDT 2009


On Sunday 28 June 2009 23:04, Yossi Etigin wrote:
> Jack Morgenstein wrote:
> >  in ipoib_mcast_leave():
> >          *** NEED TO WAIT HERE BEFORE CONTINUING (so that BUSY is cleared (mcast->mc is in error),
> >          *** or BUSY flag is set and mcast->mc is a valid, non-NULL pointer ****
> >          if (test_and_clear_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags))
> >                ib_sa_free_multicast(mcast->mc);
> >  
> > - Jack
> 
> How about making the leave/free mcast operation take place on the ipoib_workqueue, on which
> the join operation takes place? this way we can avoid this race, and more potential races
> of this kind.
> 
> --Yossi
> 
> 
It may be possible.  The only place where the leave does not take place on the ipoib_workqueue
is ipoib_stop (where, of course, the crash occurs).

Maybe we can transfer some of the stop operation to the workqueue?

static int ipoib_stop(struct net_device *dev)
{
	struct ipoib_dev_priv *priv = netdev_priv(dev);

	ipoib_dbg(priv, "stopping interface\n");

	clear_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags);
	napi_disable(&priv->napi);

	netif_stop_queue(dev);
************* Transfer to work queue starting here ****************
	ipoib_ib_dev_down(dev, 0);
	ipoib_ib_dev_stop(dev, 0);

	if (!test_bit(IPOIB_FLAG_SUBINTERFACE, &priv->flags)) {
		struct ipoib_dev_priv *cpriv;

		/* Bring down any child interfaces too */
		mutex_lock(&priv->vlan_mutex);
		list_for_each_entry(cpriv, &priv->child_intfs, list) {
			int flags;

			flags = cpriv->dev->flags;
			if (!(flags & IFF_UP))
				continue;

			dev_change_flags(cpriv->dev, flags & ~IFF_UP);
		}
		mutex_unlock(&priv->vlan_mutex);
	}
****************** transfer to here *******
	return 0;
}

Or, maybe change the order of things as well, as below:

static int ipoib_stop(struct net_device *dev)
{
	struct ipoib_dev_priv *priv = netdev_priv(dev);

	ipoib_dbg(priv, "stopping interface\n");

	clear_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags);
	napi_disable(&priv->napi);

	netif_stop_queue(dev);
	if (!test_bit(IPOIB_FLAG_SUBINTERFACE, &priv->flags)) {
		struct ipoib_dev_priv *cpriv;

		/* Bring down any child interfaces too */
		mutex_lock(&priv->vlan_mutex);
		list_for_each_entry(cpriv, &priv->child_intfs, list) {
			int flags;

			flags = cpriv->dev->flags;
			if (!(flags & IFF_UP))
				continue;

			dev_change_flags(cpriv->dev, flags & ~IFF_UP);
		}
		mutex_unlock(&priv->vlan_mutex);
	}
************* Transfer to work queue starting here ****************
	ipoib_ib_dev_down(dev, 0);
	ipoib_ib_dev_stop(dev, 0);
****************** transfer to here *******
	return 0;
}

Yossi, Roland, what do you think?



More information about the general mailing list