[ofa-general] IPoIB kernel Oops -- race condition
Jack Morgenstein
jackm at dev.mellanox.co.il
Sun Jun 28 23:26:55 PDT 2009
On Monday 29 June 2009 07:14, Jack Morgenstein wrote:
> >
> On second thought, maybe it would be simpler to just create an ipoib_stop_task(),
> and do everything ipoib_stop() does in that workqueue task. leave would thus always
> be executed in the workqueue.
>
> Thoughts?
>
Looked at this again. There is a problem with this approach -- If I do ipoib_stop in
the workqueue, I need to do ipoib_open in the workqueue as well. In ipoib_open, I will
lose all the error returns if I take this approach:
int ipoib_open(struct net_device *dev)
{
struct ipoib_dev_priv *priv = netdev_priv(dev);
ipoib_dbg(priv, "bringing up interface\n");
napi_enable(&priv->napi);
set_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags);
if (ipoib_pkey_dev_delay_open(dev))
return 0;
if (ipoib_ib_dev_open(dev)) {
napi_disable(&priv->napi);
return -EINVAL;
}
if (ipoib_ib_dev_up(dev)) {
ipoib_ib_dev_stop(dev, 1);
napi_disable(&priv->napi);
return -EINVAL;
}
if (!test_bit(IPOIB_FLAG_SUBINTERFACE, &priv->flags)) {
struct ipoib_dev_priv *cpriv;
/* Bring up any child interfaces too */
mutex_lock(&priv->vlan_mutex);
list_for_each_entry(cpriv, &priv->child_intfs, list) {
int flags;
flags = cpriv->dev->flags;
if (flags & IFF_UP)
continue;
dev_change_flags(cpriv->dev, flags | IFF_UP);
}
mutex_unlock(&priv->vlan_mutex);
}
netif_start_queue(dev);
return 0;
}
I'm very worried that if we do things asymmetrically we will introduce instability.
Regarding doing only the multicast leave/free in the workqueue -- I do not see how to do
this cleanly (i.e., everything as is, but simply scheduling a leave-free task).
Yossi, Roland, Eli?
-Jack
More information about the general
mailing list