[ofa-general] Re: IPoIB kernel Oops -- possible race condition identified.
Jack Morgenstein
jackm at dev.mellanox.co.il
Tue Jan 27 01:07:02 PST 2009
On Monday 26 January 2009 19:00, Yossi Etigin wrote:
> There's a patch of mine in OFED that's probably exposing a bug in ipoib.
> The bug is that priv->broadcast can be NULL-ified and join_task does not
> protect the check with the spinlock.
> The patch may expose the bug because it uses rtnl_lock().
> However, in 2.6.28 kernel there's another version of this patch which does not
> take rtnl_lock, so the problem still exists but is probably much harder to reproduce.
>
I'm using code IDENTICAL to the 2.6.28 code, except for ipoib_warn and ipoib_dbg_mcast
message formatting. The 2.6.28 code is below. I've indicated the hole where priv->broadcast
may be set to NULL by another kernel thread. Below, at the end of this post, I have included
a suggested patch which fixes the problem.
- Jack
====================
2.6.28 snippet of ipoib_mcast_join_task:
while (1) {identical
struct ipoib_mcast *mcast = NULL;
spin_lock_irq(&priv->lock);
list_for_each_entry(mcast, &priv->multicast_list, list) {
if (!test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)
&& !test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags)
&& !test_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags)) {
/* Found the next unjoined group */
break;
}
}
spin_unlock_irq(&priv->lock);
if (&mcast->list == &priv->multicast_list) {
/* All done */
break;
}
ipoib_mcast_join(dev, mcast, 1);
return;
}
==>***** priv->broadcast MAY BE NULLED OUT HERE! **************<======
priv->mcast_mtu = IPOIB_UD_MTU(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu));
if (!ipoib_cm_admin_enabled(dev)) {
rtnl_lock();
dev_set_mtu(dev, min(priv->mcast_mtu, priv->admin_mtu));
rtnl_unlock();
}
=================================
The following patch can fix the problem:
ipoib: fix unprotected use of priv->broadcast in ipoib_mcast_join_task.
There is a race whereby the ipoib broadcast pointer may be set to NULL by flush while the join
task is being started. This protects the broadcast pointer access via a spinlock. If the
pointer is indeed NULL, we set the mcast_mtu value to the current admin_mtu value -- since
it does not matter anyway, the I/F is going down.
Signed-off-by: Jack Morgenstein <jackm at dev.mellanox.co.il>
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2009-01-27 10:48:07.399491000 +0200
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2009-01-26 18:17:07.000000000 +0200
@@ -581,7 +593,12 @@ void ipoib_mcast_join_task(struct work_s
return;
}
- priv->mcast_mtu = IPOIB_UD_MTU(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu));
+ spin_lock_irq(&priv->lock);
+ if (priv->broadcast)
+ priv->mcast_mtu = IPOIB_UD_MTU(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu));
+ else
+ priv->mcast_mtu = priv->admin_mtu;
+ spin_unlock_irq(&priv->lock);
if (!ipoib_cm_admin_enabled(dev)) {
rtnl_lock();
More information about the general
mailing list