[ofa-general] [PATCH] ipoib: do not join broadcast group if interface is brought down
Vladimir Sokolovsky
vlad at dev.mellanox.co.il
Wed Nov 26 00:53:28 PST 2008
Yossi Etigin wrote:
> Because ipoib_workqueue is not flushed when ipoib interface is brought
> down,
> ipoib_mcast_join() may trigger a join to the broadcast group after
> priv->broadcast
> was set to NULL (during cleanup). This will cause ipoib to be joined
> to the
> broadcast group when interface is down.
> As a side effect, this breaks the optimization of setting qkey only
> when joining
> the broadcast group.
>
> Signed-off-by: Yossi Etigin <yosefe at voltaire.com>
>
> --
>
> Fix bugzilla 1370.
>
> Index: b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> ===================================================================
> --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2008-11-19
> 21:33:54.000000000 +0200
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2008-11-19
> 21:40:12.000000000 +0200
> @@ -565,7 +565,8 @@ void ipoib_mcast_join_task(struct work_s
> ipoib_warn(priv, "ib_query_port failed\n");
> }
>
> - if (!priv->broadcast) {
> + rtnl_lock();
> + if (test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags) &&
> !priv->broadcast) {
> struct ipoib_mcast *broadcast;
>
> broadcast = ipoib_mcast_alloc(dev, 1);
> @@ -576,6 +577,7 @@ void ipoib_mcast_join_task(struct work_s
> queue_delayed_work(ipoib_workqueue,
> &priv->mcast_join_task, HZ);
> mutex_unlock(&mcast_mutex);
> + rtnl_unlock();
> return;
> }
>
> @@ -587,6 +589,7 @@ void ipoib_mcast_join_task(struct work_s
> __ipoib_mcast_add(dev, priv->broadcast);
> spin_unlock_irq(&priv->lock);
> }
> + rtnl_unlock();
>
> if (!test_bit(IPOIB_MCAST_FLAG_ATTACHED, &priv->broadcast->flags)) {
> if (!test_bit(IPOIB_MCAST_FLAG_BUSY, &priv->broadcast->flags))
Hi Yossi,
I got the following kernel oops on SLES 10 (2.6.16.21-0.8-smp) using the
patch above.
To reproduce, run:
rmmod ib_ipoib
Unable to handle kernel NULL pointer dereference at virtual address 00000068
printing eip:
f8c5e3c4
*pde = 7a0e8067
Oops: 0000 [#1]
SMP
last sysfs file: /class/infiniband/mthca0/node_desc
Modules linked in: ib_ipoib ib_cm ib_sa ib_uverbs ib_umad mlx4_ib
mlx4_core ib_mthca ib_mad ib_core memtrack autofs4 nfs lockd nfs_acl
sunrpc ipv6 af_packe
CPU: 0
EIP: 0060:[<f8c5e3c4>] Tainted: G U VLI
EFLAGS: 00010202 (2.6.16.21-0.8-smp #1)
EIP is at ipoib_mcast_join_task+0x134/0x24d [ib_ipoib]
eax: 00000000 ebx: f6a2c3e8 ecx: 00000000 edx: 00000000
esi: f6a2c56c edi: f6a2c12c ebp: f6a2c380 esp: f6a2bf0c
ds: 007b es: 007b ss: 0068
Process ipoib (pid: 7858, threadinfo=f6a2a000 task=f7e3c0f0)
Stack: <0>f6a2c000 00000004 00000004 00000004 00000020 02510a68 80000000
00000000
00000000 00020040 0400000f 02001200 00000501 f6a2c3e8 f6a2c3ec
f73447c0
00000292 c012d85e f8c5e290 f6a2c3e8 f73447cc f73447c0 f73447d4
c012e052
Call Trace:
[<c012d85e>] run_workqueue+0x7f/0xba
[<f8c5e290>] ipoib_mcast_join_task+0x0/0x24d [ib_ipoib]
[<c012e052>] worker_thread+0x0/0x11e
[<c012e13f>] worker_thread+0xed/0x11e
[<c011a067>] default_wake_function+0x0/0xc
[<c0130895>] kthread+0x9d/0xc9
[<c01307f8>] kthread+0x0/0xc9
[<c0102005>] kernel_thread_helper+0x5/0xb
Code: 21 63 c7 8b 75 04 81 c6 3c 01 00 00 a5 a5 a5 a5 89 5d 28 8b 04 24
89 da e8 b3 f5 ff ff b0 01 86 45 00 fb e8 62 92 5e c7 8b 55 28 <8b> 42
68 a8 08 75
Regards,
Vladimir
More information about the general
mailing list