[openib-general] [PATCH] IB/ipoib: fix crash on mcast join finish
Or Gerlitz
ogerlitz at voltaire.com
Mon Jul 24 00:00:01 PDT 2006
Roland,
This crash happens 1:1 with setting ipoib_debug_mcast, the fix applied by the
patch below is to set mcast->ah before the debug code attempts to access it.
This is 2.6.18 material, correct? also since the crash does not allow
to debug mcast, i guess the fix needs to go into OFED 1.1 as well.
Or.
ib0: joining MGID ff12:401b:ffff:0000:0000:0000:ffff:ffff
ib0: join completion for ff12:401b:ffff:0000:0000:0000:ffff:ffff (status 0)
ib0: Created ah ffff81002cdb1c00
Unable to handle kernel NULL pointer dereference at 0000000000000008 RIP:
[<ffffffff880adbf5>] :ib_ipoib:ipoib_mcast_join_finish+0x273/0x3af
PGD 0
Oops: 0000 [1] SMP
CPU 1
Modules linked in: ib_ipoib ib_sa autofs nfs lockd sunrpc sg st sd_mod sr_mod scsi_mod ib_mthca ib_mad ib_core i2c_amd8111 i2c_amd756 i2c_core e100
Pid: 3919, comm: ib_mad1 Not tainted 2.6.18-rc1 #6
RIP: 0010:[<ffffffff880adbf5>] [<ffffffff880adbf5>] :ib_ipoib:ipoib_mcast_join_finish+0x273/0x3af
RSP: 0018:ffff81003eba9ca0 EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff81000ddcce80 RCX: 0000000000000012
RDX: 00000000000000ff RSI: ffff810031f63000 RDI: ffffffff880b087b
RBP: ffff8100050d61c0 R08: 0000000000000040 R09: 000000000000001b
R10: 0000000000000000 R11: 0000000000000000 R12: ffff810031f63500
R13: ffff810031f63000 R14: ffff81003eba9d58 R15: ffff81001a947000
FS: 00002aad0e23a0a0(0000) GS:ffff81003f8b8f40(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000008 CR3: 0000000000201000 CR4: 00000000000006e0
Process ib_mad1 (pid: 3919, threadinfo ffff81003eba8000, task ffff81001e204a70)
Stack: 000000000000c000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 000101030000c000 0000ffff1b4012ff
ffffffff00000000 0000000000000000 000101030000c000 00000000000000ff
Call Trace:
[<ffffffff880ae622>] :ib_ipoib:ipoib_mcast_join_complete+0x9d/0x24c
[<ffffffff880a4aad>] :ib_sa:ib_sa_mcmember_rec_callback+0x40/0x49
[<ffffffff880a43b3>] :ib_sa:recv_handler+0x3a/0x43
[<ffffffff8802f48b>] :ib_mad:ib_mad_completion_handler+0x3ac/0x592
[<ffffffff8023fc47>] run_workqueue+0x9a/0xea
[<ffffffff8023fdfd>] worker_thread+0x108/0x13a
[<ffffffff80243244>] kthread+0xc9/0xf2
[<ffffffff8020a592>] child_rip+0x8/0x12
Code: ff 70 08 0f b6 43 0f 50 0f b6 43 0e 50 0f b6 43 0d 50 0f b6
RIP [<ffffffff880adbf5>] :ib_ipoib:ipoib_mcast_join_finish+0x273/0x3af
RSP <ffff81003eba9ca0>
CR2: 0000000000000008
----------------------------------------------------------------------
set mcast->ah before the ipoib mcast debug code attempts to access it
Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com>
--- linux-2.6.18-rc1-orig/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2006-07-18 12:53:33.000000000 +0300
+++ linux-2.6.18-rc1/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2006-07-24 09:50:37.000000000 +0300
@@ -264,6 +264,9 @@ static int ipoib_mcast_join_finish(struc
if (!ah) {
ipoib_warn(priv, "ib_address_create failed\n");
} else {
+ spin_lock_irq(&priv->lock);
+ mcast->ah = ah;
+ spin_unlock_irq(&priv->lock);
ipoib_dbg_mcast(priv, "MGID " IPOIB_GID_FMT
" AV %p, LID 0x%04x, SL %d\n",
IPOIB_GID_ARG(mcast->mcmember.mgid),
@@ -271,10 +274,6 @@ static int ipoib_mcast_join_finish(struc
be16_to_cpu(mcast->mcmember.mlid),
mcast->mcmember.sl);
}
-
- spin_lock_irq(&priv->lock);
- mcast->ah = ah;
- spin_unlock_irq(&priv->lock);
}
/* actually send any queued packets */
More information about the general
mailing list