[ofa-general] IPoIB kernel Oops -- possible race condition identified.

Mon Jan 26 07:41:08 PST 2009

The following Oops occurred several times on an X86 host when unloading the driver:
(console command sequence:
 /etc/init.d/openibd start
 opensm &
 pkill -2 opensm
 /etc/init.d/openibd stop
)
********************************************************************
IP: [<f8e67a49>] :ib_ipoib:ipoib_mcast_join_task+0x193/0x217
*pde = 00000000
Oops: 0000 [#1] SMP
...

Pid: 22483, comm: ipoib Not tainted (2.6.27.5 #1)
EIP: 0060:[<f8e67a49>] EFLAGS: 00010286 CPU: 1
EIP is at ipoib_mcast_join_task+0x193/0x217 [ib_ipoib]
EAX: 00000000 EBX: c2060480 ECX: 0005c700 EDX: ffffffff
ESI: c20605dc EDI: c2060154 EBP: c2060480 ESP: f72aff64
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process ipoib (pid: 22483, ti=f72af000 task=f59fcdc0 task.ti=f72af000)
Stack: c2060000 00000004 00000005 00000005 00000001 02500848 00001000 00000000
       00000000 00010008 03000001 02001200 00000504 f509bbc0 c2060508 f8e678b6
       00000000 c04307a8 f509bbc0 c0430e7c f509bbcc c0430f2f 00000000 f59fcdc0
Call Trace:
 [<f8e678b6>] ipoib_mcast_join_task+0x0/0x217 [ib_ipoib]
 [<c04307a8>] run_workqueue+0x6a/0xdf
 [<c0430e7c>] worker_thread+0x0/0xbd
 [<c0430f2f>] worker_thread+0xb3/0xbd
 [<c04330a0>] autoremove_wake_function+0x0/0x2d
 [<c0432fdf>] kthread+0x38/0x5d
 [<c0432fa7>] kthread+0x0/0x5d
 [<c0404473>] kernel_thread_helper+0x7/0x10
 =======================
EIP: [<f8e67a49>] ipoib_mcast_join_task+0x193/0x217 [ib_ipoib] SS:ESP 0068:f72aff64
**********************************************************************
ipoib_mcast_join_task +0x193 is at (in file ipoib_multicast.c):
	priv->mcast_mtu = IPOIB_UD_MTU(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu));

I think the problem is the following:
  priv->broadcast is NULLed out in procedure ipoib_mcast_dev_flush(), under the protection
  of a spinlock.

  However, in ipoib_mcast_join_task(), there is no spinlock protection in the access to
  priv->broadcast in the crash line given above.

  Note that there seems to be a race condition here.
  If the flush occurs after the following test at the start ipoib_mcast_join_task():
	if (!test_bit(IPOIB_MCAST_RUN, &priv->flags))
		return;
  then there is no protection at all later for priv->broadcast being NULLed elsewhere.

- Jack