[ofa-general] Re: IPoIB kernel Oops -- possible race condition identified.

Mon Jan 26 09:00:10 PST 2009

 There's a patch of mine in OFED that's probably exposing a bug in ipoib.
The bug is that priv->broadcast can be NULL-ified and join_task does not
protect the check with the spinlock.
The patch may expose the bug because it uses rtnl_lock().
 However, in 2.6.28 kernel there's another version of this patch which does not
take rtnl_lock, so the problem still exists but is probably much harder to reproduce.
Please see https://kerneltrap.org/mailarchive/openfabrics-general/2009/1/13/4705114/thread

What OFED version are you using?

Jack Morgenstein wrote:
> The following Oops occurred several times on an X86 host when unloading the driver:
> (console command sequence:
>  /etc/init.d/openibd start
>  opensm &
>  pkill -2 opensm
>  /etc/init.d/openibd stop
> )
> ********************************************************************
> IP: [<f8e67a49>] :ib_ipoib:ipoib_mcast_join_task+0x193/0x217
> *pde = 00000000
> Oops: 0000 [#1] SMP
> ...
> 
> Pid: 22483, comm: ipoib Not tainted (2.6.27.5 #1)
> EIP: 0060:[<f8e67a49>] EFLAGS: 00010286 CPU: 1
> EIP is at ipoib_mcast_join_task+0x193/0x217 [ib_ipoib]
> EAX: 00000000 EBX: c2060480 ECX: 0005c700 EDX: ffffffff
> ESI: c20605dc EDI: c2060154 EBP: c2060480 ESP: f72aff64
>  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> Process ipoib (pid: 22483, ti=f72af000 task=f59fcdc0 task.ti=f72af000)
> Stack: c2060000 00000004 00000005 00000005 00000001 02500848 00001000 00000000
>        00000000 00010008 03000001 02001200 00000504 f509bbc0 c2060508 f8e678b6
>        00000000 c04307a8 f509bbc0 c0430e7c f509bbcc c0430f2f 00000000 f59fcdc0
> Call Trace:
>  [<f8e678b6>] ipoib_mcast_join_task+0x0/0x217 [ib_ipoib]
>  [<c04307a8>] run_workqueue+0x6a/0xdf
>  [<c0430e7c>] worker_thread+0x0/0xbd
>  [<c0430f2f>] worker_thread+0xb3/0xbd
>  [<c04330a0>] autoremove_wake_function+0x0/0x2d
>  [<c0432fdf>] kthread+0x38/0x5d
>  [<c0432fa7>] kthread+0x0/0x5d
>  [<c0404473>] kernel_thread_helper+0x7/0x10
>  =======================
> EIP: [<f8e67a49>] ipoib_mcast_join_task+0x193/0x217 [ib_ipoib] SS:ESP 0068:f72aff64
> **********************************************************************
> ipoib_mcast_join_task +0x193 is at (in file ipoib_multicast.c):
> 	priv->mcast_mtu = IPOIB_UD_MTU(ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu));
> 
> I think the problem is the following:
>   priv->broadcast is NULLed out in procedure ipoib_mcast_dev_flush(), under the protection
>   of a spinlock.
> 
>   However, in ipoib_mcast_join_task(), there is no spinlock protection in the access to
>   priv->broadcast in the crash line given above.
> 
>   Note that there seems to be a race condition here.
>   If the flush occurs after the following test at the start ipoib_mcast_join_task():
> 	if (!test_bit(IPOIB_MCAST_RUN, &priv->flags))
> 		return;
>   then there is no protection at all later for priv->broadcast being NULLed elsewhere.
> 
> - Jack

-- 
--Yossi