[ofa-general] [PATCH v2] ipiob: fix rtnl deadlock

Or Gerlitz ogerlitz at voltaire.com
Thu Aug 14 04:31:38 PDT 2008


Roland Dreier wrote:
>  > > I don't think moving to a different workqueue helps, does it?  Because
>  > > we just have to flush *that* workqueue somewhere too.
>
>  > Yes, but it won't have to be from ipoib_stop, it can be from a place
>  > where rtnl_lock is not held.
>
> That's kind of the direction I've been looking, except I don't think we
> need to invent a new workqueue to do this.  It seems that ipoib_stop is
> the wrong place to flush our workqueue in general.
I get this with 2.6.27-rc3 which I assume is what this thread talks 
about... can you fix that guys... rebooting after each time the device 
goes down is hard to work with. I'll use the current version of the 
patch for now, but would be happy to see something goes upstream

Or.
> ipoib         D ffffffff805ea3c0     0  1905      2
>  ffff880037c1ddc0 0000000000000046 0000000000000000 ffff88007e4b9230
>  ffff88007f842f80 0000000100000003 ffff88007e827910 ffff88007e8276c0
>  0000000000000001 0000000000000000 0000000000000000 00000000000000ff
> Call Trace:
>  [<ffffffff8045e4b3>] __mutex_lock_slowpath+0x69/0xa6
>  [<ffffffff8045e3a1>] mutex_lock+0x24/0x28
>  [<ffffffffa017015f>] mthca_query_port+0x1c4/0x1d6 [ib_mthca]
>  [<ffffffffa017015f>] mthca_query_port+0x1c4/0x1d6 [ib_mthca]
>  [<ffffffffa01533f7>] ipoib_mcast_join_task+0x244/0x290 [ib_ipoib]
>  [<ffffffffa01531b3>] ipoib_mcast_join_task+0x0/0x290 [ib_ipoib]
>  [<ffffffff8023fdf7>] run_workqueue+0x8f/0x114
>  [<ffffffff8023fe7c>] worker_thread+0x0/0xec
>  [<ffffffff8023ff5e>] worker_thread+0xe2/0xec
>  [<ffffffff80243003>] autoremove_wake_function+0x0/0x2e
>  [<ffffffff80243003>] autoremove_wake_function+0x0/0x2e
>  [<ffffffff80242a20>] kthread+0x3d/0x63
>  [<ffffffff8020c249>] child_rip+0xa/0x11
>  [<ffffffff802429e3>] kthread+0x0/0x63
>  [<ffffffff8020c23f>] child_rip+0x0/0x11
>
>
> ifconfig      D 0000000000000002     0 19218   5470
>  ffff880058d4dc18 0000000000000082 0000000000000000 ffffffff80555a40
>  ffffffff80550020 0000000000000001 ffff88007a30a000 ffff88007a309db0
>  0000000058d4dbe8 0000000000000000 00000000ffffffff 00000000000000ff
> Call Trace:
>  [<ffffffff8045e0be>] schedule_timeout+0x1e/0xad
>  [<ffffffff8045e0be>] schedule_timeout+0x1e/0xad
>  [<ffffffff8045dd7d>] wait_for_common+0xfb/0x178
>  [<ffffffff8022b6bb>] default_wake_function+0x0/0xe
>  [<ffffffff8022b6bb>] default_wake_function+0x0/0xe
>  [<ffffffff80240024>] flush_cpu_workqueue+0x62/0x6b
>  [<ffffffff8023ff68>] wq_barrier_func+0x0/0x9
>  [<ffffffff80240065>] flush_workqueue+0x38/0x4e
>  [<ffffffffa014e1cd>] ipoib_stop+0x75/0x10c [ib_ipoib]
>  [<ffffffff803e77a8>] dev_close+0x6f/0x87
>  [<ffffffff803e9b12>] dev_change_flags+0xa3/0x15b
>  [<ffffffff80428aa0>] devinet_ioctl+0x293/0x5d3
>  [<ffffffff8042a5b4>] inet_ioctl+0x8f/0xa7
>  [<ffffffff803dd413>] sock_ioctl+0x0/0x1f6
>  [<ffffffff803dd5e5>] sock_ioctl+0x1d2/0x1f6
>  [<ffffffff802925c9>] vfs_ioctl+0x29/0x6f
>  [<ffffffff80292865>] do_vfs_ioctl+0x256/0x265
>  [<ffffffff802928c5>] sys_ioctl+0x51/0x74
>  [<ffffffff8020b2fb>] system_call_fastpath+0x16/0x1b
>





More information about the general mailing list