[ofa-general] lock dependency in ib_user_mad

chas williams - CONTRACTOR chas at cmf.nrl.navy.mil
Tue Jan 8 09:33:33 PST 2008


In message <000401c8518c$aabcf280$a937170a at amr.corp.intel.com>,"Sean Hefty" wri
tes:
>I turned on lock checking and got the following possible locking dependency.
>(Running on 2.6.24-rc3.)

i have seen a similar dead lock before.  however, i couldn't get enough
information to track it down.  earlier roland wrote:

>This should be fine (and comes from an earlier set of changes to fix
>deadlocks): ib_umad_close() does a downgrade_write() before calling
>ib_unregister_mad_agent(), so it only holds the mutex with a read
>lock, which means that queue_packet() should be able to take another
>read lock.
>
>Unless there's something that prevents one thread from taking a read
>lock twice?  What kernel are you seeing these problems with?

i dont think you are allowed to have nested locks of any sort.
from include/linux/rwsem.h:

	#ifdef CONFIG_DEBUG_LOCK_ALLOC
	/*
	 * nested locking. NOTE: rwsems are not allowed to recurse
	 * (which occurs if the same task tries to acquire the same
	 * lock instance multiple times), but multiple locks of the
	 * same lock class might be taken, if the order of the locks
	 * is always the same. This ordering rule can be expressed
	 * to lockdep via the _nested() APIs, but enumerating the
	 * subclasses that are used. (If the nesting relationship is
	 * static then another method for expressing nested locking is
	 * the explicit definition of lock class keys and the use of
	 * lockdep_set_class() at lock initialization time.
	 * See Documentation/lockdep-design.txt for more details.)
	 */

the below certainly looks like a nested read lock.

>Jan  7 12:23:35 mshefty-linux3 kernel: -> #0 (¥	){--..}:
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff80256d9a>] print_stack_trace+0x6a/0x80
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff8025d0a2>] __lock_acquire+0x612/0x10c0
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff8025d377>] __lock_acquire+0x8e7/0x10c0
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff8025de63>] lock_acquire+0x53/0x70
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff8024d7b0>] flush_workqueue+0x0/0xa0
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff804d0445>] _spin_unlock_irqrestore+0x55/0x70
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff8024d7f3>] flush_workqueue+0x43/0xa0
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff881e37b7>] ib_unregister_mad_agent+0x297/0x460 [ib_mad]
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff8827849e>] ib_umad_close+0xbe/0x100 [ib_umad]
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff802a41ab>] __fput+0x1cb/0x200
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff802a243b>] filp_close+0x4b/0xa0
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff8023d0e0>] put_files_struct+0x70/0xc0
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff8023db08>] do_exit+0x1d8/0x8d0
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff802463f7>] __dequeue_signal+0x27/0x1e0
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff8023e270>] do_group_exit+0x30/0x90
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff8024868e>] get_signal_to_deliver+0x2fe/0x4f0
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff8020afe5>] do_notify_resume+0xc5/0x750
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff804cfca4>] trace_hardirqs_on_thunk+0x35/0x3a
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff8025c2df>] trace_hardirqs_on+0xbf/0x160
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff8020bc40>] sysret_signal+0x21/0x31
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffff8020bf47>] ptregscall_common+0x67/0xb0
>Jan  7 12:23:35 mshefty-linux3 kernel:        [<ffffffffffffffff>] 0xffffffffffffffff




More information about the general mailing list