[ofa-general] lockdep is unhappy with SDP
Lars Ellenberg
lars.ellenberg at linbit.com
Tue Jul 7 09:14:15 PDT 2009
[ 1947.003662] =============================================
[ 1947.006730] [ INFO: possible recursive locking detected ]
[ 1947.020862] 2.6.27.26-d02 #2
which is to read:
git://git.openfabrics.org/ofed_1_4/linux-2.6.git
merged with upstream stable
git://git4.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.27.y.git
both as of today:
ofed_kernel 08acda8 sdp: Fix memory leak in bzcopy
linux-v2.6.27.y/master 49cbf40 Linux 2.6.27.26
the relevant code section looks identically in ofed 1.5, though.
[ 1947.020862] ---------------------------------------------
[ 1947.020862] ib_cm/6/31070 is trying to acquire lock:
[ 1947.054199] (sk_lock-27){--..}, at: [<ffffffffa02a3100>] sdp_connected_handler+0x1c0/0x2f0 [ib_sdp]
[ 1947.054199]
[ 1947.054199] but task is already holding lock:
[ 1947.054199] (sk_lock-27){--..}, at: [<ffffffffa02a330c>] sdp_cma_handler+0x3c/0x15f0 [ib_sdp]
[ 1947.054199]
[ 1947.054199] other info that might help us debug this:
[ 1947.054199] 4 locks held by ib_cm/6/31070:
[ 1947.120855] #0: (ib_cm){--..}, at: [<ffffffff80253bed>] run_workqueue+0x14d/0x250
[ 1947.120855] #1: (&(&work->work)->work){--..}, at: [<ffffffff80253bed>] run_workqueue+0x14d/0x250
[ 1947.120855] #2: (&id_priv->handler_mutex){--..}, at: [<ffffffffa02922bb>] cma_disable_callback+0x2b/0x60 [rdma_cm]
[ 1947.120855] #3: (sk_lock-27){--..}, at: [<ffffffffa02a330c>] sdp_cma_handler+0x3c/0x15f0 [ib_sdp]
[ 1947.120855]
[ 1947.120855] stack backtrace:
[ 1947.120855] Pid: 31070, comm: ib_cm/6 Not tainted 2.6.27.26-d02 #2
[ 1947.120855]
[ 1947.120855] Call Trace:
[ 1947.120855] [<ffffffff80269b0f>] validate_chain+0xaaf/0x1040
[ 1947.120855] [<ffffffff8026a2ea>] __lock_acquire+0x24a/0xa00
[ 1947.120855] [<ffffffff8026ab31>] lock_acquire+0x91/0xc0
[ 1947.250858] [<ffffffffa02a3100>] ? sdp_connected_handler+0x1c0/0x2f0 [ib_sdp]
[ 1947.250858] [<ffffffff804415e8>] lock_sock_nested+0x108/0x120
actually it uses lock_sock(parent),
which is lock_sock_nested(parent, 0);
maybe it is simply wrong?
or maybe this is in fact ok, and you only need to tell lockdep about it, like
- lock_sock(parent);
+ lock_sock_nested(parent, SINGLE_DEPTH_NESTING);
[ 1947.250858] [<ffffffffa02a3100>] ? sdp_connected_handler+0x1c0/0x2f0 [ib_sdp]
[ 1947.250858] [<ffffffff8026891d>] ? trace_hardirqs_on+0xd/0x10
[ 1947.250858] [<ffffffff8026886a>] ? trace_hardirqs_on_caller+0xda/0x180
[ 1947.250858] [<ffffffffa02a3100>] sdp_connected_handler+0x1c0/0x2f0 [ib_sdp]
[ 1947.250858] [<ffffffffa02a39c1>] sdp_cma_handler+0x6f1/0x15f0 [ib_sdp]
[ 1947.250858] [<ffffffff8026886a>] ? trace_hardirqs_on_caller+0xda/0x180
[ 1947.250858] [<ffffffff804d3193>] ? mutex_lock_nested+0x1f3/0x300
[ 1947.250858] [<ffffffffa02922bb>] ? cma_disable_callback+0x2b/0x60 [rdma_cm]
[ 1947.250858] [<ffffffff8026513d>] ? trace_hardirqs_off+0xd/0x10
[ 1947.250858] [<ffffffffa02922bb>] ? cma_disable_callback+0x2b/0x60 [rdma_cm]
[ 1947.250858] [<ffffffffa0289b00>] ? cm_work_handler+0x0/0xce0 [ib_cm]
[ 1947.250858] [<ffffffffa0289b00>] ? cm_work_handler+0x0/0xce0 [ib_cm]
[ 1947.250858] [<ffffffffa02942fd>] cma_ib_handler+0xcd/0x230 [rdma_cm]
[ 1947.250858] [<ffffffff8026891d>] ? trace_hardirqs_on+0xd/0x10
[ 1947.250858] [<ffffffffa0288632>] cm_process_work+0x22/0xe0 [ib_cm]
[ 1947.250858] [<ffffffffa0289b00>] ? cm_work_handler+0x0/0xce0 [ib_cm]
[ 1947.250858] [<ffffffffa028879b>] cm_rtu_handler+0xab/0x150 [ib_cm]
[ 1947.499836] [<ffffffffa0289bfc>] cm_work_handler+0xfc/0xce0 [ib_cm]
[ 1947.507523] [<ffffffff80253bed>] ? run_workqueue+0x14d/0x250
[ 1947.507523] [<ffffffffa0289b00>] ? cm_work_handler+0x0/0xce0 [ib_cm]
[ 1947.507523] [<ffffffff80253c3e>] run_workqueue+0x19e/0x250
[ 1947.507523] [<ffffffff80253bed>] ? run_workqueue+0x14d/0x250
[ 1947.507523] [<ffffffff80254a0f>] worker_thread+0xbf/0x120
[ 1947.507523] [<ffffffff802580c0>] ? autoremove_wake_function+0x0/0x40
[ 1947.507523] [<ffffffff80254950>] ? worker_thread+0x0/0x120
[ 1947.507523] [<ffffffff80257c7d>] kthread+0x4d/0x80
[ 1947.507523] [<ffffffff8020d4d9>] child_rip+0xa/0x11
[ 1947.507523] [<ffffffff8020cae3>] ? restore_args+0x0/0x30
[ 1947.507523] [<ffffffff80257c30>] ? kthread+0x0/0x80
[ 1947.507523] [<ffffffff8020d4cf>] ? child_rip+0x0/0x11
Thanks for feedback.
--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
More information about the general
mailing list