[ofa-general] lockdep is unhappy with SDP

Lars Ellenberg lars.ellenberg at linbit.com
Tue Jul 7 09:14:15 PDT 2009


[ 1947.003662] =============================================
[ 1947.006730] [ INFO: possible recursive locking detected ]
[ 1947.020862] 2.6.27.26-d02 #2

which is to read:
git://git.openfabrics.org/ofed_1_4/linux-2.6.git
merged with upstream stable
git://git4.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.27.y.git

both as of today:
  ofed_kernel            08acda8 sdp: Fix memory leak in bzcopy
  linux-v2.6.27.y/master 49cbf40 Linux 2.6.27.26

the relevant code section looks identically in ofed 1.5, though.


[ 1947.020862] ---------------------------------------------
[ 1947.020862] ib_cm/6/31070 is trying to acquire lock:
[ 1947.054199]  (sk_lock-27){--..}, at: [<ffffffffa02a3100>] sdp_connected_handler+0x1c0/0x2f0 [ib_sdp]
[ 1947.054199]
[ 1947.054199] but task is already holding lock:
[ 1947.054199]  (sk_lock-27){--..}, at: [<ffffffffa02a330c>] sdp_cma_handler+0x3c/0x15f0 [ib_sdp]
[ 1947.054199]
[ 1947.054199] other info that might help us debug this:
[ 1947.054199] 4 locks held by ib_cm/6/31070:
[ 1947.120855]  #0:  (ib_cm){--..}, at: [<ffffffff80253bed>] run_workqueue+0x14d/0x250
[ 1947.120855]  #1:  (&(&work->work)->work){--..}, at: [<ffffffff80253bed>] run_workqueue+0x14d/0x250
[ 1947.120855]  #2:  (&id_priv->handler_mutex){--..}, at: [<ffffffffa02922bb>] cma_disable_callback+0x2b/0x60 [rdma_cm]
[ 1947.120855]  #3:  (sk_lock-27){--..}, at: [<ffffffffa02a330c>] sdp_cma_handler+0x3c/0x15f0 [ib_sdp]
[ 1947.120855]
[ 1947.120855] stack backtrace:
[ 1947.120855] Pid: 31070, comm: ib_cm/6 Not tainted 2.6.27.26-d02 #2
[ 1947.120855]
[ 1947.120855] Call Trace:
[ 1947.120855]  [<ffffffff80269b0f>] validate_chain+0xaaf/0x1040
[ 1947.120855]  [<ffffffff8026a2ea>] __lock_acquire+0x24a/0xa00
[ 1947.120855]  [<ffffffff8026ab31>] lock_acquire+0x91/0xc0
[ 1947.250858]  [<ffffffffa02a3100>] ? sdp_connected_handler+0x1c0/0x2f0 [ib_sdp]
[ 1947.250858]  [<ffffffff804415e8>] lock_sock_nested+0x108/0x120

actually it uses lock_sock(parent),
which is lock_sock_nested(parent, 0);
maybe it is simply wrong?

or maybe this is in fact ok, and you only need to tell lockdep about it, like
-	lock_sock(parent);
+	lock_sock_nested(parent, SINGLE_DEPTH_NESTING);

[ 1947.250858]  [<ffffffffa02a3100>] ? sdp_connected_handler+0x1c0/0x2f0 [ib_sdp]
[ 1947.250858]  [<ffffffff8026891d>] ? trace_hardirqs_on+0xd/0x10
[ 1947.250858]  [<ffffffff8026886a>] ? trace_hardirqs_on_caller+0xda/0x180
[ 1947.250858]  [<ffffffffa02a3100>] sdp_connected_handler+0x1c0/0x2f0 [ib_sdp]
[ 1947.250858]  [<ffffffffa02a39c1>] sdp_cma_handler+0x6f1/0x15f0 [ib_sdp]
[ 1947.250858]  [<ffffffff8026886a>] ? trace_hardirqs_on_caller+0xda/0x180
[ 1947.250858]  [<ffffffff804d3193>] ? mutex_lock_nested+0x1f3/0x300
[ 1947.250858]  [<ffffffffa02922bb>] ? cma_disable_callback+0x2b/0x60 [rdma_cm]
[ 1947.250858]  [<ffffffff8026513d>] ? trace_hardirqs_off+0xd/0x10
[ 1947.250858]  [<ffffffffa02922bb>] ? cma_disable_callback+0x2b/0x60 [rdma_cm]
[ 1947.250858]  [<ffffffffa0289b00>] ? cm_work_handler+0x0/0xce0 [ib_cm]
[ 1947.250858]  [<ffffffffa0289b00>] ? cm_work_handler+0x0/0xce0 [ib_cm]
[ 1947.250858]  [<ffffffffa02942fd>] cma_ib_handler+0xcd/0x230 [rdma_cm]
[ 1947.250858]  [<ffffffff8026891d>] ? trace_hardirqs_on+0xd/0x10
[ 1947.250858]  [<ffffffffa0288632>] cm_process_work+0x22/0xe0 [ib_cm]
[ 1947.250858]  [<ffffffffa0289b00>] ? cm_work_handler+0x0/0xce0 [ib_cm]
[ 1947.250858]  [<ffffffffa028879b>] cm_rtu_handler+0xab/0x150 [ib_cm]
[ 1947.499836]  [<ffffffffa0289bfc>] cm_work_handler+0xfc/0xce0 [ib_cm]
[ 1947.507523]  [<ffffffff80253bed>] ? run_workqueue+0x14d/0x250
[ 1947.507523]  [<ffffffffa0289b00>] ? cm_work_handler+0x0/0xce0 [ib_cm]
[ 1947.507523]  [<ffffffff80253c3e>] run_workqueue+0x19e/0x250
[ 1947.507523]  [<ffffffff80253bed>] ? run_workqueue+0x14d/0x250
[ 1947.507523]  [<ffffffff80254a0f>] worker_thread+0xbf/0x120
[ 1947.507523]  [<ffffffff802580c0>] ? autoremove_wake_function+0x0/0x40
[ 1947.507523]  [<ffffffff80254950>] ? worker_thread+0x0/0x120
[ 1947.507523]  [<ffffffff80257c7d>] kthread+0x4d/0x80
[ 1947.507523]  [<ffffffff8020d4d9>] child_rip+0xa/0x11
[ 1947.507523]  [<ffffffff8020cae3>] ? restore_args+0x0/0x30
[ 1947.507523]  [<ffffffff80257c30>] ? kthread+0x0/0x80
[ 1947.507523]  [<ffffffff8020d4cf>] ? child_rip+0x0/0x11

Thanks for feedback.


-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.



More information about the general mailing list