[openib-general] recursion depth exceeded in ipoib_workqueue
Jack Morgenstein
jackm at mellanox.co.il
Mon Sep 19 08:21:56 PDT 2005
environment:
HCA Port 1 of Host 1 is connected back-to-back to HCA port 1 of Host 2.
A shell script running on Host 1 loads and unloads the openib driver. On
Host 2, the openib driver is up and opensm is running.
Host 1: while date ; do
/etc/init.d/openibd start
sleep 3
/etc/init.d/openibd stop
sleep 1
done
NOTES:
a. sleeps were inserted to give time to opensm on host 2 to
respond to changes
b. openibd script attached
Problem -- recursion depth exceeded in ipoib_workqueue:
/var/log/messages from Host 1
------------------------------
ib_mthca: Initializing (0000:04:00.0)
ACPI: PCI Interrupt 0000:04:00.0[A] -> GSI 29 (level, low) -> IRQ 185
run_workqueue: recursion depth exceeded: 4
Call Trace:<ffffffff80147a47>{flush_cpu_workqueue+87}
<ffffffff803f76d6>{wait_for_completion+230}
<ffffffff80131d50>{default_wake_function+0}
<ffffffff8013fc39>{lock_timer_base+41}
<ffffffff88078ba3>{:ib_ipoib:ipoib_mcast_stop_thread+99}
<ffffffff88078cdc>{:ib_ipoib:ipoib_mcast_restart_task+44}
<ffffffff80147abd>{flush_cpu_workqueue+205}
<ffffffff88078cb0>{:ib_ipoib:ipoib_mcast_restart_task+0}
<ffffffff8013fc39>{lock_timer_base+41}
<ffffffff88078ba3>{:ib_ipoib:ipoib_mcast_stop_thread+99}
<ffffffff88078cdc>{:ib_ipoib:ipoib_mcast_restart_task+44}
<ffffffff80147abd>{flush_cpu_workqueue+205}
<ffffffff88078cb0>{:ib_ipoib:ipoib_mcast_restart_task+0}
<ffffffff8013fc39>{lock_timer_base+41}
<ffffffff88078ba3>{:ib_ipoib:ipoib_mcast_stop_thread+99}
<ffffffff88078cdc>{:ib_ipoib:ipoib_mcast_restart_task+44}
<ffffffff88078cb0>{:ib_ipoib:ipoib_mcast_restart_task+0}
<ffffffff8014795e>{worker_thread+478}
<ffffffff80131d50>{default_wake_function+0}
<ffffffff8012f2d3>{__wake_up_common+67}
<ffffffff80131d50>{default_wake_function+0}
<ffffffff8014beb0>{keventd_create_kthread+0}
<ffffffff80147780>{worker_thread+0}
<ffffffff8014beb0>{keventd_create_kthread+0}
<ffffffff8014c009>{kthread+217}
<ffffffff8010e50e>{child_rip+8}
<ffffffff8014beb0>{keventd_create_kthread+0}
<ffffffff8014bf30>{kthread+0} <ffffffff8010e506>{child_rip+0}
Please Note:
-- Set Multicast List posts the restart task to the ipoib_workqueue
(ipoib_main.c:675)
-- ipoib_mcast_restart_task (ipoib_multicast.c) calls
ipoib_mcast_stop_thread(), which calls flush_workqueue(ipoib_workqueue)
-- so the restart task flushes the work queue its running from.
-- Linux prevents the deadlock by testing if the flush is called from the
same thread (see linux/workqueue.c:223). If it is, Linux flushes remaining
tasks in the work queue (without waiting). This both breaks serialization
of tasks in the work queue, and can cause the recursion overflow seen above.
Jack
<<openibd>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050919/feab5d9d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: openibd
Type: application/octet-stream
Size: 24304 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050919/feab5d9d/attachment.obj>
More information about the general
mailing list