[ofa-general] [Bug 14235] New: SRP initiator lockup
bugzilla-daemon at bugzilla.kernel.org
bugzilla-daemon at bugzilla.kernel.org
Sat Sep 26 07:54:37 PDT 2009
http://bugzilla.kernel.org/show_bug.cgi?id=14235
Summary: SRP initiator lockup
Product: Drivers
Version: 2.5
Platform: All
OS/Version: Linux
Tree: Mainline
Status: NEW
Severity: high
Priority: P1
Component: Infiniband/RDMA
AssignedTo: drivers_infiniband-rdma at kernel-bugs.osdl.org
ReportedBy: bart.vanassche at gmail.com
Regression: No
If an SRP target processes SRP I/O slow enough, the SRP initiator locks up.
This issue is 100% reproducible with the following setup:
Target:
* Kernel 2.6.30.4 with SCST patches applied and kernel debugging enabled.
* SCST r1153 with EXTRA_CFLAGS += -DCONFIG_SCST_TRACING -DCONFIG_SCST_DEBUG -g
added in srpt/src/Makefile and with EXTRA_CFLAGS += -DCONFIG_SCST_TRACING added
in scst/src/Makefile.
* ib_srpt loaded with kernel module parameters thread=0 and
processing_delay_in_us=500.
Initiator:
* Kernel 2.6.31.1 with kernel debugging enabled.
* SRP login has been performed as follows: rmmod ib_srp; modprobe ib_srp;
ibsrpdm -c | while read target_info; do echo "${target_info}"; echo
"${target_info}" > /sys/class/infiniband_srp/srp-mlx4_0-1/add_target; done
* After SRP login succeeded the following fio command was started:
fio --rw=rw --bs=64M --rwmixread=100 --numjobs=1 --iodepth=1 --sync=0
--direct=1 --ioengine=sync --filename=/dev/${srp_initiator_device} --name=test
--loops=1000 --runtime=600 --size=2G
After a few minutes fio locked up (I/O rate dropped from 1500 MB/s to 0 MB/s)
and the following kernel message started appearing periodically:
INFO: task fio:6389 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
fio D 0000000000000000 0 6389 6388 0x00000000
ffff880071dc5bd8 0000000000000046 ffff880071dc5b08 000000018107764d
0000000000012cc0 000000000000de20 0000000000000001 ffff880070cd8000
ffff880070cd83b0 0000000100000000 000000010001193e ffff88007fb99050
Call Trace:
[<ffffffff812ec5e5>] ? _spin_unlock_irqrestore+0x65/0x80
[<ffffffff812e9b37>] io_schedule+0x37/0x50
[<ffffffff8110cff2>] __blockdev_direct_IO+0x692/0xd80
[<ffffffff810e0357>] ? get_super+0x27/0xc0
[<ffffffff8110b169>] blkdev_direct_IO+0x49/0x50
[<ffffffff8110a1f0>] ? blkdev_get_blocks+0x0/0xc0
[<ffffffff810a1799>] generic_file_aio_read+0x679/0x690
[<ffffffff810dc35a>] ? __dentry_open+0x13a/0x340
[<ffffffff810de091>] do_sync_read+0xf1/0x140
[<ffffffff810775ed>] ? trace_hardirqs_on_caller+0x14d/0x1a0
[<ffffffff810662f0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff810775ed>] ? trace_hardirqs_on_caller+0x14d/0x1a0
[<ffffffff8107764d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff810ded28>] vfs_read+0xc8/0x180
[<ffffffff810deed0>] sys_read+0x50/0x90
[<ffffffff8100be6b>] system_call_fastpath+0x16/0x1b
no locks held by fio/6389.
--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
More information about the general
mailing list