[Scst-devel] [ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs
Chris Worley
worleys at gmail.com
Mon Sep 21 08:56:36 PDT 2009
On Mon, Sep 21, 2009 at 9:22 AM, Bart Van Assche
<bart.vanassche at gmail.com> wrote:
> On Wed, Sep 9, 2009 at 12:29 AM, Chris Worley <worleys at gmail.com> wrote:
>> [ ... ]
>> But, the same issue occurs... the apps on the initiator hang, and the
>> target thinks all is well. An app will hang in one of the file
>> systems... the others seem to be working well (even though they are
>> comprised of the same drives as the hung fs/app), for example: you can
>> do a "find ." from their root w/o hanging "find", but if you try that
>> in the fs where the app is hung, "find" will hang. Lvscan/pvscan will
>> hang too.
>>
>> Strangely, restarting the target (removing ib_srpd and scst_vdisk
>> modules, then re-registering the disks with scst_vdisk and
>> re-modprobing ib_srpt from scratch) causes the apps on the initiator
>> to un-hang and make progress again... (but eventually hang again...
>> seemingly more readily than before).
>>
>> While nothing other than the messages you'd expect (from
>> re-registering the drives to the initiator logging in) occur on the
>> target, the initiator has much to say during this re-registration
>> period, starting w/ the time-out (that has been shown previously):
>>
>> Sep 8 22:04:07 nameme kernel: sd 30:0:0:3: timing out command, waited 360s
>> Sep 8 22:04:07 nameme kernel: sd 30:0:0:3: SCSI error: return code = 0x06000000
>> Sep 8 22:04:07 nameme kernel: end_request: I/O error, dev sdo, sector 45304704
>> [ ... ]
>
> Hello Chris,
>
> Unless you will report the opposite I assume that the above issue (SRP
> timeouts) has been solved by the solution I sent you via private
> e-mail, namely to load the SRPT kernel module with the parameter
> 'thread' set to one (modprobe ib_srpt thread=1).
I do view that as a work-around, as it implies there is an issue in
the threads... and multiple threads do provide more performance (which
is what IB is all about).
I very much appreciate the work-around, though... this has been such a
show-stopper for me.
Thanks,
Chris
>
> Bart.
>
More information about the general
mailing list