[ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs

Chris Worley worleys at gmail.com
Sun Aug 9 11:19:23 PDT 2009


On Sun, Aug 9, 2009 at 11:26 AM, Joe
Landman<landman at scalableinformatics.com> wrote:
> Chris Worley wrote:
>>
>> I'm running a target comprised of: RHEL5.2/2.6.18-92.el5 (fresh off
>> the CD.. never updated) and it's embedded IB stack (not the latest
>> OFED) w/ SCST rev 1029 8-Aug-2009 ("svn info").
>>
>> I'm running a W2008S (fully patched) initiator w/
>> MLNX_WinOF_2_0_5_wlh_x64_fre_2_0_5_4453.
>>
>> Using Mellanox QDR cards/switch.
>>
>> Writes over SRP, as measured from the initiator using IOMeter, get
>> proper performance (i.e. 1.2GB/s).
>>
>> Reads get about 30% performance (i.e. 500MB/s instead of 1.6GB/s).
>
> Chris:
>
>  What is the backing store capable of?  That is, if you are doing, say dd's
> streaming from disk, what rate do you see?  Or are you doing this with a
> RAMDISK to check protocol performance?

I tested my local performance before testing SRP.

These are ioDrives.  I'm running two, so the local performance is
1.6GB/s for reads.  I've run up to four ioDrives through one QDR IB
link w/ Linux host and initiator, and get 2.7GB/s to the initiator.
This was using an upgraded distro on the target, and I'm testing this
on someone elses machine and don't have permission to upgrade it yet.
This could also be the rev of WinOF.

>
>  The dd's should look something like this on the RHEL machine
>
>        write:
>
>                dd if=/dev/zero of=/path/to/target bs=1M count=32k
>
> (make sure the product of count * bs is greater than 2x system ram)
>
>        read:
>
>                dd if=/path/to/target of=/dev/null bs=1M count=32k
>
> If you are not getting 1.6 GB/s out of the file system locally, you won't
> get it out of the target over the network.

1.6GB/s out of two ioDrives is no problem locally.

>  The backing store is usually one
> of the slower aspects.
>
> For our units, this is what we are seeing:
>
> dd if=/dev/zero of=/data/big.file ...
> 10240+0 records in
> 10240+0 records out
> 171798691840 bytes (172 GB) copied, 94.8258 seconds, 1.8 GB/s
>
> [root at jr5 ~]# dd if=/data/big.file of=/dev/null bs=16M
> 10240+0 records in
> 10240+0 records out
> 171798691840 bytes (172 GB) copied, 76.6224 seconds,  2.2 GB/s
>
> So our writes and reads through SCST should be less than 1.8 and 2.2 GB/s
> respectively.
>
>> And while reading, IOMeter eventually hangs the system (Windows
>> becomes unresponsive to GUI interaction).  In this state, I see iostat
>
> Hmmm....  We had IOMeter running continuously over a 10GbE link to a
> SCST-based target at SC09.  The backing store could provide ~700 MB/s, and
> we saw 500 MB/s for ~4 days  running during the day (running benchmarks
> continuously all day long).
>
>> reporting transfers at the same low read rate from the target... so
>> there's IB traffic, but, given IOMeter's tasks are 10 minutes each, it
>> acts like it's a "skipping record" (sorry of you young folks don't
>> know what that is... but I can't think of another way to describe it)
>> and never moving on to the next benchmark, just endlessly repeating
>> the same I/O over and over again.  If I unload then reload the mlx4_ib
>> driver on the target, then the Windows system quickly returns, but
>> IOMeter remains hung and needs killed.
>>
>> So, I have a lot of experimentation to do on the target in 1)
>> upgrading the target or changing the distro altogether and 2) using
>> OFED instead of built-in IB stack on the target to try to see if I can
>> budge this issue.
>>
>> But, I was wondering if somebody might have a hint on this _or_ have a
>> known target distro/kernel setup that works reliably w/ Windows-based
>> SRP initiators.
>
> SCST works (the versions we have used, 1.0.0, 1.0.1, ...) reliably with
> Windows initiator for XP, XP64, 2003, and 2008.  Look in the windows error
> log, and see if you are getting driver timeouts.  See if you have an updated
> driver.

I'm worried more about the underlying IB stack and kernel on the
target side.  It would be best to know exactly which distro, kernel,
and OFED revisions (unless you're using the distro's built-in IB
stack) you're using on the target.  The WinOF version you're using on
the Windows side would be helpful info too. Can you relay these?

Thanks,

Chris
>
> Regards,
>
> Joe
>
> --
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics, Inc.
> email: landman at scalableinformatics.com
> web  : http://scalableinformatics.com
>       http://scalableinformatics.com/jackrabbit
> phone: +1 734 786 8423 x121
> fax  : +1 866 888 3112
> cell : +1 734 612 4615
>



More information about the general mailing list