[ofa-general] WinOF_2_0_5/SRP initiator: slow reads and eventually hangs

Joe Landman landman at scalableinformatics.com
Sun Aug 9 10:26:29 PDT 2009


Chris Worley wrote:
> I'm running a target comprised of: RHEL5.2/2.6.18-92.el5 (fresh off
> the CD.. never updated) and it's embedded IB stack (not the latest
> OFED) w/ SCST rev 1029 8-Aug-2009 ("svn info").
> 
> I'm running a W2008S (fully patched) initiator w/
> MLNX_WinOF_2_0_5_wlh_x64_fre_2_0_5_4453.
> 
> Using Mellanox QDR cards/switch.
> 
> Writes over SRP, as measured from the initiator using IOMeter, get
> proper performance (i.e. 1.2GB/s).
> 
> Reads get about 30% performance (i.e. 500MB/s instead of 1.6GB/s).

Chris:

   What is the backing store capable of?  That is, if you are doing, say 
dd's streaming from disk, what rate do you see?  Or are you doing this 
with a RAMDISK to check protocol performance?

   The dd's should look something like this on the RHEL machine

	write:
	
		dd if=/dev/zero of=/path/to/target bs=1M count=32k

(make sure the product of count * bs is greater than 2x system ram)

	read:

		dd if=/path/to/target of=/dev/null bs=1M count=32k

If you are not getting 1.6 GB/s out of the file system locally, you 
won't get it out of the target over the network.  The backing store is 
usually one of the slower aspects.

For our units, this is what we are seeing:

dd if=/dev/zero of=/data/big.file ...
10240+0 records in
10240+0 records out
171798691840 bytes (172 GB) copied, 94.8258 seconds, 1.8 GB/s

[root at jr5 ~]# dd if=/data/big.file of=/dev/null bs=16M
10240+0 records in
10240+0 records out
171798691840 bytes (172 GB) copied, 76.6224 seconds,  2.2 GB/s

So our writes and reads through SCST should be less than 1.8 and 2.2 
GB/s respectively.

> And while reading, IOMeter eventually hangs the system (Windows
> becomes unresponsive to GUI interaction).  In this state, I see iostat

Hmmm....  We had IOMeter running continuously over a 10GbE link to a 
SCST-based target at SC09.  The backing store could provide ~700 MB/s, 
and we saw 500 MB/s for ~4 days  running during the day (running 
benchmarks continuously all day long).

> reporting transfers at the same low read rate from the target... so
> there's IB traffic, but, given IOMeter's tasks are 10 minutes each, it
> acts like it's a "skipping record" (sorry of you young folks don't
> know what that is... but I can't think of another way to describe it)
> and never moving on to the next benchmark, just endlessly repeating
> the same I/O over and over again.  If I unload then reload the mlx4_ib
> driver on the target, then the Windows system quickly returns, but
> IOMeter remains hung and needs killed.
> 
> So, I have a lot of experimentation to do on the target in 1)
> upgrading the target or changing the distro altogether and 2) using
> OFED instead of built-in IB stack on the target to try to see if I can
> budge this issue.
> 
> But, I was wondering if somebody might have a hint on this _or_ have a
> known target distro/kernel setup that works reliably w/ Windows-based
> SRP initiators.

SCST works (the versions we have used, 1.0.0, 1.0.1, ...) reliably with 
Windows initiator for XP, XP64, 2003, and 2008.  Look in the windows 
error log, and see if you are getting driver timeouts.  See if you have 
an updated driver.

Regards,

Joe

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/jackrabbit
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615



More information about the general mailing list