[ofw] SRP bug?

Randy Kreiser rkreiser at datadirectnet.com
Mon Dec 24 10:39:09 PST 2007


HI Leonid, answers are below!

Randy

-----Original Message-----
From: Leonid Keller [mailto:leonid at mellanox.co.il] 
Sent: Sunday, December 23, 2007 3:53 AM
To: Randy Kreiser; rob at systemfabricworks.com; ofw at lists.openfabrics.org
Subject: RE: [ofw] SRP bug?

Hi Randy,

Thank you for the reply. 
A bit more questions:
Does it fail with ModeFlags=0 (a default value) ? 

	A) Yes, it fails with settings of 0,1 and 3

What SW run on the target side ?

	A) We are running a target driver DDN version 3.08

What kind of device is that appliance (from SRP point of view) ?

	A) Dumb block device (RAID controller with 8 luns).

Does data transfer work in raw mode (without formatting) ?

	A) Yes, we setup a CXFS client running windows and it reads and
writes until your heart is content!

(you can check that with Iometer)
TIA

Leonid 

> -----Original Message-----
> From: Randy Kreiser [mailto:rkreiser at datadirectnet.com]
> Sent: Friday, December 21, 2007 4:53 PM
> To: Leonid Keller; rob at systemfabricworks.com; 
> ofw at lists.openfabrics.org
> Subject: RE: [ofw] SRP bug?
> 
> Leonid, set the register you wanted to a "1" and it fails much quicker 
> but that was the only change I saw as it still fails the format.
> 
> Randy
> 
> -----Original Message-----
> From: Leonid Keller [mailto:leonid at mellanox.co.il]
> Sent: Thursday, December 20, 2007 4:50 AM
> To: rob at systemfabricworks.com; ofw at lists.openfabrics.org
> Cc: Randy Kreiser
> Subject: RE: [ofw] SRP bug?
> 
> Hi Rob,
> 
> Thank you for the elaborate analysis. It seems right.
> I'd like to get some more information, maybe you or someone else can 
> help.
> 
> Did this trace come from an IB sniffer ? 
> (Otherwise we can't be sure that the corruption happens at Initator's
> side.)
> 
> How often it happens ? 
> 
> How can one reproduce it ?
> 
> What SRP target is being used ?
> 
> Could we ask (and whom) to perform experiments ?
> For example, I'd suggest to set ModeFlags to 1 in 
> HKLM\SYSTEM\CurrentControlSet\Services\ibsrp\parameters,
> restart SRP driver and rerun the test.
> 
> Leonid
> 
> 
> 
> > -----Original Message-----
> > From: ofw-bounces at lists.openfabrics.org 
> > [mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Robert H.B.
> > Netzer
> > Sent: Tuesday, December 18, 2007 8:38 PM
> > To: ofw at lists.openfabrics.org
> > Cc: 'Randy Kreiser'
> > Subject: [ofw] SRP bug?
> > 
> > I have recently been shown a trace of an SRP session
> between the WinOF
> > 1.0.1 SRP initiator and a DDN S2A9550 storage appliance
> that has the
> > following suspicious SRP_CMD.  It seems to contain a bad virtual 
> > address.  Here is the payload of the send from the initiator to the 
> > appliance (this is a few hundred cmds into the stream):
> > 
> > 02000000 00200100 EF010000 00000000
> > 00000000 00000000 00000000 00000000
> > 2A000064 00220000 20000000 00000000
> > 00000000 05A83364 A8002201 00000010
> > 00004000 03006209 04006309 AA002301
> > 00004000
> > 
> > Consulting the SRP and SCSI specs and decoding this:
> > 
> > The first row indicates that it's an SRP_CMD, that there is one 
> > data-out buffer descriptor, and that it's an "indirect data buffer 
> > descriptor" (type 2h, encoded in the high nibble of the sixth byte 
> > above).
> > 
> > The SCSI CBD starts in the third row and is a write
> (10-byte CDB). The
> > length is 20h blocks (16k bytes).
> > 
> > The data-out buffer descriptor starts at byte 48 (fourth row) and 
> > consists of a 16-byte "indirect table memory descriptor", a
> four-byte
> > total length (00004000), and one 16-byte "partial memory
> descriptor" 
> > (there is one of these because the data-out buffer
> descriptor count,
> > the 7th byte in the SRP_CMD, is 1).
> > 
> > The suspicious part is the partial memory descriptor, which is this 
> > (copying the last four words from above): 03006209
> > 04006309 AA002301 00004000.  This is a virtual address of
> > 03006209 04006309, a memory handle (AA002301) that looks like the 
> > other ones earlier in the trace, and a data length of 16k.
> > 
> > The SRP stream gets into trouble when the target does an RDMA Read 
> > Request using this virtual address -- it looks bogus.
> > 
> > I'm hoping that someone can double-check my decoding of
> this packet,
> > and perhaps Tzachi could take a look.
> > 
> > Rob Netzer
> > System Fabric Works, Inc.
> > 
> > 
> > _______________________________________________
> > ofw mailing list
> > ofw at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> > 
> 
> 
> 
> 






More information about the ofw mailing list