[ofw] RE: ofw Digest, Vol 11, Issue 28

Ken Dieter (kedieter) kedieter at cisco.com
Wed Dec 26 14:42:41 PST 2007


Q

Ken Dieter
kedieter at cisco.com
630-701-5809
(sent from handheld PDA)

 -----Original Message-----
From: 	ofw-request at lists.openfabrics.org [mailto:ofw-request at lists.openfabrics.org]
Sent:	Tuesday, December 25, 2007 03:00 PM Eastern Standard Time
To:	ofw at lists.openfabrics.org
Subject:	ofw Digest, Vol 11, Issue 28

Send ofw mailing list submissions to
	ofw at lists.openfabrics.org

To subscribe or unsubscribe via the World Wide Web, visit
	http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
or, via email, send a message with subject or body 'help' to
	ofw-request at lists.openfabrics.org

You can reach the person managing the list at
	ofw-owner at lists.openfabrics.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of ofw digest..."


Today's Topics:

   1. RE:  SRP bug? (Bill Boas)


----------------------------------------------------------------------

Message: 1
Date: Mon, 24 Dec 2007 14:06:17 -0800
From: "Bill Boas" <bboas at systemfabricworks.com>
Subject: RE: [ofw] SRP bug?
To: "'Randy Kreiser'" <rkreiser at datadirectnet.com>,	"'Leonid Keller'"
	<leonid at mellanox.co.il>, <rob at systemfabricworks.com>,
	<ofw at lists.openfabrics.org>
Message-ID: <008501c84679$35d15ec0$6401a8c0 at YOURCB10AA3FFD>
Content-Type: text/plain;	charset="us-ascii"

Randy, thanks for responding to Leonid's questions.

Leonid,

The US Gov customers are anxious to learn if we are making progress
understanding the cause(s) of this bug.

Do you have access to suitable hardware and software in the Mellanox
facilities where you are - (Yokneam)? To duplicate this bug and run further
tests to diagnose the root causes?

Bill.

Bill Boas
VP, Business  Development
System Fabric Works
510-375-8840
bboas at systemfabricworks.com
www.systemfabricworks.com


-----Original Message-----
From: ofw-bounces at lists.openfabrics.org
[mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Randy Kreiser
Sent: Monday, December 24, 2007 10:39 AM
To: 'Leonid Keller'; rob at systemfabricworks.com; ofw at lists.openfabrics.org
Subject: RE: [ofw] SRP bug?

HI Leonid, answers are below!

Randy

-----Original Message-----
From: Leonid Keller [mailto:leonid at mellanox.co.il] 
Sent: Sunday, December 23, 2007 3:53 AM
To: Randy Kreiser; rob at systemfabricworks.com; ofw at lists.openfabrics.org
Subject: RE: [ofw] SRP bug?

Hi Randy,

Thank you for the reply. 
A bit more questions:
Does it fail with ModeFlags=0 (a default value) ? 

	A) Yes, it fails with settings of 0,1 and 3

What SW run on the target side ?

	A) We are running a target driver DDN version 3.08

What kind of device is that appliance (from SRP point of view) ?

	A) Dumb block device (RAID controller with 8 luns).

Does data transfer work in raw mode (without formatting) ?

	A) Yes, we setup a CXFS client running windows and it reads and
writes until your heart is content!

(you can check that with Iometer)
TIA

Leonid 

> -----Original Message-----
> From: Randy Kreiser [mailto:rkreiser at datadirectnet.com]
> Sent: Friday, December 21, 2007 4:53 PM
> To: Leonid Keller; rob at systemfabricworks.com; 
> ofw at lists.openfabrics.org
> Subject: RE: [ofw] SRP bug?
> 
> Leonid, set the register you wanted to a "1" and it fails much quicker 
> but that was the only change I saw as it still fails the format.
> 
> Randy
> 
> -----Original Message-----
> From: Leonid Keller [mailto:leonid at mellanox.co.il]
> Sent: Thursday, December 20, 2007 4:50 AM
> To: rob at systemfabricworks.com; ofw at lists.openfabrics.org
> Cc: Randy Kreiser
> Subject: RE: [ofw] SRP bug?
> 
> Hi Rob,
> 
> Thank you for the elaborate analysis. It seems right.
> I'd like to get some more information, maybe you or someone else can 
> help.
> 
> Did this trace come from an IB sniffer ? 
> (Otherwise we can't be sure that the corruption happens at Initator's
> side.)
> 
> How often it happens ? 
> 
> How can one reproduce it ?
> 
> What SRP target is being used ?
> 
> Could we ask (and whom) to perform experiments ?
> For example, I'd suggest to set ModeFlags to 1 in 
> HKLM\SYSTEM\CurrentControlSet\Services\ibsrp\parameters,
> restart SRP driver and rerun the test.
> 
> Leonid
> 
> 
> 
> > -----Original Message-----
> > From: ofw-bounces at lists.openfabrics.org 
> > [mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Robert H.B.
> > Netzer
> > Sent: Tuesday, December 18, 2007 8:38 PM
> > To: ofw at lists.openfabrics.org
> > Cc: 'Randy Kreiser'
> > Subject: [ofw] SRP bug?
> > 
> > I have recently been shown a trace of an SRP session
> between the WinOF
> > 1.0.1 SRP initiator and a DDN S2A9550 storage appliance
> that has the
> > following suspicious SRP_CMD.  It seems to contain a bad virtual 
> > address.  Here is the payload of the send from the initiator to the 
> > appliance (this is a few hundred cmds into the stream):
> > 
> > 02000000 00200100 EF010000 00000000
> > 00000000 00000000 00000000 00000000
> > 2A000064 00220000 20000000 00000000
> > 00000000 05A83364 A8002201 00000010
> > 00004000 03006209 04006309 AA002301
> > 00004000
> > 
> > Consulting the SRP and SCSI specs and decoding this:
> > 
> > The first row indicates that it's an SRP_CMD, that there is one 
> > data-out buffer descriptor, and that it's an "indirect data buffer 
> > descriptor" (type 2h, encoded in the high nibble of the sixth byte 
> > above).
> > 
> > The SCSI CBD starts in the third row and is a write
> (10-byte CDB). The
> > length is 20h blocks (16k bytes).
> > 
> > The data-out buffer descriptor starts at byte 48 (fourth row) and 
> > consists of a 16-byte "indirect table memory descriptor", a
> four-byte
> > total length (00004000), and one 16-byte "partial memory
> descriptor" 
> > (there is one of these because the data-out buffer
> descriptor count,
> > the 7th byte in the SRP_CMD, is 1).
> > 
> > The suspicious part is the partial memory descriptor, which is this 
> > (copying the last four words from above): 03006209
> > 04006309 AA002301 00004000.  This is a virtual address of
> > 03006209 04006309, a memory handle (AA002301) that looks like the 
> > other ones earlier in the trace, and a data length of 16k.
> > 
> > The SRP stream gets into trouble when the target does an RDMA Read 
> > Request using this virtual address -- it looks bogus.
> > 
> > I'm hoping that someone can double-check my decoding of
> this packet,
> > and perhaps Tzachi could take a look.
> > 
> > Rob Netzer
> > System Fabric Works, Inc.
> > 
> > 
> > _______________________________________________
> > ofw mailing list
> > ofw at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> > 
> 
> 
> 
> 



_______________________________________________
ofw mailing list
ofw at lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw


------------------------------

_______________________________________________
ofw mailing list
ofw at lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw

End of ofw Digest, Vol 11, Issue 28
***********************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20071226/ea468eae/attachment.html>


More information about the ofw mailing list