[nvmewin] NVMe 1.1 : Clarification on PRP2 Entry list
Luse, Paul E
paul.e.luse at intel.com
Mon Apr 1 11:40:25 PDT 2013
You didn’t change any compile options (like enabling DUMB_DRIVER) or anything? Besides adding the print that Kwok suggests (did you do that?) you can also define PRP_DBG which will print out a ton of info the PRPs (don’t try to do anything that generates a lot of IO, its only useful for specific things) which might provide another hint although this define does change the code path in question slightly so could affect whatever your failure mode is so keep that in mind.
From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of SANTOSH SINGH
Sent: Monday, April 01, 2013 5:06 AM
To: Kong, Kwok
Cc: nvmewin at lists.openfabrics.org
Subject: Re: [nvmewin] NVMe 1.1 : Clarification on PRP2 Entry list
Hi Kwok,
I did not modify the driver code nor introdue any bug but have seen this behaviour couple of times. Will capture the debug trace.
Regards
Santosh
------- Original Message -------
Sender : Kong, Kwok<Kwok.Kong at idt.com<mailto:Kwok.Kong at idt.com>>
Date : Mar 27, 2013 07:56 (GMT+09:00)
Title : RE: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list
Santosh,
Something is not right here. The driver pre-allocates the PRP list to store up to 32 entries per list during initialization time. PRP2 must point to a list with the following offset:
- 0x000
- 0x100
- 0x200
- …
- …
- 0xE00
- 0xF00
PRP2 cannot point to an offset 0xFF0 as in your example.
You can confirm this in your driver by adding the following debugging message in the file nvmeinit.c and function NVMeInitFreeQ.
It's around Line# 720 to add the printout...
/* Save the address of current list for calculating next list */
CurPRPList = (ULONG_PTR)pCmdInfo->pPRPList;
pCmdInfo->prpListPhyAddr = StorPortGetPhysicalAddress(pAE,
NULL,
pCmdInfo->pPRPList,
&prpListSz);
StorPortDebugPrint(INFO, "NVMeInitFreeQ : Entry#%d List starts 0x%llX\n",
Entry, pCmdInfo->pPRPList);
Here is a sample output:
STORMINI: NVMeInitFreeQ : Entry#0 List starts 0xFFFFFA800D69E100
STORMINI: NVMeInitFreeQ : Entry#1 List starts 0xFFFFFA800D69E200
STORMINI: NVMeInitFreeQ : Entry#2 List starts 0xFFFFFA800D69E300
STORMINI: NVMeInitFreeQ : Entry#3 List starts 0xFFFFFA800D69E400
STORMINI: NVMeInitFreeQ : Entry#4 List starts 0xFFFFFA800D69E500
STORMINI: NVMeInitFreeQ : Entry#5 List starts 0xFFFFFA800D69E600
STORMINI: NVMeInitFreeQ : Entry#6 List starts 0xFFFFFA800D69E700
STORMINI: NVMeInitFreeQ : Entry#7 List starts 0xFFFFFA800D69E800
STORMINI: NVMeInitFreeQ : Entry#8 List starts 0xFFFFFA800D69E900
STORMINI: NVMeInitFreeQ : Entry#9 List starts 0xFFFFFA800D69EA00
STORMINI: NVMeInitFreeQ : Entry#10 List starts 0xFFFFFA800D69EB00
STORMINI: NVMeInitFreeQ : Entry#11 List starts 0xFFFFFA800D69EC00
STORMINI: NVMeInitFreeQ : Entry#12 List starts 0xFFFFFA800D69ED00
STORMINI: NVMeInitFreeQ : Entry#13 List starts 0xFFFFFA800D69EE00
STORMINI: NVMeInitFreeQ : Entry#14 List starts 0xFFFFFA800D69EF00
STORMINI: NVMeInitFreeQ : Entry#15 List starts 0xFFFFFA800D69F000
STORMINI: NVMeInitFreeQ : Entry#16 List starts 0xFFFFFA800D69F100
STORMINI: NVMeInitFreeQ : Entry#17 List starts 0xFFFFFA800D6A0100
STORMINI: NVMeInitFreeQ : Entry#18 List starts 0xFFFFFA800D6A0200
STORMINI: NVMeInitFreeQ : Entry#19 List starts 0xFFFFFA800D6A0300
…
…
Did you modify the driver and create a bug in your driver ?
We have not seen this problem in our testing environment.
Thanks
-Kwok
From: SANTOSH SINGH [mailto:santosh.s2 at samsung.com]
Sent: Tuesday, March 26, 2013 3:32 AM
To: Robles, Raymond C
Cc: Kong, Kwok; technical at nvmexpress.org<mailto:technical at nvmexpress.org>; Onufryk, Peter; Wilcox, Matthew R; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: Re: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list
Hi Ray,
Attached is the summary of the screen shot.
[cid:image001.png at 01CE2ECD.B1F79810]
We can discuss in the next WG call.
Regards
Santosh
------- Original Message -------
Sender : Robles, Raymond C<raymond.c.robles at intel.com<mailto:raymond.c.robles at intel.com>>
Date : Mar 22, 2013 05:17 (GMT+09:00)
Title : RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list
Hi Santosh,
I am one of the original authors of the OFA Windows NVMe driver (sorry, I’m late to this thread). What do you believe is the problem? The Windows OFA driver constructs PRP lists per the NVMe spec. We’ve run for several days with numerous data integrity testing tools (without error).
Do you believe that the PRP list is incorrectly constructed? Based on the screen shot you sent out, the second PRP entry in the submission queue entry points to a PRP list… it should not contain the 2nd PRP entry. This is per the NVMe spec.
Thanks,
Ray
From: Kong, Kwok [mailto:Kwok.Kong at idt.com]
Sent: Thursday, March 21, 2013 11:14 AM
To: santosh.s2 at samsung.com<mailto:santosh.s2 at samsung.com>
Cc: technical at nvmexpress.org<mailto:technical at nvmexpress.org>; Onufryk, Peter; Wilcox, Matthew R
Subject: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list
Santosh,
The specification is very clear on how the PRP list should be constructed. If you see a problem with any driver, then it is a driver problem and not a specification problem.
Please send your question to nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org> if you believe there is a driver bug.
What is the LBA size for your testing ? 512B or 4KB ?
Thanks
-Kwok
From: SANTOSH SINGH [mailto:santosh.s2 at samsung.com]
Sent: Thursday, March 21, 2013 3:55 AM
To: Kong, Kwok
Cc: technical at nvmexpress.org<mailto:technical at nvmexpress.org>; Onufryk, Peter; 'Wilcox, Matthew R'
Subject: Re: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list
Sorry some how the attachement is missing. Resending.
[cid:image002.png at 01CE2ECD.B1F79810]
Regards
Santosh
------- Original Message -------
Sender : SANTOSH SINGH<santosh.s2 at samsung.com<mailto:santosh.s2 at samsung.com>> Senior Chief Engineer/SRI-Bangalore-SSD Solutions/Samsung Electronics
Date : Mar 21, 2013 19:45 (GMT+09:00)
Title : RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list
Hi Kwok,
I got the scenario reproduced again, while issuing the FS format command.
Following are the debug details.
Page size 4k
Data Transfer size was 16 LBA
PRP1 Entry 0x3e8fd060
PRP2 Entry list 0xbe166ff0
Total no. of PRP entries 17
The 16 PRP entries should fit in single page. But the PRP2 offset
0xbe166ff0(PRP2) is not correct and it has run off the page(from 0xbe166ff0
to 0xbe167000) which is the next page in continuity. Attached is the
snapshot of the debug window for the detailed analysis.
Regards
Santosh
-----Original Message-----
From: Kong, Kwok [mailto:Kwok.Kong at idt.com]
Sent: Thursday, March 21, 2013 1:22 AM
To: Wilcox, Matthew R; santosh.s2 at samsung.com<mailto:santosh.s2 at samsung.com>
Cc: Onufryk, Peter; technical at nvmexpress.org<mailto:technical at nvmexpress.org>
Subject: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list
Santosh,
Your verification on the OFA driver that it uses PRP list format as in
figure 2 is incorrect.
The OFA driver does neither figure 1, nor figure 2.
By default, the max request size that the mini-port supports is 128KB. If
the request size is bigger than 128KB, the port driver sends multiple 128KB
requests to the OFA mini-port driver.
The OFA mini-port driver pre-allocates the PRP list buffers during
initialization. The PRP entries (a max of 32 entries for 128KB request
size) never run off the end of a page as shown in figure 1 or figure 2. All
PRP entries are guaranteed to fit within a single Memory Page.
Thanks
-Kwok
-----Original Message-----
From: Wilcox, Matthew R [mailto:matthew.r.wilcox at intel.com]
Sent: Wednesday, March 20, 2013 8:26 AM
To: santosh.s2 at samsung.com<mailto:santosh.s2 at samsung.com>; Kong, Kwok
Cc: Onufryk, Peter; technical at nvmexpress.org<mailto:technical at nvmexpress.org>
Subject: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list
The Linux driver does neither figure 1, nor figure 2.
If the number of PRP entries requires more than one page, it starts at the
beginning of a page. If it requires less than a page, it may start in the
middle of a page, but will never run off the end of a page as shown in
figure 2.
I have not reviewed the OFA driver to see what it does.
________________________________
From: SANTOSH SINGH [santosh.s2 at samsung.com]
Sent: March 19, 2013 8:01 PM
To: Kong, Kwok; Wilcox, Matthew R
Cc: Onufryk, Peter; technical at nvmexpress.org<mailto:technical at nvmexpress.org>
Subject: Re: RE: NVMe 1.1 : Clarification on PRP2 Entry list
Hi Kwok, Matthew,
I verified the OFA driver too and that prepares the PRP List entry as in
fig-2.
Any reason why both the drivers(Linux and OFA) prepares the PRP2 list like
fig-2. Will it change in the later version of drivers as fig-1.
Regards
Santosh
------- Original Message -------
Sender : Kong, Kwok
Date : Mar 20, 2013 01:01 (GMT+09:00)
Title : RE: NVMe 1.1 : Clarification on PRP2 Entry list
I believe the specification is very clear to indicate that figure 1 is
correct.
"The last entry within a memory page, as indicated by the memory page size
in the CC.MPS field, shall be a PRP List pointer if there is more than a
single memory page of data to be transferred.".
Thanks
-Kwok
From: Onufryk, Peter [mailto:Peter.Onufryk at idt.com]
Sent: Tuesday, March 19, 2013 7:22 AM
To: Santosh Singh; technical at nvmexpress.org<mailto:technical at nvmexpress.org>
Subject: RE: NVMe 1.1 : Clarification on PRP2 Entry list
Santosh,
John and I discussed this and we believe that Figure 1 is correct and that
the best way to clarify this is by adding a figure showing it to the spec.
This will be the first item on the agenda in this week's calls.
Regards,
Peter
From: Santosh Singh [mailto:santosh.s2 at samsung.com]
Sent: Tuesday, March 19, 2013 5:49 AM
To: technical at nvmexpress.org<mailto:technical at nvmexpress.org>
Subject: NVMe 1.1 : Clarification on PRP2 Entry list
Hi All,
I got the query on PRP2 , when it is entry list and not memory page
aligned from a design engineer .
The below paragraph is from section 4.3 'Physical Region Page Entry and
List' of Spec 1.1.
[cid:Z5JE7EUABGFC at namo.co.kr]
PRP entry 2, when pointing to a list may also have a non-zero offset within
a memory page, means that is not memory page aligned.
The last entry within a memory page shall be a list pointer. There are
following two understandings for this:
1000
h
1
FFFh
PRP entry
2
pointing to a list
Entry
1
Entry
2
Entry
3
Entry
4
Address of PRP list
5000
h
5
FFFh
Entry
5
Entry
6
Entry
7
Entry
8
1000
h
1
FFFh
PRP entry
2
pointing to a list
Entry
1
Entry
2
Entry
3
Entry
4
2
FFFh
Entry
5
Entry
6
Entry
7
Entry
8
Entry
511
Entry
510
Address of PRP list
5000
h
5
FFFh
Entry
512
Entry
513
Entry
514
Fig:1
Fig:2
Page boundary
So just want to verify with others that as per the line in section 4.3 'A
physical region page list (PRP List) is a set of PRP entries in a single
page of contiguous memory'
fig:2 is the correct understanding.
Thanks & Regards
Santosh
[cid:LK7CT9SZN3WZ at namo.co.kr]
[cid:image003.gif at 01CE2ECD.B1F79810]
[http://ext.samsung.net/mailcheck/SeenTimeChecker?do=ba69d47c78c3acc08d47d8c18e24da0151171515984b9d550ad7d0699a0799098adfa564d3c39ac365b186a42a35dd3e259756a7cc35ba77326bbdfb2ea96a2fcf878f9a26ce15a0]
[http://ext.samsung.net/mailcheck/SeenTimeChecker?do=ba69d47c78c3acc0b6872ba05ecac647cb20e510c9bc8be10ad7d0699a0799098adfa564d3c39ac365b186a42a35dd3e259756a7cc35ba77326bbdfb2ea96a2fcf878f9a26ce15a0]
[http://ext.samsung.net/mailcheck/SeenTimeChecker?do=5f64b9fd8cb08cc2f66a4f6be00ce242c653ccac87eca576c883f74d3027808b94d548315d7b78b3f676ccf8e4fcda078aa631650c9c0a6c62e1ac75b522795a07805447a154a46fcf878f9a26ce15a0]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20130401/bf43f391/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 81077 bytes
Desc: image001.png
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20130401/bf43f391/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 126969 bytes
Desc: image002.png
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20130401/bf43f391/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.gif
Type: image/gif
Size: 14036 bytes
Desc: image003.gif
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20130401/bf43f391/attachment.gif>
More information about the nvmewin
mailing list