From santosh.s2 at samsung.com Mon Apr 1 05:05:49 2013 From: santosh.s2 at samsung.com (SANTOSH SINGH) Date: Mon, 01 Apr 2013 12:05:49 +0000 (GMT) Subject: [nvmewin] NVMe 1.1 : Clarification on PRP2 Entry list Message-ID: <5D.4C.24454.D1879515@epcpsbgx4.samsung.com> An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 201304011739357_BEI0XT4N.png Type: image/png Size: 81077 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 201304011739997_2LL5XOK0.png Type: image/png Size: 126969 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 201304011739292_A6X7LP7K.gif Type: image/gif Size: 14036 bytes Desc: not available URL: From paul.e.luse at intel.com Mon Apr 1 11:40:25 2013 From: paul.e.luse at intel.com (Luse, Paul E) Date: Mon, 1 Apr 2013 18:40:25 +0000 Subject: [nvmewin] NVMe 1.1 : Clarification on PRP2 Entry list In-Reply-To: <5D.4C.24454.D1879515@epcpsbgx4.samsung.com> References: <5D.4C.24454.D1879515@epcpsbgx4.samsung.com> Message-ID: <82C9F782B054C94B9FC04A331649C77A07C84514@FMSMSX106.amr.corp.intel.com> You didn’t change any compile options (like enabling DUMB_DRIVER) or anything? Besides adding the print that Kwok suggests (did you do that?) you can also define PRP_DBG which will print out a ton of info the PRPs (don’t try to do anything that generates a lot of IO, its only useful for specific things) which might provide another hint although this define does change the code path in question slightly so could affect whatever your failure mode is so keep that in mind. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of SANTOSH SINGH Sent: Monday, April 01, 2013 5:06 AM To: Kong, Kwok Cc: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] NVMe 1.1 : Clarification on PRP2 Entry list Hi Kwok, I did not modify the driver code nor introdue any bug but have seen this behaviour couple of times. Will capture the debug trace. Regards Santosh ------- Original Message ------- Sender : Kong, Kwok> Date : Mar 27, 2013 07:56 (GMT+09:00) Title : RE: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Santosh, Something is not right here. The driver pre-allocates the PRP list to store up to 32 entries per list during initialization time. PRP2 must point to a list with the following offset: - 0x000 - 0x100 - 0x200 - … - … - 0xE00 - 0xF00 PRP2 cannot point to an offset 0xFF0 as in your example. You can confirm this in your driver by adding the following debugging message in the file nvmeinit.c and function NVMeInitFreeQ. It's around Line# 720 to add the printout... /* Save the address of current list for calculating next list */ CurPRPList = (ULONG_PTR)pCmdInfo->pPRPList; pCmdInfo->prpListPhyAddr = StorPortGetPhysicalAddress(pAE, NULL, pCmdInfo->pPRPList, &prpListSz); StorPortDebugPrint(INFO, "NVMeInitFreeQ : Entry#%d List starts 0x%llX\n", Entry, pCmdInfo->pPRPList); Here is a sample output: STORMINI: NVMeInitFreeQ : Entry#0 List starts 0xFFFFFA800D69E100 STORMINI: NVMeInitFreeQ : Entry#1 List starts 0xFFFFFA800D69E200 STORMINI: NVMeInitFreeQ : Entry#2 List starts 0xFFFFFA800D69E300 STORMINI: NVMeInitFreeQ : Entry#3 List starts 0xFFFFFA800D69E400 STORMINI: NVMeInitFreeQ : Entry#4 List starts 0xFFFFFA800D69E500 STORMINI: NVMeInitFreeQ : Entry#5 List starts 0xFFFFFA800D69E600 STORMINI: NVMeInitFreeQ : Entry#6 List starts 0xFFFFFA800D69E700 STORMINI: NVMeInitFreeQ : Entry#7 List starts 0xFFFFFA800D69E800 STORMINI: NVMeInitFreeQ : Entry#8 List starts 0xFFFFFA800D69E900 STORMINI: NVMeInitFreeQ : Entry#9 List starts 0xFFFFFA800D69EA00 STORMINI: NVMeInitFreeQ : Entry#10 List starts 0xFFFFFA800D69EB00 STORMINI: NVMeInitFreeQ : Entry#11 List starts 0xFFFFFA800D69EC00 STORMINI: NVMeInitFreeQ : Entry#12 List starts 0xFFFFFA800D69ED00 STORMINI: NVMeInitFreeQ : Entry#13 List starts 0xFFFFFA800D69EE00 STORMINI: NVMeInitFreeQ : Entry#14 List starts 0xFFFFFA800D69EF00 STORMINI: NVMeInitFreeQ : Entry#15 List starts 0xFFFFFA800D69F000 STORMINI: NVMeInitFreeQ : Entry#16 List starts 0xFFFFFA800D69F100 STORMINI: NVMeInitFreeQ : Entry#17 List starts 0xFFFFFA800D6A0100 STORMINI: NVMeInitFreeQ : Entry#18 List starts 0xFFFFFA800D6A0200 STORMINI: NVMeInitFreeQ : Entry#19 List starts 0xFFFFFA800D6A0300 … … Did you modify the driver and create a bug in your driver ? We have not seen this problem in our testing environment. Thanks -Kwok From: SANTOSH SINGH [mailto:santosh.s2 at samsung.com] Sent: Tuesday, March 26, 2013 3:32 AM To: Robles, Raymond C Cc: Kong, Kwok; technical at nvmexpress.org; Onufryk, Peter; Wilcox, Matthew R; nvmewin at lists.openfabrics.org Subject: Re: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Hi Ray, Attached is the summary of the screen shot. [cid:image001.png at 01CE2ECD.B1F79810] We can discuss in the next WG call. Regards Santosh ------- Original Message ------- Sender : Robles, Raymond C> Date : Mar 22, 2013 05:17 (GMT+09:00) Title : RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Hi Santosh, I am one of the original authors of the OFA Windows NVMe driver (sorry, I’m late to this thread). What do you believe is the problem? The Windows OFA driver constructs PRP lists per the NVMe spec. We’ve run for several days with numerous data integrity testing tools (without error). Do you believe that the PRP list is incorrectly constructed? Based on the screen shot you sent out, the second PRP entry in the submission queue entry points to a PRP list… it should not contain the 2nd PRP entry. This is per the NVMe spec. Thanks, Ray From: Kong, Kwok [mailto:Kwok.Kong at idt.com] Sent: Thursday, March 21, 2013 11:14 AM To: santosh.s2 at samsung.com Cc: technical at nvmexpress.org; Onufryk, Peter; Wilcox, Matthew R Subject: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Santosh, The specification is very clear on how the PRP list should be constructed. If you see a problem with any driver, then it is a driver problem and not a specification problem. Please send your question to nvmewin at lists.openfabrics.org if you believe there is a driver bug. What is the LBA size for your testing ? 512B or 4KB ? Thanks -Kwok From: SANTOSH SINGH [mailto:santosh.s2 at samsung.com] Sent: Thursday, March 21, 2013 3:55 AM To: Kong, Kwok Cc: technical at nvmexpress.org; Onufryk, Peter; 'Wilcox, Matthew R' Subject: Re: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Sorry some how the attachement is missing. Resending. [cid:image002.png at 01CE2ECD.B1F79810] Regards Santosh ------- Original Message ------- Sender : SANTOSH SINGH> Senior Chief Engineer/SRI-Bangalore-SSD Solutions/Samsung Electronics Date : Mar 21, 2013 19:45 (GMT+09:00) Title : RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Hi Kwok, I got the scenario reproduced again, while issuing the FS format command. Following are the debug details. Page size 4k Data Transfer size was 16 LBA PRP1 Entry 0x3e8fd060 PRP2 Entry list 0xbe166ff0 Total no. of PRP entries 17 The 16 PRP entries should fit in single page. But the PRP2 offset 0xbe166ff0(PRP2) is not correct and it has run off the page(from 0xbe166ff0 to 0xbe167000) which is the next page in continuity. Attached is the snapshot of the debug window for the detailed analysis. Regards Santosh -----Original Message----- From: Kong, Kwok [mailto:Kwok.Kong at idt.com] Sent: Thursday, March 21, 2013 1:22 AM To: Wilcox, Matthew R; santosh.s2 at samsung.com Cc: Onufryk, Peter; technical at nvmexpress.org Subject: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Santosh, Your verification on the OFA driver that it uses PRP list format as in figure 2 is incorrect. The OFA driver does neither figure 1, nor figure 2. By default, the max request size that the mini-port supports is 128KB. If the request size is bigger than 128KB, the port driver sends multiple 128KB requests to the OFA mini-port driver. The OFA mini-port driver pre-allocates the PRP list buffers during initialization. The PRP entries (a max of 32 entries for 128KB request size) never run off the end of a page as shown in figure 1 or figure 2. All PRP entries are guaranteed to fit within a single Memory Page. Thanks -Kwok -----Original Message----- From: Wilcox, Matthew R [mailto:matthew.r.wilcox at intel.com] Sent: Wednesday, March 20, 2013 8:26 AM To: santosh.s2 at samsung.com; Kong, Kwok Cc: Onufryk, Peter; technical at nvmexpress.org Subject: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list The Linux driver does neither figure 1, nor figure 2. If the number of PRP entries requires more than one page, it starts at the beginning of a page. If it requires less than a page, it may start in the middle of a page, but will never run off the end of a page as shown in figure 2. I have not reviewed the OFA driver to see what it does. ________________________________ From: SANTOSH SINGH [santosh.s2 at samsung.com] Sent: March 19, 2013 8:01 PM To: Kong, Kwok; Wilcox, Matthew R Cc: Onufryk, Peter; technical at nvmexpress.org Subject: Re: RE: NVMe 1.1 : Clarification on PRP2 Entry list Hi Kwok, Matthew, I verified the OFA driver too and that prepares the PRP List entry as in fig-2. Any reason why both the drivers(Linux and OFA) prepares the PRP2 list like fig-2. Will it change in the later version of drivers as fig-1. Regards Santosh ------- Original Message ------- Sender : Kong, Kwok Date : Mar 20, 2013 01:01 (GMT+09:00) Title : RE: NVMe 1.1 : Clarification on PRP2 Entry list I believe the specification is very clear to indicate that figure 1 is correct. "The last entry within a memory page, as indicated by the memory page size in the CC.MPS field, shall be a PRP List pointer if there is more than a single memory page of data to be transferred.". Thanks -Kwok From: Onufryk, Peter [mailto:Peter.Onufryk at idt.com] Sent: Tuesday, March 19, 2013 7:22 AM To: Santosh Singh; technical at nvmexpress.org Subject: RE: NVMe 1.1 : Clarification on PRP2 Entry list Santosh, John and I discussed this and we believe that Figure 1 is correct and that the best way to clarify this is by adding a figure showing it to the spec. This will be the first item on the agenda in this week's calls. Regards, Peter From: Santosh Singh [mailto:santosh.s2 at samsung.com] Sent: Tuesday, March 19, 2013 5:49 AM To: technical at nvmexpress.org Subject: NVMe 1.1 : Clarification on PRP2 Entry list Hi All, I got the query on PRP2 , when it is entry list and not memory page aligned from a design engineer . The below paragraph is from section 4.3 'Physical Region Page Entry and List' of Spec 1.1. [cid:Z5JE7EUABGFC at namo.co.kr] PRP entry 2, when pointing to a list may also have a non-zero offset within a memory page, means that is not memory page aligned. The last entry within a memory page shall be a list pointer. There are following two understandings for this: 1000 h 1 FFFh PRP entry 2 pointing to a list Entry 1 Entry 2 Entry 3 Entry 4 Address of PRP list 5000 h 5 FFFh Entry 5 Entry 6 Entry 7 Entry 8 1000 h 1 FFFh PRP entry 2 pointing to a list Entry 1 Entry 2 Entry 3 Entry 4 2 FFFh Entry 5 Entry 6 Entry 7 Entry 8 Entry 511 Entry 510 Address of PRP list 5000 h 5 FFFh Entry 512 Entry 513 Entry 514 Fig:1 Fig:2 Page boundary So just want to verify with others that as per the line in section 4.3 'A physical region page list (PRP List) is a set of PRP entries in a single page of contiguous memory' fig:2 is the correct understanding. Thanks & Regards Santosh [cid:LK7CT9SZN3WZ at namo.co.kr] [cid:image003.gif at 01CE2ECD.B1F79810] [http://ext.samsung.net/mailcheck/SeenTimeChecker?do=ba69d47c78c3acc08d47d8c18e24da0151171515984b9d550ad7d0699a0799098adfa564d3c39ac365b186a42a35dd3e259756a7cc35ba77326bbdfb2ea96a2fcf878f9a26ce15a0] [http://ext.samsung.net/mailcheck/SeenTimeChecker?do=ba69d47c78c3acc0b6872ba05ecac647cb20e510c9bc8be10ad7d0699a0799098adfa564d3c39ac365b186a42a35dd3e259756a7cc35ba77326bbdfb2ea96a2fcf878f9a26ce15a0] [http://ext.samsung.net/mailcheck/SeenTimeChecker?do=5f64b9fd8cb08cc2f66a4f6be00ce242c653ccac87eca576c883f74d3027808b94d548315d7b78b3f676ccf8e4fcda078aa631650c9c0a6c62e1ac75b522795a07805447a154a46fcf878f9a26ce15a0] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 81077 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 126969 bytes Desc: image002.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 14036 bytes Desc: image003.gif URL: From Kwok.Kong at idt.com Mon Apr 1 14:03:16 2013 From: Kwok.Kong at idt.com (Kong, Kwok) Date: Mon, 1 Apr 2013 21:03:16 +0000 Subject: [nvmewin] NVMe 1.1 : Clarification on PRP2 Entry list In-Reply-To: <2D.4C.24454.C1879515@epcpsbgx4.samsung.com> References: <2D.4C.24454.C1879515@epcpsbgx4.samsung.com> Message-ID: <05CD7821AE397547A01AC160FBC231474BC8EF41@corpmail1.na.ads.idt.com> Santosh, What version of Windows are you running ? Is it 32-bit or 64-bit ? Officially, the current release of the driver supports Windows 7 – 64 bits only. It does not work with 32-bit Windows. Thanks -Kwok From: SANTOSH SINGH [mailto:santosh.s2 at samsung.com] Sent: Monday, April 01, 2013 5:06 AM To: Kong, Kwok Cc: nvmewin at lists.openfabrics.org Subject: Re: RE: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Hi Kwok, I did not modify the driver code nor introdue any bug but have seen this behaviour couple of times. Will capture the debug trace. Regards Santosh ------- Original Message ------- Sender : Kong, Kwok> Date : Mar 27, 2013 07:56 (GMT+09:00) Title : RE: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Santosh, Something is not right here. The driver pre-allocates the PRP list to store up to 32 entries per list during initialization time. PRP2 must point to a list with the following offset: - 0x000 - 0x100 - 0x200 - … - … - 0xE00 - 0xF00 PRP2 cannot point to an offset 0xFF0 as in your example. You can confirm this in your driver by adding the following debugging message in the file nvmeinit.c and function NVMeInitFreeQ. It's around Line# 720 to add the printout... /* Save the address of current list for calculating next list */ CurPRPList = (ULONG_PTR)pCmdInfo->pPRPList; pCmdInfo->prpListPhyAddr = StorPortGetPhysicalAddress(pAE, NULL, pCmdInfo->pPRPList, &prpListSz); StorPortDebugPrint(INFO, "NVMeInitFreeQ : Entry#%d List starts 0x%llX\n", Entry, pCmdInfo->pPRPList); Here is a sample output: STORMINI: NVMeInitFreeQ : Entry#0 List starts 0xFFFFFA800D69E100 STORMINI: NVMeInitFreeQ : Entry#1 List starts 0xFFFFFA800D69E200 STORMINI: NVMeInitFreeQ : Entry#2 List starts 0xFFFFFA800D69E300 STORMINI: NVMeInitFreeQ : Entry#3 List starts 0xFFFFFA800D69E400 STORMINI: NVMeInitFreeQ : Entry#4 List starts 0xFFFFFA800D69E500 STORMINI: NVMeInitFreeQ : Entry#5 List starts 0xFFFFFA800D69E600 STORMINI: NVMeInitFreeQ : Entry#6 List starts 0xFFFFFA800D69E700 STORMINI: NVMeInitFreeQ : Entry#7 List starts 0xFFFFFA800D69E800 STORMINI: NVMeInitFreeQ : Entry#8 List starts 0xFFFFFA800D69E900 STORMINI: NVMeInitFreeQ : Entry#9 List starts 0xFFFFFA800D69EA00 STORMINI: NVMeInitFreeQ : Entry#10 List starts 0xFFFFFA800D69EB00 STORMINI: NVMeInitFreeQ : Entry#11 List starts 0xFFFFFA800D69EC00 STORMINI: NVMeInitFreeQ : Entry#12 List starts 0xFFFFFA800D69ED00 STORMINI: NVMeInitFreeQ : Entry#13 List starts 0xFFFFFA800D69EE00 STORMINI: NVMeInitFreeQ : Entry#14 List starts 0xFFFFFA800D69EF00 STORMINI: NVMeInitFreeQ : Entry#15 List starts 0xFFFFFA800D69F000 STORMINI: NVMeInitFreeQ : Entry#16 List starts 0xFFFFFA800D69F100 STORMINI: NVMeInitFreeQ : Entry#17 List starts 0xFFFFFA800D6A0100 STORMINI: NVMeInitFreeQ : Entry#18 List starts 0xFFFFFA800D6A0200 STORMINI: NVMeInitFreeQ : Entry#19 List starts 0xFFFFFA800D6A0300 … … Did you modify the driver and create a bug in your driver ? We have not seen this problem in our testing environment. Thanks -Kwok From: SANTOSH SINGH [mailto:santosh.s2 at samsung.com] Sent: Tuesday, March 26, 2013 3:32 AM To: Robles, Raymond C Cc: Kong, Kwok; technical at nvmexpress.org; Onufryk, Peter; Wilcox, Matthew R; nvmewin at lists.openfabrics.org Subject: Re: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Hi Ray, Attached is the summary of the screen shot. [cid:image001.png at 01CE2EE1.A8250200] We can discuss in the next WG call. Regards Santosh ------- Original Message ------- Sender : Robles, Raymond C> Date : Mar 22, 2013 05:17 (GMT+09:00) Title : RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Hi Santosh, I am one of the original authors of the OFA Windows NVMe driver (sorry, I’m late to this thread). What do you believe is the problem? The Windows OFA driver constructs PRP lists per the NVMe spec. We’ve run for several days with numerous data integrity testing tools (without error). Do you believe that the PRP list is incorrectly constructed? Based on the screen shot you sent out, the second PRP entry in the submission queue entry points to a PRP list… it should not contain the 2nd PRP entry. This is per the NVMe spec. Thanks, Ray From: Kong, Kwok [mailto:Kwok.Kong at idt.com] Sent: Thursday, March 21, 2013 11:14 AM To: santosh.s2 at samsung.com Cc: technical at nvmexpress.org; Onufryk, Peter; Wilcox, Matthew R Subject: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Santosh, The specification is very clear on how the PRP list should be constructed. If you see a problem with any driver, then it is a driver problem and not a specification problem. Please send your question to nvmewin at lists.openfabrics.org if you believe there is a driver bug. What is the LBA size for your testing ? 512B or 4KB ? Thanks -Kwok From: SANTOSH SINGH [mailto:santosh.s2 at samsung.com] Sent: Thursday, March 21, 2013 3:55 AM To: Kong, Kwok Cc: technical at nvmexpress.org; Onufryk, Peter; 'Wilcox, Matthew R' Subject: Re: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Sorry some how the attachement is missing. Resending. [cid:image002.png at 01CE2EE1.A8250200] Regards Santosh ------- Original Message ------- Sender : SANTOSH SINGH> Senior Chief Engineer/SRI-Bangalore-SSD Solutions/Samsung Electronics Date : Mar 21, 2013 19:45 (GMT+09:00) Title : RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Hi Kwok, I got the scenario reproduced again, while issuing the FS format command. Following are the debug details. Page size 4k Data Transfer size was 16 LBA PRP1 Entry 0x3e8fd060 PRP2 Entry list 0xbe166ff0 Total no. of PRP entries 17 The 16 PRP entries should fit in single page. But the PRP2 offset 0xbe166ff0(PRP2) is not correct and it has run off the page(from 0xbe166ff0 to 0xbe167000) which is the next page in continuity. Attached is the snapshot of the debug window for the detailed analysis. Regards Santosh -----Original Message----- From: Kong, Kwok [mailto:Kwok.Kong at idt.com] Sent: Thursday, March 21, 2013 1:22 AM To: Wilcox, Matthew R; santosh.s2 at samsung.com Cc: Onufryk, Peter; technical at nvmexpress.org Subject: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Santosh, Your verification on the OFA driver that it uses PRP list format as in figure 2 is incorrect. The OFA driver does neither figure 1, nor figure 2. By default, the max request size that the mini-port supports is 128KB. If the request size is bigger than 128KB, the port driver sends multiple 128KB requests to the OFA mini-port driver. The OFA mini-port driver pre-allocates the PRP list buffers during initialization. The PRP entries (a max of 32 entries for 128KB request size) never run off the end of a page as shown in figure 1 or figure 2. All PRP entries are guaranteed to fit within a single Memory Page. Thanks -Kwok -----Original Message----- From: Wilcox, Matthew R [mailto:matthew.r.wilcox at intel.com] Sent: Wednesday, March 20, 2013 8:26 AM To: santosh.s2 at samsung.com; Kong, Kwok Cc: Onufryk, Peter; technical at nvmexpress.org Subject: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list The Linux driver does neither figure 1, nor figure 2. If the number of PRP entries requires more than one page, it starts at the beginning of a page. If it requires less than a page, it may start in the middle of a page, but will never run off the end of a page as shown in figure 2. I have not reviewed the OFA driver to see what it does. ________________________________ From: SANTOSH SINGH [santosh.s2 at samsung.com] Sent: March 19, 2013 8:01 PM To: Kong, Kwok; Wilcox, Matthew R Cc: Onufryk, Peter; technical at nvmexpress.org Subject: Re: RE: NVMe 1.1 : Clarification on PRP2 Entry list Hi Kwok, Matthew, I verified the OFA driver too and that prepares the PRP List entry as in fig-2. Any reason why both the drivers(Linux and OFA) prepares the PRP2 list like fig-2. Will it change in the later version of drivers as fig-1. Regards Santosh ------- Original Message ------- Sender : Kong, Kwok Date : Mar 20, 2013 01:01 (GMT+09:00) Title : RE: NVMe 1.1 : Clarification on PRP2 Entry list I believe the specification is very clear to indicate that figure 1 is correct. "The last entry within a memory page, as indicated by the memory page size in the CC.MPS field, shall be a PRP List pointer if there is more than a single memory page of data to be transferred.". Thanks -Kwok From: Onufryk, Peter [mailto:Peter.Onufryk at idt.com] Sent: Tuesday, March 19, 2013 7:22 AM To: Santosh Singh; technical at nvmexpress.org Subject: RE: NVMe 1.1 : Clarification on PRP2 Entry list Santosh, John and I discussed this and we believe that Figure 1 is correct and that the best way to clarify this is by adding a figure showing it to the spec. This will be the first item on the agenda in this week's calls. Regards, Peter From: Santosh Singh [mailto:santosh.s2 at samsung.com] Sent: Tuesday, March 19, 2013 5:49 AM To: technical at nvmexpress.org Subject: NVMe 1.1 : Clarification on PRP2 Entry list Hi All, I got the query on PRP2 , when it is entry list and not memory page aligned from a design engineer . The below paragraph is from section 4.3 'Physical Region Page Entry and List' of Spec 1.1. [cid:Z5JE7EUABGFC at namo.co.kr] PRP entry 2, when pointing to a list may also have a non-zero offset within a memory page, means that is not memory page aligned. The last entry within a memory page shall be a list pointer. There are following two understandings for this: 1000 h 1 FFFh PRP entry 2 pointing to a list Entry 1 Entry 2 Entry 3 Entry 4 Address of PRP list 5000 h 5 FFFh Entry 5 Entry 6 Entry 7 Entry 8 1000 h 1 FFFh PRP entry 2 pointing to a list Entry 1 Entry 2 Entry 3 Entry 4 2 FFFh Entry 5 Entry 6 Entry 7 Entry 8 Entry 511 Entry 510 Address of PRP list 5000 h 5 FFFh Entry 512 Entry 513 Entry 514 Fig:1 Fig:2 Page boundary So just want to verify with others that as per the line in section 4.3 'A physical region page list (PRP List) is a set of PRP entries in a single page of contiguous memory' fig:2 is the correct understanding. Thanks & Regards Santosh [cid:LK7CT9SZN3WZ at namo.co.kr] [cid:image003.gif at 01CE2EE1.A8250200] [http://ext.samsung.net/mailcheck/SeenTimeChecker?do=ba69d47c78c3acc08d47d8c18e24da0151171515984b9d550ad7d0699a0799098adfa564d3c39ac365b186a42a35dd3e259756a7cc35ba77326bbdfb2ea96a2fcf878f9a26ce15a0] [http://ext.samsung.net/mailcheck/SeenTimeChecker?do=ba69d47c78c3acc0b6872ba05ecac647cb20e510c9bc8be10ad7d0699a0799098adfa564d3c39ac365b186a42a35dd3e259756a7cc35ba77326bbdfb2ea96a2fcf878f9a26ce15a0] [http://ext.samsung.net/mailcheck/SeenTimeChecker?do=5f64b9fd8cb08cc2f66a4f6be00ce242c653ccac87eca576932338572397a37994dfdc2a6b2aa2df7b37768dabeed03adb9fdddda33e82cbe4a391424e62fcf6cf878f9a26ce15a0] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 81077 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 126969 bytes Desc: image002.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 14036 bytes Desc: image003.gif URL: From Kwok.Kong at idt.com Mon Apr 1 16:50:55 2013 From: Kwok.Kong at idt.com (Kong, Kwok) Date: Mon, 1 Apr 2013 23:50:55 +0000 Subject: [nvmewin] [nvemwin] Review driver development status for the up coming June release Message-ID: <05CD7821AE397547A01AC160FBC231474BC8F07A@corpmail1.na.ads.idt.com> When: Thursday, April 11, 2013 1:00 PM-2:00 PM (UTC-08:00) Pacific Time (US & Canada). Where: Bridge: 888-270-9936, Access code: 811938# Note: The GMT offset above does not reflect daylight saving time adjustments. *~*~*~*~*~*~*~*~*~* Agenda: - Report Patch check in since last call - Review driver 1.2 release status o IDT - Windows 6 64-bit, Server 2008R2 64-bit, Server 2012 64-bit and NVMe 1.00e support o LSI - TRIM command support o Huawei - Hibernation as a boot drive - Review driver release roadmap for 2013 - Review ModeSense Translation issue as reported by Dharani Kotte - Review PRP2 issue as reported by Santosh - Do we need any regular call ? - AOB Please email me any items that you would like to add to the agenda. Regards, -Kwok You are invited to attend an AT&T Connect iMeeting . To connect to the Web Conference: ============================= Click here: https://connect9.uc.att.com/service32/meet/?ExEventID=8811938&CT=M TO CONNECT WITH YOUR *TELEPHONE ONLY* (no computer): =================================================== 1. Choose one of the following numbers to dial: If you are calling from an office location with on-site number(s) (listed below), try this number first. If you do not have on-site access, or you are not a member of the host's company/organization, use one of the other numbers shown. * Toll-Free Number (in USA): 888-270-9936. * Caller-Paid number: 602-333-0032 * Blackberry (Toll-Free Number): 888-270-9936x811938# * A number in your country or in a country close to you (may be toll free): https://www.teleconference.att.com/servlet/glbAccess?process=1&accessNumber=8882709936&accessCode=811938 2. When prompted, enter the Meeting Access Code: 811938# To prepare in advance for the conference (for all devices): https://connect9.uc.att.com/service32/Prepare/. To view supported Operating Systems and devices: http://www.uc.att.com/support/SupportedDevices.html Powered by AT&T Connect. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/calendar Size: 3750 bytes Desc: not available URL: From raymond.c.robles at intel.com Tue Apr 2 10:06:52 2013 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Tue, 2 Apr 2013 17:06:52 +0000 Subject: [nvmewin] QEMU with NVM Express Support Uploaded to OFA Repository Message-ID: <49158E750348AA499168FD41D889836061B90F83@FMSMSX105.amr.corp.intel.com> Hello all, It has been brought to my attention that the open source QEMU repo does not contain the version that supports NVMe. In short, there is a patch awaiting to be pushed, but the maintainer of the QEMU repo has numerous comments around the patch and this has stalled the integration of NVMe support into the master QEMU repo. In lieu of this, I've uploaded Intel's copy of the QEMU source that has NVMe support (since the NVMe support patch came from Intel). This version of QEMU is not proprietary and only implements NVMe per spec. I've added a new directory in the root directory named qemu/... and it contains: qemu_5_March_2013.tgz README Please let me know if you have any questions. I will keep the distribution list informed when the official patch is pushed to the QEMU master repo. Thanks, Ray [cid:image001.png at 01CB3870.4BB88E70] Raymond C. Robles NVM Solutions Group | Internal SSD Engineering Technology & Manufacturing Group Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From santosh.s2 at samsung.com Thu Apr 4 03:25:48 2013 From: santosh.s2 at samsung.com (SANTOSH SINGH) Date: Thu, 04 Apr 2013 10:25:48 +0000 (GMT) Subject: [nvmewin] NVMe 1.1 : Clarification on PRP2 Entry list Message-ID: <44.86.08014.C255D515@epcpsbgx2.samsung.com> An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 201304041559155_5IQ015DS.png Type: image/png Size: 81077 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 201304041559984_AWTV2GZO.png Type: image/png Size: 126969 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 201304041559182_8J3POT2T.gif Type: image/gif Size: 14036 bytes Desc: not available URL: From Kwok.Kong at idt.com Thu Apr 4 07:49:23 2013 From: Kwok.Kong at idt.com (Kong, Kwok) Date: Thu, 4 Apr 2013 14:49:23 +0000 Subject: [nvmewin] NVMe 1.1 : Clarification on PRP2 Entry list In-Reply-To: References: Message-ID: <05CD7821AE397547A01AC160FBC231474BC8FFA3@corpmail1.na.ads.idt.com> Santosh, The current driver supports Windows -7 64-bit only. Windows-7 32-bit is not supported. 32-bit Windows will be supported in the end of 2013 release. There is no owner to work on Windows 32-bit support yet. It will be great if you can take the ownership to work on Windows-7 and Windows-8 32-bit support. Thanks -Kwok From: SANTOSH SINGH [mailto:santosh.s2 at samsung.com] Sent: Thursday, April 04, 2013 3:26 AM To: Kong, Kwok Cc: nvmewin at lists.openfabrics.org Subject: Re: RE: RE: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Hi Kwok, Its Windows-7 32 bit system and but I have n't seen the same issue on Windows 64-bit system. Regards Santosh ------- Original Message ------- Sender : Kong, Kwok> Date : Apr 02, 2013 06:03 (GMT+09:00) Title : RE: RE: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Santosh, What version of Windows are you running ? Is it 32-bit or 64-bit ? Officially, the current release of the driver supports Windows 7 – 64 bits only. It does not work with 32-bit Windows. Thanks -Kwok From: SANTOSH SINGH [mailto:santosh.s2 at samsung.com] Sent: Monday, April 01, 2013 5:06 AM To: Kong, Kwok Cc: nvmewin at lists.openfabrics.org Subject: Re: RE: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Hi Kwok, I did not modify the driver code nor introdue any bug but have seen this behaviour couple of times. Will capture the debug trace. Regards Santosh ------- Original Message ------- Sender : Kong, Kwok> Date : Mar 27, 2013 07:56 (GMT+09:00) Title : RE: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Santosh, Something is not right here. The driver pre-allocates the PRP list to store up to 32 entries per list during initialization time. PRP2 must point to a list with the following offset: - 0x000 - 0x100 - 0x200 - … - … - 0xE00 - 0xF00 PRP2 cannot point to an offset 0xFF0 as in your example. You can confirm this in your driver by adding the following debugging message in the file nvmeinit.c and function NVMeInitFreeQ. It's around Line# 720 to add the printout... /* Save the address of current list for calculating next list */ CurPRPList = (ULONG_PTR)pCmdInfo->pPRPList; pCmdInfo->prpListPhyAddr = StorPortGetPhysicalAddress(pAE, NULL, pCmdInfo->pPRPList, &prpListSz); StorPortDebugPrint(INFO, "NVMeInitFreeQ : Entry#%d List starts 0x%llX\n", Entry, pCmdInfo->pPRPList); Here is a sample output: STORMINI: NVMeInitFreeQ : Entry#0 List starts 0xFFFFFA800D69E100 STORMINI: NVMeInitFreeQ : Entry#1 List starts 0xFFFFFA800D69E200 STORMINI: NVMeInitFreeQ : Entry#2 List starts 0xFFFFFA800D69E300 STORMINI: NVMeInitFreeQ : Entry#3 List starts 0xFFFFFA800D69E400 STORMINI: NVMeInitFreeQ : Entry#4 List starts 0xFFFFFA800D69E500 STORMINI: NVMeInitFreeQ : Entry#5 List starts 0xFFFFFA800D69E600 STORMINI: NVMeInitFreeQ : Entry#6 List starts 0xFFFFFA800D69E700 STORMINI: NVMeInitFreeQ : Entry#7 List starts 0xFFFFFA800D69E800 STORMINI: NVMeInitFreeQ : Entry#8 List starts 0xFFFFFA800D69E900 STORMINI: NVMeInitFreeQ : Entry#9 List starts 0xFFFFFA800D69EA00 STORMINI: NVMeInitFreeQ : Entry#10 List starts 0xFFFFFA800D69EB00 STORMINI: NVMeInitFreeQ : Entry#11 List starts 0xFFFFFA800D69EC00 STORMINI: NVMeInitFreeQ : Entry#12 List starts 0xFFFFFA800D69ED00 STORMINI: NVMeInitFreeQ : Entry#13 List starts 0xFFFFFA800D69EE00 STORMINI: NVMeInitFreeQ : Entry#14 List starts 0xFFFFFA800D69EF00 STORMINI: NVMeInitFreeQ : Entry#15 List starts 0xFFFFFA800D69F000 STORMINI: NVMeInitFreeQ : Entry#16 List starts 0xFFFFFA800D69F100 STORMINI: NVMeInitFreeQ : Entry#17 List starts 0xFFFFFA800D6A0100 STORMINI: NVMeInitFreeQ : Entry#18 List starts 0xFFFFFA800D6A0200 STORMINI: NVMeInitFreeQ : Entry#19 List starts 0xFFFFFA800D6A0300 … … Did you modify the driver and create a bug in your driver ? We have not seen this problem in our testing environment. Thanks -Kwok From: SANTOSH SINGH [mailto:santosh.s2 at samsung.com] Sent: Tuesday, March 26, 2013 3:32 AM To: Robles, Raymond C Cc: Kong, Kwok; technical at nvmexpress.org; Onufryk, Peter; Wilcox, Matthew R; nvmewin at lists.openfabrics.org Subject: Re: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Hi Ray, Attached is the summary of the screen shot. [cid:image001.png at 01CE3108.1FE00EC0] We can discuss in the next WG call. Regards Santosh ------- Original Message ------- Sender : Robles, Raymond C> Date : Mar 22, 2013 05:17 (GMT+09:00) Title : RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Hi Santosh, I am one of the original authors of the OFA Windows NVMe driver (sorry, I’m late to this thread). What do you believe is the problem? The Windows OFA driver constructs PRP lists per the NVMe spec. We’ve run for several days with numerous data integrity testing tools (without error). Do you believe that the PRP list is incorrectly constructed? Based on the screen shot you sent out, the second PRP entry in the submission queue entry points to a PRP list… it should not contain the 2nd PRP entry. This is per the NVMe spec. Thanks, Ray From: Kong, Kwok [mailto:Kwok.Kong at idt.com] Sent: Thursday, March 21, 2013 11:14 AM To: santosh.s2 at samsung.com Cc: technical at nvmexpress.org; Onufryk, Peter; Wilcox, Matthew R Subject: RE: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Santosh, The specification is very clear on how the PRP list should be constructed. If you see a problem with any driver, then it is a driver problem and not a specification problem. Please send your question to nvmewin at lists.openfabrics.org if you believe there is a driver bug. What is the LBA size for your testing ? 512B or 4KB ? Thanks -Kwok From: SANTOSH SINGH [mailto:santosh.s2 at samsung.com] Sent: Thursday, March 21, 2013 3:55 AM To: Kong, Kwok Cc: technical at nvmexpress.org; Onufryk, Peter; 'Wilcox, Matthew R' Subject: Re: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Sorry some how the attachement is missing. Resending. [cid:image002.png at 01CE3108.1FE00EC0] Regards Santosh ------- Original Message ------- Sender : SANTOSH SINGH> Senior Chief Engineer/SRI-Bangalore-SSD Solutions/Samsung Electronics Date : Mar 21, 2013 19:45 (GMT+09:00) Title : RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Hi Kwok, I got the scenario reproduced again, while issuing the FS format command. Following are the debug details. Page size 4k Data Transfer size was 16 LBA PRP1 Entry 0x3e8fd060 PRP2 Entry list 0xbe166ff0 Total no. of PRP entries 17 The 16 PRP entries should fit in single page. But the PRP2 offset 0xbe166ff0(PRP2) is not correct and it has run off the page(from 0xbe166ff0 to 0xbe167000) which is the next page in continuity. Attached is the snapshot of the debug window for the detailed analysis. Regards Santosh -----Original Message----- From: Kong, Kwok [mailto:Kwok.Kong at idt.com] Sent: Thursday, March 21, 2013 1:22 AM To: Wilcox, Matthew R; santosh.s2 at samsung.com Cc: Onufryk, Peter; technical at nvmexpress.org Subject: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list Santosh, Your verification on the OFA driver that it uses PRP list format as in figure 2 is incorrect. The OFA driver does neither figure 1, nor figure 2. By default, the max request size that the mini-port supports is 128KB. If the request size is bigger than 128KB, the port driver sends multiple 128KB requests to the OFA mini-port driver. The OFA mini-port driver pre-allocates the PRP list buffers during initialization. The PRP entries (a max of 32 entries for 128KB request size) never run off the end of a page as shown in figure 1 or figure 2. All PRP entries are guaranteed to fit within a single Memory Page. Thanks -Kwok -----Original Message----- From: Wilcox, Matthew R [mailto:matthew.r.wilcox at intel.com] Sent: Wednesday, March 20, 2013 8:26 AM To: santosh.s2 at samsung.com; Kong, Kwok Cc: Onufryk, Peter; technical at nvmexpress.org Subject: RE: RE: NVMe 1.1 : Clarification on PRP2 Entry list The Linux driver does neither figure 1, nor figure 2. If the number of PRP entries requires more than one page, it starts at the beginning of a page. If it requires less than a page, it may start in the middle of a page, but will never run off the end of a page as shown in figure 2. I have not reviewed the OFA driver to see what it does. ________________________________ From: SANTOSH SINGH [santosh.s2 at samsung.com] Sent: March 19, 2013 8:01 PM To: Kong, Kwok; Wilcox, Matthew R Cc: Onufryk, Peter; technical at nvmexpress.org Subject: Re: RE: NVMe 1.1 : Clarification on PRP2 Entry list Hi Kwok, Matthew, I verified the OFA driver too and that prepares the PRP List entry as in fig-2. Any reason why both the drivers(Linux and OFA) prepares the PRP2 list like fig-2. Will it change in the later version of drivers as fig-1. Regards Santosh ------- Original Message ------- Sender : Kong, Kwok Date : Mar 20, 2013 01:01 (GMT+09:00) Title : RE: NVMe 1.1 : Clarification on PRP2 Entry list I believe the specification is very clear to indicate that figure 1 is correct. "The last entry within a memory page, as indicated by the memory page size in the CC.MPS field, shall be a PRP List pointer if there is more than a single memory page of data to be transferred.". Thanks -Kwok From: Onufryk, Peter [mailto:Peter.Onufryk at idt.com] Sent: Tuesday, March 19, 2013 7:22 AM To: Santosh Singh; technical at nvmexpress.org Subject: RE: NVMe 1.1 : Clarification on PRP2 Entry list Santosh, John and I discussed this and we believe that Figure 1 is correct and that the best way to clarify this is by adding a figure showing it to the spec. This will be the first item on the agenda in this week's calls. Regards, Peter From: Santosh Singh [mailto:santosh.s2 at samsung.com] Sent: Tuesday, March 19, 2013 5:49 AM To: technical at nvmexpress.org Subject: NVMe 1.1 : Clarification on PRP2 Entry list Hi All, I got the query on PRP2 , when it is entry list and not memory page aligned from a design engineer . The below paragraph is from section 4.3 'Physical Region Page Entry and List' of Spec 1.1. [cid:Z5JE7EUABGFC at namo.co.kr] PRP entry 2, when pointing to a list may also have a non-zero offset within a memory page, means that is not memory page aligned. The last entry within a memory page shall be a list pointer. There are following two understandings for this: 1000 h 1 FFFh PRP entry 2 pointing to a list Entry 1 Entry 2 Entry 3 Entry 4 Address of PRP list 5000 h 5 FFFh Entry 5 Entry 6 Entry 7 Entry 8 1000 h 1 FFFh PRP entry 2 pointing to a list Entry 1 Entry 2 Entry 3 Entry 4 2 FFFh Entry 5 Entry 6 Entry 7 Entry 8 Entry 511 Entry 510 Address of PRP list 5000 h 5 FFFh Entry 512 Entry 513 Entry 514 Fig:1 Fig:2 Page boundary So just want to verify with others that as per the line in section 4.3 'A physical region page list (PRP List) is a set of PRP entries in a single page of contiguous memory' fig:2 is the correct understanding. Thanks & Regards Santosh [cid:LK7CT9SZN3WZ at namo.co.kr] [cid:image003.gif at 01CE3108.1FE00EC0] [http://ext.samsung.net/mailcheck/SeenTimeChecker?do=ba69d47c78c3acc08d47d8c18e24da0151171515984b9d550ad7d0699a0799098adfa564d3c39ac365b186a42a35dd3e259756a7cc35ba77326bbdfb2ea96a2fcf878f9a26ce15a0] [http://ext.samsung.net/mailcheck/SeenTimeChecker?do=ba69d47c78c3acc0b6872ba05ecac647cb20e510c9bc8be10ad7d0699a0799098adfa564d3c39ac365b186a42a35dd3e259756a7cc35ba77326bbdfb2ea96a2fcf878f9a26ce15a0] [http://ext.samsung.net/mailcheck/SeenTimeChecker?do=5f64b9fd8cb08cc2f66a4f6be00ce242c653ccac87eca576932338572397a37994dfdc2a6b2aa2df7b37768dabeed03adb9fdddda33e82cbe4a391424e62fcf6cf878f9a26ce15a0] [http://ext.samsung.net/mailcheck/SeenTimeChecker?do=5f64b9fd8cb08cc2fdd126d80ad984713eb55cd9d5cabd6d932338572397a37994dfdc2a6b2aa2df7b37768dabeed03adb9fdddda33e82cbe4a391424e62fcf6cf878f9a26ce15a0] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 81077 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 126969 bytes Desc: image002.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 14036 bytes Desc: image003.gif URL: From rrandall at micron.com Fri Apr 5 07:19:31 2013 From: rrandall at micron.com (Robert Randall (rrandall)) Date: Fri, 5 Apr 2013 14:19:31 +0000 Subject: [nvmewin] a little help with QEMU NVMe and Windows Message-ID: <70C73440F9F7C24F81A11355C292A9B574B2FEFF@NTXBOIMBX04.micron.com> Hi Paul and the mailing list, Paul, you might remember me; we met at last year's SNIA SDC. We had a brief discussion regarding getting Micron more involved in the NVMe open source community. I am the Windows driver lead for the PCIe SSD products at Micron. While we have had success getting QEMU NVMe and Linux working just fine following the guidance provided on the nvmecompliance github project we are struggling to get a Windows setup working properly. I am using Linux as the host OS, compiling qemunvme according to the information on github. While everything builds fine and the guest OS (Win8 Pro) installs fine and runs there is a small problem; the NVMe controller doesn't appear to have a PCIe bus to connect to. When I review the device details of the virtual Win8 machine I can see that there are NO PCIe root complexes available on the machine. However, RWEverything does show that the NVMe controller is present and I suspect that is only because the PCI configuration space is accessible but nothing else is because of the bus type mismatch. It feels like this will not work properly without a PCIe root port device being added to the virtual machine. I've not been able to find any useful help by searching the web. Is there some advice you can provide or some site or email thread you can point me to that will help me resolve this issue? We want to test our Windows NVMe driver in qemunvme but we are stuck. Best regards, Robert. Robert Randall Windows Driver Architect Micron Technologies, Inc. 3001 Broadway St NE Minneapolis, MN 55413-2657 desk: 612.884.2592 mobile: 612.770.9612 rrandall at micron.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.e.luse at intel.com Fri Apr 5 08:54:38 2013 From: paul.e.luse at intel.com (Luse, Paul E) Date: Fri, 5 Apr 2013 15:54:38 +0000 Subject: [nvmewin] a little help with QEMU NVMe and Windows In-Reply-To: <70C73440F9F7C24F81A11355C292A9B574B2FEFF@NTXBOIMBX04.micron.com> References: <70C73440F9F7C24F81A11355C292A9B574B2FEFF@NTXBOIMBX04.micron.com> Message-ID: <82C9F782B054C94B9FC04A331649C77A07C88F08@FMSMSX106.amr.corp.intel.com> Hi Randall, Ray sent a note out recently regarding a required QEMU patch that's coming soon (missing from public repo for some reason), can't find it now but Ray maybe you can shoot an email out once it's been pushed so Randall can try again then? Thx Paul From: Robert Randall (rrandall) [mailto:rrandall at micron.com] Sent: Friday, April 05, 2013 7:20 AM To: Luse, Paul E Cc: Robert Randall (rrandall); nvmewin at lists.openfabrics.org Subject: a little help with QEMU NVMe and Windows Importance: High Hi Paul and the mailing list, Paul, you might remember me; we met at last year's SNIA SDC. We had a brief discussion regarding getting Micron more involved in the NVMe open source community. I am the Windows driver lead for the PCIe SSD products at Micron. While we have had success getting QEMU NVMe and Linux working just fine following the guidance provided on the nvmecompliance github project we are struggling to get a Windows setup working properly. I am using Linux as the host OS, compiling qemunvme according to the information on github. While everything builds fine and the guest OS (Win8 Pro) installs fine and runs there is a small problem; the NVMe controller doesn't appear to have a PCIe bus to connect to. When I review the device details of the virtual Win8 machine I can see that there are NO PCIe root complexes available on the machine. However, RWEverything does show that the NVMe controller is present and I suspect that is only because the PCI configuration space is accessible but nothing else is because of the bus type mismatch. It feels like this will not work properly without a PCIe root port device being added to the virtual machine. I've not been able to find any useful help by searching the web. Is there some advice you can provide or some site or email thread you can point me to that will help me resolve this issue? We want to test our Windows NVMe driver in qemunvme but we are stuck. Best regards, Robert. Robert Randall Windows Driver Architect Micron Technologies, Inc. 3001 Broadway St NE Minneapolis, MN 55413-2657 desk: 612.884.2592 mobile: 612.770.9612 rrandall at micron.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.c.robles at intel.com Mon Apr 8 15:31:19 2013 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Mon, 8 Apr 2013 22:31:19 +0000 Subject: [nvmewin] NVMe Windows repo is LOCKED - Pushing IDT fix for SNTI changes Message-ID: <49158E750348AA499168FD41D889836061B946F0@FMSMSX105.amr.corp.intel.com> Locking the Windows NVMe repo. Pushing Alex's (IDT) fix for buffer overrun for reads and writes, 0 length reads/writes (excluding READ6/WRITE6), and not surfacing namespaces with metadata enabled. Thanks, Ray [cid:image001.png at 01CB3870.4BB88E70] Raymond C. Robles NVM Solutions Group | Internal SSD Engineering Technology & Manufacturing Group Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From raymond.c.robles at intel.com Mon Apr 8 15:41:36 2013 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Mon, 8 Apr 2013 22:41:36 +0000 Subject: [nvmewin] NVMe Windows repo is UNLOCKED - Pushing IDT fix for SNTI changes Message-ID: <49158E750348AA499168FD41D889836061B94712@FMSMSX105.amr.corp.intel.com> Latest patch from IDT (Read/Write fixes) has been pushed to the trunk. New tag created = IDT_SNTI_Read_Write_Fixes. If anyone has any questions, please feel free to contact me. Thanks, Ray From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Monday, April 08, 2013 3:31 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] NVMe Windows repo is LOCKED - Pushing IDT fix for SNTI changes Locking the Windows NVMe repo. Pushing Alex's (IDT) fix for buffer overrun for reads and writes, 0 length reads/writes (excluding READ6/WRITE6), and not surfacing namespaces with metadata enabled. Thanks, Ray [cid:image001.png at 01CB3870.4BB88E70] Raymond C. Robles NVM Solutions Group | Internal SSD Engineering Technology & Manufacturing Group Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From Alex.Chang at idt.com Mon Apr 8 15:45:12 2013 From: Alex.Chang at idt.com (Chang, Alex) Date: Mon, 8 Apr 2013 22:45:12 +0000 Subject: [nvmewin] NVMe Windows repo is UNLOCKED - Pushing IDT fix for SNTI changes In-Reply-To: <49158E750348AA499168FD41D889836061B94712@FMSMSX105.amr.corp.intel.com> References: <49158E750348AA499168FD41D889836061B94712@FMSMSX105.amr.corp.intel.com> Message-ID: <548C5470AAD9DA4A85D259B663190D361FFDE8A1@corpmail1.na.ads.idt.com> Thank you very much, Ray. Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Monday, April 08, 2013 3:42 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] NVMe Windows repo is UNLOCKED - Pushing IDT fix for SNTI changes Latest patch from IDT (Read/Write fixes) has been pushed to the trunk. New tag created = IDT_SNTI_Read_Write_Fixes. If anyone has any questions, please feel free to contact me. Thanks, Ray From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Monday, April 08, 2013 3:31 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] NVMe Windows repo is LOCKED - Pushing IDT fix for SNTI changes Locking the Windows NVMe repo. Pushing Alex's (IDT) fix for buffer overrun for reads and writes, 0 length reads/writes (excluding READ6/WRITE6), and not surfacing namespaces with metadata enabled. Thanks, Ray [cid:image001.png at 01CB3870.4BB88E70] Raymond C. Robles NVM Solutions Group | Internal SSD Engineering Technology & Manufacturing Group Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From Kwok.Kong at idt.com Thu Apr 11 15:42:39 2013 From: Kwok.Kong at idt.com (Kong, Kwok) Date: Thu, 11 Apr 2013 22:42:39 +0000 Subject: [nvmewin] NVMe Windows driver working group meeting note (4-11-13) Message-ID: <05CD7821AE397547A01AC160FBC231474BC920E1@corpmail1.na.ads.idt.com> NVMe OFA Windows Driver Meeting Note (April 11, 2013) Release 1.2 Status ================== - IDT and Intel have done testing with the following Windows version and the driver seems to be working without any problem. IDT will do more testing and on schedule to finish the testing before June. NVMe 1.00e enhancement is expected to be completed in May. - Windows 8 64-bit support - Windows Server 2008R2 64-bit support - Windows Server 2012 64-bit - LSI has confirmed that they have allocated resource to work on the TRIM command. The TRIM command support is expected to be completed before end of May. - Huawei has started the development on the Hibernation as a boot drive. Huawei should have no problem to finish the deveopment before end of May Release Roadmap for 2013 ======================== - There is no change to the roadmap. Release 1.2 (June) - Supports the following Windows versions in addition to Windows 7 - 64 bits (IDT) - Windows 8 64-bit - Windows Server 2008R2 64-bit - Windows Server 2012 64-bit - TRIM command support (LSI) - NVMe 1.00e enhancement (IDT) - Hibernation as a boot drive (Huawei) Release 1.3 (Dec) - Support additional Windows versions - Windows 7 32-bit - Windows 8 32-bit - Windows 8 features: - Extended SRB format - SMART handling via new Extended SRB format Features that will not be supported in 2013 (will be reviewed mid-year): NVMe 1.1 support: - multi-path - SGL - Get/Set feature update - Autonomous power state transition - Host Identifier - Reservation Notification Mask - Reservation Persistence - identify structure update - write zeros command Other feature: - End-to-end protection (Server 2012 support this) Known problems that will be fixed ================================ - Not Accessing NVMe registers in their native width. (Ray - Intel) - ModeSense Translation issue. (Dharani - SanDisk) - format nvm error. (Judy - Samsung) - Not handling CSTS.RDY status (from 1->0 and 0->1) properly on NVMe reset. (Judy - Samsung) SCSI Translation ================ - Yong Chen (Huawei) will represent this working group to work with Microsoft on the SCSI translation. He is going to report back to this working group and the NVMe WG if he sees any discrepncy in the NVMe SCSI translation recommendation and the Microsoft SCSI compliance testing. Next Meeting ============ - Next meeting will be in June. Kwok Kong of IDT will set up the meeting in May. From Dharani.Kotte at sandisk.com Fri Apr 12 09:58:28 2013 From: Dharani.Kotte at sandisk.com (Dharani Kotte) Date: Fri, 12 Apr 2013 16:58:28 +0000 Subject: [nvmewin] ***UNCHECKED*** [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] New Patch From Sandisk Message-ID: <23EC73C80FB59046A6B7B8EB7B3826591F27784E@MILMBXIP01.sdcorp.global.sandisk.com> Hi all, I am attaching a new patch that includes the following changes: 1. In nvmeSnti.c a. Allocate a global buffer of 256bytes size for modesense return data preparation b. For any mode sense this buffer is used for modesense data along with the requested pages in it. c. Copy the required data according to the alloc_length requested in the cdb to the Srb->DataBuffer and update the dataTransferLength accordingly d. In function SntiTranslateReturnAllModePagesResponse() the modesense data header offset is calculated wrong which ends up in BSOD in some cases modified code to fix this issue Please review the changes and provide feedbacks if you have any. If nobody disagrees with the changes, I will remind Ray to merge them in two weeks. Thanks, Dharani. ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: source_sndk_04_12_2013.zip Type: application/x-zip-compressed Size: 168133 bytes Desc: source_sndk_04_12_2013.zip URL: From raymond.c.robles at intel.com Fri Apr 12 11:35:11 2013 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Fri, 12 Apr 2013 18:35:11 +0000 Subject: [nvmewin] New Patch From Sandisk In-Reply-To: <23EC73C80FB59046A6B7B8EB7B3826591F27784E@MILMBXIP01.sdcorp.global.sandisk.com> References: <23EC73C80FB59046A6B7B8EB7B3826591F27784E@MILMBXIP01.sdcorp.global.sandisk.com> Message-ID: <49158E750348AA499168FD41D889836061B97C65@FMSMSX105.amr.corp.intel.com> Dharani, Thank you for submitting this patch. Can you please provide the password for the zip file? Thanks! LSI/IDT, Let's try to have this patch reviewed by April 16th (2 weeks from now). I will review for Intel and provide any feedback, if necessary. Thanks, Ray From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Dharani Kotte Sent: Friday, April 12, 2013 9:58 AM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] New Patch From Sandisk Hi all, I am attaching a new patch that includes the following changes: 1. In nvmeSnti.c a. Allocate a global buffer of 256bytes size for modesense return data preparation b. For any mode sense this buffer is used for modesense data along with the requested pages in it. c. Copy the required data according to the alloc_length requested in the cdb to the Srb->DataBuffer and update the dataTransferLength accordingly d. In function SntiTranslateReturnAllModePagesResponse() the modesense data header offset is calculated wrong which ends up in BSOD in some cases modified code to fix this issue Please review the changes and provide feedbacks if you have any. If nobody disagrees with the changes, I will remind Ray to merge them in two weeks. Thanks, Dharani. ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -------------- next part -------------- An HTML attachment was scrubbed... URL: From Dharani.Kotte at sandisk.com Fri Apr 12 11:36:10 2013 From: Dharani.Kotte at sandisk.com (Dharani Kotte) Date: Fri, 12 Apr 2013 18:36:10 +0000 Subject: [nvmewin] New Patch From Sandisk In-Reply-To: <49158E750348AA499168FD41D889836061B97C65@FMSMSX105.amr.corp.intel.com> References: <23EC73C80FB59046A6B7B8EB7B3826591F27784E@MILMBXIP01.sdcorp.global.sandisk.com> <49158E750348AA499168FD41D889836061B97C65@FMSMSX105.amr.corp.intel.com> Message-ID: <23EC73C80FB59046A6B7B8EB7B3826591F2778C8@MILMBXIP01.sdcorp.global.sandisk.com> Sorry I forgot, "sndk1234" Thanks, Dharani. From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Friday, April 12, 2013 11:35 AM To: Dharani Kotte; nvmewin at lists.openfabrics.org Subject: RE: New Patch From Sandisk Dharani, Thank you for submitting this patch. Can you please provide the password for the zip file? Thanks! LSI/IDT, Let's try to have this patch reviewed by April 16th (2 weeks from now). I will review for Intel and provide any feedback, if necessary. Thanks, Ray From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Dharani Kotte Sent: Friday, April 12, 2013 9:58 AM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] New Patch From Sandisk Hi all, I am attaching a new patch that includes the following changes: 1. In nvmeSnti.c a. Allocate a global buffer of 256bytes size for modesense return data preparation b. For any mode sense this buffer is used for modesense data along with the requested pages in it. c. Copy the required data according to the alloc_length requested in the cdb to the Srb->DataBuffer and update the dataTransferLength accordingly d. In function SntiTranslateReturnAllModePagesResponse() the modesense data header offset is calculated wrong which ends up in BSOD in some cases modified code to fix this issue Please review the changes and provide feedbacks if you have any. If nobody disagrees with the changes, I will remind Ray to merge them in two weeks. Thanks, Dharani. ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at idt.com Fri Apr 12 11:45:27 2013 From: Alex.Chang at idt.com (Chang, Alex) Date: Fri, 12 Apr 2013 18:45:27 +0000 Subject: [nvmewin] New Patch From Sandisk In-Reply-To: <49158E750348AA499168FD41D889836061B97C65@FMSMSX105.amr.corp.intel.com> References: <23EC73C80FB59046A6B7B8EB7B3826591F27784E@MILMBXIP01.sdcorp.global.sandisk.com> <49158E750348AA499168FD41D889836061B97C65@FMSMSX105.amr.corp.intel.com> Message-ID: <548C5470AAD9DA4A85D259B663190D361FFDEAB6@corpmail1.na.ads.idt.com> Hi Ray, Do you mean April 26th? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Friday, April 12, 2013 11:35 AM To: Dharani Kotte; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] New Patch From Sandisk Dharani, Thank you for submitting this patch. Can you please provide the password for the zip file? Thanks! LSI/IDT, Let's try to have this patch reviewed by April 16th (2 weeks from now). I will review for Intel and provide any feedback, if necessary. Thanks, Ray From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Dharani Kotte Sent: Friday, April 12, 2013 9:58 AM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] New Patch From Sandisk Hi all, I am attaching a new patch that includes the following changes: 1. In nvmeSnti.c a. Allocate a global buffer of 256bytes size for modesense return data preparation b. For any mode sense this buffer is used for modesense data along with the requested pages in it. c. Copy the required data according to the alloc_length requested in the cdb to the Srb->DataBuffer and update the dataTransferLength accordingly d. In function SntiTranslateReturnAllModePagesResponse() the modesense data header offset is calculated wrong which ends up in BSOD in some cases modified code to fix this issue Please review the changes and provide feedbacks if you have any. If nobody disagrees with the changes, I will remind Ray to merge them in two weeks. Thanks, Dharani. ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.c.robles at intel.com Fri Apr 12 11:50:42 2013 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Fri, 12 Apr 2013 18:50:42 +0000 Subject: [nvmewin] New Patch From Sandisk In-Reply-To: <548C5470AAD9DA4A85D259B663190D361FFDEAB6@corpmail1.na.ads.idt.com> References: <23EC73C80FB59046A6B7B8EB7B3826591F27784E@MILMBXIP01.sdcorp.global.sandisk.com> <49158E750348AA499168FD41D889836061B97C65@FMSMSX105.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFDEAB6@corpmail1.na.ads.idt.com> Message-ID: <49158E750348AA499168FD41D889836061B97C8A@FMSMSX105.amr.corp.intel.com> Yes, good catch... typo on my part. Please have the review complete by April 26th. Thanks, Ray From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Friday, April 12, 2013 11:45 AM To: Robles, Raymond C; Dharani Kotte; nvmewin at lists.openfabrics.org Subject: RE: New Patch From Sandisk Hi Ray, Do you mean April 26th? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Friday, April 12, 2013 11:35 AM To: Dharani Kotte; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] New Patch From Sandisk Dharani, Thank you for submitting this patch. Can you please provide the password for the zip file? Thanks! LSI/IDT, Let's try to have this patch reviewed by April 16th (2 weeks from now). I will review for Intel and provide any feedback, if necessary. Thanks, Ray From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Dharani Kotte Sent: Friday, April 12, 2013 9:58 AM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] New Patch From Sandisk Hi all, I am attaching a new patch that includes the following changes: 1. In nvmeSnti.c a. Allocate a global buffer of 256bytes size for modesense return data preparation b. For any mode sense this buffer is used for modesense data along with the requested pages in it. c. Copy the required data according to the alloc_length requested in the cdb to the Srb->DataBuffer and update the dataTransferLength accordingly d. In function SntiTranslateReturnAllModePagesResponse() the modesense data header offset is calculated wrong which ends up in BSOD in some cases modified code to fix this issue Please review the changes and provide feedbacks if you have any. If nobody disagrees with the changes, I will remind Ray to merge them in two weeks. Thanks, Dharani. ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.c.robles at intel.com Thu Apr 18 14:46:29 2013 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Thu, 18 Apr 2013 21:46:29 +0000 Subject: [nvmewin] Handling pending commands when processing Format [changing from NVMe WG dist. list to OFA NVMe Windows Driver dist. list] Message-ID: <49158E750348AA499168FD41D88983606254536E@ORSMSX152.amr.corp.intel.com> Hello Judy/Wonmoon, Sorry for the late response. The “hot remove” state is just a state that we enter when the driver receives a Format command. Basically, this state will remove the namespace(s) from the topology by calling StorPortNotification() with BusChangeDetected. This will remove the “SCSI target/disk” associated with each namespace form the OS (because Storport will re-enumerate the controller and the driver will not expose the namespaces about to be formatted) so that the format can occur on the relevant namespaces. By signaling Windows that the namespaces have been “removed”, all I/O will be stopped by the OS. Then the format can complete. Once the format is complete, we perform the opposite action to “hot add” the namespace back into the topology by calling StorPortNotification() with BusChangeDetected… only this time, we will surface the namespace(s) again when Storport re-enumerates. Not queues are deleted in this state, no memory is de-allocated, and nothing else changes about the namespace. This is simply the first step (in a 3 step sequence) when formatting a namespace as we cannot format a namespace while the OS is aware of its presence and could be potentially sending I/O to a stale namespace config (i.e. changing LBA/sector size). Let me know if this answers your question. Thanks, Ray From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Wednesday, April 17, 2013 6:21 AM To: WONMOON CHEON; 강미경; technical at nvmexpress.org Subject: RE: RE: Handling pending commands when processing Format >>Would you elaborate more about the "hot-remove" state? In this state, do you mean that all the IO SQ/CQs are deleted? Or, waiting for completions of all the outstanding IOs? The IO SQ/CQs are definitely NOT deleted. I would need to look more closely through the driver code to see how IOs previously sent to the namespaces which are marked as OFFLINE are handled/finished/quiesced. There are other folks on this thread who no doubt have more history/intimate knowledge of this driver than I do who may answer that question more quickly than I can…also, perhaps this discussion should probably be moved to the OFA driver forum since it has turned into a driver-specific thread at this point. What do folks think? Judy From: 천원문 [mailto:wm.cheon at samsung.com] Sent: Wednesday, April 17, 2013 1:20 AM To: Judy Brock-SSI; 강미경; technical at nvmexpress.org Subject: Re: RE: Handling pending commands when processing Format Hi Judy, Would you elaborate more about the "hot-remove" state? In this state, do you mean that all the IO SQ/CQs are deleted? Or, waiting for completions of all the outstanding IOs? Thanks, Wonmoon ------- Original Message ------- Sender : Judy Brock-SSI> Date : 2013-04-17 16:39 (GMT+09:00) Title : RE: Handling pending commands when processing Format Hi, I should clarify that it is not the Windows operating system – but rather the Windows OFA NVMe driver - that, from what I can see, does a “hot-remove” of all namespace(s) associated with a device before allowing a format operation to begin; “hot remove” is just the name for an internal state in the driver format nvm state machine. Before beginning the actual format op, the driver internally marks all namespaces associated with the format operation “offline”. It then notifies the OS that there has been a “bus change” event (via an OS-specific API). This in turn will cause the OS to rescan (re-enumerate) the “bus” (the pseudo SCSI bus, that is – we expose NVM namespaces as SCSI luns). Since all the pertinent namespaces have been marked offline internally, the bus rescan won’t detect any valid SCSI luns (because the driver will not report any). Hence from the OS point of view, any SCSI lun(s) previously mapped to the namespace(s) to be formatted will have disappeared/will be unaddressable while the format operation is in progress. Judy From: Judy Brock-SSI Sent: Tuesday, April 16, 2013 8:22 PM To: 'mkkang.kang at samsung.com'; technical at nvmexpress.org Subject: RE: Handling pending commands when processing Format Mikyeong, I haven’t looked at the Linux driver but I know that Windows hot-removes all namespace(s) associated with a device before allowing a format operation to begin. And a namespace can’t be removed while there is IO outstanding to it so that answers your question regarding IOs being completed before format begins. It also answers the question about requests being sent to a namespace while format is in progress – can’t happen. Thanks, Judy From: 강미경 [mailto:mkkang.kang at samsung.com] Sent: Tuesday, April 16, 2013 7:07 PM To: technical at nvmexpress.org Subject: Handling pending commands when processing Format Dear All, Format NVM command may change the Namespace repository, and it will be executed out of order like any other commands. Therefore, Format NVM command may affect other commands that are pending execution in the device, if any. 1) How does an OFA/linux driver handle 'Format NVM command'? Does a host make sure that all commands for a particular NSID are completed before sending 'Format NVM command'? 2) If a host driver does not behave like 1) above, how can a device handle other pending commands which were previously submitted in a SQ? It seems like we need an additional status code. e.g. Abort due to Namespace Format 3) Let's suppose that 'Format NVM command' is in progress. If the host driver sends subsequent commands to the namespace being formatted, should the device reply directly with a 'Namespace not Ready'? [1.0e spec] If the device does not reply directly and the format operation takes long time, then, I/O command will timeout and the host may send the reset. But if commands are responded with 'Namespace Not Ready', host may not issue the reset. Therefore, direct reply seems to be needed. [1.1 spec. ECN 001] There is Format progress indicator. The host driver can check format progress any time, therefore, there is no concern about reset during format command. Best Regards, Mikyeong Kang ________________________________ Kang MiKyeong Flash Memory Planning/Enabling Group, Memory Div. SAMSUNG ELECTRONICS, Co., Ltd.. Phone: 82-31-208-3857 Mobile: 82-10-9369-0177 E-mail: mkkang.kang at samsung.com ________________________________ [cid:image001.gif at 01CE3C43.8279D160] [http://ext.samsung.net/mailcheck/SeenTimeChecker?do=dd90ab357fccaa0879ef49d135999623709a3a8db170c13642590acae5bdcb6a1b41a3ea5b94b2ed546e5f4bdee86ec72cb0372888719872db9fdddda33e82cbe4a391424e62fcf6cf878f9a26ce15a0] [http://ext.samsung.net/mailcheck/SeenTimeChecker?do=dd90ab357fccaa0810067d8e81f72583c0a10306535965788c7e8eca81cb564b226b8d944a9d91d2f73afb615c53a27ef4bcdeced46ed5ee08cece8541bc14eacf878f9a26ce15a0] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 14036 bytes Desc: image001.gif URL: From judy.brock at ssi.samsung.com Thu Apr 18 16:47:24 2013 From: judy.brock at ssi.samsung.com (Judy Brock-SSI) Date: Thu, 18 Apr 2013 23:47:24 +0000 Subject: [nvmewin] Handling pending commands when processing Format [changing from NVMe WG dist. list to OFA NVMe Windows Driver dist. list] In-Reply-To: <49158E750348AA499168FD41D88983606254536E@ORSMSX152.amr.corp.intel.com> References: <49158E750348AA499168FD41D88983606254536E@ORSMSX152.amr.corp.intel.com> Message-ID: <36E8D38D6B771A4BBDB1C0D800158A512FBDBF5E@SSIEXCH-MB2.ssi.samsung.com> Hi Ray, [Ray wrote] Let me know if this answers your question. I don’t think it does. What I wrote below I think was pretty much the same as what you wrote - or at least that was my intention ☺. However, the piece I couldn’t explain (cause I haven’t looked into it) is how the driver holds off the beginning of the actual format NVM operation till whatever old IOs that were already in progress for the namespace(s) before the format op request was received are completed back to the caller, aborted, or whatever - so there are no old live requests hanging around, still in the driver , before the format op begins. In other words, does the driver hold off starting the format cmd till the outstanding IOs are completed? Or do we perhaps just drop them on the floor and let the OS figure out that those requests are permanently lost/gone due to the LUNs having disappeared (my guess is, the latter is what we do)? Or do we try to abort them all? And so on. So again, we do understand how to get the OS to avoid sending new I/O requests to stale namespaces but how exactly are the old I/O reqs (those existing at the time the format request comes in ) handled? At least that is my current question ☺ Thanks, Judy From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Thursday, April 18, 2013 2:46 PM To: Judy Brock-SSI; WONMOON CHEON; ???; nvmewin at lists.openfabrics.org Subject: RE: RE: Handling pending commands when processing Format [changing from NVMe WG dist. list to OFA NVMe Windows Driver dist. list] Hello Judy/Wonmoon, Sorry for the late response. The “hot remove” state is just a state that we enter when the driver receives a Format command. Basically, this state will remove the namespace(s) from the topology by calling StorPortNotification() with BusChangeDetected. This will remove the “SCSI target/disk” associated with each namespace form the OS (because Storport will re-enumerate the controller and the driver will not expose the namespaces about to be formatted) so that the format can occur on the relevant namespaces. By signaling Windows that the namespaces have been “removed”, all I/O will be stopped by the OS. Then the format can complete. Once the format is complete, we perform the opposite action to “hot add” the namespace back into the topology by calling StorPortNotification() with BusChangeDetected… only this time, we will surface the namespace(s) again when Storport re-enumerates. Not queues are deleted in this state, no memory is de-allocated, and nothing else changes about the namespace. This is simply the first step (in a 3 step sequence) when formatting a namespace as we cannot format a namespace while the OS is aware of its presence and could be potentially sending I/O to a stale namespace config (i.e. changing LBA/sector size). Let me know if this answers your question. Thanks, Ray From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Wednesday, April 17, 2013 6:21 AM To: WONMOON CHEON; 강미경; technical at nvmexpress.org Subject: RE: RE: Handling pending commands when processing Format >>Would you elaborate more about the "hot-remove" state? In this state, do you mean that all the IO SQ/CQs are deleted? Or, waiting for completions of all the outstanding IOs? The IO SQ/CQs are definitely NOT deleted. I would need to look more closely through the driver code to see how IOs previously sent to the namespaces which are marked as OFFLINE are handled/finished/quiesced. There are other folks on this thread who no doubt have more history/intimate knowledge of this driver than I do who may answer that question more quickly than I can…also, perhaps this discussion should probably be moved to the OFA driver forum since it has turned into a driver-specific thread at this point. What do folks think? Judy From: 천원문 [mailto:wm.cheon at samsung.com] Sent: Wednesday, April 17, 2013 1:20 AM To: Judy Brock-SSI; 강미경; technical at nvmexpress.org Subject: Re: RE: Handling pending commands when processing Format Hi Judy, Would you elaborate more about the "hot-remove" state? In this state, do you mean that all the IO SQ/CQs are deleted? Or, waiting for completions of all the outstanding IOs? Thanks, Wonmoon ------- Original Message ------- Sender : Judy Brock-SSI> Date : 2013-04-17 16:39 (GMT+09:00) Title : RE: Handling pending commands when processing Format Hi, I should clarify that it is not the Windows operating system – but rather the Windows OFA NVMe driver - that, from what I can see, does a “hot-remove” of all namespace(s) associated with a device before allowing a format operation to begin; “hot remove” is just the name for an internal state in the driver format nvm state machine. Before beginning the actual format op, the driver internally marks all namespaces associated with the format operation “offline”. It then notifies the OS that there has been a “bus change” event (via an OS-specific API). This in turn will cause the OS to rescan (re-enumerate) the “bus” (the pseudo SCSI bus, that is – we expose NVM namespaces as SCSI luns). Since all the pertinent namespaces have been marked offline internally, the bus rescan won’t detect any valid SCSI luns (because the driver will not report any). Hence from the OS point of view, any SCSI lun(s) previously mapped to the namespace(s) to be formatted will have disappeared/will be unaddressable while the format operation is in progress. Judy From: Judy Brock-SSI Sent: Tuesday, April 16, 2013 8:22 PM To: 'mkkang.kang at samsung.com'; technical at nvmexpress.org Subject: RE: Handling pending commands when processing Format Mikyeong, I haven’t looked at the Linux driver but I know that Windows hot-removes all namespace(s) associated with a device before allowing a format operation to begin. And a namespace can’t be removed while there is IO outstanding to it so that answers your question regarding IOs being completed before format begins. It also answers the question about requests being sent to a namespace while format is in progress – can’t happen. Thanks, Judy From: 강미경 [mailto:mkkang.kang at samsung.com] Sent: Tuesday, April 16, 2013 7:07 PM To: technical at nvmexpress.org Subject: Handling pending commands when processing Format Dear All, Format NVM command may change the Namespace repository, and it will be executed out of order like any other commands. Therefore, Format NVM command may affect other commands that are pending execution in the device, if any. 1) How does an OFA/linux driver handle 'Format NVM command'? Does a host make sure that all commands for a particular NSID are completed before sending 'Format NVM command'? 2) If a host driver does not behave like 1) above, how can a device handle other pending commands which were previously submitted in a SQ? It seems like we need an additional status code. e.g. Abort due to Namespace Format 3) Let's suppose that 'Format NVM command' is in progress. If the host driver sends subsequent commands to the namespace being formatted, should the device reply directly with a 'Namespace not Ready'? [1.0e spec] If the device does not reply directly and the format operation takes long time, then, I/O command will timeout and the host may send the reset. But if commands are responded with 'Namespace Not Ready', host may not issue the reset. Therefore, direct reply seems to be needed. [1.1 spec. ECN 001] There is Format progress indicator. The host driver can check format progress any time, therefore, there is no concern about reset during format command. Best Regards, Mikyeong Kang [X] Kang MiKyeong Flash Memory Planning/Enabling Group, Memory Div. SAMSUNG ELECTRONICS, Co., Ltd.. Phone: 82-31-208-3857 Mobile: 82-10-9369-0177 E-mail: mkkang.kang at samsung.com [X] [http://ext.samsung.net/mailcheck/SeenTimeChecker?do=dd90ab357fccaa0879ef49d135999623709a3a8db170c13642590acae5bdcb6a1b41a3ea5b94b2ed546e5f4bdee86ec72cb0372888719872db9fdddda33e82cbe4a391424e62fcf6cf878f9a26ce15a0] [http://ext.samsung.net/mailcheck/SeenTimeChecker?do=dd90ab357fccaa0810067d8e81f72583c0a10306535965788c7e8eca81cb564b226b8d944a9d91d2f73afb615c53a27ef4bcdeced46ed5ee08cece8541bc14eacf878f9a26ce15a0] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT87928 1.jpg Type: image/jpeg Size: 34869 bytes Desc: ATT87928 1.jpg URL: From raymond.c.robles at intel.com Thu Apr 18 18:07:58 2013 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Fri, 19 Apr 2013 01:07:58 +0000 Subject: [nvmewin] Handling pending commands when processing Format [changing from NVMe WG dist. list to OFA NVMe Windows Driver dist. list] In-Reply-To: <36E8D38D6B771A4BBDB1C0D800158A512FBDBF5E@SSIEXCH-MB2.ssi.samsung.com> References: <49158E750348AA499168FD41D88983606254536E@ORSMSX152.amr.corp.intel.com> <36E8D38D6B771A4BBDB1C0D800158A512FBDBF5E@SSIEXCH-MB2.ssi.samsung.com> Message-ID: <49158E750348AA499168FD41D889836062545566@ORSMSX152.amr.corp.intel.com> Hi Judy, Ahhh… I see now. I didn’t answer that question below. The format command is essentially built into the driver by a state machine. When we receive the format command we immediately issue the *hot remove* command… but that is done inline. So, once we call Storport to kick off the enumeration, we simply return back to handling the format command in NVMeStartIoProcessIoctl(). The appropriate states are set along the way to indicate progress. Once the namespace is removed from the “OS view”, then the format is processed like any other command (via ProcessIo). The callback is setup to call NVMeIoctlFormatNVMCAllback() and the variable “FormatNvmInfo->AddNamespaceNeeded” is set to TRUE so that on the completion side we remember to have the OS re-enumerate after we are done. Once the NVM format completes, the callback is invoked in the completion DPC. Then on the completion side we issue Identify Controller and Identify Namespace so that our cached driver data for the namespace(s) formatted are up to date. In the last state, after getting the Identify Namespace struct, we’ll call *hot add* which is described below. Note that at no point do we “wait” for any I/O to finish. Format is a dangerous command… especially via pass through IOCTL. We talked about this quite a bit in the beginning of developing this driver. But essentially, if a format comes down for a namespace, any I/O outstanding to the controller (there won’t be anything that needs to be sent… I/O will either be at the device or on the CQ) will simply complete via normal operation or be aborted at the controller… but Storport won’t care because the SCSI target was already removed upon initially receiving the format command. Any I/O in the CQ will be completed and handled by Storport correctly. Thanks, Ray From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Thursday, April 18, 2013 4:47 PM To: Robles, Raymond C; WONMOON CHEON; ???; nvmewin at lists.openfabrics.org Subject: RE: RE: Handling pending commands when processing Format [changing from NVMe WG dist. list to OFA NVMe Windows Driver dist. list] Hi Ray, [Ray wrote] Let me know if this answers your question. I don’t think it does. What I wrote below I think was pretty much the same as what you wrote - or at least that was my intention ☺. However, the piece I couldn’t explain (cause I haven’t looked into it) is how the driver holds off the beginning of the actual format NVM operation till whatever old IOs that were already in progress for the namespace(s) before the format op request was received are completed back to the caller, aborted, or whatever - so there are no old live requests hanging around, still in the driver , before the format op begins. In other words, does the driver hold off starting the format cmd till the outstanding IOs are completed? Or do we perhaps just drop them on the floor and let the OS figure out that those requests are permanently lost/gone due to the LUNs having disappeared (my guess is, the latter is what we do)? Or do we try to abort them all? And so on. So again, we do understand how to get the OS to avoid sending new I/O requests to stale namespaces but how exactly are the old I/O reqs (those existing at the time the format request comes in ) handled? At least that is my current question ☺ Thanks, Judy From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Thursday, April 18, 2013 2:46 PM To: Judy Brock-SSI; WONMOON CHEON; ???; nvmewin at lists.openfabrics.org Subject: RE: RE: Handling pending commands when processing Format [changing from NVMe WG dist. list to OFA NVMe Windows Driver dist. list] Hello Judy/Wonmoon, Sorry for the late response. The “hot remove” state is just a state that we enter when the driver receives a Format command. Basically, this state will remove the namespace(s) from the topology by calling StorPortNotification() with BusChangeDetected. This will remove the “SCSI target/disk” associated with each namespace form the OS (because Storport will re-enumerate the controller and the driver will not expose the namespaces about to be formatted) so that the format can occur on the relevant namespaces. By signaling Windows that the namespaces have been “removed”, all I/O will be stopped by the OS. Then the format can complete. Once the format is complete, we perform the opposite action to “hot add” the namespace back into the topology by calling StorPortNotification() with BusChangeDetected… only this time, we will surface the namespace(s) again when Storport re-enumerates. Not queues are deleted in this state, no memory is de-allocated, and nothing else changes about the namespace. This is simply the first step (in a 3 step sequence) when formatting a namespace as we cannot format a namespace while the OS is aware of its presence and could be potentially sending I/O to a stale namespace config (i.e. changing LBA/sector size). Let me know if this answers your question. Thanks, Ray From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Wednesday, April 17, 2013 6:21 AM To: WONMOON CHEON; 강미경; technical at nvmexpress.org Subject: RE: RE: Handling pending commands when processing Format >>Would you elaborate more about the "hot-remove" state? In this state, do you mean that all the IO SQ/CQs are deleted? Or, waiting for completions of all the outstanding IOs? The IO SQ/CQs are definitely NOT deleted. I would need to look more closely through the driver code to see how IOs previously sent to the namespaces which are marked as OFFLINE are handled/finished/quiesced. There are other folks on this thread who no doubt have more history/intimate knowledge of this driver than I do who may answer that question more quickly than I can…also, perhaps this discussion should probably be moved to the OFA driver forum since it has turned into a driver-specific thread at this point. What do folks think? Judy From: 천원문 [mailto:wm.cheon at samsung.com] Sent: Wednesday, April 17, 2013 1:20 AM To: Judy Brock-SSI; 강미경; technical at nvmexpress.org Subject: Re: RE: Handling pending commands when processing Format Hi Judy, Would you elaborate more about the "hot-remove" state? In this state, do you mean that all the IO SQ/CQs are deleted? Or, waiting for completions of all the outstanding IOs? Thanks, Wonmoon ------- Original Message ------- Sender : Judy Brock-SSI> Date : 2013-04-17 16:39 (GMT+09:00) Title : RE: Handling pending commands when processing Format Hi, I should clarify that it is not the Windows operating system – but rather the Windows OFA NVMe driver - that, from what I can see, does a “hot-remove” of all namespace(s) associated with a device before allowing a format operation to begin; “hot remove” is just the name for an internal state in the driver format nvm state machine. Before beginning the actual format op, the driver internally marks all namespaces associated with the format operation “offline”. It then notifies the OS that there has been a “bus change” event (via an OS-specific API). This in turn will cause the OS to rescan (re-enumerate) the “bus” (the pseudo SCSI bus, that is – we expose NVM namespaces as SCSI luns). Since all the pertinent namespaces have been marked offline internally, the bus rescan won’t detect any valid SCSI luns (because the driver will not report any). Hence from the OS point of view, any SCSI lun(s) previously mapped to the namespace(s) to be formatted will have disappeared/will be unaddressable while the format operation is in progress. Judy From: Judy Brock-SSI Sent: Tuesday, April 16, 2013 8:22 PM To: 'mkkang.kang at samsung.com'; technical at nvmexpress.org Subject: RE: Handling pending commands when processing Format Mikyeong, I haven’t looked at the Linux driver but I know that Windows hot-removes all namespace(s) associated with a device before allowing a format operation to begin. And a namespace can’t be removed while there is IO outstanding to it so that answers your question regarding IOs being completed before format begins. It also answers the question about requests being sent to a namespace while format is in progress – can’t happen. Thanks, Judy From: 강미경 [mailto:mkkang.kang at samsung.com] Sent: Tuesday, April 16, 2013 7:07 PM To: technical at nvmexpress.org Subject: Handling pending commands when processing Format Dear All, Format NVM command may change the Namespace repository, and it will be executed out of order like any other commands. Therefore, Format NVM command may affect other commands that are pending execution in the device, if any. 1) How does an OFA/linux driver handle 'Format NVM command'? Does a host make sure that all commands for a particular NSID are completed before sending 'Format NVM command'? 2) If a host driver does not behave like 1) above, how can a device handle other pending commands which were previously submitted in a SQ? It seems like we need an additional status code. e.g. Abort due to Namespace Format 3) Let's suppose that 'Format NVM command' is in progress. If the host driver sends subsequent commands to the namespace being formatted, should the device reply directly with a 'Namespace not Ready'? [1.0e spec] If the device does not reply directly and the format operation takes long time, then, I/O command will timeout and the host may send the reset. But if commands are responded with 'Namespace Not Ready', host may not issue the reset. Therefore, direct reply seems to be needed. [1.1 spec. ECN 001] There is Format progress indicator. The host driver can check format progress any time, therefore, there is no concern about reset during format command. Best Regards, Mikyeong Kang Kang MiKyeong Flash Memory Planning/Enabling Group, Memory Div. SAMSUNG ELECTRONICS, Co., Ltd.. Phone: 82-31-208-3857 Mobile: 82-10-9369-0177 E-mail: mkkang.kang at samsung.com [cid:image001.jpg at 01CE3C59.BF34AF60] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 34869 bytes Desc: image001.jpg URL: From raymond.c.robles at intel.com Fri Apr 19 17:28:28 2013 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Sat, 20 Apr 2013 00:28:28 +0000 Subject: [nvmewin] New NVMe Supported QEMU Version Uploaded to OFA NVMEWIN Repo Message-ID: <49158E750348AA499168FD41D889836062545F76@ORSMSX152.amr.corp.intel.com> Hello, I've uploaded a new revision of QEMU that emulates the NVMe device as a PCIe device (instead of a PCI device). Plus this revision has a few minor modifications. The new tar ball is located under the qemu directory. Thanks, Ray [cid:image001.png at 01CB3870.4BB88E70] Raymond C. Robles NVM Solutions Group | Internal SSD Engineering Technology & Manufacturing Group Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From judy.brock at ssi.samsung.com Mon Apr 22 23:27:03 2013 From: judy.brock at ssi.samsung.com (Judy Brock-SSI) Date: Tue, 23 Apr 2013 06:27:03 +0000 Subject: [nvmewin] Handling pending commands when processing Format [changing from NVMe WG dist. list to OFA NVMe Windows Driver dist. list] In-Reply-To: <49158E750348AA499168FD41D889836062545566@ORSMSX152.amr.corp.intel.com> References: <49158E750348AA499168FD41D88983606254536E@ORSMSX152.amr.corp.intel.com> <36E8D38D6B771A4BBDB1C0D800158A512FBDBF5E@SSIEXCH-MB2.ssi.samsung.com> <49158E750348AA499168FD41D889836062545566@ORSMSX152.amr.corp.intel.com> Message-ID: <36E8D38D6B771A4BBDB1C0D800158A512FBED70F@SSIEXCH-MB2.ssi.samsung.com> Hi Ray, Thanks for all the details. The stuff about hot adding the namespaces back in, reissuing Identify Controller/Identify Device, etc was very easy to follow in the driver; it’s clear how & why that’s done. However I’m still having some difficulty with the finishing-up-IOs-outstanding before starting the format issue. I may be just covering ground the group covered a long time ago but I’m wondering if all the assumptions below are valid. I’m also interested in knowing how the group validated them. Here are my concerns: >> if a format comes down for a namespace, any I/O outstanding to the controller (there won’t be anything that needs to be sent… I/O will either be at the device or on the CQ) Why do we think there won’t be anything that needs to be sent? The app that sends a pass through IOCTL to do the format is presumably completely independent from say any file IO that might be going on on behalf of other apps, or even raw IO from apps like Iometer. Seems like lots of IO could still be coming in when the format IOCTL is received. > > any I/O outstanding to the controller …will simply complete via normal operation or be aborted at the controller… but Storport won’t care because the SCSI target was already removed upon initially receiving the format command. Any I/O in the CQ will be completed and handled by Storport correctly. I am wondering about two of the assumptions about timing in the paragraph above. a) I don’t think it’s true that Storport can handle commands completing for a device it no longer has a record of – ie, after the target was removed due to re-enumeration.I think it will have torn down its own structures for any old LUN(s) we had previously exposed and that would include any record it had of commands outstanding for those old LUNs. I think it isn’t going to hold on to ghost requests on behalf of devices that it no longer has a record of because it has no place to store such requests anyway at that point/ no object to associate them with. b) We are assuming that when we call StorPortNotification() with BusChangeDetected that Storport will come back in to the driver to rescan the bus – either via a bunch of Inquiry cmds or via a Report Luns cmd – and will finish all the work associated with the bus scan, updating it’s record of device topology (ie remove any SCSI target/LUNs that were assocated with the NVMe dev we are about to format) – all before it returns from our call to ScsiPortNotification and before the driver code continues on to call ProcessIO to start the real NVMe Format NVM operation. I don’t know that that is a safe correct assumption to make. The bus scan could be deferred till the driver returns. Or even if it is launched right away, could the bus scan take place on a different processor while the proc that is running through the driver just continues on its’ way? In my experience, it is the drivers and controllers’ joint responsibility to make sure that all outstanding IOs are completed back to the caller one way or the other before starting the format op. That means aborting whatever can be aborted and also making absolutely sure that no live request left over from a Namespace that has been removed ever gets completed back to the host after the Namespace has been removed. Thanks, Judy From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Thursday, April 18, 2013 6:08 PM To: Judy Brock-SSI; WONMOON CHEON; ???; nvmewin at lists.openfabrics.org Subject: RE: RE: Handling pending commands when processing Format [changing from NVMe WG dist. list to OFA NVMe Windows Driver dist. list] Hi Judy, Ahhh… I see now. I didn’t answer that question below. The format command is essentially built into the driver by a state machine. When we receive the format command we immediately issue the *hot remove* command… but that is done inline. So, once we call Storport to kick off the enumeration, we simply return back to handling the format command in NVMeStartIoProcessIoctl(). The appropriate states are set along the way to indicate progress. Once the namespace is removed from the “OS view”, then the format is processed like any other command (via ProcessIo). The callback is setup to call NVMeIoctlFormatNVMCAllback() and the variable “FormatNvmInfo->AddNamespaceNeeded” is set to TRUE so that on the completion side we remember to have the OS re-enumerate after we are done. Once the NVM format completes, the callback is invoked in the completion DPC. Then on the completion side we issue Identify Controller and Identify Namespace so that our cached driver data for the namespace(s) formatted are up to date. In the last state, after getting the Identify Namespace struct, we’ll call *hot add* which is described below. Note that at no point do we “wait” for any I/O to finish. Format is a dangerous command… especially via pass through IOCTL. We talked about this quite a bit in the beginning of developing this driver. But essentially, if a format comes down for a namespace, any I/O outstanding to the controller (there won’t be anything that needs to be sent… I/O will either be at the device or on the CQ) will simply complete via normal operation or be aborted at the controller… but Storport won’t care because the SCSI target was already removed upon initially receiving the format command. Any I/O in the CQ will be completed and handled by Storport correctly. Thanks, Ray From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Thursday, April 18, 2013 4:47 PM To: Robles, Raymond C; WONMOON CHEON; ???; nvmewin at lists.openfabrics.org Subject: RE: RE: Handling pending commands when processing Format [changing from NVMe WG dist. list to OFA NVMe Windows Driver dist. list] Hi Ray, [Ray wrote] Let me know if this answers your question. I don’t think it does. What I wrote below I think was pretty much the same as what you wrote - or at least that was my intention ☺. However, the piece I couldn’t explain (cause I haven’t looked into it) is how the driver holds off the beginning of the actual format NVM operation till whatever old IOs that were already in progress for the namespace(s) before the format op request was received are completed back to the caller, aborted, or whatever - so there are no old live requests hanging around, still in the driver , before the format op begins. In other words, does the driver hold off starting the format cmd till the outstanding IOs are completed? Or do we perhaps just drop them on the floor and let the OS figure out that those requests are permanently lost/gone due to the LUNs having disappeared (my guess is, the latter is what we do)? Or do we try to abort them all? And so on. So again, we do understand how to get the OS to avoid sending new I/O requests to stale namespaces but how exactly are the old I/O reqs (those existing at the time the format request comes in ) handled? At least that is my current question ☺ Thanks, Judy From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Thursday, April 18, 2013 2:46 PM To: Judy Brock-SSI; WONMOON CHEON; ???; nvmewin at lists.openfabrics.org Subject: RE: RE: Handling pending commands when processing Format [changing from NVMe WG dist. list to OFA NVMe Windows Driver dist. list] Hello Judy/Wonmoon, Sorry for the late response. The “hot remove” state is just a state that we enter when the driver receives a Format command. Basically, this state will remove the namespace(s) from the topology by calling StorPortNotification() with BusChangeDetected. This will remove the “SCSI target/disk” associated with each namespace form the OS (because Storport will re-enumerate the controller and the driver will not expose the namespaces about to be formatted) so that the format can occur on the relevant namespaces. By signaling Windows that the namespaces have been “removed”, all I/O will be stopped by the OS. Then the format can complete. Once the format is complete, we perform the opposite action to “hot add” the namespace back into the topology by calling StorPortNotification() with BusChangeDetected… only this time, we will surface the namespace(s) again when Storport re-enumerates. Not queues are deleted in this state, no memory is de-allocated, and nothing else changes about the namespace. This is simply the first step (in a 3 step sequence) when formatting a namespace as we cannot format a namespace while the OS is aware of its presence and could be potentially sending I/O to a stale namespace config (i.e. changing LBA/sector size). Let me know if this answers your question. Thanks, Ray From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Wednesday, April 17, 2013 6:21 AM To: WONMOON CHEON; 강미경; technical at nvmexpress.org Subject: RE: RE: Handling pending commands when processing Format >>Would you elaborate more about the "hot-remove" state? In this state, do you mean that all the IO SQ/CQs are deleted? Or, waiting for completions of all the outstanding IOs? The IO SQ/CQs are definitely NOT deleted. I would need to look more closely through the driver code to see how IOs previously sent to the namespaces which are marked as OFFLINE are handled/finished/quiesced. There are other folks on this thread who no doubt have more history/intimate knowledge of this driver than I do who may answer that question more quickly than I can…also, perhaps this discussion should probably be moved to the OFA driver forum since it has turned into a driver-specific thread at this point. What do folks think? Judy From: 천원문 [mailto:wm.cheon at samsung.com] Sent: Wednesday, April 17, 2013 1:20 AM To: Judy Brock-SSI; 강미경; technical at nvmexpress.org Subject: Re: RE: Handling pending commands when processing Format Hi Judy, Would you elaborate more about the "hot-remove" state? In this state, do you mean that all the IO SQ/CQs are deleted? Or, waiting for completions of all the outstanding IOs? Thanks, Wonmoon ------- Original Message ------- Sender : Judy Brock-SSI> Date : 2013-04-17 16:39 (GMT+09:00) Title : RE: Handling pending commands when processing Format Hi, I should clarify that it is not the Windows operating system – but rather the Windows OFA NVMe driver - that, from what I can see, does a “hot-remove” of all namespace(s) associated with a device before allowing a format operation to begin; “hot remove” is just the name for an internal state in the driver format nvm state machine. Before beginning the actual format op, the driver internally marks all namespaces associated with the format operation “offline”. It then notifies the OS that there has been a “bus change” event (via an OS-specific API). This in turn will cause the OS to rescan (re-enumerate) the “bus” (the pseudo SCSI bus, that is – we expose NVM namespaces as SCSI luns). Since all the pertinent namespaces have been marked offline internally, the bus rescan won’t detect any valid SCSI luns (because the driver will not report any). Hence from the OS point of view, any SCSI lun(s) previously mapped to the namespace(s) to be formatted will have disappeared/will be unaddressable while the format operation is in progress. Judy From: Judy Brock-SSI Sent: Tuesday, April 16, 2013 8:22 PM To: 'mkkang.kang at samsung.com'; technical at nvmexpress.org Subject: RE: Handling pending commands when processing Format Mikyeong, I haven’t looked at the Linux driver but I know that Windows hot-removes all namespace(s) associated with a device before allowing a format operation to begin. And a namespace can’t be removed while there is IO outstanding to it so that answers your question regarding IOs being completed before format begins. It also answers the question about requests being sent to a namespace while format is in progress – can’t happen. Thanks, Judy From: 강미경 [mailto:mkkang.kang at samsung.com] Sent: Tuesday, April 16, 2013 7:07 PM To: technical at nvmexpress.org Subject: Handling pending commands when processing Format Dear All, Format NVM command may change the Namespace repository, and it will be executed out of order like any other commands. Therefore, Format NVM command may affect other commands that are pending execution in the device, if any. 1) How does an OFA/linux driver handle 'Format NVM command'? Does a host make sure that all commands for a particular NSID are completed before sending 'Format NVM command'? 2) If a host driver does not behave like 1) above, how can a device handle other pending commands which were previously submitted in a SQ? It seems like we need an additional status code. e.g. Abort due to Namespace Format 3) Let's suppose that 'Format NVM command' is in progress. If the host driver sends subsequent commands to the namespace being formatted, should the device reply directly with a 'Namespace not Ready'? [1.0e spec] If the device does not reply directly and the format operation takes long time, then, I/O command will timeout and the host may send the reset. But if commands are responded with 'Namespace Not Ready', host may not issue the reset. Therefore, direct reply seems to be needed. [1.1 spec. ECN 001] There is Format progress indicator. The host driver can check format progress any time, therefore, there is no concern about reset during format command. Best Regards, Mikyeong Kang Kang MiKyeong Flash Memory Planning/Enabling Group, Memory Div. SAMSUNG ELECTRONICS, Co., Ltd.. Phone: 82-31-208-3857 Mobile: 82-10-9369-0177 E-mail: mkkang.kang at samsung.com [cid:image001.jpg at 01CE3C59.BF34AF60] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 34869 bytes Desc: image001.jpg URL: From Dharani.Kotte at sandisk.com Tue Apr 30 08:36:34 2013 From: Dharani.Kotte at sandisk.com (Dharani Kotte) Date: Tue, 30 Apr 2013 15:36:34 +0000 Subject: [nvmewin] New Patch From Sandisk Message-ID: <23EC73C80FB59046A6B7B8EB7B38265923248B63@MILMBXIP02.sdcorp.global.sandisk.com> Hi all, It's been almost two weeks after I sent out the patch. please let us know if you're okay with it. Thanks, Dharani. From: Dharani Kotte Sent: Friday, April 12, 2013 9:58 AM To: nvmewin at lists.openfabrics.org Subject: New Patch From Sandisk Hi all, I am attaching a new patch that includes the following changes: 1. In nvmeSnti.c a. Allocate a global buffer of 256bytes size for modesense return data preparation b. For any mode sense this buffer is used for modesense data along with the requested pages in it. c. Copy the required data according to the alloc_length requested in the cdb to the Srb->DataBuffer and update the dataTransferLength accordingly d. In function SntiTranslateReturnAllModePagesResponse() the modesense data header offset is calculated wrong which ends up in BSOD in some cases modified code to fix this issue Please review the changes and provide feedbacks if you have any. If nobody disagrees with the changes, I will remind Ray to merge them in two weeks. Thanks, Dharani. ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at idt.com Tue Apr 30 09:22:57 2013 From: Alex.Chang at idt.com (Chang, Alex) Date: Tue, 30 Apr 2013 16:22:57 +0000 Subject: [nvmewin] New Patch From Sandisk In-Reply-To: <23EC73C80FB59046A6B7B8EB7B38265923248B63@MILMBXIP02.sdcorp.global.sandisk.com> References: <23EC73C80FB59046A6B7B8EB7B38265923248B63@MILMBXIP02.sdcorp.global.sandisk.com> Message-ID: <548C5470AAD9DA4A85D259B663190D361FFE04D2@corpmail1.na.ads.idt.com> Hi Dharani, I'd suggest that it's safer to initialize gModeSenseBuf as all zeros each time before it is used, just in case there are some unexpected data in the buffer. Other than that, I am fine with the patch. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Dharani Kotte Sent: Tuesday, April 30, 2013 8:37 AM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] New Patch From Sandisk Hi all, It's been almost two weeks after I sent out the patch. please let us know if you're okay with it. Thanks, Dharani. From: Dharani Kotte Sent: Friday, April 12, 2013 9:58 AM To: nvmewin at lists.openfabrics.org Subject: New Patch From Sandisk Hi all, I am attaching a new patch that includes the following changes: 1. In nvmeSnti.c a. Allocate a global buffer of 256bytes size for modesense return data preparation b. For any mode sense this buffer is used for modesense data along with the requested pages in it. c. Copy the required data according to the alloc_length requested in the cdb to the Srb->DataBuffer and update the dataTransferLength accordingly d. In function SntiTranslateReturnAllModePagesResponse() the modesense data header offset is calculated wrong which ends up in BSOD in some cases modified code to fix this issue Please review the changes and provide feedbacks if you have any. If nobody disagrees with the changes, I will remind Ray to merge them in two weeks. Thanks, Dharani. ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -------------- next part -------------- An HTML attachment was scrubbed... URL: From Dharani.Kotte at sandisk.com Tue Apr 30 10:08:32 2013 From: Dharani.Kotte at sandisk.com (Dharani Kotte) Date: Tue, 30 Apr 2013 17:08:32 +0000 Subject: [nvmewin] New Patch From Sandisk In-Reply-To: <548C5470AAD9DA4A85D259B663190D361FFE04D2@corpmail1.na.ads.idt.com> References: <23EC73C80FB59046A6B7B8EB7B38265923248B63@MILMBXIP02.sdcorp.global.sandisk.com> <548C5470AAD9DA4A85D259B663190D361FFE04D2@corpmail1.na.ads.idt.com> Message-ID: <23EC73C80FB59046A6B7B8EB7B38265923248B9F@MILMBXIP02.sdcorp.global.sandisk.com> Sure. Thanks, Dharani. From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Tuesday, April 30, 2013 9:23 AM To: Dharani Kotte; nvmewin at lists.openfabrics.org Subject: RE: New Patch From Sandisk Hi Dharani, I'd suggest that it's safer to initialize gModeSenseBuf as all zeros each time before it is used, just in case there are some unexpected data in the buffer. Other than that, I am fine with the patch. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Dharani Kotte Sent: Tuesday, April 30, 2013 8:37 AM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] New Patch From Sandisk Hi all, It's been almost two weeks after I sent out the patch. please let us know if you're okay with it. Thanks, Dharani. From: Dharani Kotte Sent: Friday, April 12, 2013 9:58 AM To: nvmewin at lists.openfabrics.org Subject: New Patch From Sandisk Hi all, I am attaching a new patch that includes the following changes: 1. In nvmeSnti.c a. Allocate a global buffer of 256bytes size for modesense return data preparation b. For any mode sense this buffer is used for modesense data along with the requested pages in it. c. Copy the required data according to the alloc_length requested in the cdb to the Srb->DataBuffer and update the dataTransferLength accordingly d. In function SntiTranslateReturnAllModePagesResponse() the modesense data header offset is calculated wrong which ends up in BSOD in some cases modified code to fix this issue Please review the changes and provide feedbacks if you have any. If nobody disagrees with the changes, I will remind Ray to merge them in two weeks. Thanks, Dharani. ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -------------- next part -------------- An HTML attachment was scrubbed... URL: From Dharani.Kotte at sandisk.com Tue Apr 30 11:14:07 2013 From: Dharani.Kotte at sandisk.com (Dharani Kotte) Date: Tue, 30 Apr 2013 18:14:07 +0000 Subject: [nvmewin] ***UNCHECKED*** [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] New Patch From Sandisk In-Reply-To: <548C5470AAD9DA4A85D259B663190D361FFE04D2@corpmail1.na.ads.idt.com> References: <23EC73C80FB59046A6B7B8EB7B38265923248B63@MILMBXIP02.sdcorp.global.sandisk.com> <548C5470AAD9DA4A85D259B663190D361FFE04D2@corpmail1.na.ads.idt.com> Message-ID: <23EC73C80FB59046A6B7B8EB7B38265923248BD1@MILMBXIP02.sdcorp.global.sandisk.com> Attaching the code with according to Alex suggestion. Passwd: sndk1234 Thanks, Dharani. From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Tuesday, April 30, 2013 9:23 AM To: Dharani Kotte; nvmewin at lists.openfabrics.org Subject: RE: New Patch From Sandisk Hi Dharani, I'd suggest that it's safer to initialize gModeSenseBuf as all zeros each time before it is used, just in case there are some unexpected data in the buffer. Other than that, I am fine with the patch. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Dharani Kotte Sent: Tuesday, April 30, 2013 8:37 AM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] New Patch From Sandisk Hi all, It's been almost two weeks after I sent out the patch. please let us know if you're okay with it. Thanks, Dharani. From: Dharani Kotte Sent: Friday, April 12, 2013 9:58 AM To: nvmewin at lists.openfabrics.org Subject: New Patch From Sandisk Hi all, I am attaching a new patch that includes the following changes: 1. In nvmeSnti.c a. Allocate a global buffer of 256bytes size for modesense return data preparation b. For any mode sense this buffer is used for modesense data along with the requested pages in it. c. Copy the required data according to the alloc_length requested in the cdb to the Srb->DataBuffer and update the dataTransferLength accordingly d. In function SntiTranslateReturnAllModePagesResponse() the modesense data header offset is calculated wrong which ends up in BSOD in some cases modified code to fix this issue Please review the changes and provide feedbacks if you have any. If nobody disagrees with the changes, I will remind Ray to merge them in two weeks. Thanks, Dharani. ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: source_sndk_04_30_2013.zip Type: application/x-zip-compressed Size: 168169 bytes Desc: source_sndk_04_30_2013.zip URL: From raymond.c.robles at intel.com Tue Apr 30 11:52:19 2013 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Tue, 30 Apr 2013 18:52:19 +0000 Subject: [nvmewin] New Patch From Sandisk In-Reply-To: <23EC73C80FB59046A6B7B8EB7B38265923248BD1@MILMBXIP02.sdcorp.global.sandisk.com> References: <23EC73C80FB59046A6B7B8EB7B38265923248B63@MILMBXIP02.sdcorp.global.sandisk.com> <548C5470AAD9DA4A85D259B663190D361FFE04D2@corpmail1.na.ads.idt.com> <23EC73C80FB59046A6B7B8EB7B38265923248BD1@MILMBXIP02.sdcorp.global.sandisk.com> Message-ID: <49158E750348AA499168FD41D88983606254CB45@ORSMSX152.amr.corp.intel.com> Hi Dharani, Thank you for putting this patch together. Please see my feedback below: - Line 52: Note that we have a file called nvmeSntiTypes.h. This file contains all #defines. As per our existing convention, all #defines should be placed in this file (versus at the top of a .c file). - Line 3339/3412/3502/3593/3756/5391/5433/5442/5453: Our coding convention states no line be longer than 80 characters. These lines all exceed 80 characters. - Line 5410: SntiTranslateAllModePagesResponse() contains new if-else blocks. Our coding convention dictates that open curly braces be placed on the same line as the if or else statement. You can refer to other instances of the code and if-else clauses as examples. - Line 4017: SntiCreateModeDataHeader() no longer needs the SRB pointer to be passed in as a parameter since you are using a global temp buffer to build up the mode sense data. That parameter can be removed. - General: The method of using a global buffer for a single mode sense command opens up a potential race condition for multiple mode sense commands submitted on top of each other. The idea behind putting the SCSI to NVMe translation phase in the HwStorBuildIo phase was to help performance. We initially knew that we would never be sharing any data structures or memory in calls to BuildIo. There is no spinlock or protection for calls to BuildIo and MSDN makes it clear that these BuildIo calls are not synchronized and that any synchronization of memory/data must be done within the function implementations or just use StartIo instead (since Storport will acquire the StartIoSpinLock). So there is hole here w/r/t the global buffer being populated for multiple mode sense commands at the same time. I believe you have the right idea, but the correct solution is to create a temp mode sense buffer inside the SRB extension. This way, each command has its own temp mode sense buffer for building the mode sense response, and then when ready to copy over to the SRB data buffer, it's pulling from a buffer that can only be accessed by the current command (instead of a global buffer that may have been overwritten by another mode sense request call from another BuildIo call). Note that the driver collaboration group made the decision to never acquire any type of lock in BuildIo. Let me know if you have questions on this solution. The additional size to the SRB extension is not an issue. This fix is critical for this patch to be accepted. Thanks, Ray From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Dharani Kotte Sent: Tuesday, April 30, 2013 11:14 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] New Patch From Sandisk Attaching the code with according to Alex suggestion. Passwd: sndk1234 Thanks, Dharani. From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Tuesday, April 30, 2013 9:23 AM To: Dharani Kotte; nvmewin at lists.openfabrics.org Subject: RE: New Patch From Sandisk Hi Dharani, I'd suggest that it's safer to initialize gModeSenseBuf as all zeros each time before it is used, just in case there are some unexpected data in the buffer. Other than that, I am fine with the patch. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Dharani Kotte Sent: Tuesday, April 30, 2013 8:37 AM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] New Patch From Sandisk Hi all, It's been almost two weeks after I sent out the patch. please let us know if you're okay with it. Thanks, Dharani. From: Dharani Kotte Sent: Friday, April 12, 2013 9:58 AM To: nvmewin at lists.openfabrics.org Subject: New Patch From Sandisk Hi all, I am attaching a new patch that includes the following changes: 1. In nvmeSnti.c a. Allocate a global buffer of 256bytes size for modesense return data preparation b. For any mode sense this buffer is used for modesense data along with the requested pages in it. c. Copy the required data according to the alloc_length requested in the cdb to the Srb->DataBuffer and update the dataTransferLength accordingly d. In function SntiTranslateReturnAllModePagesResponse() the modesense data header offset is calculated wrong which ends up in BSOD in some cases modified code to fix this issue Please review the changes and provide feedbacks if you have any. If nobody disagrees with the changes, I will remind Ray to merge them in two weeks. Thanks, Dharani. ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -------------- next part -------------- An HTML attachment was scrubbed... URL: From Dharani.Kotte at sandisk.com Tue Apr 30 12:03:25 2013 From: Dharani.Kotte at sandisk.com (Dharani Kotte) Date: Tue, 30 Apr 2013 19:03:25 +0000 Subject: [nvmewin] New Patch From Sandisk In-Reply-To: <49158E750348AA499168FD41D88983606254CB45@ORSMSX152.amr.corp.intel.com> References: <23EC73C80FB59046A6B7B8EB7B38265923248B63@MILMBXIP02.sdcorp.global.sandisk.com> <548C5470AAD9DA4A85D259B663190D361FFE04D2@corpmail1.na.ads.idt.com> <23EC73C80FB59046A6B7B8EB7B38265923248BD1@MILMBXIP02.sdcorp.global.sandisk.com> <49158E750348AA499168FD41D88983606254CB45@ORSMSX152.amr.corp.intel.com> Message-ID: <23EC73C80FB59046A6B7B8EB7B38265923248C18@MILMBXIP02.sdcorp.global.sandisk.com> Thank you for the feedback, I agree that the global buffer is not good I have made the changes to allocate this buffer from the srbext I will provide the code with the below modifications soon. Thanks, Dharani. From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Tuesday, April 30, 2013 11:52 AM To: Dharani Kotte; nvmewin at lists.openfabrics.org Subject: RE: New Patch From Sandisk Hi Dharani, Thank you for putting this patch together. Please see my feedback below: - Line 52: Note that we have a file called nvmeSntiTypes.h. This file contains all #defines. As per our existing convention, all #defines should be placed in this file (versus at the top of a .c file). - Line 3339/3412/3502/3593/3756/5391/5433/5442/5453: Our coding convention states no line be longer than 80 characters. These lines all exceed 80 characters. - Line 5410: SntiTranslateAllModePagesResponse() contains new if-else blocks. Our coding convention dictates that open curly braces be placed on the same line as the if or else statement. You can refer to other instances of the code and if-else clauses as examples. - Line 4017: SntiCreateModeDataHeader() no longer needs the SRB pointer to be passed in as a parameter since you are using a global temp buffer to build up the mode sense data. That parameter can be removed. - General: The method of using a global buffer for a single mode sense command opens up a potential race condition for multiple mode sense commands submitted on top of each other. The idea behind putting the SCSI to NVMe translation phase in the HwStorBuildIo phase was to help performance. We initially knew that we would never be sharing any data structures or memory in calls to BuildIo. There is no spinlock or protection for calls to BuildIo and MSDN makes it clear that these BuildIo calls are not synchronized and that any synchronization of memory/data must be done within the function implementations or just use StartIo instead (since Storport will acquire the StartIoSpinLock). So there is hole here w/r/t the global buffer being populated for multiple mode sense commands at the same time. I believe you have the right idea, but the correct solution is to create a temp mode sense buffer inside the SRB extension. This way, each command has its own temp mode sense buffer for building the mode sense response, and then when ready to copy over to the SRB data buffer, it's pulling from a buffer that can only be accessed by the current command (instead of a global buffer that may have been overwritten by another mode sense request call from another BuildIo call). Note that the driver collaboration group made the decision to never acquire any type of lock in BuildIo. Let me know if you have questions on this solution. The additional size to the SRB extension is not an issue. This fix is critical for this patch to be accepted. Thanks, Ray From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Dharani Kotte Sent: Tuesday, April 30, 2013 11:14 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] New Patch From Sandisk Attaching the code with according to Alex suggestion. Passwd: sndk1234 Thanks, Dharani. From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Tuesday, April 30, 2013 9:23 AM To: Dharani Kotte; nvmewin at lists.openfabrics.org Subject: RE: New Patch From Sandisk Hi Dharani, I'd suggest that it's safer to initialize gModeSenseBuf as all zeros each time before it is used, just in case there are some unexpected data in the buffer. Other than that, I am fine with the patch. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Dharani Kotte Sent: Tuesday, April 30, 2013 8:37 AM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] New Patch From Sandisk Hi all, It's been almost two weeks after I sent out the patch. please let us know if you're okay with it. Thanks, Dharani. From: Dharani Kotte Sent: Friday, April 12, 2013 9:58 AM To: nvmewin at lists.openfabrics.org Subject: New Patch From Sandisk Hi all, I am attaching a new patch that includes the following changes: 1. In nvmeSnti.c a. Allocate a global buffer of 256bytes size for modesense return data preparation b. For any mode sense this buffer is used for modesense data along with the requested pages in it. c. Copy the required data according to the alloc_length requested in the cdb to the Srb->DataBuffer and update the dataTransferLength accordingly d. In function SntiTranslateReturnAllModePagesResponse() the modesense data header offset is calculated wrong which ends up in BSOD in some cases modified code to fix this issue Please review the changes and provide feedbacks if you have any. If nobody disagrees with the changes, I will remind Ray to merge them in two weeks. Thanks, Dharani. ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -------------- next part -------------- An HTML attachment was scrubbed... URL: