From Alex.Chang at idt.com Tue Jul 2 16:04:55 2013 From: Alex.Chang at idt.com (Chang, Alex) Date: Tue, 2 Jul 2013 23:04:55 +0000 Subject: [nvmewin] Intel Byte Enable Patch In-Reply-To: <6B4557D9CF036C4E8F9D6C561818DABB365D7A6E@FMSMSX112.amr.corp.intel.com> References: <6B4557D9CF036C4E8F9D6C561818DABB365CE5AD@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A66@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D764C@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A88@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7721@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4B26@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D774C@FMSMSX112.amr.corp.intel.com>, <548C5470AAD9DA4A85D259B663190D361FFF4B41@corpmail1.na.ads.idt.com> <548C5470AAD9DA4A85D259B663190D361FFF4B8E@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7A6E@FMSMSX112.amr.corp.intel.com> Message-ID: <548C5470AAD9DA4A85D259B663190D361FFF69FF@corpmail1.na.ads.idt.com> Hi Kris, When I started to build the patch via WDK 7600, it gave me error and complained: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) AND defined(_WIN64) Can you change it to: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) && defined(_WIN64) Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Friday, June 28, 2013 11:04 AM To: nvmewin at lists.openfabrics.org Cc: Chang, Alex Subject: RE: Intel Byte Enable Patch See attached zip file with password: intel1234 In summary, the patch change is whenever the Capabilities register is referenced that is replaced with reading the entire 64-bit register. This is to avoid any Byte Enabled traffic that may be generated across the PCIe bus. The registers are read using 'StorPortReadRegisterUlong()' except on Windows 8 builds with a 64-bit platform. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 7:14 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, The patch from SanDisk had been pushed. You may go ahead re-base and re-send out your patch for review when you're ready. Thanks, Alex ________________________________ From: Chang, Alex Sent: Thursday, June 27, 2013 5:11 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch You may refer to the wrapping for TRIM command support codes. We will release two binary packages. One built with WDK 7600 for Windows 7, Windows Server 2008 R2 and Windows Server 2012. The other one built with WDK 8/VS2012 for Windows 8, where the API is compiled in. Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:59 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch We're using WDK 8.0 and compiling integrated with Visual Studio 2012. I did try my old WDK 7600 compiler and it gave the error for 'storPortReadRegisterUlong64' as undefined. How about wrapping it with (NTDDI_VERSION >= NTDDI_WIN8)? ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 4:45 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Per this link, it says so: http://msdn.microsoft.com/en-us/library/windows/hardware/hh967741(v=vs.85).aspx I checked the storport.h coming with WDK 7600, it does not define StorPortReadRegisterUlong64. Which WDK version you compile with? Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:36 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Are you sure? When we looked at the definition in storport.h (and wdm.h) it doesn't appear to be wrapped in any OS version compile switches. Also, I tested it on Windows 7 in QEMU and it ran without error. I verified in WinDbg that correct values were read using this function. From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:44 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Sounds good to me. However, the API is only available on Windows 8. You might want add: #if _WIN32_WINNT > _WIN32_WINNT_WIN7 Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 1:32 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Correct, Ray was going to use an array for the doorbell registers but it turns out that didn't need to happen. I saw that after I submitted, but decided to wait till the rebase to remove it from the patch. Another change I plan to make is adding a compile option for _WIN64 that will use the StorPortReadUlong64(...) function instead of 2x 32-bit reads. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:25 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, After reviewing your patch, I notice that, variable "IODB" (Line# 900 and 1013 in nvmeinit.c) is declared/initialized, but never gets used. I think you meant to read back the initial value of doorbell pointer with it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Wednesday, June 12, 2013 4:45 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** Intel Byte Enable Patch All, Attached is the patch to fix issues where accessing memory mapped controller register fields directly would generate single byte accesses across the PCIe bus by calling the StorPort functions to read those registers. The 4 places this is done are NVMeFindAdapter, NVMeInitCplQueue, NVMeSubQueue, and NVMeInitCallback. Password: intel1234 Testing done using IOMeter and SCSI Compliance with logs attached. Please review and provide feedback in the next couple weeks. Upon acceptance I'll rebase after the other patches make it through. Thanks, Kris -------------- next part -------------- An HTML attachment was scrubbed... URL: From kris.r.murray at intel.com Wed Jul 3 07:45:52 2013 From: kris.r.murray at intel.com (Murray, Kris R) Date: Wed, 3 Jul 2013 14:45:52 +0000 Subject: [nvmewin] ***UNCHECKED*** RE: Intel Byte Enable Patch In-Reply-To: <548C5470AAD9DA4A85D259B663190D361FFF69FF@corpmail1.na.ads.idt.com> References: <6B4557D9CF036C4E8F9D6C561818DABB365CE5AD@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A66@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D764C@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A88@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7721@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4B26@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D774C@FMSMSX112.amr.corp.intel.com>, <548C5470AAD9DA4A85D259B663190D361FFF4B41@corpmail1.na.ads.idt.com> <548C5470AAD9DA4A85D259B663190D361FFF4B8E@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7A6E@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF69FF@corpmail1.na.ads.idt.com> Message-ID: <6B4557D9CF036C4E8F9D6C561818DABB365D87D4@FMSMSX112.amr.corp.intel.com> Odd, in my WDK 7600 it builds just fine. I even verified the correct function calls in disassembly when running it on QEMU. What is the error you see? In any case, I made the change and verified no compile errors. Password is intel1234 ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Tuesday, July 02, 2013 4:05 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, When I started to build the patch via WDK 7600, it gave me error and complained: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) AND defined(_WIN64) Can you change it to: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) && defined(_WIN64) Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Friday, June 28, 2013 11:04 AM To: nvmewin at lists.openfabrics.org Cc: Chang, Alex Subject: RE: Intel Byte Enable Patch See attached zip file with password: intel1234 In summary, the patch change is whenever the Capabilities register is referenced that is replaced with reading the entire 64-bit register. This is to avoid any Byte Enabled traffic that may be generated across the PCIe bus. The registers are read using 'StorPortReadRegisterUlong()' except on Windows 8 builds with a 64-bit platform. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 7:14 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, The patch from SanDisk had been pushed. You may go ahead re-base and re-send out your patch for review when you're ready. Thanks, Alex ________________________________ From: Chang, Alex Sent: Thursday, June 27, 2013 5:11 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch You may refer to the wrapping for TRIM command support codes. We will release two binary packages. One built with WDK 7600 for Windows 7, Windows Server 2008 R2 and Windows Server 2012. The other one built with WDK 8/VS2012 for Windows 8, where the API is compiled in. Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:59 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch We're using WDK 8.0 and compiling integrated with Visual Studio 2012. I did try my old WDK 7600 compiler and it gave the error for 'storPortReadRegisterUlong64' as undefined. How about wrapping it with (NTDDI_VERSION >= NTDDI_WIN8)? ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 4:45 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Per this link, it says so: http://msdn.microsoft.com/en-us/library/windows/hardware/hh967741(v=vs.85).aspx I checked the storport.h coming with WDK 7600, it does not define StorPortReadRegisterUlong64. Which WDK version you compile with? Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:36 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Are you sure? When we looked at the definition in storport.h (and wdm.h) it doesn't appear to be wrapped in any OS version compile switches. Also, I tested it on Windows 7 in QEMU and it ran without error. I verified in WinDbg that correct values were read using this function. From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:44 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Sounds good to me. However, the API is only available on Windows 8. You might want add: #if _WIN32_WINNT > _WIN32_WINNT_WIN7 Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 1:32 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Correct, Ray was going to use an array for the doorbell registers but it turns out that didn't need to happen. I saw that after I submitted, but decided to wait till the rebase to remove it from the patch. Another change I plan to make is adding a compile option for _WIN64 that will use the StorPortReadUlong64(...) function instead of 2x 32-bit reads. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:25 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, After reviewing your patch, I notice that, variable "IODB" (Line# 900 and 1013 in nvmeinit.c) is declared/initialized, but never gets used. I think you meant to read back the initial value of doorbell pointer with it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Wednesday, June 12, 2013 4:45 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** Intel Byte Enable Patch All, Attached is the patch to fix issues where accessing memory mapped controller register fields directly would generate single byte accesses across the PCIe bus by calling the StorPort functions to read those registers. The 4 places this is done are NVMeFindAdapter, NVMeInitCplQueue, NVMeSubQueue, and NVMeInitCallback. Password: intel1234 Testing done using IOMeter and SCSI Compliance with logs attached. Please review and provide feedback in the next couple weeks. Upon acceptance I'll rebase after the other patches make it through. Thanks, Kris -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Intel_ByteEnable_Patch.zip Type: application/x-zip-compressed Size: 168379 bytes Desc: Intel_ByteEnable_Patch.zip URL: From barrett.n.mayes at intel.com Wed Jul 3 09:57:25 2013 From: barrett.n.mayes at intel.com (Mayes, Barrett N) Date: Wed, 3 Jul 2013 16:57:25 +0000 Subject: [nvmewin] Intel Byte Enable Patch In-Reply-To: <548C5470AAD9DA4A85D259B663190D361FFF69FF@corpmail1.na.ads.idt.com> References: <6B4557D9CF036C4E8F9D6C561818DABB365CE5AD@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A66@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D764C@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A88@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7721@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4B26@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D774C@FMSMSX112.amr.corp.intel.com>, <548C5470AAD9DA4A85D259B663190D361FFF4B41@corpmail1.na.ads.idt.com> <548C5470AAD9DA4A85D259B663190D361FFF4B8E@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7A6E@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF69FF@corpmail1.na.ads.idt.com> Message-ID: I fell behind a bit in looking at these patches so I apologize for making this comment late. _WIN32_WINNT is a user-mode #define. If you want to switch on the version of the WDK environment you're using to build drivers, the check should against the NTDDI_VERSION. For example: #if (NTDDI_VERSION >= NTDDI_WIN8) NTDDI_WIN* are defined in \include\shared\sdkddkver.h #define NTDDI_WIN6 0x06000000 #define NTDDI_VISTA NTDDI_WIN6 #define NTDDI_WIN7 0x06010000 #define NTDDI_WIN8 0x06020000 From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Tuesday, July 02, 2013 4:05 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Intel Byte Enable Patch Hi Kris, When I started to build the patch via WDK 7600, it gave me error and complained: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) AND defined(_WIN64) Can you change it to: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) && defined(_WIN64) Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Friday, June 28, 2013 11:04 AM To: nvmewin at lists.openfabrics.org Cc: Chang, Alex Subject: RE: Intel Byte Enable Patch See attached zip file with password: intel1234 In summary, the patch change is whenever the Capabilities register is referenced that is replaced with reading the entire 64-bit register. This is to avoid any Byte Enabled traffic that may be generated across the PCIe bus. The registers are read using 'StorPortReadRegisterUlong()' except on Windows 8 builds with a 64-bit platform. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 7:14 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, The patch from SanDisk had been pushed. You may go ahead re-base and re-send out your patch for review when you're ready. Thanks, Alex ________________________________ From: Chang, Alex Sent: Thursday, June 27, 2013 5:11 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch You may refer to the wrapping for TRIM command support codes. We will release two binary packages. One built with WDK 7600 for Windows 7, Windows Server 2008 R2 and Windows Server 2012. The other one built with WDK 8/VS2012 for Windows 8, where the API is compiled in. Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:59 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch We're using WDK 8.0 and compiling integrated with Visual Studio 2012. I did try my old WDK 7600 compiler and it gave the error for 'storPortReadRegisterUlong64' as undefined. How about wrapping it with (NTDDI_VERSION >= NTDDI_WIN8)? ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 4:45 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Per this link, it says so: http://msdn.microsoft.com/en-us/library/windows/hardware/hh967741(v=vs.85).aspx I checked the storport.h coming with WDK 7600, it does not define StorPortReadRegisterUlong64. Which WDK version you compile with? Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:36 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Are you sure? When we looked at the definition in storport.h (and wdm.h) it doesn't appear to be wrapped in any OS version compile switches. Also, I tested it on Windows 7 in QEMU and it ran without error. I verified in WinDbg that correct values were read using this function. From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:44 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Sounds good to me. However, the API is only available on Windows 8. You might want add: #if _WIN32_WINNT > _WIN32_WINNT_WIN7 Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 1:32 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Correct, Ray was going to use an array for the doorbell registers but it turns out that didn't need to happen. I saw that after I submitted, but decided to wait till the rebase to remove it from the patch. Another change I plan to make is adding a compile option for _WIN64 that will use the StorPortReadUlong64(...) function instead of 2x 32-bit reads. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:25 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, After reviewing your patch, I notice that, variable "IODB" (Line# 900 and 1013 in nvmeinit.c) is declared/initialized, but never gets used. I think you meant to read back the initial value of doorbell pointer with it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Wednesday, June 12, 2013 4:45 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** Intel Byte Enable Patch All, Attached is the patch to fix issues where accessing memory mapped controller register fields directly would generate single byte accesses across the PCIe bus by calling the StorPort functions to read those registers. The 4 places this is done are NVMeFindAdapter, NVMeInitCplQueue, NVMeSubQueue, and NVMeInitCallback. Password: intel1234 Testing done using IOMeter and SCSI Compliance with logs attached. Please review and provide feedback in the next couple weeks. Upon acceptance I'll rebase after the other patches make it through. Thanks, Kris -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at idt.com Wed Jul 3 09:59:55 2013 From: Alex.Chang at idt.com (Chang, Alex) Date: Wed, 3 Jul 2013 16:59:55 +0000 Subject: [nvmewin] Intel Byte Enable Patch In-Reply-To: <6B4557D9CF036C4E8F9D6C561818DABB365D87D4@FMSMSX112.amr.corp.intel.com> References: <6B4557D9CF036C4E8F9D6C561818DABB365CE5AD@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A66@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D764C@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A88@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7721@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4B26@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D774C@FMSMSX112.amr.corp.intel.com>, <548C5470AAD9DA4A85D259B663190D361FFF4B41@corpmail1.na.ads.idt.com> <548C5470AAD9DA4A85D259B663190D361FFF4B8E@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7A6E@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF69FF@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D87D4@FMSMSX112.amr.corp.intel.com> Message-ID: <548C5470AAD9DA4A85D259B663190D361FFF6A7E@corpmail1.na.ads.idt.com> When compiling, it complains error C2220: warning treated as error. Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Wednesday, July 03, 2013 7:46 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Odd, in my WDK 7600 it builds just fine. I even verified the correct function calls in disassembly when running it on QEMU. What is the error you see? In any case, I made the change and verified no compile errors. Password is intel1234 ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Tuesday, July 02, 2013 4:05 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, When I started to build the patch via WDK 7600, it gave me error and complained: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) AND defined(_WIN64) Can you change it to: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) && defined(_WIN64) Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Friday, June 28, 2013 11:04 AM To: nvmewin at lists.openfabrics.org Cc: Chang, Alex Subject: RE: Intel Byte Enable Patch See attached zip file with password: intel1234 In summary, the patch change is whenever the Capabilities register is referenced that is replaced with reading the entire 64-bit register. This is to avoid any Byte Enabled traffic that may be generated across the PCIe bus. The registers are read using 'StorPortReadRegisterUlong()' except on Windows 8 builds with a 64-bit platform. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 7:14 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, The patch from SanDisk had been pushed. You may go ahead re-base and re-send out your patch for review when you're ready. Thanks, Alex ________________________________ From: Chang, Alex Sent: Thursday, June 27, 2013 5:11 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch You may refer to the wrapping for TRIM command support codes. We will release two binary packages. One built with WDK 7600 for Windows 7, Windows Server 2008 R2 and Windows Server 2012. The other one built with WDK 8/VS2012 for Windows 8, where the API is compiled in. Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:59 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch We're using WDK 8.0 and compiling integrated with Visual Studio 2012. I did try my old WDK 7600 compiler and it gave the error for 'storPortReadRegisterUlong64' as undefined. How about wrapping it with (NTDDI_VERSION >= NTDDI_WIN8)? ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 4:45 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Per this link, it says so: http://msdn.microsoft.com/en-us/library/windows/hardware/hh967741(v=vs.85).aspx I checked the storport.h coming with WDK 7600, it does not define StorPortReadRegisterUlong64. Which WDK version you compile with? Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:36 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Are you sure? When we looked at the definition in storport.h (and wdm.h) it doesn't appear to be wrapped in any OS version compile switches. Also, I tested it on Windows 7 in QEMU and it ran without error. I verified in WinDbg that correct values were read using this function. From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:44 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Sounds good to me. However, the API is only available on Windows 8. You might want add: #if _WIN32_WINNT > _WIN32_WINNT_WIN7 Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 1:32 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Correct, Ray was going to use an array for the doorbell registers but it turns out that didn't need to happen. I saw that after I submitted, but decided to wait till the rebase to remove it from the patch. Another change I plan to make is adding a compile option for _WIN64 that will use the StorPortReadUlong64(...) function instead of 2x 32-bit reads. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:25 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, After reviewing your patch, I notice that, variable "IODB" (Line# 900 and 1013 in nvmeinit.c) is declared/initialized, but never gets used. I think you meant to read back the initial value of doorbell pointer with it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Wednesday, June 12, 2013 4:45 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** Intel Byte Enable Patch All, Attached is the patch to fix issues where accessing memory mapped controller register fields directly would generate single byte accesses across the PCIe bus by calling the StorPort functions to read those registers. The 4 places this is done are NVMeFindAdapter, NVMeInitCplQueue, NVMeSubQueue, and NVMeInitCallback. Password: intel1234 Testing done using IOMeter and SCSI Compliance with logs attached. Please review and provide feedback in the next couple weeks. Upon acceptance I'll rebase after the other patches make it through. Thanks, Kris -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at idt.com Wed Jul 3 10:53:38 2013 From: Alex.Chang at idt.com (Chang, Alex) Date: Wed, 3 Jul 2013 17:53:38 +0000 Subject: [nvmewin] Intel Byte Enable Patch In-Reply-To: References: <6B4557D9CF036C4E8F9D6C561818DABB365CE5AD@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A66@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D764C@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A88@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7721@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4B26@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D774C@FMSMSX112.amr.corp.intel.com>, <548C5470AAD9DA4A85D259B663190D361FFF4B41@corpmail1.na.ads.idt.com> <548C5470AAD9DA4A85D259B663190D361FFF4B8E@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7A6E@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF69FF@corpmail1.na.ads.idt.com> Message-ID: <548C5470AAD9DA4A85D259B663190D361FFF6A9D@corpmail1.na.ads.idt.com> Thank you, Barrett. Since I will add a patch before our release by the end of July, your suggestion will be included in that patch. Regards, Alex ________________________________ From: Mayes, Barrett N [mailto:barrett.n.mayes at intel.com] Sent: Wednesday, July 03, 2013 9:57 AM To: Chang, Alex; Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch I fell behind a bit in looking at these patches so I apologize for making this comment late. _WIN32_WINNT is a user-mode #define. If you want to switch on the version of the WDK environment you're using to build drivers, the check should against the NTDDI_VERSION. For example: #if (NTDDI_VERSION >= NTDDI_WIN8) NTDDI_WIN* are defined in \include\shared\sdkddkver.h #define NTDDI_WIN6 0x06000000 #define NTDDI_VISTA NTDDI_WIN6 #define NTDDI_WIN7 0x06010000 #define NTDDI_WIN8 0x06020000 From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Tuesday, July 02, 2013 4:05 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Intel Byte Enable Patch Hi Kris, When I started to build the patch via WDK 7600, it gave me error and complained: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) AND defined(_WIN64) Can you change it to: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) && defined(_WIN64) Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Friday, June 28, 2013 11:04 AM To: nvmewin at lists.openfabrics.org Cc: Chang, Alex Subject: RE: Intel Byte Enable Patch See attached zip file with password: intel1234 In summary, the patch change is whenever the Capabilities register is referenced that is replaced with reading the entire 64-bit register. This is to avoid any Byte Enabled traffic that may be generated across the PCIe bus. The registers are read using 'StorPortReadRegisterUlong()' except on Windows 8 builds with a 64-bit platform. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 7:14 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, The patch from SanDisk had been pushed. You may go ahead re-base and re-send out your patch for review when you're ready. Thanks, Alex ________________________________ From: Chang, Alex Sent: Thursday, June 27, 2013 5:11 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch You may refer to the wrapping for TRIM command support codes. We will release two binary packages. One built with WDK 7600 for Windows 7, Windows Server 2008 R2 and Windows Server 2012. The other one built with WDK 8/VS2012 for Windows 8, where the API is compiled in. Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:59 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch We're using WDK 8.0 and compiling integrated with Visual Studio 2012. I did try my old WDK 7600 compiler and it gave the error for 'storPortReadRegisterUlong64' as undefined. How about wrapping it with (NTDDI_VERSION >= NTDDI_WIN8)? ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 4:45 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Per this link, it says so: http://msdn.microsoft.com/en-us/library/windows/hardware/hh967741(v=vs.85).aspx I checked the storport.h coming with WDK 7600, it does not define StorPortReadRegisterUlong64. Which WDK version you compile with? Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:36 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Are you sure? When we looked at the definition in storport.h (and wdm.h) it doesn't appear to be wrapped in any OS version compile switches. Also, I tested it on Windows 7 in QEMU and it ran without error. I verified in WinDbg that correct values were read using this function. From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:44 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Sounds good to me. However, the API is only available on Windows 8. You might want add: #if _WIN32_WINNT > _WIN32_WINNT_WIN7 Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 1:32 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Correct, Ray was going to use an array for the doorbell registers but it turns out that didn't need to happen. I saw that after I submitted, but decided to wait till the rebase to remove it from the patch. Another change I plan to make is adding a compile option for _WIN64 that will use the StorPortReadUlong64(...) function instead of 2x 32-bit reads. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:25 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, After reviewing your patch, I notice that, variable "IODB" (Line# 900 and 1013 in nvmeinit.c) is declared/initialized, but never gets used. I think you meant to read back the initial value of doorbell pointer with it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Wednesday, June 12, 2013 4:45 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** Intel Byte Enable Patch All, Attached is the patch to fix issues where accessing memory mapped controller register fields directly would generate single byte accesses across the PCIe bus by calling the StorPort functions to read those registers. The 4 places this is done are NVMeFindAdapter, NVMeInitCplQueue, NVMeSubQueue, and NVMeInitCallback. Password: intel1234 Testing done using IOMeter and SCSI Compliance with logs attached. Please review and provide feedback in the next couple weeks. Upon acceptance I'll rebase after the other patches make it through. Thanks, Kris -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at idt.com Wed Jul 3 14:36:56 2013 From: Alex.Chang at idt.com (Chang, Alex) Date: Wed, 3 Jul 2013 21:36:56 +0000 Subject: [nvmewin] Intel Byte Enable Patch In-Reply-To: <6B4557D9CF036C4E8F9D6C561818DABB365D87D4@FMSMSX112.amr.corp.intel.com> References: <6B4557D9CF036C4E8F9D6C561818DABB365CE5AD@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A66@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D764C@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A88@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7721@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4B26@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D774C@FMSMSX112.amr.corp.intel.com>, <548C5470AAD9DA4A85D259B663190D361FFF4B41@corpmail1.na.ads.idt.com> <548C5470AAD9DA4A85D259B663190D361FFF4B8E@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7A6E@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF69FF@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D87D4@FMSMSX112.amr.corp.intel.com> Message-ID: <548C5470AAD9DA4A85D259B663190D361FFF6ADB@corpmail1.na.ads.idt.com> Thanks, Kris. Hi Rick, Could you provide your approval at your earliest convenience if you are fine with the patch? I plan to push it by the end of this week and currently I am testing it on Windows 8 and it works fine on Windows 7 and others. Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Wednesday, July 03, 2013 7:46 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Odd, in my WDK 7600 it builds just fine. I even verified the correct function calls in disassembly when running it on QEMU. What is the error you see? In any case, I made the change and verified no compile errors. Password is intel1234 ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Tuesday, July 02, 2013 4:05 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, When I started to build the patch via WDK 7600, it gave me error and complained: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) AND defined(_WIN64) Can you change it to: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) && defined(_WIN64) Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Friday, June 28, 2013 11:04 AM To: nvmewin at lists.openfabrics.org Cc: Chang, Alex Subject: RE: Intel Byte Enable Patch See attached zip file with password: intel1234 In summary, the patch change is whenever the Capabilities register is referenced that is replaced with reading the entire 64-bit register. This is to avoid any Byte Enabled traffic that may be generated across the PCIe bus. The registers are read using 'StorPortReadRegisterUlong()' except on Windows 8 builds with a 64-bit platform. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 7:14 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, The patch from SanDisk had been pushed. You may go ahead re-base and re-send out your patch for review when you're ready. Thanks, Alex ________________________________ From: Chang, Alex Sent: Thursday, June 27, 2013 5:11 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch You may refer to the wrapping for TRIM command support codes. We will release two binary packages. One built with WDK 7600 for Windows 7, Windows Server 2008 R2 and Windows Server 2012. The other one built with WDK 8/VS2012 for Windows 8, where the API is compiled in. Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:59 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch We're using WDK 8.0 and compiling integrated with Visual Studio 2012. I did try my old WDK 7600 compiler and it gave the error for 'storPortReadRegisterUlong64' as undefined. How about wrapping it with (NTDDI_VERSION >= NTDDI_WIN8)? ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 4:45 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Per this link, it says so: http://msdn.microsoft.com/en-us/library/windows/hardware/hh967741(v=vs.85).aspx I checked the storport.h coming with WDK 7600, it does not define StorPortReadRegisterUlong64. Which WDK version you compile with? Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:36 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Are you sure? When we looked at the definition in storport.h (and wdm.h) it doesn't appear to be wrapped in any OS version compile switches. Also, I tested it on Windows 7 in QEMU and it ran without error. I verified in WinDbg that correct values were read using this function. From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:44 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Sounds good to me. However, the API is only available on Windows 8. You might want add: #if _WIN32_WINNT > _WIN32_WINNT_WIN7 Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 1:32 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Correct, Ray was going to use an array for the doorbell registers but it turns out that didn't need to happen. I saw that after I submitted, but decided to wait till the rebase to remove it from the patch. Another change I plan to make is adding a compile option for _WIN64 that will use the StorPortReadUlong64(...) function instead of 2x 32-bit reads. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:25 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, After reviewing your patch, I notice that, variable "IODB" (Line# 900 and 1013 in nvmeinit.c) is declared/initialized, but never gets used. I think you meant to read back the initial value of doorbell pointer with it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Wednesday, June 12, 2013 4:45 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** Intel Byte Enable Patch All, Attached is the patch to fix issues where accessing memory mapped controller register fields directly would generate single byte accesses across the PCIe bus by calling the StorPort functions to read those registers. The 4 places this is done are NVMeFindAdapter, NVMeInitCplQueue, NVMeSubQueue, and NVMeInitCallback. Password: intel1234 Testing done using IOMeter and SCSI Compliance with logs attached. Please review and provide feedback in the next couple weeks. Upon acceptance I'll rebase after the other patches make it through. Thanks, Kris -------------- next part -------------- An HTML attachment was scrubbed... URL: From Rick.Knoblaugh at lsi.com Wed Jul 3 17:36:04 2013 From: Rick.Knoblaugh at lsi.com (Knoblaugh, Rick) Date: Wed, 3 Jul 2013 18:36:04 -0600 Subject: [nvmewin] Intel Byte Enable Patch In-Reply-To: <548C5470AAD9DA4A85D259B663190D361FFF6ADB@corpmail1.na.ads.idt.com> References: <6B4557D9CF036C4E8F9D6C561818DABB365CE5AD@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A66@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D764C@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A88@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7721@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4B26@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D774C@FMSMSX112.amr.corp.intel.com>, <548C5470AAD9DA4A85D259B663190D361FFF4B41@corpmail1.na.ads.idt.com> <548C5470AAD9DA4A85D259B663190D361FFF4B8E@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7A6E@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF69FF@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D87D4@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF6ADB@corpmail1.na.ads.idt.com> Message-ID: <4565AEA676113A449269C2F3A549520F01382E11CF@cosmail03.lsi.com> Hi Alex, Sorry for the delay. Yes, we are good with this patch. Have a good holiday weekend. Thanks, -Rick From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Wednesday, July 03, 2013 2:37 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Intel Byte Enable Patch Thanks, Kris. Hi Rick, Could you provide your approval at your earliest convenience if you are fine with the patch? I plan to push it by the end of this week and currently I am testing it on Windows 8 and it works fine on Windows 7 and others. Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Wednesday, July 03, 2013 7:46 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Odd, in my WDK 7600 it builds just fine. I even verified the correct function calls in disassembly when running it on QEMU. What is the error you see? In any case, I made the change and verified no compile errors. Password is intel1234 ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Tuesday, July 02, 2013 4:05 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, When I started to build the patch via WDK 7600, it gave me error and complained: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) AND defined(_WIN64) Can you change it to: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) && defined(_WIN64) Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Friday, June 28, 2013 11:04 AM To: nvmewin at lists.openfabrics.org Cc: Chang, Alex Subject: RE: Intel Byte Enable Patch See attached zip file with password: intel1234 In summary, the patch change is whenever the Capabilities register is referenced that is replaced with reading the entire 64-bit register. This is to avoid any Byte Enabled traffic that may be generated across the PCIe bus. The registers are read using 'StorPortReadRegisterUlong()' except on Windows 8 builds with a 64-bit platform. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 7:14 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, The patch from SanDisk had been pushed. You may go ahead re-base and re-send out your patch for review when you're ready. Thanks, Alex ________________________________ From: Chang, Alex Sent: Thursday, June 27, 2013 5:11 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch You may refer to the wrapping for TRIM command support codes. We will release two binary packages. One built with WDK 7600 for Windows 7, Windows Server 2008 R2 and Windows Server 2012. The other one built with WDK 8/VS2012 for Windows 8, where the API is compiled in. Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:59 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch We're using WDK 8.0 and compiling integrated with Visual Studio 2012. I did try my old WDK 7600 compiler and it gave the error for 'storPortReadRegisterUlong64' as undefined. How about wrapping it with (NTDDI_VERSION >= NTDDI_WIN8)? ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 4:45 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Per this link, it says so: http://msdn.microsoft.com/en-us/library/windows/hardware/hh967741(v=vs.85).aspx I checked the storport.h coming with WDK 7600, it does not define StorPortReadRegisterUlong64. Which WDK version you compile with? Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:36 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Are you sure? When we looked at the definition in storport.h (and wdm.h) it doesn't appear to be wrapped in any OS version compile switches. Also, I tested it on Windows 7 in QEMU and it ran without error. I verified in WinDbg that correct values were read using this function. From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:44 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Sounds good to me. However, the API is only available on Windows 8. You might want add: #if _WIN32_WINNT > _WIN32_WINNT_WIN7 Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 1:32 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Correct, Ray was going to use an array for the doorbell registers but it turns out that didn't need to happen. I saw that after I submitted, but decided to wait till the rebase to remove it from the patch. Another change I plan to make is adding a compile option for _WIN64 that will use the StorPortReadUlong64(...) function instead of 2x 32-bit reads. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:25 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, After reviewing your patch, I notice that, variable "IODB" (Line# 900 and 1013 in nvmeinit.c) is declared/initialized, but never gets used. I think you meant to read back the initial value of doorbell pointer with it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Wednesday, June 12, 2013 4:45 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** Intel Byte Enable Patch All, Attached is the patch to fix issues where accessing memory mapped controller register fields directly would generate single byte accesses across the PCIe bus by calling the StorPort functions to read those registers. The 4 places this is done are NVMeFindAdapter, NVMeInitCplQueue, NVMeSubQueue, and NVMeInitCallback. Password: intel1234 Testing done using IOMeter and SCSI Compliance with logs attached. Please review and provide feedback in the next couple weeks. Upon acceptance I'll rebase after the other patches make it through. Thanks, Kris -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at idt.com Wed Jul 3 17:44:06 2013 From: Alex.Chang at idt.com (Chang, Alex) Date: Thu, 4 Jul 2013 00:44:06 +0000 Subject: [nvmewin] Intel Byte Enable Patch In-Reply-To: <4565AEA676113A449269C2F3A549520F01382E11CF@cosmail03.lsi.com> References: <6B4557D9CF036C4E8F9D6C561818DABB365CE5AD@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A66@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D764C@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4A88@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7721@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF4B26@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D774C@FMSMSX112.amr.corp.intel.com>, <548C5470AAD9DA4A85D259B663190D361FFF4B41@corpmail1.na.ads.idt.com> <548C5470AAD9DA4A85D259B663190D361FFF4B8E@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D7A6E@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF69FF@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365D87D4@FMSMSX112.amr.corp.intel.com> <548C5470AAD9DA4A85D259B663190D361FFF6ADB@corpmail1.na.ads.idt.com> <4565AEA676113A449269C2F3A549520F01382E11CF@cosmail03.lsi.com> Message-ID: <548C5470AAD9DA4A85D259B663190D361FFF6B1E@corpmail1.na.ads.idt.com> Thanks, Rick. Alex ________________________________ From: Knoblaugh, Rick [mailto:Rick.Knoblaugh at lsi.com] Sent: Wednesday, July 03, 2013 5:36 PM To: Chang, Alex; Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Alex, Sorry for the delay. Yes, we are good with this patch. Have a good holiday weekend. Thanks, -Rick From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Wednesday, July 03, 2013 2:37 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Intel Byte Enable Patch Thanks, Kris. Hi Rick, Could you provide your approval at your earliest convenience if you are fine with the patch? I plan to push it by the end of this week and currently I am testing it on Windows 8 and it works fine on Windows 7 and others. Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Wednesday, July 03, 2013 7:46 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Odd, in my WDK 7600 it builds just fine. I even verified the correct function calls in disassembly when running it on QEMU. What is the error you see? In any case, I made the change and verified no compile errors. Password is intel1234 ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Tuesday, July 02, 2013 4:05 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, When I started to build the patch via WDK 7600, it gave me error and complained: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) AND defined(_WIN64) Can you change it to: #if (_WIN32_WINNT > _WIN32_WINNT_WIN7) && defined(_WIN64) Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Friday, June 28, 2013 11:04 AM To: nvmewin at lists.openfabrics.org Cc: Chang, Alex Subject: RE: Intel Byte Enable Patch See attached zip file with password: intel1234 In summary, the patch change is whenever the Capabilities register is referenced that is replaced with reading the entire 64-bit register. This is to avoid any Byte Enabled traffic that may be generated across the PCIe bus. The registers are read using 'StorPortReadRegisterUlong()' except on Windows 8 builds with a 64-bit platform. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 7:14 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, The patch from SanDisk had been pushed. You may go ahead re-base and re-send out your patch for review when you're ready. Thanks, Alex ________________________________ From: Chang, Alex Sent: Thursday, June 27, 2013 5:11 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch You may refer to the wrapping for TRIM command support codes. We will release two binary packages. One built with WDK 7600 for Windows 7, Windows Server 2008 R2 and Windows Server 2012. The other one built with WDK 8/VS2012 for Windows 8, where the API is compiled in. Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:59 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch We're using WDK 8.0 and compiling integrated with Visual Studio 2012. I did try my old WDK 7600 compiler and it gave the error for 'storPortReadRegisterUlong64' as undefined. How about wrapping it with (NTDDI_VERSION >= NTDDI_WIN8)? ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 4:45 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Per this link, it says so: http://msdn.microsoft.com/en-us/library/windows/hardware/hh967741(v=vs.85).aspx I checked the storport.h coming with WDK 7600, it does not define StorPortReadRegisterUlong64. Which WDK version you compile with? Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 4:36 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Are you sure? When we looked at the definition in storport.h (and wdm.h) it doesn't appear to be wrapped in any OS version compile switches. Also, I tested it on Windows 7 in QEMU and it ran without error. I verified in WinDbg that correct values were read using this function. From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:44 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Sounds good to me. However, the API is only available on Windows 8. You might want add: #if _WIN32_WINNT > _WIN32_WINNT_WIN7 Thanks, Alex ________________________________ From: Murray, Kris R [mailto:kris.r.murray at intel.com] Sent: Thursday, June 27, 2013 1:32 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Correct, Ray was going to use an array for the doorbell registers but it turns out that didn't need to happen. I saw that after I submitted, but decided to wait till the rebase to remove it from the patch. Another change I plan to make is adding a compile option for _WIN64 that will use the StorPortReadUlong64(...) function instead of 2x 32-bit reads. Thanks, ~Kris From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, June 27, 2013 1:25 PM To: Murray, Kris R; nvmewin at lists.openfabrics.org Subject: RE: Intel Byte Enable Patch Hi Kris, After reviewing your patch, I notice that, variable "IODB" (Line# 900 and 1013 in nvmeinit.c) is declared/initialized, but never gets used. I think you meant to read back the initial value of doorbell pointer with it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Wednesday, June 12, 2013 4:45 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** Intel Byte Enable Patch All, Attached is the patch to fix issues where accessing memory mapped controller register fields directly would generate single byte accesses across the PCIe bus by calling the StorPort functions to read those registers. The 4 places this is done are NVMeFindAdapter, NVMeInitCplQueue, NVMeSubQueue, and NVMeInitCallback. Password: intel1234 Testing done using IOMeter and SCSI Compliance with logs attached. Please review and provide feedback in the next couple weeks. Upon acceptance I'll rebase after the other patches make it through. Thanks, Kris -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at idt.com Fri Jul 5 15:10:23 2013 From: Alex.Chang at idt.com (Chang, Alex) Date: Fri, 5 Jul 2013 22:10:23 +0000 Subject: [nvmewin] NVMe Windows DB Is LOCKED - Pushing Patch From Intel - Byte Enable Message-ID: <548C5470AAD9DA4A85D259B663190D361FFF6B62@corpmail1.na.ads.idt.com> Locking NVMe Windows DB. Thanks, Alex _______________________________________________ nvmewin mailing list nvmewin at lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/nvmewin From Alex.Chang at idt.com Fri Jul 5 15:40:58 2013 From: Alex.Chang at idt.com (Chang, Alex) Date: Fri, 5 Jul 2013 22:40:58 +0000 Subject: [nvmewin] NVMe Windows Repo Is UNLOCKED - Byte Enable Fix Pushed Message-ID: <548C5470AAD9DA4A85D259B663190D361FFF6B73@corpmail1.na.ads.idt.com> Hi all, Latest patch from Intel (Byte Enable Fix) has been pushed to trunk. A new tag has also been created as "byte_enable_fix". If anyone has any questions, please feel free to contact me. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From Arpit.Patel at lsi.com Sat Jul 6 12:09:54 2013 From: Arpit.Patel at lsi.com (Patel, Arpit) Date: Sat, 6 Jul 2013 13:09:54 -0600 Subject: [nvmewin] (no subject) Message-ID: A From Alex.Chang at idt.com Mon Jul 8 16:58:48 2013 From: Alex.Chang at idt.com (Chang, Alex) Date: Mon, 8 Jul 2013 23:58:48 +0000 Subject: [nvmewin] New Patches? In-Reply-To: <548C5470AAD9DA4A85D259B663190D361FFF6B73@corpmail1.na.ads.idt.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6B73@corpmail1.na.ads.idt.com> Message-ID: <548C5470AAD9DA4A85D259B663190D361FFF6C66@corpmail1.na.ads.idt.com> Hi all, Last patch had been pushed into trunk and I have a patch ready to check in as well. If you have patch ready for review, please let me know via replying this message. If I don't hear from you by the end of tomorrow, I will send out my patch then. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From Yong.sc.Chen at huawei.com Tue Jul 9 00:42:20 2013 From: Yong.sc.Chen at huawei.com (Yong Chen) Date: Tue, 9 Jul 2013 07:42:20 +0000 Subject: [nvmewin] New Patches? In-Reply-To: <548C5470AAD9DA4A85D259B663190D361FFF6C66@corpmail1.na.ads.idt.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6B73@corpmail1.na.ads.idt.com> <548C5470AAD9DA4A85D259B663190D361FFF6C66@corpmail1.na.ads.idt.com> Message-ID: <02EC085151D99A469E06988E94FEBCDB1C42DBBC@dfweml513-mbs.china.huawei.com> I expect to have a patch for review Wednesday. It is the crash dump & hibernation change. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Monday, July 08, 2013 4:59 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] New Patches? Hi all, Last patch had been pushed into trunk and I have a patch ready to check in as well. If you have patch ready for review, please let me know via replying this message. If I don't hear from you by the end of tomorrow, I will send out my patch then. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From Yong.sc.Chen at huawei.com Wed Jul 10 10:45:22 2013 From: Yong.sc.Chen at huawei.com (Yong Chen) Date: Wed, 10 Jul 2013 17:45:22 +0000 Subject: [nvmewin] New Patches? In-Reply-To: <02EC085151D99A469E06988E94FEBCDB1C42DBBC@dfweml513-mbs.china.huawei.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6B73@corpmail1.na.ads.idt.com> <548C5470AAD9DA4A85D259B663190D361FFF6C66@corpmail1.na.ads.idt.com> <02EC085151D99A469E06988E94FEBCDB1C42DBBC@dfweml513-mbs.china.huawei.com> Message-ID: <02EC085151D99A469E06988E94FEBCDB1C42DEF8@dfweml513-mbs.china.huawei.com> Hi, Alex and all, I still have one last issue to resolve and not ready for review. You can go ahead with your patches. Thanks, Yong From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Yong Chen Sent: Tuesday, July 09, 2013 12:42 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] New Patches? I expect to have a patch for review Wednesday. It is the crash dump & hibernation change. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Monday, July 08, 2013 4:59 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] New Patches? Hi all, Last patch had been pushed into trunk and I have a patch ready to check in as well. If you have patch ready for review, please let me know via replying this message. If I don't hear from you by the end of tomorrow, I will send out my patch then. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at idt.com Wed Jul 10 14:12:59 2013 From: Alex.Chang at idt.com (Chang, Alex) Date: Wed, 10 Jul 2013 21:12:59 +0000 Subject: [nvmewin] ***UNCHECKED*** New Patch From IDT Message-ID: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> Hi all, I am attaching a new patch that includes the following changes: 1. Fixed system crash issue when powering off the system. It happens when locating any pending requests after the queue entry buffers had been freed. 2. Replaced WIN32_WINNT_WINx compiling flags with NTDDI_WINx. The password is idt1234 I had run through all required tests successfully. Please provide your feedbacks after reviewing it. Rick and Kris, If you're okay with it, please send out your approval as well. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: idt_patch_0710.zip Type: application/x-zip-compressed Size: 171298 bytes Desc: idt_patch_0710.zip URL: From kris.r.murray at intel.com Thu Jul 11 11:17:01 2013 From: kris.r.murray at intel.com (Murray, Kris R) Date: Thu, 11 Jul 2013 18:17:01 +0000 Subject: [nvmewin] New Patch From IDT In-Reply-To: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> Message-ID: <6B4557D9CF036C4E8F9D6C561818DABB365DA99C@FMSMSX112.amr.corp.intel.com> Alex, Change looks good to me. Thanks, ~Kris From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Wednesday, July 10, 2013 2:13 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** New Patch From IDT Hi all, I am attaching a new patch that includes the following changes: 1. Fixed system crash issue when powering off the system. It happens when locating any pending requests after the queue entry buffers had been freed. 2. Replaced WIN32_WINNT_WINx compiling flags with NTDDI_WINx. The password is idt1234 I had run through all required tests successfully. Please provide your feedbacks after reviewing it. Rick and Kris, If you're okay with it, please send out your approval as well. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From Rick.Knoblaugh at lsi.com Thu Jul 11 11:19:50 2013 From: Rick.Knoblaugh at lsi.com (Knoblaugh, Rick) Date: Thu, 11 Jul 2013 12:19:50 -0600 Subject: [nvmewin] New Patch From IDT In-Reply-To: <6B4557D9CF036C4E8F9D6C561818DABB365DA99C@FMSMSX112.amr.corp.intel.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365DA99C@FMSMSX112.amr.corp.intel.com> Message-ID: <4565AEA676113A449269C2F3A549520F013839D912@cosmail03.lsi.com> Hi Alex, Looks good to me as well. Thanks, -Rick From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Thursday, July 11, 2013 11:17 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] New Patch From IDT Alex, Change looks good to me. Thanks, ~Kris From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Wednesday, July 10, 2013 2:13 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** New Patch From IDT Hi all, I am attaching a new patch that includes the following changes: 1. Fixed system crash issue when powering off the system. It happens when locating any pending requests after the queue entry buffers had been freed. 2. Replaced WIN32_WINNT_WINx compiling flags with NTDDI_WINx. The password is idt1234 I had run through all required tests successfully. Please provide your feedbacks after reviewing it. Rick and Kris, If you're okay with it, please send out your approval as well. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at idt.com Thu Jul 11 13:39:27 2013 From: Alex.Chang at idt.com (Chang, Alex) Date: Thu, 11 Jul 2013 20:39:27 +0000 Subject: [nvmewin] New Patch From IDT In-Reply-To: <4565AEA676113A449269C2F3A549520F013839D912@cosmail03.lsi.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> <6B4557D9CF036C4E8F9D6C561818DABB365DA99C@FMSMSX112.amr.corp.intel.com> <4565AEA676113A449269C2F3A549520F013839D912@cosmail03.lsi.com> Message-ID: <548C5470AAD9DA4A85D259B663190D361FFF745C@corpmail1.na.ads.idt.com> Thank you very much, Kris and Rick. I will push it in the trunk later. Alex ________________________________ From: Knoblaugh, Rick [mailto:Rick.Knoblaugh at lsi.com] Sent: Thursday, July 11, 2013 11:20 AM To: Murray, Kris R; Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: New Patch From IDT Hi Alex, Looks good to me as well. Thanks, -Rick From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Thursday, July 11, 2013 11:17 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] New Patch From IDT Alex, Change looks good to me. Thanks, ~Kris From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Wednesday, July 10, 2013 2:13 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** New Patch From IDT Hi all, I am attaching a new patch that includes the following changes: 1. Fixed system crash issue when powering off the system. It happens when locating any pending requests after the queue entry buffers had been freed. 2. Replaced WIN32_WINNT_WINx compiling flags with NTDDI_WINx. The password is idt1234 I had run through all required tests successfully. Please provide your feedbacks after reviewing it. Rick and Kris, If you're okay with it, please send out your approval as well. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From PoYen.Chang at pmcs.com Tue Jul 16 14:08:02 2013 From: PoYen.Chang at pmcs.com (Po-Yen Chang) Date: Tue, 16 Jul 2013 14:08:02 -0700 Subject: [nvmewin] NVMe Windows DB Is LOCKed - Pushing Patch From IDT - Invalid Buffer Access When Shutting Down Message-ID: <40A0B8B92CE0F94685A03264958540C4D7CF73@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> -------------- next part -------------- An HTML attachment was scrubbed... URL: From PoYen.Chang at pmcs.com Tue Jul 16 14:35:48 2013 From: PoYen.Chang at pmcs.com (Po-Yen Chang) Date: Tue, 16 Jul 2013 14:35:48 -0700 Subject: [nvmewin] NVMe Windows Repo Is UNLOCKED - Invalid Buffer Access When Shutting Down Fix Pushed Message-ID: <40A0B8B92CE0F94685A03264958540C4D7CF8C@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Hi all, Latest patch from IDT (Fix Invalid Buffer Access When Shutting Down) has been pushed to trunk. A new tag has been created as "fix_inv_buf_access_when_shutting_down". If anyone has any questions, please feel free to contact me. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From Yong.sc.Chen at huawei.com Wed Jul 17 10:54:00 2013 From: Yong.sc.Chen at huawei.com (Yong Chen) Date: Wed, 17 Jul 2013 17:54:00 +0000 Subject: [nvmewin] need help to verify Hibernation on your NVMe controller Message-ID: <02EC085151D99A469E06988E94FEBCDB1C42EDBA@dfweml513-mbs.china.huawei.com> Hi, All, I have hibernation change working on the controller I have. But the change still has some issues in waking up after 2nd hibernation. I would like to try out on different hardware, after having done extensive troubleshooting in KD. The hibernation is wrapped up with SCSIOP_SYNCHRONIZE_CACHE (i.e. NVM_Flush) and SHUTDOWN requests and the driver seems to have completed them correctly. Please ‘r’ me if you have a bootable controller that works with NVMe Windows reference driver. I won’t disclose such info with anyone. Thanks, Yong ________________________________ Yong Chen Storage Architect 华为技术有限公司 Huawei Technologies Co., Ltd [Company_logo] Office: 408-330-5482 Mobile: 425-922-0658 Email: yong.sc.chen at huawei.com 2330 Central Expressway Santa Clara, CA 95050 http://www.huawei.com 本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁 止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中 的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 6737 bytes Desc: image001.jpg URL: From Kwok.Kong at pmcs.com Wed Jul 17 13:28:37 2013 From: Kwok.Kong at pmcs.com (Kwok Kong) Date: Wed, 17 Jul 2013 13:28:37 -0700 Subject: [nvmewin] Have you any patch ready for the view before 1.2 release ? Message-ID: <40A0B8B92CE0F94685A03264958540C4D7D315@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> All, We plan to make a 1.2 release this month. We expect two more patches for this release, one from Yong chen of Huawei and the other one from Judy of Samsung. Yong is still debugging his feature (hibernation) and I don't think he can make it to the 1.2 release. His feature will be checked in after the 1.2 release. Yong is ok with this. Judy, are you ready to send out your patch this Friday ? If not, then your patch has to be delayed after the 1.2 release. I plan to ask Alex to build a 1.2 release candidate for final testing next week. If there is no problem, then a final 1.2 release will be made at the end of July. Please let me know if you have any questions/comment. Thanks -Kwok -------------- next part -------------- An HTML attachment was scrubbed... URL: From judy.brock at ssi.samsung.com Wed Jul 17 22:08:05 2013 From: judy.brock at ssi.samsung.com (Judy Brock-SSI) Date: Thu, 18 Jul 2013 05:08:05 +0000 Subject: [nvmewin] NvmeStartio path critical section handling not protected from NVMe ISR? In-Reply-To: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> Message-ID: <36E8D38D6B771A4BBDB1C0D800158A5131277267@SSIEXCH-MB3.ssi.samsung.com> All, Under Windows Server 2012, I've seen a crash where NVMeStartIo() gets interrupted by our ISR at a time when it's in the middle of manipulating a linked list critical data structure which the ISR then goes on to attempt to manipulate also - which results in a crash. Below is the call stack - see where I've inserted the comment "<---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK" 2: kd> kc Call Site nt!RtlpBreakWithStatusInstruction nt!KiBugCheckDebugBreak nt!KeBugCheck2 nt!KeBugCheckEx nt!KiBugCheckDispatch nt!KiFastFailDispatch nt!KiRaiseSecurityCheckFailure nvme!RtlFailFast nvme!FatalListEntryError nvme!RtlpCheckListEntry nvme!InsertTailList nvme!NVMeCompleteCmd nvme!NVMeIsrMsix nt!KiInterruptDispatch <---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK nvme!RemoveHeadList nvme!NVMeGetCmdEntry nvme!ProcessIo nvme!NVMeStartIo storport!RaidpAdapterContinueScatterGather hal!HalpAllocateAdapterCallbackV2 hal!IoFreeAdapterChannelV2 hal!HalAllocateAdapterChannelV2 hal!HalBuildScatterGatherListV2 storport!RaUnitStartIo storport!RaidUnitCompleteRequest storport!RaidpAdapterRedirectDpcRoutine nt!KiExecuteAllDpcs nt!KiRetireDpcList I looked through the code and noticed we never call StorPortAcquireSpinLock to acquire the InterruptLock to protect us from such pre-emption. Another way to achieve this would be to indicate we run at half-duplex rather than full-duplex but that would degrade the general performance of the driver. I'm not sure why we didn't run into this way before now - is there some other re-entrance protection algorithm besides the two above that others are aware of? If not, I believe we need to fix this asap. Suggestions: A. Simplest approach is to lock down all of NVMeStartIo as per below (not tested yet) but we almost may as well run half-duplex if we do this: 1 . At the very the top of NVMeStartIo: /* we should never be holding the interrupt lock upon entry to NVMeStartIo. * Acquire the Interrupt Spin Lock to protect against getting hit by our ISR. */ if (NULL == pAdapterExtension->hInterruptLock) { (StorPortAcquireSpinLock(pAdapterExtension, InterruptLock, NULL, &pAdapterExtension->hInterruptLock); } else { ASSERT(FALSE); } 2. At the very the top of IO_StorPortNotification PNVME_DEVICE_EXTENSION pAE = (PNVME_DEVICE_EXTENSION) pHwDeviceExtension; /* if we got here from NvmeStartIo we need to release the interrupt lock */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } 3. At the very bottom of NVMeStartIo: /* if we didn't release the Interrupt Lock in one of the calls to * IO_StorPortNotification above we need to release before we exit NVMEStartIo */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } return TRUE; } /* NVMeStartIo */ B. Better approach is to just lock ProcessIo(). But code exists in that routine which acquires the StartIo lock - we can't take locks out of order or we'll cause deadlock. Right now that code never gets invoked - what was it for? Do we still need it? Can ProcessIo() get called from non-StartIo Paths? Can it get called multiple times? Not having been involved in the initial development of this driver, I would need to study the flow to make sure to respect the StorPort lock acquiring/releasing hierarchy rules at all times. If those conversant in the overall developmental history and architecture of this driver could share their thoughts, that would be great. Thanks, Judy -------------- next part -------------- An HTML attachment was scrubbed... URL: From judy.brock at ssi.samsung.com Thu Jul 18 00:33:14 2013 From: judy.brock at ssi.samsung.com (Judy Brock-SSI) Date: Thu, 18 Jul 2013 07:33:14 +0000 Subject: [nvmewin] Iometer hang In-Reply-To: <548C5470AAD9DA4A85D259B663190D361FFEED9E@corpmail1.na.ads.idt.com> References: <49158E750348AA499168FD41D88983606257AE07@FMSMSX105.amr.corp.intel.com> <4565AEA676113A449269C2F3A549520FDBB1FBD5@cosmail03.lsi.com> <49158E750348AA499168FD41D88983606257AEE1@FMSMSX105.amr.corp.intel.com> <4565AEA676113A449269C2F3A549520FDBBD3022@cosmail03.lsi.com> <548C5470AAD9DA4A85D259B663190D361FFEED9E@corpmail1.na.ads.idt.com> Message-ID: <36E8D38D6B771A4BBDB1C0D800158A51312772D2@SSIEXCH-MB3.ssi.samsung.com> Alex, We are seeing the same problem you describe below with IOMeter stopping right after hitting "Start Tests" with 4k sequential writes. To be fair, I haven't tried top of repository tree, I was using TRIM_command_support label revision 72. Did you ever find out what caused this problem? Thanks, Judy From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Thursday, June 13, 2013 9:40 AM To: Knoblaugh, Rick; Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] LSI Trim Patch Hi Rick, I did some basic tests like disk formats, SCSICompliance, SDStress and IOMeter. They're all working fine except IOMeter, which I configured as 4Kbyte, sequential writes. IOMeter stops right after hitting "Start Tests" (green flag). Do you see the problem when you tested it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Knoblaugh, Rick Sent: Friday, June 07, 2013 6:38 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** LSI Trim Patch Hi Ray, Per your request, since we will switch order, moving Intel patch to the number 3 position, I'm sending LSI's Trim patch. Password for the attached file is: lsi1234. Also, I have attached a document here that describes what was changed/added. It would be great if everyone can review and please let me know any feedback. Thanks. -Rick From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Monday, June 03, 2013 5:12 PM To: Knoblaugh, Rick; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Rick, That's great news! LSI will be 3rd in line after IDT and Intel. Thanks for the contributions to TRIM. Thanks, Ray From: Knoblaugh, Rick [mailto:Rick.Knoblaugh at lsi.com] Sent: Monday, June 03, 2013 4:57 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Ray, We also have the patch for Trim. It is ready to send. Please let me know when you would like me to send out. Thanks, -Rick From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Monday, June 03, 2013 4:44 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] Sandisk patch delay Hello, It appears the Sandisk patch for Mode Sense is taking longer than expected. In order to keep things moving along with the OFA driver, I'm going to take the Sandisk patch offline for now until they can resolve their issues. Once they have worked out the kinks, they can re-submit. In the meantime, Alex from IDT has a patch he'd like to push and I also have a patch I'd like to push. Alex, please send your patch out for code review as soon as possible and then I will send out my patch immediately after. Thanks, Ray [cid:image001.png at 01CB3870.4BB88E70] Raymond C. Robles NVM Solutions Group | Internal SSD Engineering Technology & Manufacturing Group Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From judy.brock at ssi.samsung.com Thu Jul 18 06:22:04 2013 From: judy.brock at ssi.samsung.com (Judy Brock-SSI) Date: Thu, 18 Jul 2013 13:22:04 +0000 Subject: [nvmewin] NvmeStartio path critical section handling not protected from NVMe ISR? References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> Message-ID: <36E8D38D6B771A4BBDB1C0D800158A5131277B4C@SSIEXCH-MB3.ssi.samsung.com> I just thought of another way to handle this problem. Could we not call StorPortSynchronizeAccess() with a pointer back to our ProcessIo() routine? ProcessIo would get called before the call to StorPortSynchronizeAccess() returns and this would have the effect of guaranteeing synchronization with our ISR. This seems like a much cleaner solution that a lock-acquiring approach. I still don't know if there are any issues with ProcessIo being called multiple times, from non-StartIo code paths, etc. - would still need to be looked at. Thanks, Judy From: Judy Brock-SSI Sent: Wednesday, July 17, 2013 10:08 PM To: nvmewin at lists.openfabrics.org Subject: NvmeStartio path critical section handling not protected from NVMe ISR? All, Under Windows Server 2012, I've seen a crash where NVMeStartIo() gets interrupted by our ISR at a time when it's in the middle of manipulating a linked list critical data structure which the ISR then goes on to attempt to manipulate also - which results in a crash. Below is the call stack - see where I've inserted the comment "<---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK" 2: kd> kc Call Site nt!RtlpBreakWithStatusInstruction nt!KiBugCheckDebugBreak nt!KeBugCheck2 nt!KeBugCheckEx nt!KiBugCheckDispatch nt!KiFastFailDispatch nt!KiRaiseSecurityCheckFailure nvme!RtlFailFast nvme!FatalListEntryError nvme!RtlpCheckListEntry nvme!InsertTailList nvme!NVMeCompleteCmd nvme!NVMeIsrMsix nt!KiInterruptDispatch <---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK nvme!RemoveHeadList nvme!NVMeGetCmdEntry nvme!ProcessIo nvme!NVMeStartIo storport!RaidpAdapterContinueScatterGather hal!HalpAllocateAdapterCallbackV2 hal!IoFreeAdapterChannelV2 hal!HalAllocateAdapterChannelV2 hal!HalBuildScatterGatherListV2 storport!RaUnitStartIo storport!RaidUnitCompleteRequest storport!RaidpAdapterRedirectDpcRoutine nt!KiExecuteAllDpcs nt!KiRetireDpcList I looked through the code and noticed we never call StorPortAcquireSpinLock to acquire the InterruptLock to protect us from such pre-emption. Another way to achieve this would be to indicate we run at half-duplex rather than full-duplex but that would degrade the general performance of the driver. I'm not sure why we didn't run into this way before now - is there some other re-entrance protection algorithm besides the two above that others are aware of? If not, I believe we need to fix this asap. Suggestions: A. Simplest approach is to lock down all of NVMeStartIo as per below (not tested yet) but we almost may as well run half-duplex if we do this: 1 . At the very the top of NVMeStartIo: /* we should never be holding the interrupt lock upon entry to NVMeStartIo. * Acquire the Interrupt Spin Lock to protect against getting hit by our ISR. */ if (NULL == pAdapterExtension->hInterruptLock) { (StorPortAcquireSpinLock(pAdapterExtension, InterruptLock, NULL, &pAdapterExtension->hInterruptLock); } else { ASSERT(FALSE); } 2. At the very the top of IO_StorPortNotification PNVME_DEVICE_EXTENSION pAE = (PNVME_DEVICE_EXTENSION) pHwDeviceExtension; /* if we got here from NvmeStartIo we need to release the interrupt lock */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } 3. At the very bottom of NVMeStartIo: /* if we didn't release the Interrupt Lock in one of the calls to * IO_StorPortNotification above we need to release before we exit NVMEStartIo */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } return TRUE; } /* NVMeStartIo */ B. Better approach is to just lock ProcessIo(). But code exists in that routine which acquires the StartIo lock - we can't take locks out of order or we'll cause deadlock. Right now that code never gets invoked - what was it for? Do we still need it? Can ProcessIo() get called from non-StartIo Paths? Can it get called multiple times? Not having been involved in the initial development of this driver, I would need to study the flow to make sure to respect the StorPort lock acquiring/releasing hierarchy rules at all times. If those conversant in the overall developmental history and architecture of this driver could share their thoughts, that would be great. Thanks, Judy -------------- next part -------------- An HTML attachment was scrubbed... URL: From judy.brock at ssi.samsung.com Thu Jul 18 06:34:11 2013 From: judy.brock at ssi.samsung.com (Judy Brock-SSI) Date: Thu, 18 Jul 2013 13:34:11 +0000 Subject: [nvmewin] NVMeResetBus design Message-ID: <36E8D38D6B771A4BBDB1C0D800158A5131277BB2@SSIEXCH-MB3.ssi.samsung.com> In our testing, we create a situation where we put the NVMe driver under heavy I/O load with Iometer and then cause the device to stop responding. This results in I/O request timeouts which eventually causes the driver to be called at it's HwStorResetBus entry point (NVMeResetBus). I have some feedback on the current architecture of that routine: 1. Among other things, NMeResetBus schedules a DPC to complete any pending commands. This creates a situation where upon return from this entry point, there are still cmds outstanding which don't get completed till the DPC runs. According to the WDK, this doesn't appear to be legal - all outstanding cmds have to be completed by the HwStorResetBus routine before it returns: HwResetBus Pointer to the miniport driver's HwStorResetBus routine, which is a required entry point for all miniport drivers. This member has the same meaning for the Storport version of the HW_INITIALIZATION_DATA structure as it does for the SCSI Port version of the structure. For more information, see the HwResetBus member of HW_INITIALIZATION_DATA (SCSI) and HwScsiResetBus must complete any outstanding requests by calling ScsiPortCompleteRequest with the SrbStatus value SRB_STATUS_BUS_RESET or, for individual SRBs, ScsiPortNotification with this status value. and The port driver pauses all device IO queues for the adapter and then calls the HwStorResetBus routine at IRQL DISPATCH_LEVEL after acquiring the StartIo spin lock. A miniport driver is responsible for completing SRBs received by HwStorStartIo for PathId during this routine and setting their status to SRB_STATUS_BUS_RESET if necessary Since HwStorResetBus must finish its work before returning; it can't schedule a DPC to do so later on. The logic which schedules a DPC should be removed. 2. Code should be added to call StorPortPause() to hold off any new requests till StorPortResume() is called. 3. Code should be added to call StorPortSynchronizeAccess() in order to synchronize with HwStorInterrupt. A callback routine in the NVMe driver should also be added for NVMeResetBus to do the synchronized work in. HwStorResetBus is already synchronized with HwStorStartIo since the port driver calls it only after acquiring the StartIo spinlock. 4. We should implement a driver-internal global (per "adapter") flag signifying we are busy with reset processing and thus can't allow new I/O requests to go through to the hardware. 5. Code should be added to call StorPortResume() when all work is complete. 6. We should refer to the WDK-supplied LSI parallel SCSI StorPort miniport sample driver for an example of all of the above. Thanks, Judy -------------- next part -------------- An HTML attachment was scrubbed... URL: From PoYen.Chang at pmcs.com Thu Jul 18 08:56:43 2013 From: PoYen.Chang at pmcs.com (Po-Yen Chang) Date: Thu, 18 Jul 2013 08:56:43 -0700 Subject: [nvmewin] Iometer hang In-Reply-To: <36E8D38D6B771A4BBDB1C0D800158A51312772D2@SSIEXCH-MB3.ssi.samsung.com> References: <49158E750348AA499168FD41D88983606257AE07@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBB1FBD5@cosmail03.lsi.com><49158E750348AA499168FD41D88983606257AEE1@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBBD3022@cosmail03.lsi.com><548C5470AAD9DA4A85D259B663190D361FFEED9E@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A51312772D2@SSIEXCH-MB3.ssi.samsung.com> Message-ID: <40A0B8B92CE0F94685A03264958540C4DEBFBC@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Hi Judy, Could you please let me know how you built the binary and which operating system you ran on? There are some compatibility issue here and that's why we need to release separate binary package for Windows 8, where TRIM is enabled. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 12:33 AM To: Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] Iometer hang Alex, We are seeing the same problem you describe below with IOMeter stopping right after hitting "Start Tests" with 4k sequential writes. To be fair, I haven't tried top of repository tree, I was using TRIM_command_support label revision 72. Did you ever find out what caused this problem? Thanks, Judy From: nvmewin-bounces at lists.openfabrics.org [ mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Thursday, June 13, 2013 9:40 AM To: Knoblaugh, Rick; Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] LSI Trim Patch Hi Rick, I did some basic tests like disk formats, SCSICompliance, SDStress and IOMeter. They're all working fine except IOMeter, which I configured as 4Kbyte, sequential writes. IOMeter stops right after hitting "Start Tests" (green flag). Do you see the problem when you tested it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [ mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Knoblaugh, Rick Sent: Friday, June 07, 2013 6:38 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** LSI Trim Patch Hi Ray, Per your request, since we will switch order, moving Intel patch to the number 3 position, I'm sending LSI's Trim patch. Password for the attached file is: lsi1234. Also, I have attached a document here that describes what was changed/added. It would be great if everyone can review and please let me know any feedback. Thanks. -Rick From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Monday, June 03, 2013 5:12 PM To: Knoblaugh, Rick; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Rick, That's great news! LSI will be 3rd in line after IDT and Intel. Thanks for the contributions to TRIM. Thanks, Ray From: Knoblaugh, Rick [mailto:Rick.Knoblaugh at lsi.com] Sent: Monday, June 03, 2013 4:57 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Ray, We also have the patch for Trim. It is ready to send. Please let me know when you would like me to send out. Thanks, -Rick From: nvmewin-bounces at lists.openfabrics.org [ mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Monday, June 03, 2013 4:44 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] Sandisk patch delay Hello, It appears the Sandisk patch for Mode Sense is taking longer than expected. In order to keep things moving along with the OFA driver, I'm going to take the Sandisk patch offline for now until they can resolve their issues. Once they have worked out the kinks, they can re-submit. In the meantime, Alex from IDT has a patch he'd like to push and I also have a patch I'd like to push. Alex, please send your patch out for code review as soon as possible and then I will send out my patch immediately after. Thanks, Ray Raymond C. Robles NVM Solutions Group | Internal SSD Engineering Technology & Manufacturing Group Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From judy.brock at ssi.samsung.com Thu Jul 18 09:01:02 2013 From: judy.brock at ssi.samsung.com (Judy Brock-SSI) Date: Thu, 18 Jul 2013 16:01:02 +0000 Subject: [nvmewin] Iometer hang In-Reply-To: <40A0B8B92CE0F94685A03264958540C4DEBFBC@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> References: <49158E750348AA499168FD41D88983606257AE07@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBB1FBD5@cosmail03.lsi.com><49158E750348AA499168FD41D88983606257AEE1@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBBD3022@cosmail03.lsi.com><548C5470AAD9DA4A85D259B663190D361FFEED9E@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A51312772D2@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEBFBC@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Message-ID: <36E8D38D6B771A4BBDB1C0D800158A5131277CB6@SSIEXCH-MB3.ssi.samsung.com> I ran on Windows Server 2012 and the binary (not built by me) was built with VS 2012/Win 8 WDK. Can you explain what you mean by compatibility issues? Did you isolate the root cause of the hang? Thanks, Judy From: Po-Yen Chang [mailto:PoYen.Chang at pmcs.com] Sent: Thursday, July 18, 2013 8:57 AM To: Judy Brock-SSI; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Hi Judy, Could you please let me know how you built the binary and which operating system you ran on? There are some compatibility issue here and that's why we need to release separate binary package for Windows 8, where TRIM is enabled. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 12:33 AM To: Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] Iometer hang Alex, We are seeing the same problem you describe below with IOMeter stopping right after hitting "Start Tests" with 4k sequential writes. To be fair, I haven't tried top of repository tree, I was using TRIM_command_support label revision 72. Did you ever find out what caused this problem? Thanks, Judy From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Thursday, June 13, 2013 9:40 AM To: Knoblaugh, Rick; Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] LSI Trim Patch Hi Rick, I did some basic tests like disk formats, SCSICompliance, SDStress and IOMeter. They're all working fine except IOMeter, which I configured as 4Kbyte, sequential writes. IOMeter stops right after hitting "Start Tests" (green flag). Do you see the problem when you tested it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Knoblaugh, Rick Sent: Friday, June 07, 2013 6:38 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** LSI Trim Patch Hi Ray, Per your request, since we will switch order, moving Intel patch to the number 3 position, I'm sending LSI's Trim patch. Password for the attached file is: lsi1234. Also, I have attached a document here that describes what was changed/added. It would be great if everyone can review and please let me know any feedback. Thanks. -Rick From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Monday, June 03, 2013 5:12 PM To: Knoblaugh, Rick; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Rick, That's great news! LSI will be 3rd in line after IDT and Intel. Thanks for the contributions to TRIM. Thanks, Ray From: Knoblaugh, Rick [mailto:Rick.Knoblaugh at lsi.com] Sent: Monday, June 03, 2013 4:57 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Ray, We also have the patch for Trim. It is ready to send. Please let me know when you would like me to send out. Thanks, -Rick From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Monday, June 03, 2013 4:44 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] Sandisk patch delay Hello, It appears the Sandisk patch for Mode Sense is taking longer than expected. In order to keep things moving along with the OFA driver, I'm going to take the Sandisk patch offline for now until they can resolve their issues. Once they have worked out the kinks, they can re-submit. In the meantime, Alex from IDT has a patch he'd like to push and I also have a patch I'd like to push. Alex, please send your patch out for code review as soon as possible and then I will send out my patch immediately after. Thanks, Ray [cid:image001.png at 01CB3870.4BB88E70] Raymond C. Robles NVM Solutions Group | Internal SSD Engineering Technology & Manufacturing Group Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From PoYen.Chang at pmcs.com Thu Jul 18 09:18:49 2013 From: PoYen.Chang at pmcs.com (Po-Yen Chang) Date: Thu, 18 Jul 2013 09:18:49 -0700 Subject: [nvmewin] Iometer hang In-Reply-To: <36E8D38D6B771A4BBDB1C0D800158A5131277CB6@SSIEXCH-MB3.ssi.samsung.com> References: <49158E750348AA499168FD41D88983606257AE07@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBB1FBD5@cosmail03.lsi.com><49158E750348AA499168FD41D88983606257AEE1@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBBD3022@cosmail03.lsi.com><548C5470AAD9DA4A85D259B663190D361FFEED9E@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A51312772D2@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEBFBC@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <36E8D38D6B771A4BBDB1C0D800158A5131277CB6@SSIEXCH-MB3.ssi.samsung.com> Message-ID: <40A0B8B92CE0F94685A03264958540C4DEBFDA@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Judy, No, I haven't got the chance to get to the bottom of it and Yong from Huawei promised to find out more information from Microsoft contacts he has. Here are the suggested methods to build the binary for different Operating Systems: For Windows 7, Server 2008 R2, Server 2012, Windows 8 (TRIM disabled): Within WDK 7600 build environment, or Within VS 2012 when configured for Windows 7 in Project Property. For Windows 8 (TRIM enabled): Within VS 2012 when configured for Windows 8 in Project Property. Thanks, Alex ________________________________ From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Thursday, July 18, 2013 9:01 AM To: Po-Yen Chang; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang I ran on Windows Server 2012 and the binary (not built by me) was built with VS 2012/Win 8 WDK. Can you explain what you mean by compatibility issues? Did you isolate the root cause of the hang? Thanks, Judy From: Po-Yen Chang [mailto:PoYen.Chang at pmcs.com] Sent: Thursday, July 18, 2013 8:57 AM To: Judy Brock-SSI; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Hi Judy, Could you please let me know how you built the binary and which operating system you ran on? There are some compatibility issue here and that's why we need to release separate binary package for Windows 8, where TRIM is enabled. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [ mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 12:33 AM To: Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] Iometer hang Alex, We are seeing the same problem you describe below with IOMeter stopping right after hitting "Start Tests" with 4k sequential writes. To be fair, I haven't tried top of repository tree, I was using TRIM_command_support label revision 72. Did you ever find out what caused this problem? Thanks, Judy From: nvmewin-bounces at lists.openfabrics.org [ mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Thursday, June 13, 2013 9:40 AM To: Knoblaugh, Rick; Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] LSI Trim Patch Hi Rick, I did some basic tests like disk formats, SCSICompliance, SDStress and IOMeter. They're all working fine except IOMeter, which I configured as 4Kbyte, sequential writes. IOMeter stops right after hitting "Start Tests" (green flag). Do you see the problem when you tested it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [ mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Knoblaugh, Rick Sent: Friday, June 07, 2013 6:38 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** LSI Trim Patch Hi Ray, Per your request, since we will switch order, moving Intel patch to the number 3 position, I'm sending LSI's Trim patch. Password for the attached file is: lsi1234. Also, I have attached a document here that describes what was changed/added. It would be great if everyone can review and please let me know any feedback. Thanks. -Rick From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Monday, June 03, 2013 5:12 PM To: Knoblaugh, Rick; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Rick, That's great news! LSI will be 3rd in line after IDT and Intel. Thanks for the contributions to TRIM. Thanks, Ray From: Knoblaugh, Rick [mailto:Rick.Knoblaugh at lsi.com] Sent: Monday, June 03, 2013 4:57 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Ray, We also have the patch for Trim. It is ready to send. Please let me know when you would like me to send out. Thanks, -Rick From: nvmewin-bounces at lists.openfabrics.org [ mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Monday, June 03, 2013 4:44 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] Sandisk patch delay Hello, It appears the Sandisk patch for Mode Sense is taking longer than expected. In order to keep things moving along with the OFA driver, I'm going to take the Sandisk patch offline for now until they can resolve their issues. Once they have worked out the kinks, they can re-submit. In the meantime, Alex from IDT has a patch he'd like to push and I also have a patch I'd like to push. Alex, please send your patch out for code review as soon as possible and then I will send out my patch immediately after. Thanks, Ray Raymond C. Robles NVM Solutions Group | Internal SSD Engineering Technology & Manufacturing Group Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From judy.brock at ssi.samsung.com Thu Jul 18 09:36:19 2013 From: judy.brock at ssi.samsung.com (Judy Brock-SSI) Date: Thu, 18 Jul 2013 16:36:19 +0000 Subject: [nvmewin] Iometer hang In-Reply-To: <40A0B8B92CE0F94685A03264958540C4DEBFDA@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> References: <49158E750348AA499168FD41D88983606257AE07@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBB1FBD5@cosmail03.lsi.com><49158E750348AA499168FD41D88983606257AEE1@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBBD3022@cosmail03.lsi.com><548C5470AAD9DA4A85D259B663190D361FFEED9E@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A51312772D2@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEBFBC@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <36E8D38D6B771A4BBDB1C0D800158A5131277CB6@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEBFDA@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Message-ID: <36E8D38D6B771A4BBDB1C0D800158A5131277D90@SSIEXCH-MB3.ssi.samsung.com> Hi Alex, I'm still confused. 1. I thought the suggested method for building for Windows Server 2012 (TRIM enabled) should be the same as for Windows 8 (TRIM enabled) but that is not what it says below - there is no mention of building for Windows Server 2012 w/TRIM enabled. Is it a mistake below or is it really the case that we are suggesting that folks need to disable the TRIM code in the Server 2012 environment as it says below? Because if we build for Server 2012 using either WDK 7600 build environment or within VS 2012 when configured for Windows 7 in Project Property, we are disabling the TRIM code in that environment, correct? 2. When you say "I haven't got the chance to get to the bottom of it and Yong from Huawei promised to find out more information from Microsoft contacts he has.", get to the bottom of what? Why are we asking MS about an Iometer hang? Unless perhaps it's the case that you see this problem with Server 2012 (TRIM enabled) but not Windows 8 (TRIM enabled)....is that the case? Please clarify why we think this is an OS issue or why MS may have light to shed. 3. Which versions of the OS did you see this problem on? 4. When I tried to debug the problem using a checked build version of the driver, I got a whole different symptom. In fact that's when I saw the crash which showed NVMe ISR interrupting NVMEStartIo critical section due to lack of synchronization between the two paths. So while it's good to have discovered that particular hole in the driver, I was unable to debug the Iometer hang because it only manifested with the free build driver binary. Does anyone know how to configure the build to generate free build symbols? It won't be as painless but I could debug the free build with the help of a symbol file at least - and maybe a mixed assembly/source listing and a map file... Thanks, Judy From: Po-Yen Chang [mailto:PoYen.Chang at pmcs.com] Sent: Thursday, July 18, 2013 9:19 AM To: Judy Brock-SSI; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Judy, No, I haven't got the chance to get to the bottom of it and Yong from Huawei promised to find out more information from Microsoft contacts he has. Here are the suggested methods to build the binary for different Operating Systems: For Windows 7, Server 2008 R2, Server 2012, Windows 8 (TRIM disabled): Within WDK 7600 build environment, or Within VS 2012 when configured for Windows 7 in Project Property. For Windows 8 (TRIM enabled): Within VS 2012 when configured for Windows 8 in Project Property. Thanks, Alex ________________________________ From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Thursday, July 18, 2013 9:01 AM To: Po-Yen Chang; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang I ran on Windows Server 2012 and the binary (not built by me) was built with VS 2012/Win 8 WDK. Can you explain what you mean by compatibility issues? Did you isolate the root cause of the hang? Thanks, Judy From: Po-Yen Chang [mailto:PoYen.Chang at pmcs.com] Sent: Thursday, July 18, 2013 8:57 AM To: Judy Brock-SSI; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Hi Judy, Could you please let me know how you built the binary and which operating system you ran on? There are some compatibility issue here and that's why we need to release separate binary package for Windows 8, where TRIM is enabled. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 12:33 AM To: Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] Iometer hang Alex, We are seeing the same problem you describe below with IOMeter stopping right after hitting "Start Tests" with 4k sequential writes. To be fair, I haven't tried top of repository tree, I was using TRIM_command_support label revision 72. Did you ever find out what caused this problem? Thanks, Judy From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Thursday, June 13, 2013 9:40 AM To: Knoblaugh, Rick; Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] LSI Trim Patch Hi Rick, I did some basic tests like disk formats, SCSICompliance, SDStress and IOMeter. They're all working fine except IOMeter, which I configured as 4Kbyte, sequential writes. IOMeter stops right after hitting "Start Tests" (green flag). Do you see the problem when you tested it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Knoblaugh, Rick Sent: Friday, June 07, 2013 6:38 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** LSI Trim Patch Hi Ray, Per your request, since we will switch order, moving Intel patch to the number 3 position, I'm sending LSI's Trim patch. Password for the attached file is: lsi1234. Also, I have attached a document here that describes what was changed/added. It would be great if everyone can review and please let me know any feedback. Thanks. -Rick From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Monday, June 03, 2013 5:12 PM To: Knoblaugh, Rick; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Rick, That's great news! LSI will be 3rd in line after IDT and Intel. Thanks for the contributions to TRIM. Thanks, Ray From: Knoblaugh, Rick [mailto:Rick.Knoblaugh at lsi.com] Sent: Monday, June 03, 2013 4:57 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Ray, We also have the patch for Trim. It is ready to send. Please let me know when you would like me to send out. Thanks, -Rick From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Monday, June 03, 2013 4:44 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] Sandisk patch delay Hello, It appears the Sandisk patch for Mode Sense is taking longer than expected. In order to keep things moving along with the OFA driver, I'm going to take the Sandisk patch offline for now until they can resolve their issues. Once they have worked out the kinks, they can re-submit. In the meantime, Alex from IDT has a patch he'd like to push and I also have a patch I'd like to push. Alex, please send your patch out for code review as soon as possible and then I will send out my patch immediately after. Thanks, Ray [cid:image001.png at 01CB3870.4BB88E70] Raymond C. Robles NVM Solutions Group | Internal SSD Engineering Technology & Manufacturing Group Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From PoYen.Chang at pmcs.com Thu Jul 18 10:00:54 2013 From: PoYen.Chang at pmcs.com (Po-Yen Chang) Date: Thu, 18 Jul 2013 10:00:54 -0700 Subject: [nvmewin] Iometer hang In-Reply-To: <36E8D38D6B771A4BBDB1C0D800158A5131277D90@SSIEXCH-MB3.ssi.samsung.com> References: <49158E750348AA499168FD41D88983606257AE07@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBB1FBD5@cosmail03.lsi.com><49158E750348AA499168FD41D88983606257AEE1@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBBD3022@cosmail03.lsi.com><548C5470AAD9DA4A85D259B663190D361FFEED9E@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A51312772D2@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEBFBC@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <36E8D38D6B771A4BBDB1C0D800158A5131277CB6@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEBFDA@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <36E8D38D6B771A4BBDB1C0D800158A5131277D90@SSIEXCH-MB3.ssi.samsung.com> Message-ID: <40A0B8B92CE0F94685A03264958540C4DEC020@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Hi Judy, See my comments in red... ________________________________ From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Thursday, July 18, 2013 9:36 AM To: Po-Yen Chang; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Hi Alex, I'm still confused. 1. I thought the suggested method for building for Windows Server 2012 (TRIM enabled) should be the same as for Windows 8 (TRIM enabled) but that is not what it says below - there is no mention of building for Windows Server 2012 w/TRIM enabled. Is it a mistake below or is it really the case that we are suggesting that folks need to disable the TRIM code in the Server 2012 environment as it says below? Because if we build for Server 2012 using either WDK 7600 build environment or within VS 2012 when configured for Windows 7 in Project Property, we are disabling the TRIM code in that environment, correct? I am so sure TRIM is supported in Server 2012. Could you please confirm that? 2. When you say "I haven't got the chance to get to the bottom of it and Yong from Huawei promised to find out more information from Microsoft contacts he has.", get to the bottom of what? Why are we asking MS about an Iometer hang? Unless perhaps it's the case that you see this problem with Server 2012 (TRIM enabled) but not Windows 8 (TRIM enabled)....is that the case? Please clarify why we think this is an OS issue or why MS may have light to shed. After tracing back the emails I exchanged with Rick (who implemented the TRIM) for IOMeter issue, it was caused by mssing COMPLETE_IN_DPC compiling flag when configuring project in VS 2012. Do you specify that? 3. Which versions of the OS did you see this problem on? Without specifying COMPLETE_IN_DPC, I've seen the IOMeter issue on Windows 7 as well. 4. When I tried to debug the problem using a checked build version of the driver, I got a whole different symptom. In fact that's when I saw the crash which showed NVMe ISR interrupting NVMEStartIo critical section due to lack of synchronization between the two paths. So while it's good to have discovered that particular hole in the driver, I was unable to debug the Iometer hang because it only manifested with the free build driver binary. Does anyone know how to configure the build to generate free build symbols? It won't be as painless but I could debug the free build with the help of a symbol file at least - and maybe a mixed assembly/source listing and a map file... See my comments on #2. Thanks, Judy From: Po-Yen Chang [mailto:PoYen.Chang at pmcs.com] Sent: Thursday, July 18, 2013 9:19 AM To: Judy Brock-SSI; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Judy, No, I haven't got the chance to get to the bottom of it and Yong from Huawei promised to find out more information from Microsoft contacts he has. Here are the suggested methods to build the binary for different Operating Systems: For Windows 7, Server 2008 R2, Server 2012, Windows 8 (TRIM disabled): Within WDK 7600 build environment, or Within VS 2012 when configured for Windows 7 in Project Property. For Windows 8 (TRIM enabled): Within VS 2012 when configured for Windows 8 in Project Property. Thanks, Alex ________________________________ From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Thursday, July 18, 2013 9:01 AM To: Po-Yen Chang; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang I ran on Windows Server 2012 and the binary (not built by me) was built with VS 2012/Win 8 WDK. Can you explain what you mean by compatibility issues? Did you isolate the root cause of the hang? Thanks, Judy From: Po-Yen Chang [mailto:PoYen.Chang at pmcs.com] Sent: Thursday, July 18, 2013 8:57 AM To: Judy Brock-SSI; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Hi Judy, Could you please let me know how you built the binary and which operating system you ran on? There are some compatibility issue here and that's why we need to release separate binary package for Windows 8, where TRIM is enabled. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [ mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 12:33 AM To: Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] Iometer hang Alex, We are seeing the same problem you describe below with IOMeter stopping right after hitting "Start Tests" with 4k sequential writes. To be fair, I haven't tried top of repository tree, I was using TRIM_command_support label revision 72. Did you ever find out what caused this problem? Thanks, Judy From: nvmewin-bounces at lists.openfabrics.org [ mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Thursday, June 13, 2013 9:40 AM To: Knoblaugh, Rick; Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] LSI Trim Patch Hi Rick, I did some basic tests like disk formats, SCSICompliance, SDStress and IOMeter. They're all working fine except IOMeter, which I configured as 4Kbyte, sequential writes. IOMeter stops right after hitting "Start Tests" (green flag). Do you see the problem when you tested it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [ mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Knoblaugh, Rick Sent: Friday, June 07, 2013 6:38 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** LSI Trim Patch Hi Ray, Per your request, since we will switch order, moving Intel patch to the number 3 position, I'm sending LSI's Trim patch. Password for the attached file is: lsi1234. Also, I have attached a document here that describes what was changed/added. It would be great if everyone can review and please let me know any feedback. Thanks. -Rick From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Monday, June 03, 2013 5:12 PM To: Knoblaugh, Rick; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Rick, That's great news! LSI will be 3rd in line after IDT and Intel. Thanks for the contributions to TRIM. Thanks, Ray From: Knoblaugh, Rick [mailto:Rick.Knoblaugh at lsi.com] Sent: Monday, June 03, 2013 4:57 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Ray, We also have the patch for Trim. It is ready to send. Please let me know when you would like me to send out. Thanks, -Rick From: nvmewin-bounces at lists.openfabrics.org [ mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Monday, June 03, 2013 4:44 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] Sandisk patch delay Hello, It appears the Sandisk patch for Mode Sense is taking longer than expected. In order to keep things moving along with the OFA driver, I'm going to take the Sandisk patch offline for now until they can resolve their issues. Once they have worked out the kinks, they can re-submit. In the meantime, Alex from IDT has a patch he'd like to push and I also have a patch I'd like to push. Alex, please send your patch out for code review as soon as possible and then I will send out my patch immediately after. Thanks, Ray Raymond C. Robles NVM Solutions Group | Internal SSD Engineering Technology & Manufacturing Group Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From Rick.Knoblaugh at lsi.com Thu Jul 18 14:24:49 2013 From: Rick.Knoblaugh at lsi.com (Knoblaugh, Rick) Date: Thu, 18 Jul 2013 15:24:49 -0600 Subject: [nvmewin] Iometer hang In-Reply-To: <40A0B8B92CE0F94685A03264958540C4DEC020@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> References: <49158E750348AA499168FD41D88983606257AE07@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBB1FBD5@cosmail03.lsi.com><49158E750348AA499168FD41D88983606257AEE1@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBBD3022@cosmail03.lsi.com><548C5470AAD9DA4A85D259B663190D361FFEED9E@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A51312772D2@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEBFBC@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <36E8D38D6B771A4BBDB1C0D800158A5131277CB6@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEBFDA@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <36E8D38D6B771A4BBDB1C0D800158A5131277D90@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEC020@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Message-ID: <4565AEA676113A449269C2F3A549520F013839E37A@cosmail03.lsi.com> Windows Server 2012 has the same storage stack as Windows 8, so it should support Trim as well. Thanks, -Rick From: Po-Yen Chang [mailto:PoYen.Chang at pmcs.com] Sent: Thursday, July 18, 2013 10:01 AM To: Judy Brock-SSI; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Hi Judy, See my comments in red... ________________________________ From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Thursday, July 18, 2013 9:36 AM To: Po-Yen Chang; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Hi Alex, I'm still confused. 1. I thought the suggested method for building for Windows Server 2012 (TRIM enabled) should be the same as for Windows 8 (TRIM enabled) but that is not what it says below - there is no mention of building for Windows Server 2012 w/TRIM enabled. Is it a mistake below or is it really the case that we are suggesting that folks need to disable the TRIM code in the Server 2012 environment as it says below? Because if we build for Server 2012 using either WDK 7600 build environment or within VS 2012 when configured for Windows 7 in Project Property, we are disabling the TRIM code in that environment, correct? I am so sure TRIM is supported in Server 2012. Could you please confirm that? 2. When you say "I haven't got the chance to get to the bottom of it and Yong from Huawei promised to find out more information from Microsoft contacts he has.", get to the bottom of what? Why are we asking MS about an Iometer hang? Unless perhaps it's the case that you see this problem with Server 2012 (TRIM enabled) but not Windows 8 (TRIM enabled)....is that the case? Please clarify why we think this is an OS issue or why MS may have light to shed. After tracing back the emails I exchanged with Rick (who implemented the TRIM) for IOMeter issue, it was caused by mssing COMPLETE_IN_DPC compiling flag when configuring project in VS 2012. Do you specify that? 3. Which versions of the OS did you see this problem on? Without specifying COMPLETE_IN_DPC, I've seen the IOMeter issue on Windows 7 as well. 4. When I tried to debug the problem using a checked build version of the driver, I got a whole different symptom. In fact that's when I saw the crash which showed NVMe ISR interrupting NVMEStartIo critical section due to lack of synchronization between the two paths. So while it's good to have discovered that particular hole in the driver, I was unable to debug the Iometer hang because it only manifested with the free build driver binary. Does anyone know how to configure the build to generate free build symbols? It won't be as painless but I could debug the free build with the help of a symbol file at least - and maybe a mixed assembly/source listing and a map file... See my comments on #2. Thanks, Judy From: Po-Yen Chang [mailto:PoYen.Chang at pmcs.com] Sent: Thursday, July 18, 2013 9:19 AM To: Judy Brock-SSI; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Judy, No, I haven't got the chance to get to the bottom of it and Yong from Huawei promised to find out more information from Microsoft contacts he has. Here are the suggested methods to build the binary for different Operating Systems: For Windows 7, Server 2008 R2, Server 2012, Windows 8 (TRIM disabled): Within WDK 7600 build environment, or Within VS 2012 when configured for Windows 7 in Project Property. For Windows 8 (TRIM enabled): Within VS 2012 when configured for Windows 8 in Project Property. Thanks, Alex ________________________________ From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Thursday, July 18, 2013 9:01 AM To: Po-Yen Chang; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang I ran on Windows Server 2012 and the binary (not built by me) was built with VS 2012/Win 8 WDK. Can you explain what you mean by compatibility issues? Did you isolate the root cause of the hang? Thanks, Judy From: Po-Yen Chang [mailto:PoYen.Chang at pmcs.com] Sent: Thursday, July 18, 2013 8:57 AM To: Judy Brock-SSI; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Hi Judy, Could you please let me know how you built the binary and which operating system you ran on? There are some compatibility issue here and that's why we need to release separate binary package for Windows 8, where TRIM is enabled. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 12:33 AM To: Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] Iometer hang Alex, We are seeing the same problem you describe below with IOMeter stopping right after hitting "Start Tests" with 4k sequential writes. To be fair, I haven't tried top of repository tree, I was using TRIM_command_support label revision 72. Did you ever find out what caused this problem? Thanks, Judy From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Thursday, June 13, 2013 9:40 AM To: Knoblaugh, Rick; Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] LSI Trim Patch Hi Rick, I did some basic tests like disk formats, SCSICompliance, SDStress and IOMeter. They're all working fine except IOMeter, which I configured as 4Kbyte, sequential writes. IOMeter stops right after hitting "Start Tests" (green flag). Do you see the problem when you tested it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Knoblaugh, Rick Sent: Friday, June 07, 2013 6:38 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** LSI Trim Patch Hi Ray, Per your request, since we will switch order, moving Intel patch to the number 3 position, I'm sending LSI's Trim patch. Password for the attached file is: lsi1234. Also, I have attached a document here that describes what was changed/added. It would be great if everyone can review and please let me know any feedback. Thanks. -Rick From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Monday, June 03, 2013 5:12 PM To: Knoblaugh, Rick; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Rick, That's great news! LSI will be 3rd in line after IDT and Intel. Thanks for the contributions to TRIM. Thanks, Ray From: Knoblaugh, Rick [mailto:Rick.Knoblaugh at lsi.com] Sent: Monday, June 03, 2013 4:57 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Ray, We also have the patch for Trim. It is ready to send. Please let me know when you would like me to send out. Thanks, -Rick From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Monday, June 03, 2013 4:44 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] Sandisk patch delay Hello, It appears the Sandisk patch for Mode Sense is taking longer than expected. In order to keep things moving along with the OFA driver, I'm going to take the Sandisk patch offline for now until they can resolve their issues. Once they have worked out the kinks, they can re-submit. In the meantime, Alex from IDT has a patch he'd like to push and I also have a patch I'd like to push. Alex, please send your patch out for code review as soon as possible and then I will send out my patch immediately after. Thanks, Ray [cid:image001.png at 01CE83C2.8F9E8DB0] Raymond C. Robles NVM Solutions Group | Internal SSD Engineering Technology & Manufacturing Group Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From judy.brock at ssi.samsung.com Thu Jul 18 15:38:31 2013 From: judy.brock at ssi.samsung.com (Judy Brock-SSI) Date: Thu, 18 Jul 2013 22:38:31 +0000 Subject: [nvmewin] Iometer hang In-Reply-To: <40A0B8B92CE0F94685A03264958540C4DEC020@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> References: <49158E750348AA499168FD41D88983606257AE07@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBB1FBD5@cosmail03.lsi.com><49158E750348AA499168FD41D88983606257AEE1@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBBD3022@cosmail03.lsi.com><548C5470AAD9DA4A85D259B663190D361FFEED9E@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A51312772D2@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEBFBC@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <36E8D38D6B771A4BBDB1C0D800158A5131277CB6@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEBFDA@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <36E8D38D6B771A4BBDB1C0D800158A5131277D90@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEC020@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Message-ID: <36E8D38D6B771A4BBDB1C0D800158A5131277EAF@SSIEXCH-MB3.ssi.samsung.com> Hi Alex, Thanks for clarifying the problem - I'm sure it will work with TRIM enabled in 2012 now! Will let you know. Judy From: Po-Yen Chang [mailto:PoYen.Chang at pmcs.com] Sent: Thursday, July 18, 2013 10:01 AM To: Judy Brock-SSI; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Hi Judy, See my comments in red... ________________________________ From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Thursday, July 18, 2013 9:36 AM To: Po-Yen Chang; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Hi Alex, I'm still confused. 1. I thought the suggested method for building for Windows Server 2012 (TRIM enabled) should be the same as for Windows 8 (TRIM enabled) but that is not what it says below - there is no mention of building for Windows Server 2012 w/TRIM enabled. Is it a mistake below or is it really the case that we are suggesting that folks need to disable the TRIM code in the Server 2012 environment as it says below? Because if we build for Server 2012 using either WDK 7600 build environment or within VS 2012 when configured for Windows 7 in Project Property, we are disabling the TRIM code in that environment, correct? I am so sure TRIM is supported in Server 2012. Could you please confirm that? 2. When you say "I haven't got the chance to get to the bottom of it and Yong from Huawei promised to find out more information from Microsoft contacts he has.", get to the bottom of what? Why are we asking MS about an Iometer hang? Unless perhaps it's the case that you see this problem with Server 2012 (TRIM enabled) but not Windows 8 (TRIM enabled)....is that the case? Please clarify why we think this is an OS issue or why MS may have light to shed. After tracing back the emails I exchanged with Rick (who implemented the TRIM) for IOMeter issue, it was caused by mssing COMPLETE_IN_DPC compiling flag when configuring project in VS 2012. Do you specify that? 3. Which versions of the OS did you see this problem on? Without specifying COMPLETE_IN_DPC, I've seen the IOMeter issue on Windows 7 as well. 4. When I tried to debug the problem using a checked build version of the driver, I got a whole different symptom. In fact that's when I saw the crash which showed NVMe ISR interrupting NVMEStartIo critical section due to lack of synchronization between the two paths. So while it's good to have discovered that particular hole in the driver, I was unable to debug the Iometer hang because it only manifested with the free build driver binary. Does anyone know how to configure the build to generate free build symbols? It won't be as painless but I could debug the free build with the help of a symbol file at least - and maybe a mixed assembly/source listing and a map file... See my comments on #2. Thanks, Judy From: Po-Yen Chang [mailto:PoYen.Chang at pmcs.com] Sent: Thursday, July 18, 2013 9:19 AM To: Judy Brock-SSI; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Judy, No, I haven't got the chance to get to the bottom of it and Yong from Huawei promised to find out more information from Microsoft contacts he has. Here are the suggested methods to build the binary for different Operating Systems: For Windows 7, Server 2008 R2, Server 2012, Windows 8 (TRIM disabled): Within WDK 7600 build environment, or Within VS 2012 when configured for Windows 7 in Project Property. For Windows 8 (TRIM enabled): Within VS 2012 when configured for Windows 8 in Project Property. Thanks, Alex ________________________________ From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Thursday, July 18, 2013 9:01 AM To: Po-Yen Chang; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang I ran on Windows Server 2012 and the binary (not built by me) was built with VS 2012/Win 8 WDK. Can you explain what you mean by compatibility issues? Did you isolate the root cause of the hang? Thanks, Judy From: Po-Yen Chang [mailto:PoYen.Chang at pmcs.com] Sent: Thursday, July 18, 2013 8:57 AM To: Judy Brock-SSI; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Hi Judy, Could you please let me know how you built the binary and which operating system you ran on? There are some compatibility issue here and that's why we need to release separate binary package for Windows 8, where TRIM is enabled. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 12:33 AM To: Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] Iometer hang Alex, We are seeing the same problem you describe below with IOMeter stopping right after hitting "Start Tests" with 4k sequential writes. To be fair, I haven't tried top of repository tree, I was using TRIM_command_support label revision 72. Did you ever find out what caused this problem? Thanks, Judy From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Thursday, June 13, 2013 9:40 AM To: Knoblaugh, Rick; Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] LSI Trim Patch Hi Rick, I did some basic tests like disk formats, SCSICompliance, SDStress and IOMeter. They're all working fine except IOMeter, which I configured as 4Kbyte, sequential writes. IOMeter stops right after hitting "Start Tests" (green flag). Do you see the problem when you tested it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Knoblaugh, Rick Sent: Friday, June 07, 2013 6:38 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** LSI Trim Patch Hi Ray, Per your request, since we will switch order, moving Intel patch to the number 3 position, I'm sending LSI's Trim patch. Password for the attached file is: lsi1234. Also, I have attached a document here that describes what was changed/added. It would be great if everyone can review and please let me know any feedback. Thanks. -Rick From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Monday, June 03, 2013 5:12 PM To: Knoblaugh, Rick; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Rick, That's great news! LSI will be 3rd in line after IDT and Intel. Thanks for the contributions to TRIM. Thanks, Ray From: Knoblaugh, Rick [mailto:Rick.Knoblaugh at lsi.com] Sent: Monday, June 03, 2013 4:57 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Ray, We also have the patch for Trim. It is ready to send. Please let me know when you would like me to send out. Thanks, -Rick From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Monday, June 03, 2013 4:44 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] Sandisk patch delay Hello, It appears the Sandisk patch for Mode Sense is taking longer than expected. In order to keep things moving along with the OFA driver, I'm going to take the Sandisk patch offline for now until they can resolve their issues. Once they have worked out the kinks, they can re-submit. In the meantime, Alex from IDT has a patch he'd like to push and I also have a patch I'd like to push. Alex, please send your patch out for code review as soon as possible and then I will send out my patch immediately after. Thanks, Ray [cid:image001.png at 01CB3870.4BB88E70] Raymond C. Robles NVM Solutions Group | Internal SSD Engineering Technology & Manufacturing Group Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From judy.brock at ssi.samsung.com Thu Jul 18 16:16:46 2013 From: judy.brock at ssi.samsung.com (Judy Brock-SSI) Date: Thu, 18 Jul 2013 23:16:46 +0000 Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> Message-ID: <36E8D38D6B771A4BBDB1C0D800158A5131277F9B@SSIEXCH-MB3.ssi.samsung.com> So it looks like the reason this problem was not seen before is because it only surfaces when the COMPLETE_IN_DPC compile flag is not set. In other words, the COMPLETE_IN_ISR path is broken because it accesses our HwDeviceExtension without being synchronized with other paths in the driver which do the same. We can either fix the path which does completions in the ISR or get rid of that option entirely. Since it's generally considered bad practice to do that kind of work in an ISR because it's supposed to be as lean and mean as possible, would the team be adverse to getting rid of the logic which optionally allows completions to be handled by the ISR? If we insist on retaining it, a) we should come up with a good reason why and b) we should fix it asap because it is definitely not safe to use in its present form. Personally I vote for removal - we wouldn't need the COMPLETE_IN_DPC flag either anymore if we go that route. Thanks, Judy From: Judy Brock-SSI Sent: Thursday, July 18, 2013 6:22 AM To: Judy Brock-SSI; 'nvmewin at lists.openfabrics.org' Subject: RE: NvmeStartio path critical section handling not protected from NVMe ISR? I just thought of another way to handle this problem. Could we not call StorPortSynchronizeAccess() with a pointer back to our ProcessIo() routine? ProcessIo would get called before the call to StorPortSynchronizeAccess() returns and this would have the effect of guaranteeing synchronization with our ISR. This seems like a much cleaner solution that a lock-acquiring approach. I still don't know if there are any issues with ProcessIo being called multiple times, from non-StartIo code paths, etc. - would still need to be looked at. Thanks, Judy From: Judy Brock-SSI Sent: Wednesday, July 17, 2013 10:08 PM To: nvmewin at lists.openfabrics.org Subject: NvmeStartio path critical section handling not protected from NVMe ISR? All, Under Windows Server 2012, I've seen a crash where NVMeStartIo() gets interrupted by our ISR at a time when it's in the middle of manipulating a linked list critical data structure which the ISR then goes on to attempt to manipulate also - which results in a crash. Below is the call stack - see where I've inserted the comment "<---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK" 2: kd> kc Call Site nt!RtlpBreakWithStatusInstruction nt!KiBugCheckDebugBreak nt!KeBugCheck2 nt!KeBugCheckEx nt!KiBugCheckDispatch nt!KiFastFailDispatch nt!KiRaiseSecurityCheckFailure nvme!RtlFailFast nvme!FatalListEntryError nvme!RtlpCheckListEntry nvme!InsertTailList nvme!NVMeCompleteCmd nvme!NVMeIsrMsix nt!KiInterruptDispatch <---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK nvme!RemoveHeadList nvme!NVMeGetCmdEntry nvme!ProcessIo nvme!NVMeStartIo storport!RaidpAdapterContinueScatterGather hal!HalpAllocateAdapterCallbackV2 hal!IoFreeAdapterChannelV2 hal!HalAllocateAdapterChannelV2 hal!HalBuildScatterGatherListV2 storport!RaUnitStartIo storport!RaidUnitCompleteRequest storport!RaidpAdapterRedirectDpcRoutine nt!KiExecuteAllDpcs nt!KiRetireDpcList I looked through the code and noticed we never call StorPortAcquireSpinLock to acquire the InterruptLock to protect us from such pre-emption. Another way to achieve this would be to indicate we run at half-duplex rather than full-duplex but that would degrade the general performance of the driver. I'm not sure why we didn't run into this way before now - is there some other re-entrance protection algorithm besides the two above that others are aware of? If not, I believe we need to fix this asap. Suggestions: A. Simplest approach is to lock down all of NVMeStartIo as per below (not tested yet) but we almost may as well run half-duplex if we do this: 1 . At the very the top of NVMeStartIo: /* we should never be holding the interrupt lock upon entry to NVMeStartIo. * Acquire the Interrupt Spin Lock to protect against getting hit by our ISR. */ if (NULL == pAdapterExtension->hInterruptLock) { (StorPortAcquireSpinLock(pAdapterExtension, InterruptLock, NULL, &pAdapterExtension->hInterruptLock); } else { ASSERT(FALSE); } 2. At the very the top of IO_StorPortNotification PNVME_DEVICE_EXTENSION pAE = (PNVME_DEVICE_EXTENSION) pHwDeviceExtension; /* if we got here from NvmeStartIo we need to release the interrupt lock */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } 3. At the very bottom of NVMeStartIo: /* if we didn't release the Interrupt Lock in one of the calls to * IO_StorPortNotification above we need to release before we exit NVMEStartIo */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } return TRUE; } /* NVMeStartIo */ B. Better approach is to just lock ProcessIo(). But code exists in that routine which acquires the StartIo lock - we can't take locks out of order or we'll cause deadlock. Right now that code never gets invoked - what was it for? Do we still need it? Can ProcessIo() get called from non-StartIo Paths? Can it get called multiple times? Not having been involved in the initial development of this driver, I would need to study the flow to make sure to respect the StorPort lock acquiring/releasing hierarchy rules at all times. If those conversant in the overall developmental history and architecture of this driver could share their thoughts, that would be great. Thanks, Judy -------------- next part -------------- An HTML attachment was scrubbed... URL: From PoYen.Chang at pmcs.com Thu Jul 18 16:27:07 2013 From: PoYen.Chang at pmcs.com (Po-Yen Chang) Date: Thu, 18 Jul 2013 16:27:07 -0700 Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? In-Reply-To: <36E8D38D6B771A4BBDB1C0D800158A5131277F9B@SSIEXCH-MB3.ssi.samsung.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A5131277F9B@SSIEXCH-MB3.ssi.samsung.com> Message-ID: <40A0B8B92CE0F94685A03264958540C4DEC20D@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Judy, I feel the same way as well. Let's wait for the response from LSI and Intel on this. If they all agree, I will go ahead remove it. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 4:17 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it looks like the reason this problem was not seen before is because it only surfaces when the COMPLETE_IN_DPC compile flag is not set. In other words, the COMPLETE_IN_ISR path is broken because it accesses our HwDeviceExtension without being synchronized with other paths in the driver which do the same. We can either fix the path which does completions in the ISR or get rid of that option entirely. Since it's generally considered bad practice to do that kind of work in an ISR because it's supposed to be as lean and mean as possible, would the team be adverse to getting rid of the logic which optionally allows completions to be handled by the ISR? If we insist on retaining it, a) we should come up with a good reason why and b) we should fix it asap because it is definitely not safe to use in its present form. Personally I vote for removal - we wouldn't need the COMPLETE_IN_DPC flag either anymore if we go that route. Thanks, Judy From: Judy Brock-SSI Sent: Thursday, July 18, 2013 6:22 AM To: Judy Brock-SSI; 'nvmewin at lists.openfabrics.org' Subject: RE: NvmeStartio path critical section handling not protected from NVMe ISR? I just thought of another way to handle this problem. Could we not call StorPortSynchronizeAccess() with a pointer back to our ProcessIo() routine? ProcessIo would get called before the call to StorPortSynchronizeAccess() returns and this would have the effect of guaranteeing synchronization with our ISR. This seems like a much cleaner solution that a lock-acquiring approach. I still don't know if there are any issues with ProcessIo being called multiple times, from non-StartIo code paths, etc. - would still need to be looked at. Thanks, Judy From: Judy Brock-SSI Sent: Wednesday, July 17, 2013 10:08 PM To: nvmewin at lists.openfabrics.org Subject: NvmeStartio path critical section handling not protected from NVMe ISR? All, Under Windows Server 2012, I've seen a crash where NVMeStartIo() gets interrupted by our ISR at a time when it's in the middle of manipulating a linked list critical data structure which the ISR then goes on to attempt to manipulate also - which results in a crash. Below is the call stack - see where I've inserted the comment "ß-STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK" 2: kd> kc Call Site nt!RtlpBreakWithStatusInstruction nt!KiBugCheckDebugBreak nt!KeBugCheck2 nt!KeBugCheckEx nt!KiBugCheckDispatch nt!KiFastFailDispatch nt!KiRaiseSecurityCheckFailure nvme!RtlFailFast nvme!FatalListEntryError nvme!RtlpCheckListEntry nvme!InsertTailList nvme!NVMeCompleteCmd nvme!NVMeIsrMsix nt!KiInterruptDispatch ß-STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK nvme!RemoveHeadList nvme!NVMeGetCmdEntry nvme!ProcessIo nvme!NVMeStartIo storport!RaidpAdapterContinueScatterGather hal!HalpAllocateAdapterCallbackV2 hal!IoFreeAdapterChannelV2 hal!HalAllocateAdapterChannelV2 hal!HalBuildScatterGatherListV2 storport!RaUnitStartIo storport!RaidUnitCompleteRequest storport!RaidpAdapterRedirectDpcRoutine nt!KiExecuteAllDpcs nt!KiRetireDpcList I looked through the code and noticed we never call StorPortAcquireSpinLock to acquire the InterruptLock to protect us from such pre-emption. Another way to achieve this would be to indicate we run at half-duplex rather than full-duplex but that would degrade the general performance of the driver. I'm not sure why we didn't run into this way before now - is there some other re-entrance protection algorithm besides the two above that others are aware of? If not, I believe we need to fix this asap. Suggestions: A. Simplest approach is to lock down all of NVMeStartIo as per below (not tested yet) but we almost may as well run half-duplex if we do this: 1 . At the very the top of NVMeStartIo: /* we should never be holding the interrupt lock upon entry to NVMeStartIo. * Acquire the Interrupt Spin Lock to protect against getting hit by our ISR. */ if (NULL == pAdapterExtension->hInterruptLock) { (StorPortAcquireSpinLock(pAdapterExtension, InterruptLock, NULL, &pAdapterExtension->hInterruptLock); } else { ASSERT(FALSE); } 2. At the very the top of IO_StorPortNotification PNVME_DEVICE_EXTENSION pAE = (PNVME_DEVICE_EXTENSION) pHwDeviceExtension; /* if we got here from NvmeStartIo we need to release the interrupt lock */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } 3. At the very bottom of NVMeStartIo: /* if we didn't release the Interrupt Lock in one of the calls to * IO_StorPortNotification above we need to release before we exit NVMEStartIo */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } return TRUE; } /* NVMeStartIo */ B. Better approach is to just lock ProcessIo(). But code exists in that routine which acquires the StartIo lock - we can't take locks out of order or we'll cause deadlock. Right now that code never gets invoked - what was it for? Do we still need it? Can ProcessIo() get called from non-StartIo Paths? Can it get called multiple times? Not having been involved in the initial development of this driver, I would need to study the flow to make sure to respect the StorPort lock acquiring/releasing hierarchy rules at all times. If those conversant in the overall developmental history and architecture of this driver could share their thoughts, that would be great. Thanks, Judy -------------- next part -------------- An HTML attachment was scrubbed... URL: From kris.r.murray at intel.com Thu Jul 18 16:29:16 2013 From: kris.r.murray at intel.com (Murray, Kris R) Date: Thu, 18 Jul 2013 23:29:16 +0000 Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? In-Reply-To: <40A0B8B92CE0F94685A03264958540C4DEC20D@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A5131277F9B@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEC20D@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Message-ID: <6B4557D9CF036C4E8F9D6C561818DABB365E3A33@FMSMSX112.amr.corp.intel.com> Judy, I have no problems removing it. ~Kris From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Po-Yen Chang Sent: Thursday, July 18, 2013 4:27 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I feel the same way as well. Let's wait for the response from LSI and Intel on this. If they all agree, I will go ahead remove it. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 4:17 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it looks like the reason this problem was not seen before is because it only surfaces when the COMPLETE_IN_DPC compile flag is not set. In other words, the COMPLETE_IN_ISR path is broken because it accesses our HwDeviceExtension without being synchronized with other paths in the driver which do the same. We can either fix the path which does completions in the ISR or get rid of that option entirely. Since it's generally considered bad practice to do that kind of work in an ISR because it's supposed to be as lean and mean as possible, would the team be adverse to getting rid of the logic which optionally allows completions to be handled by the ISR? If we insist on retaining it, a) we should come up with a good reason why and b) we should fix it asap because it is definitely not safe to use in its present form. Personally I vote for removal - we wouldn't need the COMPLETE_IN_DPC flag either anymore if we go that route. Thanks, Judy From: Judy Brock-SSI Sent: Thursday, July 18, 2013 6:22 AM To: Judy Brock-SSI; 'nvmewin at lists.openfabrics.org' Subject: RE: NvmeStartio path critical section handling not protected from NVMe ISR? I just thought of another way to handle this problem. Could we not call StorPortSynchronizeAccess() with a pointer back to our ProcessIo() routine? ProcessIo would get called before the call to StorPortSynchronizeAccess() returns and this would have the effect of guaranteeing synchronization with our ISR. This seems like a much cleaner solution that a lock-acquiring approach. I still don't know if there are any issues with ProcessIo being called multiple times, from non-StartIo code paths, etc. - would still need to be looked at. Thanks, Judy From: Judy Brock-SSI Sent: Wednesday, July 17, 2013 10:08 PM To: nvmewin at lists.openfabrics.org Subject: NvmeStartio path critical section handling not protected from NVMe ISR? All, Under Windows Server 2012, I've seen a crash where NVMeStartIo() gets interrupted by our ISR at a time when it's in the middle of manipulating a linked list critical data structure which the ISR then goes on to attempt to manipulate also - which results in a crash. Below is the call stack - see where I've inserted the comment "<---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK" 2: kd> kc Call Site nt!RtlpBreakWithStatusInstruction nt!KiBugCheckDebugBreak nt!KeBugCheck2 nt!KeBugCheckEx nt!KiBugCheckDispatch nt!KiFastFailDispatch nt!KiRaiseSecurityCheckFailure nvme!RtlFailFast nvme!FatalListEntryError nvme!RtlpCheckListEntry nvme!InsertTailList nvme!NVMeCompleteCmd nvme!NVMeIsrMsix nt!KiInterruptDispatch <---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK nvme!RemoveHeadList nvme!NVMeGetCmdEntry nvme!ProcessIo nvme!NVMeStartIo storport!RaidpAdapterContinueScatterGather hal!HalpAllocateAdapterCallbackV2 hal!IoFreeAdapterChannelV2 hal!HalAllocateAdapterChannelV2 hal!HalBuildScatterGatherListV2 storport!RaUnitStartIo storport!RaidUnitCompleteRequest storport!RaidpAdapterRedirectDpcRoutine nt!KiExecuteAllDpcs nt!KiRetireDpcList I looked through the code and noticed we never call StorPortAcquireSpinLock to acquire the InterruptLock to protect us from such pre-emption. Another way to achieve this would be to indicate we run at half-duplex rather than full-duplex but that would degrade the general performance of the driver. I'm not sure why we didn't run into this way before now - is there some other re-entrance protection algorithm besides the two above that others are aware of? If not, I believe we need to fix this asap. Suggestions: A. Simplest approach is to lock down all of NVMeStartIo as per below (not tested yet) but we almost may as well run half-duplex if we do this: 1 . At the very the top of NVMeStartIo: /* we should never be holding the interrupt lock upon entry to NVMeStartIo. * Acquire the Interrupt Spin Lock to protect against getting hit by our ISR. */ if (NULL == pAdapterExtension->hInterruptLock) { (StorPortAcquireSpinLock(pAdapterExtension, InterruptLock, NULL, &pAdapterExtension->hInterruptLock); } else { ASSERT(FALSE); } 2. At the very the top of IO_StorPortNotification PNVME_DEVICE_EXTENSION pAE = (PNVME_DEVICE_EXTENSION) pHwDeviceExtension; /* if we got here from NvmeStartIo we need to release the interrupt lock */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } 3. At the very bottom of NVMeStartIo: /* if we didn't release the Interrupt Lock in one of the calls to * IO_StorPortNotification above we need to release before we exit NVMEStartIo */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } return TRUE; } /* NVMeStartIo */ B. Better approach is to just lock ProcessIo(). But code exists in that routine which acquires the StartIo lock - we can't take locks out of order or we'll cause deadlock. Right now that code never gets invoked - what was it for? Do we still need it? Can ProcessIo() get called from non-StartIo Paths? Can it get called multiple times? Not having been involved in the initial development of this driver, I would need to study the flow to make sure to respect the StorPort lock acquiring/releasing hierarchy rules at all times. If those conversant in the overall developmental history and architecture of this driver could share their thoughts, that would be great. Thanks, Judy -------------- next part -------------- An HTML attachment was scrubbed... URL: From judy.brock at ssi.samsung.com Thu Jul 18 16:37:57 2013 From: judy.brock at ssi.samsung.com (Judy Brock-SSI) Date: Thu, 18 Jul 2013 23:37:57 +0000 Subject: [nvmewin] Iometer hang In-Reply-To: <4565AEA676113A449269C2F3A549520F013839E37A@cosmail03.lsi.com> References: <49158E750348AA499168FD41D88983606257AE07@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBB1FBD5@cosmail03.lsi.com><49158E750348AA499168FD41D88983606257AEE1@FMSMSX105.amr.corp.intel.com><4565AEA676113A449269C2F3A549520FDBBD3022@cosmail03.lsi.com><548C5470AAD9DA4A85D259B663190D361FFEED9E@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A51312772D2@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEBFBC@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <36E8D38D6B771A4BBDB1C0D800158A5131277CB6@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEBFDA@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <36E8D38D6B771A4BBDB1C0D800158A5131277D90@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEC020@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <4565AEA676113A449269C2F3A549520F013839E37A@cosmail03.lsi.com> Message-ID: <36E8D38D6B771A4BBDB1C0D800158A5131278021@SSIEXCH-MB3.ssi.samsung.com> >> Windows Server 2012 has the same storage stack as Windows 8, so it should support Trim as well. Yes, that was my point of my question - that's what I thought. However the suggested build methods for different Operating Systems provided below does not include a build for Windows 2012 with TRIM enabled so that's why I questioned it. For future reference, is everyone ok with modifying the "official" suggested build methods to read as follows (new text I've added is in red): For Windows 7, Server 2008 R2, Server 2012 (TRIM disabled), Windows 8 (TRIM disabled): Within WDK 7600 build environment, or Within VS 2012 when configured for Windows 7 in Project Property. For Windows 8 (TRIM enabled), Server 2012 (TRIM enabled): Within VS 2012 when configured for Windows 8 in Project Property. >> After tracing back the emails I exchanged with Rick (who implemented the TRIM) for IOMeter issue, it was caused by mssing COMPLETE_IN_DPC compiling flag when configuring project in VS 2012. Do you specify that? And this totally explains how, when I tried to debug this using a checked build driver, the symptom could have morphed into an entirely different scenario, namely the StartIo-Clobbered-by-ISR problem; the latter also being a direct result of not specifying COMPLETE_IN_DPC. The timing must have changed enough between the free and checked builds to cause the two different flavors of grief :). Phew, I guess I'm not losing my driver developer marbles after all (at least not based on observing those wildly different driver behaviors)! Thanks, Judy From: Knoblaugh, Rick [mailto:Rick.Knoblaugh at lsi.com] Sent: Thursday, July 18, 2013 2:25 PM To: Po-Yen Chang; Judy Brock-SSI; Chang, Alex; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Windows Server 2012 has the same storage stack as Windows 8, so it should support Trim as well. Thanks, -Rick From: Po-Yen Chang [mailto:PoYen.Chang at pmcs.com] Sent: Thursday, July 18, 2013 10:01 AM To: Judy Brock-SSI; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Hi Judy, See my comments in red... ________________________________ From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Thursday, July 18, 2013 9:36 AM To: Po-Yen Chang; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Hi Alex, I'm still confused. 1. I thought the suggested method for building for Windows Server 2012 (TRIM enabled) should be the same as for Windows 8 (TRIM enabled) but that is not what it says below - there is no mention of building for Windows Server 2012 w/TRIM enabled. Is it a mistake below or is it really the case that we are suggesting that folks need to disable the TRIM code in the Server 2012 environment as it says below? Because if we build for Server 2012 using either WDK 7600 build environment or within VS 2012 when configured for Windows 7 in Project Property, we are disabling the TRIM code in that environment, correct? I am so sure TRIM is supported in Server 2012. Could you please confirm that? 2. When you say "I haven't got the chance to get to the bottom of it and Yong from Huawei promised to find out more information from Microsoft contacts he has.", get to the bottom of what? Why are we asking MS about an Iometer hang? Unless perhaps it's the case that you see this problem with Server 2012 (TRIM enabled) but not Windows 8 (TRIM enabled)....is that the case? Please clarify why we think this is an OS issue or why MS may have light to shed. After tracing back the emails I exchanged with Rick (who implemented the TRIM) for IOMeter issue, it was caused by mssing COMPLETE_IN_DPC compiling flag when configuring project in VS 2012. Do you specify that? 3. Which versions of the OS did you see this problem on? Without specifying COMPLETE_IN_DPC, I've seen the IOMeter issue on Windows 7 as well. 4. When I tried to debug the problem using a checked build version of the driver, I got a whole different symptom. In fact that's when I saw the crash which showed NVMe ISR interrupting NVMEStartIo critical section due to lack of synchronization between the two paths. So while it's good to have discovered that particular hole in the driver, I was unable to debug the Iometer hang because it only manifested with the free build driver binary. Does anyone know how to configure the build to generate free build symbols? It won't be as painless but I could debug the free build with the help of a symbol file at least - and maybe a mixed assembly/source listing and a map file... See my comments on #2. Thanks, Judy From: Po-Yen Chang [mailto:PoYen.Chang at pmcs.com] Sent: Thursday, July 18, 2013 9:19 AM To: Judy Brock-SSI; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Judy, No, I haven't got the chance to get to the bottom of it and Yong from Huawei promised to find out more information from Microsoft contacts he has. Here are the suggested methods to build the binary for different Operating Systems: For Windows 7, Server 2008 R2, Server 2012, Windows 8 (TRIM disabled): Within WDK 7600 build environment, or Within VS 2012 when configured for Windows 7 in Project Property. For Windows 8 (TRIM enabled): Within VS 2012 when configured for Windows 8 in Project Property. Thanks, Alex ________________________________ From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Thursday, July 18, 2013 9:01 AM To: Po-Yen Chang; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang I ran on Windows Server 2012 and the binary (not built by me) was built with VS 2012/Win 8 WDK. Can you explain what you mean by compatibility issues? Did you isolate the root cause of the hang? Thanks, Judy From: Po-Yen Chang [mailto:PoYen.Chang at pmcs.com] Sent: Thursday, July 18, 2013 8:57 AM To: Judy Brock-SSI; Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] Iometer hang Hi Judy, Could you please let me know how you built the binary and which operating system you ran on? There are some compatibility issue here and that's why we need to release separate binary package for Windows 8, where TRIM is enabled. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 12:33 AM To: Chang, Alex; Knoblaugh, Rick; Robles,Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] Iometer hang Alex, We are seeing the same problem you describe below with IOMeter stopping right after hitting "Start Tests" with 4k sequential writes. To be fair, I haven't tried top of repository tree, I was using TRIM_command_support label revision 72. Did you ever find out what caused this problem? Thanks, Judy From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Chang, Alex Sent: Thursday, June 13, 2013 9:40 AM To: Knoblaugh, Rick; Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] LSI Trim Patch Hi Rick, I did some basic tests like disk formats, SCSICompliance, SDStress and IOMeter. They're all working fine except IOMeter, which I configured as 4Kbyte, sequential writes. IOMeter stops right after hitting "Start Tests" (green flag). Do you see the problem when you tested it? Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Knoblaugh, Rick Sent: Friday, June 07, 2013 6:38 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: [nvmewin] ***UNCHECKED*** LSI Trim Patch Hi Ray, Per your request, since we will switch order, moving Intel patch to the number 3 position, I'm sending LSI's Trim patch. Password for the attached file is: lsi1234. Also, I have attached a document here that describes what was changed/added. It would be great if everyone can review and please let me know any feedback. Thanks. -Rick From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Monday, June 03, 2013 5:12 PM To: Knoblaugh, Rick; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Rick, That's great news! LSI will be 3rd in line after IDT and Intel. Thanks for the contributions to TRIM. Thanks, Ray From: Knoblaugh, Rick [mailto:Rick.Knoblaugh at lsi.com] Sent: Monday, June 03, 2013 4:57 PM To: Robles, Raymond C; nvmewin at lists.openfabrics.org Subject: RE: Sandisk patch delay Hi Ray, We also have the patch for Trim. It is ready to send. Please let me know when you would like me to send out. Thanks, -Rick From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Monday, June 03, 2013 4:44 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] Sandisk patch delay Hello, It appears the Sandisk patch for Mode Sense is taking longer than expected. In order to keep things moving along with the OFA driver, I'm going to take the Sandisk patch offline for now until they can resolve their issues. Once they have worked out the kinks, they can re-submit. In the meantime, Alex from IDT has a patch he'd like to push and I also have a patch I'd like to push. Alex, please send your patch out for code review as soon as possible and then I will send out my patch immediately after. Thanks, Ray [cid:image001.png at 01CB3870.4BB88E70] Raymond C. Robles NVM Solutions Group | Internal SSD Engineering Technology & Manufacturing Group Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From james.p.freyensee at intel.com Thu Jul 18 16:43:28 2013 From: james.p.freyensee at intel.com (Freyensee, James P) Date: Thu, 18 Jul 2013 23:43:28 +0000 Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? In-Reply-To: <6B4557D9CF036C4E8F9D6C561818DABB365E3A33@FMSMSX112.amr.corp.intel.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A5131277F9B@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEC20D@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <6B4557D9CF036C4E8F9D6C561818DABB365E3A33@FMSMSX112.amr.corp.intel.com> Message-ID: <2D98093777D3FD46A36253F35FE9D6938086C18C@ORSMSX109.amr.corp.intel.com> Out of curiosity, what was the original reason to have the ISR path in the first place? If it is currently in the driver code, there must had been some purpose to be able to either compile it using an ISR or a DPC. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Thursday, July 18, 2013 4:29 PM To: Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I have no problems removing it. ~Kris From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Po-Yen Chang Sent: Thursday, July 18, 2013 4:27 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I feel the same way as well. Let's wait for the response from LSI and Intel on this. If they all agree, I will go ahead remove it. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 4:17 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it looks like the reason this problem was not seen before is because it only surfaces when the COMPLETE_IN_DPC compile flag is not set. In other words, the COMPLETE_IN_ISR path is broken because it accesses our HwDeviceExtension without being synchronized with other paths in the driver which do the same. We can either fix the path which does completions in the ISR or get rid of that option entirely. Since it's generally considered bad practice to do that kind of work in an ISR because it's supposed to be as lean and mean as possible, would the team be adverse to getting rid of the logic which optionally allows completions to be handled by the ISR? If we insist on retaining it, a) we should come up with a good reason why and b) we should fix it asap because it is definitely not safe to use in its present form. Personally I vote for removal - we wouldn't need the COMPLETE_IN_DPC flag either anymore if we go that route. Thanks, Judy From: Judy Brock-SSI Sent: Thursday, July 18, 2013 6:22 AM To: Judy Brock-SSI; 'nvmewin at lists.openfabrics.org' Subject: RE: NvmeStartio path critical section handling not protected from NVMe ISR? I just thought of another way to handle this problem. Could we not call StorPortSynchronizeAccess() with a pointer back to our ProcessIo() routine? ProcessIo would get called before the call to StorPortSynchronizeAccess() returns and this would have the effect of guaranteeing synchronization with our ISR. This seems like a much cleaner solution that a lock-acquiring approach. I still don't know if there are any issues with ProcessIo being called multiple times, from non-StartIo code paths, etc. - would still need to be looked at. Thanks, Judy From: Judy Brock-SSI Sent: Wednesday, July 17, 2013 10:08 PM To: nvmewin at lists.openfabrics.org Subject: NvmeStartio path critical section handling not protected from NVMe ISR? All, Under Windows Server 2012, I've seen a crash where NVMeStartIo() gets interrupted by our ISR at a time when it's in the middle of manipulating a linked list critical data structure which the ISR then goes on to attempt to manipulate also - which results in a crash. Below is the call stack - see where I've inserted the comment "<---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK" 2: kd> kc Call Site nt!RtlpBreakWithStatusInstruction nt!KiBugCheckDebugBreak nt!KeBugCheck2 nt!KeBugCheckEx nt!KiBugCheckDispatch nt!KiFastFailDispatch nt!KiRaiseSecurityCheckFailure nvme!RtlFailFast nvme!FatalListEntryError nvme!RtlpCheckListEntry nvme!InsertTailList nvme!NVMeCompleteCmd nvme!NVMeIsrMsix nt!KiInterruptDispatch <---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK nvme!RemoveHeadList nvme!NVMeGetCmdEntry nvme!ProcessIo nvme!NVMeStartIo storport!RaidpAdapterContinueScatterGather hal!HalpAllocateAdapterCallbackV2 hal!IoFreeAdapterChannelV2 hal!HalAllocateAdapterChannelV2 hal!HalBuildScatterGatherListV2 storport!RaUnitStartIo storport!RaidUnitCompleteRequest storport!RaidpAdapterRedirectDpcRoutine nt!KiExecuteAllDpcs nt!KiRetireDpcList I looked through the code and noticed we never call StorPortAcquireSpinLock to acquire the InterruptLock to protect us from such pre-emption. Another way to achieve this would be to indicate we run at half-duplex rather than full-duplex but that would degrade the general performance of the driver. I'm not sure why we didn't run into this way before now - is there some other re-entrance protection algorithm besides the two above that others are aware of? If not, I believe we need to fix this asap. Suggestions: A. Simplest approach is to lock down all of NVMeStartIo as per below (not tested yet) but we almost may as well run half-duplex if we do this: 1 . At the very the top of NVMeStartIo: /* we should never be holding the interrupt lock upon entry to NVMeStartIo. * Acquire the Interrupt Spin Lock to protect against getting hit by our ISR. */ if (NULL == pAdapterExtension->hInterruptLock) { (StorPortAcquireSpinLock(pAdapterExtension, InterruptLock, NULL, &pAdapterExtension->hInterruptLock); } else { ASSERT(FALSE); } 2. At the very the top of IO_StorPortNotification PNVME_DEVICE_EXTENSION pAE = (PNVME_DEVICE_EXTENSION) pHwDeviceExtension; /* if we got here from NvmeStartIo we need to release the interrupt lock */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } 3. At the very bottom of NVMeStartIo: /* if we didn't release the Interrupt Lock in one of the calls to * IO_StorPortNotification above we need to release before we exit NVMEStartIo */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } return TRUE; } /* NVMeStartIo */ B. Better approach is to just lock ProcessIo(). But code exists in that routine which acquires the StartIo lock - we can't take locks out of order or we'll cause deadlock. Right now that code never gets invoked - what was it for? Do we still need it? Can ProcessIo() get called from non-StartIo Paths? Can it get called multiple times? Not having been involved in the initial development of this driver, I would need to study the flow to make sure to respect the StorPort lock acquiring/releasing hierarchy rules at all times. If those conversant in the overall developmental history and architecture of this driver could share their thoughts, that would be great. Thanks, Judy -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.e.luse at intel.com Thu Jul 18 16:55:57 2013 From: paul.e.luse at intel.com (Luse, Paul E) Date: Thu, 18 Jul 2013 23:55:57 +0000 Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? In-Reply-To: <2D98093777D3FD46A36253F35FE9D6938086C18C@ORSMSX109.amr.corp.intel.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A5131277F9B@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEC20D@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <6B4557D9CF036C4E8F9D6C561818DABB365E3A33@FMSMSX112.amr.corp.intel.com> <2D98093777D3FD46A36253F35FE9D6938086C18C@ORSMSX109.amr.corp.intel.com> Message-ID: <82C9F782B054C94B9FC04A331649C77A2FC01DFA@FMSMSX112.amr.corp.intel.com> So it was originally in there during early dev to measure system performance impact of completing in the ISR or pushing off to a DPC; general old school rule of thumb is to minimize ISR work and finish everything else in a DPC to be friendlier to the system. With NVMe completion being so lightweight we figured we could get away without the DPC and it used to work both ways under heavy stress. As both methods are not always tested (after all it's a compile switch) clearly its easy to break one of them. I actually ran the tests DPC vs ISR back then and there was no significant impact either way. I was using xperf and I'm sure I shared the results with the other members of the original team as well - doubt I have them anymore but I'll look. Either way, at this point in time its probably is a good simplification to pick one method and remove the compile switch for the other. I'd probably stick with the DPC route as (a) there was no major benefit from finishing in ISR and (b) sounds like its busted now anyway :) From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Freyensee, James P Sent: Thursday, July 18, 2013 4:43 PM To: Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Out of curiosity, what was the original reason to have the ISR path in the first place? If it is currently in the driver code, there must had been some purpose to be able to either compile it using an ISR or a DPC. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Thursday, July 18, 2013 4:29 PM To: Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I have no problems removing it. ~Kris From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Po-Yen Chang Sent: Thursday, July 18, 2013 4:27 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I feel the same way as well. Let's wait for the response from LSI and Intel on this. If they all agree, I will go ahead remove it. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 4:17 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it looks like the reason this problem was not seen before is because it only surfaces when the COMPLETE_IN_DPC compile flag is not set. In other words, the COMPLETE_IN_ISR path is broken because it accesses our HwDeviceExtension without being synchronized with other paths in the driver which do the same. We can either fix the path which does completions in the ISR or get rid of that option entirely. Since it's generally considered bad practice to do that kind of work in an ISR because it's supposed to be as lean and mean as possible, would the team be adverse to getting rid of the logic which optionally allows completions to be handled by the ISR? If we insist on retaining it, a) we should come up with a good reason why and b) we should fix it asap because it is definitely not safe to use in its present form. Personally I vote for removal - we wouldn't need the COMPLETE_IN_DPC flag either anymore if we go that route. Thanks, Judy From: Judy Brock-SSI Sent: Thursday, July 18, 2013 6:22 AM To: Judy Brock-SSI; 'nvmewin at lists.openfabrics.org' Subject: RE: NvmeStartio path critical section handling not protected from NVMe ISR? I just thought of another way to handle this problem. Could we not call StorPortSynchronizeAccess() with a pointer back to our ProcessIo() routine? ProcessIo would get called before the call to StorPortSynchronizeAccess() returns and this would have the effect of guaranteeing synchronization with our ISR. This seems like a much cleaner solution that a lock-acquiring approach. I still don't know if there are any issues with ProcessIo being called multiple times, from non-StartIo code paths, etc. - would still need to be looked at. Thanks, Judy From: Judy Brock-SSI Sent: Wednesday, July 17, 2013 10:08 PM To: nvmewin at lists.openfabrics.org Subject: NvmeStartio path critical section handling not protected from NVMe ISR? All, Under Windows Server 2012, I've seen a crash where NVMeStartIo() gets interrupted by our ISR at a time when it's in the middle of manipulating a linked list critical data structure which the ISR then goes on to attempt to manipulate also - which results in a crash. Below is the call stack - see where I've inserted the comment "<---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK" 2: kd> kc Call Site nt!RtlpBreakWithStatusInstruction nt!KiBugCheckDebugBreak nt!KeBugCheck2 nt!KeBugCheckEx nt!KiBugCheckDispatch nt!KiFastFailDispatch nt!KiRaiseSecurityCheckFailure nvme!RtlFailFast nvme!FatalListEntryError nvme!RtlpCheckListEntry nvme!InsertTailList nvme!NVMeCompleteCmd nvme!NVMeIsrMsix nt!KiInterruptDispatch <---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK nvme!RemoveHeadList nvme!NVMeGetCmdEntry nvme!ProcessIo nvme!NVMeStartIo storport!RaidpAdapterContinueScatterGather hal!HalpAllocateAdapterCallbackV2 hal!IoFreeAdapterChannelV2 hal!HalAllocateAdapterChannelV2 hal!HalBuildScatterGatherListV2 storport!RaUnitStartIo storport!RaidUnitCompleteRequest storport!RaidpAdapterRedirectDpcRoutine nt!KiExecuteAllDpcs nt!KiRetireDpcList I looked through the code and noticed we never call StorPortAcquireSpinLock to acquire the InterruptLock to protect us from such pre-emption. Another way to achieve this would be to indicate we run at half-duplex rather than full-duplex but that would degrade the general performance of the driver. I'm not sure why we didn't run into this way before now - is there some other re-entrance protection algorithm besides the two above that others are aware of? If not, I believe we need to fix this asap. Suggestions: A. Simplest approach is to lock down all of NVMeStartIo as per below (not tested yet) but we almost may as well run half-duplex if we do this: 1 . At the very the top of NVMeStartIo: /* we should never be holding the interrupt lock upon entry to NVMeStartIo. * Acquire the Interrupt Spin Lock to protect against getting hit by our ISR. */ if (NULL == pAdapterExtension->hInterruptLock) { (StorPortAcquireSpinLock(pAdapterExtension, InterruptLock, NULL, &pAdapterExtension->hInterruptLock); } else { ASSERT(FALSE); } 2. At the very the top of IO_StorPortNotification PNVME_DEVICE_EXTENSION pAE = (PNVME_DEVICE_EXTENSION) pHwDeviceExtension; /* if we got here from NvmeStartIo we need to release the interrupt lock */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } 3. At the very bottom of NVMeStartIo: /* if we didn't release the Interrupt Lock in one of the calls to * IO_StorPortNotification above we need to release before we exit NVMEStartIo */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } return TRUE; } /* NVMeStartIo */ B. Better approach is to just lock ProcessIo(). But code exists in that routine which acquires the StartIo lock - we can't take locks out of order or we'll cause deadlock. Right now that code never gets invoked - what was it for? Do we still need it? Can ProcessIo() get called from non-StartIo Paths? Can it get called multiple times? Not having been involved in the initial development of this driver, I would need to study the flow to make sure to respect the StorPort lock acquiring/releasing hierarchy rules at all times. If those conversant in the overall developmental history and architecture of this driver could share their thoughts, that would be great. Thanks, Judy -------------- next part -------------- An HTML attachment was scrubbed... URL: From Rick.Knoblaugh at lsi.com Thu Jul 18 17:15:30 2013 From: Rick.Knoblaugh at lsi.com (Knoblaugh, Rick) Date: Thu, 18 Jul 2013 18:15:30 -0600 Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? In-Reply-To: <82C9F782B054C94B9FC04A331649C77A2FC01DFA@FMSMSX112.amr.corp.intel.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A5131277F9B@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEC20D@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <6B4557D9CF036C4E8F9D6C561818DABB365E3A33@FMSMSX112.amr.corp.intel.com> <2D98093777D3FD46A36253F35FE9D6938086C18C@ORSMSX109.amr.corp.intel.com> <82C9F782B054C94B9FC04A331649C77A2FC01DFA@FMSMSX112.amr.corp.intel.com> Message-ID: <4565AEA676113A449269C2F3A549520F013839E3CC@cosmail03.lsi.com> Hi Paul, Thanks for explaining history behind that one. I do recall that now. Yes, I would agree, may as well go ahead and remove and stick with DPC approach. Thanks, -Rick From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Luse, Paul E Sent: Thursday, July 18, 2013 4:56 PM To: Freyensee, James P; Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it was originally in there during early dev to measure system performance impact of completing in the ISR or pushing off to a DPC; general old school rule of thumb is to minimize ISR work and finish everything else in a DPC to be friendlier to the system. With NVMe completion being so lightweight we figured we could get away without the DPC and it used to work both ways under heavy stress. As both methods are not always tested (after all it's a compile switch) clearly its easy to break one of them. I actually ran the tests DPC vs ISR back then and there was no significant impact either way. I was using xperf and I'm sure I shared the results with the other members of the original team as well - doubt I have them anymore but I'll look. Either way, at this point in time its probably is a good simplification to pick one method and remove the compile switch for the other. I'd probably stick with the DPC route as (a) there was no major benefit from finishing in ISR and (b) sounds like its busted now anyway :) From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Freyensee, James P Sent: Thursday, July 18, 2013 4:43 PM To: Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Out of curiosity, what was the original reason to have the ISR path in the first place? If it is currently in the driver code, there must had been some purpose to be able to either compile it using an ISR or a DPC. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Thursday, July 18, 2013 4:29 PM To: Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I have no problems removing it. ~Kris From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Po-Yen Chang Sent: Thursday, July 18, 2013 4:27 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I feel the same way as well. Let's wait for the response from LSI and Intel on this. If they all agree, I will go ahead remove it. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 4:17 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it looks like the reason this problem was not seen before is because it only surfaces when the COMPLETE_IN_DPC compile flag is not set. In other words, the COMPLETE_IN_ISR path is broken because it accesses our HwDeviceExtension without being synchronized with other paths in the driver which do the same. We can either fix the path which does completions in the ISR or get rid of that option entirely. Since it's generally considered bad practice to do that kind of work in an ISR because it's supposed to be as lean and mean as possible, would the team be adverse to getting rid of the logic which optionally allows completions to be handled by the ISR? If we insist on retaining it, a) we should come up with a good reason why and b) we should fix it asap because it is definitely not safe to use in its present form. Personally I vote for removal - we wouldn't need the COMPLETE_IN_DPC flag either anymore if we go that route. Thanks, Judy From: Judy Brock-SSI Sent: Thursday, July 18, 2013 6:22 AM To: Judy Brock-SSI; 'nvmewin at lists.openfabrics.org' Subject: RE: NvmeStartio path critical section handling not protected from NVMe ISR? I just thought of another way to handle this problem. Could we not call StorPortSynchronizeAccess() with a pointer back to our ProcessIo() routine? ProcessIo would get called before the call to StorPortSynchronizeAccess() returns and this would have the effect of guaranteeing synchronization with our ISR. This seems like a much cleaner solution that a lock-acquiring approach. I still don't know if there are any issues with ProcessIo being called multiple times, from non-StartIo code paths, etc. - would still need to be looked at. Thanks, Judy From: Judy Brock-SSI Sent: Wednesday, July 17, 2013 10:08 PM To: nvmewin at lists.openfabrics.org Subject: NvmeStartio path critical section handling not protected from NVMe ISR? All, Under Windows Server 2012, I've seen a crash where NVMeStartIo() gets interrupted by our ISR at a time when it's in the middle of manipulating a linked list critical data structure which the ISR then goes on to attempt to manipulate also - which results in a crash. Below is the call stack - see where I've inserted the comment "<---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK" 2: kd> kc Call Site nt!RtlpBreakWithStatusInstruction nt!KiBugCheckDebugBreak nt!KeBugCheck2 nt!KeBugCheckEx nt!KiBugCheckDispatch nt!KiFastFailDispatch nt!KiRaiseSecurityCheckFailure nvme!RtlFailFast nvme!FatalListEntryError nvme!RtlpCheckListEntry nvme!InsertTailList nvme!NVMeCompleteCmd nvme!NVMeIsrMsix nt!KiInterruptDispatch <---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK nvme!RemoveHeadList nvme!NVMeGetCmdEntry nvme!ProcessIo nvme!NVMeStartIo storport!RaidpAdapterContinueScatterGather hal!HalpAllocateAdapterCallbackV2 hal!IoFreeAdapterChannelV2 hal!HalAllocateAdapterChannelV2 hal!HalBuildScatterGatherListV2 storport!RaUnitStartIo storport!RaidUnitCompleteRequest storport!RaidpAdapterRedirectDpcRoutine nt!KiExecuteAllDpcs nt!KiRetireDpcList I looked through the code and noticed we never call StorPortAcquireSpinLock to acquire the InterruptLock to protect us from such pre-emption. Another way to achieve this would be to indicate we run at half-duplex rather than full-duplex but that would degrade the general performance of the driver. I'm not sure why we didn't run into this way before now - is there some other re-entrance protection algorithm besides the two above that others are aware of? If not, I believe we need to fix this asap. Suggestions: A. Simplest approach is to lock down all of NVMeStartIo as per below (not tested yet) but we almost may as well run half-duplex if we do this: 1 . At the very the top of NVMeStartIo: /* we should never be holding the interrupt lock upon entry to NVMeStartIo. * Acquire the Interrupt Spin Lock to protect against getting hit by our ISR. */ if (NULL == pAdapterExtension->hInterruptLock) { (StorPortAcquireSpinLock(pAdapterExtension, InterruptLock, NULL, &pAdapterExtension->hInterruptLock); } else { ASSERT(FALSE); } 2. At the very the top of IO_StorPortNotification PNVME_DEVICE_EXTENSION pAE = (PNVME_DEVICE_EXTENSION) pHwDeviceExtension; /* if we got here from NvmeStartIo we need to release the interrupt lock */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } 3. At the very bottom of NVMeStartIo: /* if we didn't release the Interrupt Lock in one of the calls to * IO_StorPortNotification above we need to release before we exit NVMEStartIo */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } return TRUE; } /* NVMeStartIo */ B. Better approach is to just lock ProcessIo(). But code exists in that routine which acquires the StartIo lock - we can't take locks out of order or we'll cause deadlock. Right now that code never gets invoked - what was it for? Do we still need it? Can ProcessIo() get called from non-StartIo Paths? Can it get called multiple times? Not having been involved in the initial development of this driver, I would need to study the flow to make sure to respect the StorPort lock acquiring/releasing hierarchy rules at all times. If those conversant in the overall developmental history and architecture of this driver could share their thoughts, that would be great. Thanks, Judy -------------- next part -------------- An HTML attachment was scrubbed... URL: From PoYen.Chang at pmcs.com Thu Jul 18 17:33:01 2013 From: PoYen.Chang at pmcs.com (Po-Yen Chang) Date: Thu, 18 Jul 2013 17:33:01 -0700 Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartiopath critical section handling not protected from NVMe ISR? In-Reply-To: <4565AEA676113A449269C2F3A549520F013839E3CC@cosmail03.lsi.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com><36E8D38D6B771A4BBDB1C0D800158A5131277F9B@SSIEXCH-MB3.ssi.samsung.com><40A0B8B92CE0F94685A03264958540C4DEC20D@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca><6B4557D9CF036C4E8F9D6C561818DABB365E3A33@FMSMSX112.amr.corp.intel.com><2D98093777D3FD46A36253F35FE9D6938086C18C@ORSMSX109.amr.corp.intel.com> <82C9F782B054C94B9FC04A331649C77A2FC01DFA@FMSMSX112.amr.corp.intel.com> <4565AEA676113A449269C2F3A549520F013839E3CC@cosmail03.lsi.com> Message-ID: <40A0B8B92CE0F94685A03264958540C4DEC25C@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Hi all, Sounds like we all agree to remove the compiling flag. I will prepare a patch and send it out for your review after running thru the required tests. Regards, Alex ________________________________ From: Knoblaugh, Rick [mailto:Rick.Knoblaugh at lsi.com] Sent: Thursday, July 18, 2013 5:16 PM To: Luse, Paul E; Freyensee, James P; Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartiopath critical section handling not protected from NVMe ISR? Hi Paul, Thanks for explaining history behind that one. I do recall that now. Yes, I would agree, may as well go ahead and remove and stick with DPC approach. Thanks, -Rick From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Luse, Paul E Sent: Thursday, July 18, 2013 4:56 PM To: Freyensee, James P; Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it was originally in there during early dev to measure system performance impact of completing in the ISR or pushing off to a DPC; general old school rule of thumb is to minimize ISR work and finish everything else in a DPC to be friendlier to the system. With NVMe completion being so lightweight we figured we could get away without the DPC and it used to work both ways under heavy stress. As both methods are not always tested (after all it's a compile switch) clearly its easy to break one of them. I actually ran the tests DPC vs ISR back then and there was no significant impact either way. I was using xperf and I'm sure I shared the results with the other members of the original team as well - doubt I have them anymore but I'll look. Either way, at this point in time its probably is a good simplification to pick one method and remove the compile switch for the other. I'd probably stick with the DPC route as (a) there was no major benefit from finishing in ISR and (b) sounds like its busted now anyway J From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Freyensee, James P Sent: Thursday, July 18, 2013 4:43 PM To: Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Out of curiosity, what was the original reason to have the ISR path in the first place? If it is currently in the driver code, there must had been some purpose to be able to either compile it using an ISR or a DPC. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Thursday, July 18, 2013 4:29 PM To: Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I have no problems removing it. ~Kris From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Po-Yen Chang Sent: Thursday, July 18, 2013 4:27 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I feel the same way as well. Let's wait for the response from LSI and Intel on this. If they all agree, I will go ahead remove it. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 4:17 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it looks like the reason this problem was not seen before is because it only surfaces when the COMPLETE_IN_DPC compile flag is not set. In other words, the COMPLETE_IN_ISR path is broken because it accesses our HwDeviceExtension without being synchronized with other paths in the driver which do the same. We can either fix the path which does completions in the ISR or get rid of that option entirely. Since it's generally considered bad practice to do that kind of work in an ISR because it's supposed to be as lean and mean as possible, would the team be adverse to getting rid of the logic which optionally allows completions to be handled by the ISR? If we insist on retaining it, a) we should come up with a good reason why and b) we should fix it asap because it is definitely not safe to use in its present form. Personally I vote for removal - we wouldn't need the COMPLETE_IN_DPC flag either anymore if we go that route. Thanks, Judy From: Judy Brock-SSI Sent: Thursday, July 18, 2013 6:22 AM To: Judy Brock-SSI; 'nvmewin at lists.openfabrics.org' Subject: RE: NvmeStartio path critical section handling not protected from NVMe ISR? I just thought of another way to handle this problem. Could we not call StorPortSynchronizeAccess() with a pointer back to our ProcessIo() routine? ProcessIo would get called before the call to StorPortSynchronizeAccess() returns and this would have the effect of guaranteeing synchronization with our ISR. This seems like a much cleaner solution that a lock-acquiring approach. I still don't know if there are any issues with ProcessIo being called multiple times, from non-StartIo code paths, etc. - would still need to be looked at. Thanks, Judy From: Judy Brock-SSI Sent: Wednesday, July 17, 2013 10:08 PM To: nvmewin at lists.openfabrics.org Subject: NvmeStartio path critical section handling not protected from NVMe ISR? All, Under Windows Server 2012, I've seen a crash where NVMeStartIo() gets interrupted by our ISR at a time when it's in the middle of manipulating a linked list critical data structure which the ISR then goes on to attempt to manipulate also - which results in a crash. Below is the call stack - see where I've inserted the comment "ß-STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK" 2: kd> kc Call Site nt!RtlpBreakWithStatusInstruction nt!KiBugCheckDebugBreak nt!KeBugCheck2 nt!KeBugCheckEx nt!KiBugCheckDispatch nt!KiFastFailDispatch nt!KiRaiseSecurityCheckFailure nvme!RtlFailFast nvme!FatalListEntryError nvme!RtlpCheckListEntry nvme!InsertTailList nvme!NVMeCompleteCmd nvme!NVMeIsrMsix nt!KiInterruptDispatch ß-STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK nvme!RemoveHeadList nvme!NVMeGetCmdEntry nvme!ProcessIo nvme!NVMeStartIo storport!RaidpAdapterContinueScatterGather hal!HalpAllocateAdapterCallbackV2 hal!IoFreeAdapterChannelV2 hal!HalAllocateAdapterChannelV2 hal!HalBuildScatterGatherListV2 storport!RaUnitStartIo storport!RaidUnitCompleteRequest storport!RaidpAdapterRedirectDpcRoutine nt!KiExecuteAllDpcs nt!KiRetireDpcList I looked through the code and noticed we never call StorPortAcquireSpinLock to acquire the InterruptLock to protect us from such pre-emption. Another way to achieve this would be to indicate we run at half-duplex rather than full-duplex but that would degrade the general performance of the driver. I'm not sure why we didn't run into this way before now - is there some other re-entrance protection algorithm besides the two above that others are aware of? If not, I believe we need to fix this asap. Suggestions: A. Simplest approach is to lock down all of NVMeStartIo as per below (not tested yet) but we almost may as well run half-duplex if we do this: 1 . At the very the top of NVMeStartIo: /* we should never be holding the interrupt lock upon entry to NVMeStartIo. * Acquire the Interrupt Spin Lock to protect against getting hit by our ISR. */ if (NULL == pAdapterExtension->hInterruptLock) { (StorPortAcquireSpinLock(pAdapterExtension, InterruptLock, NULL, &pAdapterExtension->hInterruptLock); } else { ASSERT(FALSE); } 2. At the very the top of IO_StorPortNotification PNVME_DEVICE_EXTENSION pAE = (PNVME_DEVICE_EXTENSION) pHwDeviceExtension; /* if we got here from NvmeStartIo we need to release the interrupt lock */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } 3. At the very bottom of NVMeStartIo: /* if we didn't release the Interrupt Lock in one of the calls to * IO_StorPortNotification above we need to release before we exit NVMEStartIo */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } return TRUE; } /* NVMeStartIo */ B. Better approach is to just lock ProcessIo(). But code exists in that routine which acquires the StartIo lock - we can't take locks out of order or we'll cause deadlock. Right now that code never gets invoked - what was it for? Do we still need it? Can ProcessIo() get called from non-StartIo Paths? Can it get called multiple times? Not having been involved in the initial development of this driver, I would need to study the flow to make sure to respect the StorPort lock acquiring/releasing hierarchy rules at all times. If those conversant in the overall developmental history and architecture of this driver could share their thoughts, that would be great. Thanks, Judy -------------- next part -------------- An HTML attachment was scrubbed... URL: From Yong.sc.Chen at huawei.com Thu Jul 18 18:36:19 2013 From: Yong.sc.Chen at huawei.com (Yong Chen) Date: Fri, 19 Jul 2013 01:36:19 +0000 Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? In-Reply-To: <82C9F782B054C94B9FC04A331649C77A2FC01DFA@FMSMSX112.amr.corp.intel.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A5131277F9B@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEC20D@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <6B4557D9CF036C4E8F9D6C561818DABB365E3A33@FMSMSX112.amr.corp.intel.com> <2D98093777D3FD46A36253F35FE9D6938086C18C@ORSMSX109.amr.corp.intel.com> <82C9F782B054C94B9FC04A331649C77A2FC01DFA@FMSMSX112.amr.corp.intel.com> Message-ID: <02EC085151D99A469E06988E94FEBCDB1C42F32A@dfweml513-mbs.china.huawei.com> I think completion in ISR served some purposes as I used this switch during development. Unfortunately I found it broken as well during hibernation waking-up: SntiTranslateStartStopUnit() callback routine issues ProcessIo() directly in ISR , which will bluescreen. I was about to put a note in the sources in my code change. So they are multiple ways hitting it as we know it now. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Luse, Paul E Sent: Thursday, July 18, 2013 4:56 PM To: Freyensee, James P; Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it was originally in there during early dev to measure system performance impact of completing in the ISR or pushing off to a DPC; general old school rule of thumb is to minimize ISR work and finish everything else in a DPC to be friendlier to the system. With NVMe completion being so lightweight we figured we could get away without the DPC and it used to work both ways under heavy stress. As both methods are not always tested (after all it's a compile switch) clearly its easy to break one of them. I actually ran the tests DPC vs ISR back then and there was no significant impact either way. I was using xperf and I'm sure I shared the results with the other members of the original team as well - doubt I have them anymore but I'll look. Either way, at this point in time its probably is a good simplification to pick one method and remove the compile switch for the other. I'd probably stick with the DPC route as (a) there was no major benefit from finishing in ISR and (b) sounds like its busted now anyway :) From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Freyensee, James P Sent: Thursday, July 18, 2013 4:43 PM To: Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Out of curiosity, what was the original reason to have the ISR path in the first place? If it is currently in the driver code, there must had been some purpose to be able to either compile it using an ISR or a DPC. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Thursday, July 18, 2013 4:29 PM To: Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I have no problems removing it. ~Kris From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Po-Yen Chang Sent: Thursday, July 18, 2013 4:27 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I feel the same way as well. Let's wait for the response from LSI and Intel on this. If they all agree, I will go ahead remove it. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 4:17 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it looks like the reason this problem was not seen before is because it only surfaces when the COMPLETE_IN_DPC compile flag is not set. In other words, the COMPLETE_IN_ISR path is broken because it accesses our HwDeviceExtension without being synchronized with other paths in the driver which do the same. We can either fix the path which does completions in the ISR or get rid of that option entirely. Since it's generally considered bad practice to do that kind of work in an ISR because it's supposed to be as lean and mean as possible, would the team be adverse to getting rid of the logic which optionally allows completions to be handled by the ISR? If we insist on retaining it, a) we should come up with a good reason why and b) we should fix it asap because it is definitely not safe to use in its present form. Personally I vote for removal - we wouldn't need the COMPLETE_IN_DPC flag either anymore if we go that route. Thanks, Judy From: Judy Brock-SSI Sent: Thursday, July 18, 2013 6:22 AM To: Judy Brock-SSI; 'nvmewin at lists.openfabrics.org' Subject: RE: NvmeStartio path critical section handling not protected from NVMe ISR? I just thought of another way to handle this problem. Could we not call StorPortSynchronizeAccess() with a pointer back to our ProcessIo() routine? ProcessIo would get called before the call to StorPortSynchronizeAccess() returns and this would have the effect of guaranteeing synchronization with our ISR. This seems like a much cleaner solution that a lock-acquiring approach. I still don't know if there are any issues with ProcessIo being called multiple times, from non-StartIo code paths, etc. - would still need to be looked at. Thanks, Judy From: Judy Brock-SSI Sent: Wednesday, July 17, 2013 10:08 PM To: nvmewin at lists.openfabrics.org Subject: NvmeStartio path critical section handling not protected from NVMe ISR? All, Under Windows Server 2012, I've seen a crash where NVMeStartIo() gets interrupted by our ISR at a time when it's in the middle of manipulating a linked list critical data structure which the ISR then goes on to attempt to manipulate also - which results in a crash. Below is the call stack - see where I've inserted the comment "<---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK" 2: kd> kc Call Site nt!RtlpBreakWithStatusInstruction nt!KiBugCheckDebugBreak nt!KeBugCheck2 nt!KeBugCheckEx nt!KiBugCheckDispatch nt!KiFastFailDispatch nt!KiRaiseSecurityCheckFailure nvme!RtlFailFast nvme!FatalListEntryError nvme!RtlpCheckListEntry nvme!InsertTailList nvme!NVMeCompleteCmd nvme!NVMeIsrMsix nt!KiInterruptDispatch <---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK nvme!RemoveHeadList nvme!NVMeGetCmdEntry nvme!ProcessIo nvme!NVMeStartIo storport!RaidpAdapterContinueScatterGather hal!HalpAllocateAdapterCallbackV2 hal!IoFreeAdapterChannelV2 hal!HalAllocateAdapterChannelV2 hal!HalBuildScatterGatherListV2 storport!RaUnitStartIo storport!RaidUnitCompleteRequest storport!RaidpAdapterRedirectDpcRoutine nt!KiExecuteAllDpcs nt!KiRetireDpcList I looked through the code and noticed we never call StorPortAcquireSpinLock to acquire the InterruptLock to protect us from such pre-emption. Another way to achieve this would be to indicate we run at half-duplex rather than full-duplex but that would degrade the general performance of the driver. I'm not sure why we didn't run into this way before now - is there some other re-entrance protection algorithm besides the two above that others are aware of? If not, I believe we need to fix this asap. Suggestions: A. Simplest approach is to lock down all of NVMeStartIo as per below (not tested yet) but we almost may as well run half-duplex if we do this: 1 . At the very the top of NVMeStartIo: /* we should never be holding the interrupt lock upon entry to NVMeStartIo. * Acquire the Interrupt Spin Lock to protect against getting hit by our ISR. */ if (NULL == pAdapterExtension->hInterruptLock) { (StorPortAcquireSpinLock(pAdapterExtension, InterruptLock, NULL, &pAdapterExtension->hInterruptLock); } else { ASSERT(FALSE); } 2. At the very the top of IO_StorPortNotification PNVME_DEVICE_EXTENSION pAE = (PNVME_DEVICE_EXTENSION) pHwDeviceExtension; /* if we got here from NvmeStartIo we need to release the interrupt lock */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } 3. At the very bottom of NVMeStartIo: /* if we didn't release the Interrupt Lock in one of the calls to * IO_StorPortNotification above we need to release before we exit NVMEStartIo */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } return TRUE; } /* NVMeStartIo */ B. Better approach is to just lock ProcessIo(). But code exists in that routine which acquires the StartIo lock - we can't take locks out of order or we'll cause deadlock. Right now that code never gets invoked - what was it for? Do we still need it? Can ProcessIo() get called from non-StartIo Paths? Can it get called multiple times? Not having been involved in the initial development of this driver, I would need to study the flow to make sure to respect the StorPort lock acquiring/releasing hierarchy rules at all times. If those conversant in the overall developmental history and architecture of this driver could share their thoughts, that would be great. Thanks, Judy -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.e.luse at intel.com Thu Jul 18 19:21:29 2013 From: paul.e.luse at intel.com (Luse, Paul E) Date: Fri, 19 Jul 2013 02:21:29 +0000 Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? In-Reply-To: <02EC085151D99A469E06988E94FEBCDB1C42F32A@dfweml513-mbs.china.huawei.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A5131277F9B@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEC20D@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <6B4557D9CF036C4E8F9D6C561818DABB365E3A33@FMSMSX112.amr.corp.intel.com> <2D98093777D3FD46A36253F35FE9D6938086C18C@ORSMSX109.amr.corp.intel.com> <82C9F782B054C94B9FC04A331649C77A2FC01DFA@FMSMSX112.amr.corp.intel.com> <02EC085151D99A469E06988E94FEBCDB1C42F32A@dfweml513-mbs.china.huawei.com> Message-ID: <82C9F782B054C94B9FC04A331649C77A2FC02217@FMSMSX112.amr.corp.intel.com> All of the issues could of course be fixed, the point I think is that there's no real reason to fix them.... From: Yong Chen [mailto:Yong.sc.Chen at huawei.com] Sent: Thursday, July 18, 2013 6:36 PM To: Luse, Paul E; Freyensee, James P; Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? I think completion in ISR served some purposes as I used this switch during development. Unfortunately I found it broken as well during hibernation waking-up: SntiTranslateStartStopUnit() callback routine issues ProcessIo() directly in ISR , which will bluescreen. I was about to put a note in the sources in my code change. So they are multiple ways hitting it as we know it now. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Luse, Paul E Sent: Thursday, July 18, 2013 4:56 PM To: Freyensee, James P; Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it was originally in there during early dev to measure system performance impact of completing in the ISR or pushing off to a DPC; general old school rule of thumb is to minimize ISR work and finish everything else in a DPC to be friendlier to the system. With NVMe completion being so lightweight we figured we could get away without the DPC and it used to work both ways under heavy stress. As both methods are not always tested (after all it's a compile switch) clearly its easy to break one of them. I actually ran the tests DPC vs ISR back then and there was no significant impact either way. I was using xperf and I'm sure I shared the results with the other members of the original team as well - doubt I have them anymore but I'll look. Either way, at this point in time its probably is a good simplification to pick one method and remove the compile switch for the other. I'd probably stick with the DPC route as (a) there was no major benefit from finishing in ISR and (b) sounds like its busted now anyway :) From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Freyensee, James P Sent: Thursday, July 18, 2013 4:43 PM To: Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Out of curiosity, what was the original reason to have the ISR path in the first place? If it is currently in the driver code, there must had been some purpose to be able to either compile it using an ISR or a DPC. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Thursday, July 18, 2013 4:29 PM To: Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I have no problems removing it. ~Kris From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Po-Yen Chang Sent: Thursday, July 18, 2013 4:27 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I feel the same way as well. Let's wait for the response from LSI and Intel on this. If they all agree, I will go ahead remove it. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 4:17 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it looks like the reason this problem was not seen before is because it only surfaces when the COMPLETE_IN_DPC compile flag is not set. In other words, the COMPLETE_IN_ISR path is broken because it accesses our HwDeviceExtension without being synchronized with other paths in the driver which do the same. We can either fix the path which does completions in the ISR or get rid of that option entirely. Since it's generally considered bad practice to do that kind of work in an ISR because it's supposed to be as lean and mean as possible, would the team be adverse to getting rid of the logic which optionally allows completions to be handled by the ISR? If we insist on retaining it, a) we should come up with a good reason why and b) we should fix it asap because it is definitely not safe to use in its present form. Personally I vote for removal - we wouldn't need the COMPLETE_IN_DPC flag either anymore if we go that route. Thanks, Judy From: Judy Brock-SSI Sent: Thursday, July 18, 2013 6:22 AM To: Judy Brock-SSI; 'nvmewin at lists.openfabrics.org' Subject: RE: NvmeStartio path critical section handling not protected from NVMe ISR? I just thought of another way to handle this problem. Could we not call StorPortSynchronizeAccess() with a pointer back to our ProcessIo() routine? ProcessIo would get called before the call to StorPortSynchronizeAccess() returns and this would have the effect of guaranteeing synchronization with our ISR. This seems like a much cleaner solution that a lock-acquiring approach. I still don't know if there are any issues with ProcessIo being called multiple times, from non-StartIo code paths, etc. - would still need to be looked at. Thanks, Judy From: Judy Brock-SSI Sent: Wednesday, July 17, 2013 10:08 PM To: nvmewin at lists.openfabrics.org Subject: NvmeStartio path critical section handling not protected from NVMe ISR? All, Under Windows Server 2012, I've seen a crash where NVMeStartIo() gets interrupted by our ISR at a time when it's in the middle of manipulating a linked list critical data structure which the ISR then goes on to attempt to manipulate also - which results in a crash. Below is the call stack - see where I've inserted the comment "<---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK" 2: kd> kc Call Site nt!RtlpBreakWithStatusInstruction nt!KiBugCheckDebugBreak nt!KeBugCheck2 nt!KeBugCheckEx nt!KiBugCheckDispatch nt!KiFastFailDispatch nt!KiRaiseSecurityCheckFailure nvme!RtlFailFast nvme!FatalListEntryError nvme!RtlpCheckListEntry nvme!InsertTailList nvme!NVMeCompleteCmd nvme!NVMeIsrMsix nt!KiInterruptDispatch <---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK nvme!RemoveHeadList nvme!NVMeGetCmdEntry nvme!ProcessIo nvme!NVMeStartIo storport!RaidpAdapterContinueScatterGather hal!HalpAllocateAdapterCallbackV2 hal!IoFreeAdapterChannelV2 hal!HalAllocateAdapterChannelV2 hal!HalBuildScatterGatherListV2 storport!RaUnitStartIo storport!RaidUnitCompleteRequest storport!RaidpAdapterRedirectDpcRoutine nt!KiExecuteAllDpcs nt!KiRetireDpcList I looked through the code and noticed we never call StorPortAcquireSpinLock to acquire the InterruptLock to protect us from such pre-emption. Another way to achieve this would be to indicate we run at half-duplex rather than full-duplex but that would degrade the general performance of the driver. I'm not sure why we didn't run into this way before now - is there some other re-entrance protection algorithm besides the two above that others are aware of? If not, I believe we need to fix this asap. Suggestions: A. Simplest approach is to lock down all of NVMeStartIo as per below (not tested yet) but we almost may as well run half-duplex if we do this: 1 . At the very the top of NVMeStartIo: /* we should never be holding the interrupt lock upon entry to NVMeStartIo. * Acquire the Interrupt Spin Lock to protect against getting hit by our ISR. */ if (NULL == pAdapterExtension->hInterruptLock) { (StorPortAcquireSpinLock(pAdapterExtension, InterruptLock, NULL, &pAdapterExtension->hInterruptLock); } else { ASSERT(FALSE); } 2. At the very the top of IO_StorPortNotification PNVME_DEVICE_EXTENSION pAE = (PNVME_DEVICE_EXTENSION) pHwDeviceExtension; /* if we got here from NvmeStartIo we need to release the interrupt lock */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } 3. At the very bottom of NVMeStartIo: /* if we didn't release the Interrupt Lock in one of the calls to * IO_StorPortNotification above we need to release before we exit NVMEStartIo */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } return TRUE; } /* NVMeStartIo */ B. Better approach is to just lock ProcessIo(). But code exists in that routine which acquires the StartIo lock - we can't take locks out of order or we'll cause deadlock. Right now that code never gets invoked - what was it for? Do we still need it? Can ProcessIo() get called from non-StartIo Paths? Can it get called multiple times? Not having been involved in the initial development of this driver, I would need to study the flow to make sure to respect the StorPort lock acquiring/releasing hierarchy rules at all times. If those conversant in the overall developmental history and architecture of this driver could share their thoughts, that would be great. Thanks, Judy -------------- next part -------------- An HTML attachment was scrubbed... URL: From judy.brock at ssi.samsung.com Fri Jul 19 04:29:49 2013 From: judy.brock at ssi.samsung.com (Judy Brock-SSI) Date: Fri, 19 Jul 2013 11:29:49 +0000 Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? In-Reply-To: <82C9F782B054C94B9FC04A331649C77A2FC02217@FMSMSX112.amr.corp.intel.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A5131277F9B@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEC20D@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <6B4557D9CF036C4E8F9D6C561818DABB365E3A33@FMSMSX112.amr.corp.intel.com> <2D98093777D3FD46A36253F35FE9D6938086C18C@ORSMSX109.amr.corp.intel.com> <82C9F782B054C94B9FC04A331649C77A2FC01DFA@FMSMSX112.amr.corp.intel.com> <02EC085151D99A469E06988E94FEBCDB1C42F32A@dfweml513-mbs.china.huawei.com> <82C9F782B054C94B9FC04A331649C77A2FC02217@FMSMSX112.amr.corp.intel.com> Message-ID: <36E8D38D6B771A4BBDB1C0D800158A5131278438@SSIEXCH-MB3.ssi.samsung.com> I am wondering now though if there is some need for at least some variant/vestiges of completion-in-isr-instead-of-dpc. Isn't it the case that in dump mode - ie either during crash-dump or hibernation - the driver is not allowed to schedule a DPC in it's ISR to do completions? We could put some logic in the ISR to check whether we are in dump mode or not and if we are, do the completions in the ISR itself. After all, if we are in dump mode, we are single-threaded anyway and won't have to worry about pre-emption of our StartIo routine by our ISR. Thanks, Judy From: Luse, Paul E [mailto:paul.e.luse at intel.com] Sent: Thursday, July 18, 2013 7:21 PM To: Yong Chen; Freyensee, James P; Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? All of the issues could of course be fixed, the point I think is that there's no real reason to fix them.... From: Yong Chen [mailto:Yong.sc.Chen at huawei.com] Sent: Thursday, July 18, 2013 6:36 PM To: Luse, Paul E; Freyensee, James P; Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? I think completion in ISR served some purposes as I used this switch during development. Unfortunately I found it broken as well during hibernation waking-up: SntiTranslateStartStopUnit() callback routine issues ProcessIo() directly in ISR , which will bluescreen. I was about to put a note in the sources in my code change. So they are multiple ways hitting it as we know it now. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Luse, Paul E Sent: Thursday, July 18, 2013 4:56 PM To: Freyensee, James P; Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it was originally in there during early dev to measure system performance impact of completing in the ISR or pushing off to a DPC; general old school rule of thumb is to minimize ISR work and finish everything else in a DPC to be friendlier to the system. With NVMe completion being so lightweight we figured we could get away without the DPC and it used to work both ways under heavy stress. As both methods are not always tested (after all it's a compile switch) clearly its easy to break one of them. I actually ran the tests DPC vs ISR back then and there was no significant impact either way. I was using xperf and I'm sure I shared the results with the other members of the original team as well - doubt I have them anymore but I'll look. Either way, at this point in time its probably is a good simplification to pick one method and remove the compile switch for the other. I'd probably stick with the DPC route as (a) there was no major benefit from finishing in ISR and (b) sounds like its busted now anyway :) From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Freyensee, James P Sent: Thursday, July 18, 2013 4:43 PM To: Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Out of curiosity, what was the original reason to have the ISR path in the first place? If it is currently in the driver code, there must had been some purpose to be able to either compile it using an ISR or a DPC. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Thursday, July 18, 2013 4:29 PM To: Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I have no problems removing it. ~Kris From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Po-Yen Chang Sent: Thursday, July 18, 2013 4:27 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I feel the same way as well. Let's wait for the response from LSI and Intel on this. If they all agree, I will go ahead remove it. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 4:17 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it looks like the reason this problem was not seen before is because it only surfaces when the COMPLETE_IN_DPC compile flag is not set. In other words, the COMPLETE_IN_ISR path is broken because it accesses our HwDeviceExtension without being synchronized with other paths in the driver which do the same. We can either fix the path which does completions in the ISR or get rid of that option entirely. Since it's generally considered bad practice to do that kind of work in an ISR because it's supposed to be as lean and mean as possible, would the team be adverse to getting rid of the logic which optionally allows completions to be handled by the ISR? If we insist on retaining it, a) we should come up with a good reason why and b) we should fix it asap because it is definitely not safe to use in its present form. Personally I vote for removal - we wouldn't need the COMPLETE_IN_DPC flag either anymore if we go that route. Thanks, Judy From: Judy Brock-SSI Sent: Thursday, July 18, 2013 6:22 AM To: Judy Brock-SSI; 'nvmewin at lists.openfabrics.org' Subject: RE: NvmeStartio path critical section handling not protected from NVMe ISR? I just thought of another way to handle this problem. Could we not call StorPortSynchronizeAccess() with a pointer back to our ProcessIo() routine? ProcessIo would get called before the call to StorPortSynchronizeAccess() returns and this would have the effect of guaranteeing synchronization with our ISR. This seems like a much cleaner solution that a lock-acquiring approach. I still don't know if there are any issues with ProcessIo being called multiple times, from non-StartIo code paths, etc. - would still need to be looked at. Thanks, Judy From: Judy Brock-SSI Sent: Wednesday, July 17, 2013 10:08 PM To: nvmewin at lists.openfabrics.org Subject: NvmeStartio path critical section handling not protected from NVMe ISR? All, Under Windows Server 2012, I've seen a crash where NVMeStartIo() gets interrupted by our ISR at a time when it's in the middle of manipulating a linked list critical data structure which the ISR then goes on to attempt to manipulate also - which results in a crash. Below is the call stack - see where I've inserted the comment "<---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK" 2: kd> kc Call Site nt!RtlpBreakWithStatusInstruction nt!KiBugCheckDebugBreak nt!KeBugCheck2 nt!KeBugCheckEx nt!KiBugCheckDispatch nt!KiFastFailDispatch nt!KiRaiseSecurityCheckFailure nvme!RtlFailFast nvme!FatalListEntryError nvme!RtlpCheckListEntry nvme!InsertTailList nvme!NVMeCompleteCmd nvme!NVMeIsrMsix nt!KiInterruptDispatch <---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK nvme!RemoveHeadList nvme!NVMeGetCmdEntry nvme!ProcessIo nvme!NVMeStartIo storport!RaidpAdapterContinueScatterGather hal!HalpAllocateAdapterCallbackV2 hal!IoFreeAdapterChannelV2 hal!HalAllocateAdapterChannelV2 hal!HalBuildScatterGatherListV2 storport!RaUnitStartIo storport!RaidUnitCompleteRequest storport!RaidpAdapterRedirectDpcRoutine nt!KiExecuteAllDpcs nt!KiRetireDpcList I looked through the code and noticed we never call StorPortAcquireSpinLock to acquire the InterruptLock to protect us from such pre-emption. Another way to achieve this would be to indicate we run at half-duplex rather than full-duplex but that would degrade the general performance of the driver. I'm not sure why we didn't run into this way before now - is there some other re-entrance protection algorithm besides the two above that others are aware of? If not, I believe we need to fix this asap. Suggestions: A. Simplest approach is to lock down all of NVMeStartIo as per below (not tested yet) but we almost may as well run half-duplex if we do this: 1 . At the very the top of NVMeStartIo: /* we should never be holding the interrupt lock upon entry to NVMeStartIo. * Acquire the Interrupt Spin Lock to protect against getting hit by our ISR. */ if (NULL == pAdapterExtension->hInterruptLock) { (StorPortAcquireSpinLock(pAdapterExtension, InterruptLock, NULL, &pAdapterExtension->hInterruptLock); } else { ASSERT(FALSE); } 2. At the very the top of IO_StorPortNotification PNVME_DEVICE_EXTENSION pAE = (PNVME_DEVICE_EXTENSION) pHwDeviceExtension; /* if we got here from NvmeStartIo we need to release the interrupt lock */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } 3. At the very bottom of NVMeStartIo: /* if we didn't release the Interrupt Lock in one of the calls to * IO_StorPortNotification above we need to release before we exit NVMEStartIo */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } return TRUE; } /* NVMeStartIo */ B. Better approach is to just lock ProcessIo(). But code exists in that routine which acquires the StartIo lock - we can't take locks out of order or we'll cause deadlock. Right now that code never gets invoked - what was it for? Do we still need it? Can ProcessIo() get called from non-StartIo Paths? Can it get called multiple times? Not having been involved in the initial development of this driver, I would need to study the flow to make sure to respect the StorPort lock acquiring/releasing hierarchy rules at all times. If those conversant in the overall developmental history and architecture of this driver could share their thoughts, that would be great. Thanks, Judy -------------- next part -------------- An HTML attachment was scrubbed... URL: From Yong.sc.Chen at huawei.com Fri Jul 19 11:08:25 2013 From: Yong.sc.Chen at huawei.com (Yong Chen) Date: Fri, 19 Jul 2013 18:08:25 +0000 Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? In-Reply-To: <36E8D38D6B771A4BBDB1C0D800158A5131278438@SSIEXCH-MB3.ssi.samsung.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A5131277F9B@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEC20D@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <6B4557D9CF036C4E8F9D6C561818DABB365E3A33@FMSMSX112.amr.corp.intel.com> <2D98093777D3FD46A36253F35FE9D6938086C18C@ORSMSX109.amr.corp.intel.com> <82C9F782B054C94B9FC04A331649C77A2FC01DFA@FMSMSX112.amr.corp.intel.com> <02EC085151D99A469E06988E94FEBCDB1C42F32A@dfweml513-mbs.china.huawei.com> <82C9F782B054C94B9FC04A331649C77A2FC02217@FMSMSX112.amr.corp.intel.com> <36E8D38D6B771A4BBDB1C0D800158A5131278438@SSIEXCH-MB3.ssi.samsung.com> Message-ID: <02EC085151D99A469E06988E94FEBCDB1C42F518@dfweml513-mbs.china.huawei.com> Judy, you are right. There is no ISR or DPC in dump mode. The switch simply saves one function call. From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Friday, July 19, 2013 4:30 AM To: Luse, Paul E; Yong Chen; Freyensee, James P; Murray, Kris R; Po-Yen Chang; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? I am wondering now though if there is some need for at least some variant/vestiges of completion-in-isr-instead-of-dpc. Isn't it the case that in dump mode - ie either during crash-dump or hibernation - the driver is not allowed to schedule a DPC in it's ISR to do completions? We could put some logic in the ISR to check whether we are in dump mode or not and if we are, do the completions in the ISR itself. After all, if we are in dump mode, we are single-threaded anyway and won't have to worry about pre-emption of our StartIo routine by our ISR. Thanks, Judy From: Luse, Paul E [mailto:paul.e.luse at intel.com] Sent: Thursday, July 18, 2013 7:21 PM To: Yong Chen; Freyensee, James P; Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? All of the issues could of course be fixed, the point I think is that there's no real reason to fix them.... From: Yong Chen [mailto:Yong.sc.Chen at huawei.com] Sent: Thursday, July 18, 2013 6:36 PM To: Luse, Paul E; Freyensee, James P; Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? I think completion in ISR served some purposes as I used this switch during development. Unfortunately I found it broken as well during hibernation waking-up: SntiTranslateStartStopUnit() callback routine issues ProcessIo() directly in ISR , which will bluescreen. I was about to put a note in the sources in my code change. So they are multiple ways hitting it as we know it now. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Luse, Paul E Sent: Thursday, July 18, 2013 4:56 PM To: Freyensee, James P; Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it was originally in there during early dev to measure system performance impact of completing in the ISR or pushing off to a DPC; general old school rule of thumb is to minimize ISR work and finish everything else in a DPC to be friendlier to the system. With NVMe completion being so lightweight we figured we could get away without the DPC and it used to work both ways under heavy stress. As both methods are not always tested (after all it's a compile switch) clearly its easy to break one of them. I actually ran the tests DPC vs ISR back then and there was no significant impact either way. I was using xperf and I'm sure I shared the results with the other members of the original team as well - doubt I have them anymore but I'll look. Either way, at this point in time its probably is a good simplification to pick one method and remove the compile switch for the other. I'd probably stick with the DPC route as (a) there was no major benefit from finishing in ISR and (b) sounds like its busted now anyway :) From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Freyensee, James P Sent: Thursday, July 18, 2013 4:43 PM To: Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Out of curiosity, what was the original reason to have the ISR path in the first place? If it is currently in the driver code, there must had been some purpose to be able to either compile it using an ISR or a DPC. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Thursday, July 18, 2013 4:29 PM To: Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I have no problems removing it. ~Kris From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Po-Yen Chang Sent: Thursday, July 18, 2013 4:27 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I feel the same way as well. Let's wait for the response from LSI and Intel on this. If they all agree, I will go ahead remove it. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 4:17 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it looks like the reason this problem was not seen before is because it only surfaces when the COMPLETE_IN_DPC compile flag is not set. In other words, the COMPLETE_IN_ISR path is broken because it accesses our HwDeviceExtension without being synchronized with other paths in the driver which do the same. We can either fix the path which does completions in the ISR or get rid of that option entirely. Since it's generally considered bad practice to do that kind of work in an ISR because it's supposed to be as lean and mean as possible, would the team be adverse to getting rid of the logic which optionally allows completions to be handled by the ISR? If we insist on retaining it, a) we should come up with a good reason why and b) we should fix it asap because it is definitely not safe to use in its present form. Personally I vote for removal - we wouldn't need the COMPLETE_IN_DPC flag either anymore if we go that route. Thanks, Judy From: Judy Brock-SSI Sent: Thursday, July 18, 2013 6:22 AM To: Judy Brock-SSI; 'nvmewin at lists.openfabrics.org' Subject: RE: NvmeStartio path critical section handling not protected from NVMe ISR? I just thought of another way to handle this problem. Could we not call StorPortSynchronizeAccess() with a pointer back to our ProcessIo() routine? ProcessIo would get called before the call to StorPortSynchronizeAccess() returns and this would have the effect of guaranteeing synchronization with our ISR. This seems like a much cleaner solution that a lock-acquiring approach. I still don't know if there are any issues with ProcessIo being called multiple times, from non-StartIo code paths, etc. - would still need to be looked at. Thanks, Judy From: Judy Brock-SSI Sent: Wednesday, July 17, 2013 10:08 PM To: nvmewin at lists.openfabrics.org Subject: NvmeStartio path critical section handling not protected from NVMe ISR? All, Under Windows Server 2012, I've seen a crash where NVMeStartIo() gets interrupted by our ISR at a time when it's in the middle of manipulating a linked list critical data structure which the ISR then goes on to attempt to manipulate also - which results in a crash. Below is the call stack - see where I've inserted the comment "<---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK" 2: kd> kc Call Site nt!RtlpBreakWithStatusInstruction nt!KiBugCheckDebugBreak nt!KeBugCheck2 nt!KeBugCheckEx nt!KiBugCheckDispatch nt!KiFastFailDispatch nt!KiRaiseSecurityCheckFailure nvme!RtlFailFast nvme!FatalListEntryError nvme!RtlpCheckListEntry nvme!InsertTailList nvme!NVMeCompleteCmd nvme!NVMeIsrMsix nt!KiInterruptDispatch <---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK nvme!RemoveHeadList nvme!NVMeGetCmdEntry nvme!ProcessIo nvme!NVMeStartIo storport!RaidpAdapterContinueScatterGather hal!HalpAllocateAdapterCallbackV2 hal!IoFreeAdapterChannelV2 hal!HalAllocateAdapterChannelV2 hal!HalBuildScatterGatherListV2 storport!RaUnitStartIo storport!RaidUnitCompleteRequest storport!RaidpAdapterRedirectDpcRoutine nt!KiExecuteAllDpcs nt!KiRetireDpcList I looked through the code and noticed we never call StorPortAcquireSpinLock to acquire the InterruptLock to protect us from such pre-emption. Another way to achieve this would be to indicate we run at half-duplex rather than full-duplex but that would degrade the general performance of the driver. I'm not sure why we didn't run into this way before now - is there some other re-entrance protection algorithm besides the two above that others are aware of? If not, I believe we need to fix this asap. Suggestions: A. Simplest approach is to lock down all of NVMeStartIo as per below (not tested yet) but we almost may as well run half-duplex if we do this: 1 . At the very the top of NVMeStartIo: /* we should never be holding the interrupt lock upon entry to NVMeStartIo. * Acquire the Interrupt Spin Lock to protect against getting hit by our ISR. */ if (NULL == pAdapterExtension->hInterruptLock) { (StorPortAcquireSpinLock(pAdapterExtension, InterruptLock, NULL, &pAdapterExtension->hInterruptLock); } else { ASSERT(FALSE); } 2. At the very the top of IO_StorPortNotification PNVME_DEVICE_EXTENSION pAE = (PNVME_DEVICE_EXTENSION) pHwDeviceExtension; /* if we got here from NvmeStartIo we need to release the interrupt lock */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } 3. At the very bottom of NVMeStartIo: /* if we didn't release the Interrupt Lock in one of the calls to * IO_StorPortNotification above we need to release before we exit NVMEStartIo */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } return TRUE; } /* NVMeStartIo */ B. Better approach is to just lock ProcessIo(). But code exists in that routine which acquires the StartIo lock - we can't take locks out of order or we'll cause deadlock. Right now that code never gets invoked - what was it for? Do we still need it? Can ProcessIo() get called from non-StartIo Paths? Can it get called multiple times? Not having been involved in the initial development of this driver, I would need to study the flow to make sure to respect the StorPort lock acquiring/releasing hierarchy rules at all times. If those conversant in the overall developmental history and architecture of this driver could share their thoughts, that would be great. Thanks, Judy -------------- next part -------------- An HTML attachment was scrubbed... URL: From judy.brock at ssi.samsung.com Fri Jul 19 15:45:42 2013 From: judy.brock at ssi.samsung.com (Judy Brock-SSI) Date: Fri, 19 Jul 2013 22:45:42 +0000 Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? In-Reply-To: <02EC085151D99A469E06988E94FEBCDB1C42F518@dfweml513-mbs.china.huawei.com> References: <548C5470AAD9DA4A85D259B663190D361FFF6E02@corpmail1.na.ads.idt.com> <36E8D38D6B771A4BBDB1C0D800158A5131277F9B@SSIEXCH-MB3.ssi.samsung.com> <40A0B8B92CE0F94685A03264958540C4DEC20D@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> <6B4557D9CF036C4E8F9D6C561818DABB365E3A33@FMSMSX112.amr.corp.intel.com> <2D98093777D3FD46A36253F35FE9D6938086C18C@ORSMSX109.amr.corp.intel.com> <82C9F782B054C94B9FC04A331649C77A2FC01DFA@FMSMSX112.amr.corp.intel.com> <02EC085151D99A469E06988E94FEBCDB1C42F32A@dfweml513-mbs.china.huawei.com> <82C9F782B054C94B9FC04A331649C77A2FC02217@FMSMSX112.amr.corp.intel.com> <36E8D38D6B771A4BBDB1C0D800158A5131278438@SSIEXCH-MB3.ssi.samsung.com> <02EC085151D99A469E06988E94FEBCDB1C42F518@dfweml513-mbs.china.huawei.com> Message-ID: <36E8D38D6B771A4BBDB1C0D800158A51312785E8@SSIEXCH-MB3.ssi.samsung.com> Then let's get rid of it - to retain it, we would need to fix the holes we are aware of and probably others we haven't been hit by yet. And then we need to remember to maintain both paths - saving one function call is not worth the headaches. Thoughts? Judy From: Yong Chen [mailto:Yong.sc.Chen at huawei.com] Sent: Friday, July 19, 2013 11:08 AM To: Judy Brock-SSI; Luse, Paul E; Freyensee, James P; Murray, Kris R; Po-Yen Chang; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, you are right. There is no ISR or DPC in dump mode. The switch simply saves one function call. From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com] Sent: Friday, July 19, 2013 4:30 AM To: Luse, Paul E; Yong Chen; Freyensee, James P; Murray, Kris R; Po-Yen Chang; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? I am wondering now though if there is some need for at least some variant/vestiges of completion-in-isr-instead-of-dpc. Isn't it the case that in dump mode - ie either during crash-dump or hibernation - the driver is not allowed to schedule a DPC in it's ISR to do completions? We could put some logic in the ISR to check whether we are in dump mode or not and if we are, do the completions in the ISR itself. After all, if we are in dump mode, we are single-threaded anyway and won't have to worry about pre-emption of our StartIo routine by our ISR. Thanks, Judy From: Luse, Paul E [mailto:paul.e.luse at intel.com] Sent: Thursday, July 18, 2013 7:21 PM To: Yong Chen; Freyensee, James P; Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? All of the issues could of course be fixed, the point I think is that there's no real reason to fix them.... From: Yong Chen [mailto:Yong.sc.Chen at huawei.com] Sent: Thursday, July 18, 2013 6:36 PM To: Luse, Paul E; Freyensee, James P; Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? I think completion in ISR served some purposes as I used this switch during development. Unfortunately I found it broken as well during hibernation waking-up: SntiTranslateStartStopUnit() callback routine issues ProcessIo() directly in ISR , which will bluescreen. I was about to put a note in the sources in my code change. So they are multiple ways hitting it as we know it now. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Luse, Paul E Sent: Thursday, July 18, 2013 4:56 PM To: Freyensee, James P; Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it was originally in there during early dev to measure system performance impact of completing in the ISR or pushing off to a DPC; general old school rule of thumb is to minimize ISR work and finish everything else in a DPC to be friendlier to the system. With NVMe completion being so lightweight we figured we could get away without the DPC and it used to work both ways under heavy stress. As both methods are not always tested (after all it's a compile switch) clearly its easy to break one of them. I actually ran the tests DPC vs ISR back then and there was no significant impact either way. I was using xperf and I'm sure I shared the results with the other members of the original team as well - doubt I have them anymore but I'll look. Either way, at this point in time its probably is a good simplification to pick one method and remove the compile switch for the other. I'd probably stick with the DPC route as (a) there was no major benefit from finishing in ISR and (b) sounds like its busted now anyway :) From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Freyensee, James P Sent: Thursday, July 18, 2013 4:43 PM To: Murray, Kris R; Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Out of curiosity, what was the original reason to have the ISR path in the first place? If it is currently in the driver code, there must had been some purpose to be able to either compile it using an ISR or a DPC. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R Sent: Thursday, July 18, 2013 4:29 PM To: Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I have no problems removing it. ~Kris From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Po-Yen Chang Sent: Thursday, July 18, 2013 4:27 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? Judy, I feel the same way as well. Let's wait for the response from LSI and Intel on this. If they all agree, I will go ahead remove it. Thanks, Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI Sent: Thursday, July 18, 2013 4:17 PM To: Judy Brock-SSI; nvmewin at lists.openfabrics.org Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR? So it looks like the reason this problem was not seen before is because it only surfaces when the COMPLETE_IN_DPC compile flag is not set. In other words, the COMPLETE_IN_ISR path is broken because it accesses our HwDeviceExtension without being synchronized with other paths in the driver which do the same. We can either fix the path which does completions in the ISR or get rid of that option entirely. Since it's generally considered bad practice to do that kind of work in an ISR because it's supposed to be as lean and mean as possible, would the team be adverse to getting rid of the logic which optionally allows completions to be handled by the ISR? If we insist on retaining it, a) we should come up with a good reason why and b) we should fix it asap because it is definitely not safe to use in its present form. Personally I vote for removal - we wouldn't need the COMPLETE_IN_DPC flag either anymore if we go that route. Thanks, Judy From: Judy Brock-SSI Sent: Thursday, July 18, 2013 6:22 AM To: Judy Brock-SSI; 'nvmewin at lists.openfabrics.org' Subject: RE: NvmeStartio path critical section handling not protected from NVMe ISR? I just thought of another way to handle this problem. Could we not call StorPortSynchronizeAccess() with a pointer back to our ProcessIo() routine? ProcessIo would get called before the call to StorPortSynchronizeAccess() returns and this would have the effect of guaranteeing synchronization with our ISR. This seems like a much cleaner solution that a lock-acquiring approach. I still don't know if there are any issues with ProcessIo being called multiple times, from non-StartIo code paths, etc. - would still need to be looked at. Thanks, Judy From: Judy Brock-SSI Sent: Wednesday, July 17, 2013 10:08 PM To: nvmewin at lists.openfabrics.org Subject: NvmeStartio path critical section handling not protected from NVMe ISR? All, Under Windows Server 2012, I've seen a crash where NVMeStartIo() gets interrupted by our ISR at a time when it's in the middle of manipulating a linked list critical data structure which the ISR then goes on to attempt to manipulate also - which results in a crash. Below is the call stack - see where I've inserted the comment "<---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK" 2: kd> kc Call Site nt!RtlpBreakWithStatusInstruction nt!KiBugCheckDebugBreak nt!KeBugCheck2 nt!KeBugCheckEx nt!KiBugCheckDispatch nt!KiFastFailDispatch nt!KiRaiseSecurityCheckFailure nvme!RtlFailFast nvme!FatalListEntryError nvme!RtlpCheckListEntry nvme!InsertTailList nvme!NVMeCompleteCmd nvme!NVMeIsrMsix nt!KiInterruptDispatch <---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK nvme!RemoveHeadList nvme!NVMeGetCmdEntry nvme!ProcessIo nvme!NVMeStartIo storport!RaidpAdapterContinueScatterGather hal!HalpAllocateAdapterCallbackV2 hal!IoFreeAdapterChannelV2 hal!HalAllocateAdapterChannelV2 hal!HalBuildScatterGatherListV2 storport!RaUnitStartIo storport!RaidUnitCompleteRequest storport!RaidpAdapterRedirectDpcRoutine nt!KiExecuteAllDpcs nt!KiRetireDpcList I looked through the code and noticed we never call StorPortAcquireSpinLock to acquire the InterruptLock to protect us from such pre-emption. Another way to achieve this would be to indicate we run at half-duplex rather than full-duplex but that would degrade the general performance of the driver. I'm not sure why we didn't run into this way before now - is there some other re-entrance protection algorithm besides the two above that others are aware of? If not, I believe we need to fix this asap. Suggestions: A. Simplest approach is to lock down all of NVMeStartIo as per below (not tested yet) but we almost may as well run half-duplex if we do this: 1 . At the very the top of NVMeStartIo: /* we should never be holding the interrupt lock upon entry to NVMeStartIo. * Acquire the Interrupt Spin Lock to protect against getting hit by our ISR. */ if (NULL == pAdapterExtension->hInterruptLock) { (StorPortAcquireSpinLock(pAdapterExtension, InterruptLock, NULL, &pAdapterExtension->hInterruptLock); } else { ASSERT(FALSE); } 2. At the very the top of IO_StorPortNotification PNVME_DEVICE_EXTENSION pAE = (PNVME_DEVICE_EXTENSION) pHwDeviceExtension; /* if we got here from NvmeStartIo we need to release the interrupt lock */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } 3. At the very bottom of NVMeStartIo: /* if we didn't release the Interrupt Lock in one of the calls to * IO_StorPortNotification above we need to release before we exit NVMEStartIo */ if (NULL != pAE->hInterruptLock) { STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock; pAE->hInterruptLock = NULL; StorPortReleaseSpinLock(pAE, &hInterruptLockCopy); } return TRUE; } /* NVMeStartIo */ B. Better approach is to just lock ProcessIo(). But code exists in that routine which acquires the StartIo lock - we can't take locks out of order or we'll cause deadlock. Right now that code never gets invoked - what was it for? Do we still need it? Can ProcessIo() get called from non-StartIo Paths? Can it get called multiple times? Not having been involved in the initial development of this driver, I would need to study the flow to make sure to respect the StorPort lock acquiring/releasing hierarchy rules at all times. If those conversant in the overall developmental history and architecture of this driver could share their thoughts, that would be great. Thanks, Judy -------------- next part -------------- An HTML attachment was scrubbed... URL: From rrandall at micron.com Mon Jul 22 12:27:58 2013 From: rrandall at micron.com (Robert Randall (rrandall)) Date: Mon, 22 Jul 2013 19:27:58 +0000 Subject: [nvmewin] Building for Windows 8 and Windows 8+ Message-ID: <70C73440F9F7C24F81A11355C292A9B5902DB0FE@NTXBOIMBX04.micron.com> All, Are there plans to supply a build environment for v8 of the WDK and forward? The current source tree appears to rely only on the older tools (build, etc). With Microsoft forcing folks to use Visual Studio 11 / 2012 starting with v8 of the WDK does everyone use a private build environment or are there plans to add a VS Solution or VS compatible makefile to the repository? I tend to be IDE agnostic and prefer makefiles but either one would work fine. I may be able to volunteer for the task... Best regards, Robert. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rrandall at micron.com Mon Jul 22 12:54:21 2013 From: rrandall at micron.com (Robert Randall (rrandall)) Date: Mon, 22 Jul 2013 19:54:21 +0000 Subject: [nvmewin] driver tracing volunteer Message-ID: <70C73440F9F7C24F81A11355C292A9B5902DB164@NTXBOIMBX04.micron.com> All, I believe that implementing driver tracing (the in-kernel WMI based tracing, WPP tracing, etc) is on the to-do list for the project. I am volunteering for the task. Is the correct place to start a formal proposal on how to integrate tracing into the driver? Best regards, Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kwok.Kong at pmcs.com Mon Jul 22 13:13:39 2013 From: Kwok.Kong at pmcs.com (Kwok Kong) Date: Mon, 22 Jul 2013 13:13:39 -0700 Subject: [nvmewin] driver tracing volunteer In-Reply-To: <70C73440F9F7C24F81A11355C292A9B5902DB164@NTXBOIMBX04.micron.com> References: <70C73440F9F7C24F81A11355C292A9B5902DB164@NTXBOIMBX04.micron.com> Message-ID: <40A0B8B92CE0F94685A03264958540C4DEC985@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Robert, You come to the right place. It is great that you volunteer to implement this feature. Please send out your proposal to this group for review. We don't have any formal procedure to approve any new features. All you need to do is to implement the new features, fully test it and send the source out for final testing by others and the approval. Once you get the approval , the code will be checked into by Ray from intel (it is Alex from PMC-Sierra for now as Ray is on Sabbatical). It is always better to send the proposal to the group before spending lots of time on the implementation to avoid making big changes during code review to get the final approval. Thanks -Kwok From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robert Randall (rrandall) Sent: Monday, July 22, 2013 12:54 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] driver tracing volunteer All, I believe that implementing driver tracing (the in-kernel WMI based tracing, WPP tracing, etc) is on the to-do list for the project. I am volunteering for the task. Is the correct place to start a formal proposal on how to integrate tracing into the driver? Best regards, Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at pmcs.com Tue Jul 23 09:42:17 2013 From: Alex.Chang at pmcs.com (Alex Chang) Date: Tue, 23 Jul 2013 09:42:17 -0700 Subject: [nvmewin] ***UNCHECKED*** OFA Windows NVMe Driver Release Version 1.2 Candidate Message-ID: <40A0B8B92CE0F94685A03264958540C4DECCB9@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Hi all, Please find the attached release candidate that I had run through all required tests. As for the changes, it includes: 1. Removal of compiling flag COMPLETE_IN_DPC and associated codes. 2. Because the released driver will be test signed, changed nvme.inf to make the driver installed as long as the device contains Class Code 010802. The password is ofawin. Please review the changes and provide your feedbacks in your earliest convenience. We plan to release it before end of July once the approvals are received from Intel and LSI. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ReleaseV1.2.zip Type: application/x-zip-compressed Size: 170653 bytes Desc: ReleaseV1.2.zip URL: From rrandall at micron.com Tue Jul 23 14:56:36 2013 From: rrandall at micron.com (Robert Randall (rrandall)) Date: Tue, 23 Jul 2013 21:56:36 +0000 Subject: [nvmewin] driver tracing volunteer In-Reply-To: <40A0B8B92CE0F94685A03264958540C4DEC985@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> References: <70C73440F9F7C24F81A11355C292A9B5902DB164@NTXBOIMBX04.micron.com> <40A0B8B92CE0F94685A03264958540C4DEC985@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Message-ID: <70C73440F9F7C24F81A11355C292A9B5902DBF85@NTXBOIMBX04.micron.com> Quick overview and proposal on driver tracing to get the conversation started. My apologies to those already steeped in the WPP tea leaves. >From a code perspective driver tracing has two major pieces. The trace levels and trace flags definitions and the macro used to generate the trace message data. The on-line documentation begins here http://msdn.microsoft.com/en-us/library/windows/hardware/ff556204(v=vs.85).aspx. The in-kernel tracing stores the raw data only. Microsoft provides utilities to format the raw data into human readable output (tracefmt.exe for example). There is a collection of default data which is collected automatically like source file name, line number, etc. Data the code inserts works like a printf call for example: TRACE(_T_HWINIT, "MaximumTransferLength 0x%x", PortConfig->MaximumTransferLength); In STOR Port miniport drivers I've not experienced a need for both trace levels and trace flags. I've used trace flags only. This allows for up to 32 different trace message classes or types to be defined. Here is a proposed definition of WPP trace flags and a few required macros. #define WPP_CONTROL_GUIDS \ WPP_DEFINE_CONTROL_GUID(nvme,(A_GUID), \ WPP_DEFINE_BIT(_T_ERR) \ WPP_DEFINE_BIT(_T_WARN) \ WPP_DEFINE_BIT(_T_DEBUG) \ WPP_DEFINE_BIT(_T_API_ERR) \ WPP_DEFINE_BIT(_T_IO_ERR) \ WPP_DEFINE_BIT(_T_LOCKS) \ WPP_DEFINE_BIT(_T_INT) \ WPP_DEFINE_BIT(_T_PNP) \ WPP_DEFINE_BIT(_T_PWR) \ WPP_DEFINE_BIT(_T_HWINIT) \ WPP_DEFINE_BIT(_T_IO_TX) \ WPP_DEFINE_BIT(_T_IO_RX) \ WPP_DEFINE_BIT(_T_DEV_ERR) \ WPP_DEFINE_BIT(_T_SCSI_ERR) \ WPP_DEFINE_BIT(_T_FUA) \ WPP_DEFINE_BIT(_T_IOCTL_ERR) \ WPP_DEFINE_BIT(DebugFlag16) \ WPP_DEFINE_BIT(DebugFlag17) \ WPP_DEFINE_BIT(DebugFlag18) \ WPP_DEFINE_BIT(DebugFlag19) \ WPP_DEFINE_BIT(DebugFlag20) \ WPP_DEFINE_BIT(DebugFlag21) \ WPP_DEFINE_BIT(DebugFlag22) \ WPP_DEFINE_BIT(DebugFlag23) \ WPP_DEFINE_BIT(DebugFlag24) \ WPP_DEFINE_BIT(DebugFlag25) \ WPP_DEFINE_BIT(DebugFlag26) \ WPP_DEFINE_BIT(DebugFlag27) \ WPP_DEFINE_BIT(DebugFlag28) \ WPP_DEFINE_BIT(DebugFlag29) \ WPP_DEFINE_BIT(DebugFlag30) \ WPP_DEFINE_BIT(DebugFlag31) \ ) #define WPP_FLAGS_LOGGER(flags) WPP_LEVEL_LOGGER(flags) #define WPP_FLAGS_ENABLED(flags) WPP_LEVEL_ENABLED(flags) // enable global logger / boot time logging support #define WPP_GLOBALLOGGER The WPP preprocessor is executed against each source file containing trace messages. The preprocessor is relatively clever and will catch most argument problems at the time or preprocessing. The cost of determining if a trace call is invoked is inexpensive (bitwise operation) so the overhead of leaving tracing in a production driver is quite low. Of course, noise in the hot path should be minimal. The preprocessor generates all of the C macros used to generate the trace macro. All of the formatting information is stored in the PDB of the driver using #pragma statements. The generated header file matches the name of the source file and has a .tmh extension. This header file must be included in the source file. Collecting trace information can be a bit complicated but it is very useful. For example, you can choose to capture data to in-kernel buffers only and format the human readable output directly in the debugger (Windbg). I tend to find tracing most useful when debugging device bring-up, power state transitions, and plug and play transitions. This can be very handy when an HCK test such as CHAOS fails and you're left wonder why. A few additional examples: TRACE(_T_HWINIT, "nvme bar not found"); or TRACE(_T_HWINIT, "FATAL: passive initialization failed"); Since we are working with C macros we have some freedom. We can, for example, identify the adapter instance using a macro, for example: TRACE(_T_HWINIT, WPPIDFMT "FATAL: passive initialization failed", EXT_TO_ID(pAE)); Where EXT_TO_ID could be anything we prefer to use such as (BDF). Have a read and share feedback, questions, etc. Best regards, Robert. From: Kwok Kong [mailto:Kwok.Kong at pmcs.com] Sent: Monday, July 22, 2013 3:14 PM To: Robert Randall (rrandall); nvmewin at lists.openfabrics.org Subject: RE: [nvmewin] driver tracing volunteer Robert, You come to the right place. It is great that you volunteer to implement this feature. Please send out your proposal to this group for review. We don't have any formal procedure to approve any new features. All you need to do is to implement the new features, fully test it and send the source out for final testing by others and the approval. Once you get the approval , the code will be checked into by Ray from intel (it is Alex from PMC-Sierra for now as Ray is on Sabbatical). It is always better to send the proposal to the group before spending lots of time on the implementation to avoid making big changes during code review to get the final approval. Thanks -Kwok From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robert Randall (rrandall) Sent: Monday, July 22, 2013 12:54 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] driver tracing volunteer All, I believe that implementing driver tracing (the in-kernel WMI based tracing, WPP tracing, etc) is on the to-do list for the project. I am volunteering for the task. Is the correct place to start a formal proposal on how to integrate tracing into the driver? Best regards, Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at pmcs.com Fri Jul 26 14:16:43 2013 From: Alex.Chang at pmcs.com (Alex Chang) Date: Fri, 26 Jul 2013 14:16:43 -0700 Subject: [nvmewin] NVMe Windows DB Is LOCKED - Pushing Patch From PMC - Revision 1.2 Release Message-ID: <40A0B8B92CE0F94685A03264958540C4E43245@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Locking NVMe Windows DB. Thanks, Alex nvmewin mailing list nvmewin at lists.openfabrics.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at pmcs.com Fri Jul 26 14:40:25 2013 From: Alex.Chang at pmcs.com (Alex Chang) Date: Fri, 26 Jul 2013 14:40:25 -0700 Subject: [nvmewin] NVMe Windows Repo Is UNLOCKED - Completed in DPC Only Pushed Message-ID: <40A0B8B92CE0F94685A03264958540C4E43260@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Hi all, Latest patch from PMC (Removal of COMPLETE_IN_DPC) had been pushed to trunk. Meanwhile, a new tag called "Completed_in_dpc_only" had been created. If anyone has any questions, please feel free to contact me. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at pmcs.com Mon Jul 29 15:22:46 2013 From: Alex.Chang at pmcs.com (Alex Chang) Date: Mon, 29 Jul 2013 15:22:46 -0700 Subject: [nvmewin] ***UNCHECKED*** FW: NVMe Windows Driver Release 1.2 Package Message-ID: <40A0B8B92CE0F94685A03264958540C4E436F2@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Hi all, FYI. Since this is the first release from this group, I'd like to collect more feedbacks from all of subscribers. Thanks, Alex From: Alex Chang Sent: Monday, July 29, 2013 2:16 PM To: kris.r.murray at intel.com; Patel, Arpit (Arpit.Patel at lsi.com) Cc: Kwok Kong; Knoblaugh, Rick (Rick.Knoblaugh at lsi.com); 'raymond.c.robles at intel.com' Subject: NVMe Windows Driver Release 1.2 Package Hi Kris and Arpit, I have prepared the release package for your review. Please find it in the attachment. Password is ofanvme. As you may find that I added a readme.txt to detail the release related information. Please review it and let me know if you want to add something. Once you approve it, I will create a new directory called "releases" at the same level of trunk, tags, etc. Under "releases", a directory called "revision_1.2" will be created to accommodate the package. I have tested it on all supported Windows flavors via drive formatting, IOmeter 4K read/write, SDStress and SCSICompliance. Please test it out and give me any feedbacks if you have. If you're fine with it, please notify me your approval as soon as possible. I plan to release it before end of July (this coming Wednesday) per our agreement in the meeting in June. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: revision_1.2.zip Type: application/x-zip-compressed Size: 234509 bytes Desc: revision_1.2.zip URL: From ram_sunee at yahoo.com Wed Jul 31 07:48:11 2013 From: ram_sunee at yahoo.com (Ramesh Mangamuri) Date: Wed, 31 Jul 2013 07:48:11 -0700 (PDT) Subject: [nvmewin] Release of the nvmewin driver Message-ID: <1375282091.54104.YahooMailNeo@web163904.mail.gq1.yahoo.com> Hello, Can someone please confirm if there will be official release of nvmewin driver today, as planned in JUNE meeting ?. If so, when can I download the driver ? Best Regards, Ramesh   ******************************************************************************************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender.If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. ********************************************************************************************************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at pmcs.com Wed Jul 31 08:10:44 2013 From: Alex.Chang at pmcs.com (Alex Chang) Date: Wed, 31 Jul 2013 08:10:44 -0700 Subject: [nvmewin] Release of the nvmewin driver In-Reply-To: <1375282091.54104.YahooMailNeo@web163904.mail.gq1.yahoo.com> References: <1375282091.54104.YahooMailNeo@web163904.mail.gq1.yahoo.com> Message-ID: <40A0B8B92CE0F94685A03264958540C4EB0435@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Hi Ramesh, The release will be finished by the end of this week. You may download it after you receive a notification email when it's been released. Thanks, Alex From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Ramesh Mangamuri Sent: Wednesday, July 31, 2013 7:48 AM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] Release of the nvmewin driver Hello, Can someone please confirm if there will be official release of nvmewin driver today, as planned in JUNE meeting ?. If so, when can I download the driver ? Best Regards, Ramesh ************************************************************************ ******************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender.If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. ************************************************************************ ********************************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kwok.Kong at pmcs.com Wed Jul 31 08:16:23 2013 From: Kwok.Kong at pmcs.com (Kwok Kong) Date: Wed, 31 Jul 2013 08:16:23 -0700 Subject: [nvmewin] Release of the nvmewin driver In-Reply-To: <1375282091.54104.YahooMailNeo@web163904.mail.gq1.yahoo.com> References: <1375282091.54104.YahooMailNeo@web163904.mail.gq1.yahoo.com> Message-ID: <40A0B8B92CE0F94685A03264958540C4EB0438@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Ramesh, A release is planned today or the latest this Friday. Announcement will be made to this mailing list after a release is made. Thanks -Kwok From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Ramesh Mangamuri Sent: Wednesday, July 31, 2013 7:48 AM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] Release of the nvmewin driver Hello, Can someone please confirm if there will be official release of nvmewin driver today, as planned in JUNE meeting ?. If so, when can I download the driver ? Best Regards, Ramesh ************************************************************************ ******************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender.If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. ************************************************************************ ********************************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From Dharani.Kotte at sandisk.com Wed Jul 31 09:52:56 2013 From: Dharani.Kotte at sandisk.com (Dharani Kotte) Date: Wed, 31 Jul 2013 16:52:56 +0000 Subject: [nvmewin] ***UNCHECKED*** [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] OFA NVMe Windows driver contribution - 32-bit support Win7/Win8 Message-ID: <23EC73C80FB59046A6B7B8EB7B382659320578CB@SACMBXIP01.sdcorp.global.sandisk.com> The attached is the code with 32-bit support tested on the Win7/Win8 32-bit systems. Please review it and let me know the comments. Password: sndk1234 Thanks, Dharani. From: Kwok Kong [mailto:Kwok.Kong at pmcs.com] Sent: Wednesday, July 31, 2013 9:29 AM To: Dharani Kotte Cc: Dave Landsman; Gurpreet Anand; Sumant Patro Subject: RE: [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] OFA NVMe Windows driver contribution Dharani, This is great. Would you please email it out to the following mailing list asking for review and approval ? nvmewin at lists.openfabrics.org thanks -Kwok From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com] Sent: Wednesday, July 31, 2013 9:08 AM To: Kong, Kwok (Kwok.Kong at idt.com); Kwok Kong Cc: Dave Landsman; Gurpreet Anand; Sumant Patro Subject: [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] OFA NVMe Windows driver contribution Hi Kwok, The attached is the code with 32-bit support tested on the Win7/Win8 32-bit systems. Can you please send it for review and let me know the comments. Password: sndk1234 Thanks, Dharani. -----Original Message----- From: Kong, Kwok [mailto:Kwok.Kong at idt.com] Sent: Wednesday, June 26, 2013 12:58 To: Dave Landsman Subject: OFA NVMe Windows driver contribution Dave, We are working on a NVMe Windows driver feature planning for the Dec 2013 release. Samsung cannot take on the task to get the driver to support windows 32-bit systems. I wonder if Sandisk can take on the task to get the driver to work in Windows 32-bit system. It is much appreciated if Sandisk can take this task on. Please let me know what you think. Thanks -Kwok ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: source_32bit_support_07_17_2013_Review.zip Type: application/x-zip-compressed Size: 174216 bytes Desc: source_32bit_support_07_17_2013_Review.zip URL: From Alex.Chang at pmcs.com Wed Jul 31 11:40:31 2013 From: Alex.Chang at pmcs.com (Alex Chang) Date: Wed, 31 Jul 2013 11:40:31 -0700 Subject: [nvmewin] NVMe Windows Driver Released As Revision 1.2 Message-ID: <40A0B8B92CE0F94685A03264958540C4EB0594@bby1exm14.pmc_nt.nt.pmc-sierra.bc.ca> Dear all, As we planned in our meeting of June, the first formal release as revision 1.2 from NVMe Windows Work Group has been made available now. You may download it from NVMe Windows Work Group SVN Repository: http://www.openfabrics.org/svnrepo/nvmewin After downloading it, you may find the release package under the newly-created directory called "releases". More information can also be found in readme.txt file. Should you have any questions, please reply to this email list. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: