From justina_lai at phison.com Tue Jun 6 01:20:19 2017 From: justina_lai at phison.com (Justina Lai) Date: Tue, 6 Jun 2017 08:20:19 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <7125daf8d16c4290bbf16208e2dbe16e@ExMBX2.phison.com> <49158E750348AA499168FD41D88983607C75ACC4@fmsmsx117.amr.corp.intel.com> Message-ID: Hi Raymond, Will you help to release an official version to resolve this issue? Thank you! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Friday, May 26, 2017 10:02 AM To: 'Robles, Raymond C' ; nvmewin at lists.openfabrics.org Cc: Larry Li ; 'umaparepalli at gmail.com' Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Thanks for reply. Please find attachment. It's modified based on 4/8 157 version. Thanks! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Friday, May 26, 2017 1:36 AM To: Justina Lai >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, The issue you mention below is a known issue. I thought there was a patch pushed to fix this, but if not, then my recommendation is to provide a patch that resolves this issue. The main issue is for commands that require 2 child commands, the lock is acquired a second time for the second time on the completions side of the first child command. It should be a very simple fix. Normally, I request that community members who find the issue submit patches to resolve this issue. This is our model. Could you please provide the patch for the fix? NOTE: Uma Parepalli is the new OFA chair and will handle this issue moving forward. Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Thursday, May 25, 2017 1:03 AM To: nvmewin at lists.openfabrics.org Cc: Larry Li > Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2DEE0.CD885430] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- .......... 1. SntiTranslateModeSense() .......... 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ......... callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ....... 3.SntiTranslateModeSenseResponse() ....... case MODE_SENSE_RETURN_ALL: ....... if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() ........ if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: From raymond.c.robles at intel.com Tue Jun 13 13:31:15 2017 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Tue, 13 Jun 2017 20:31:15 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <7125daf8d16c4290bbf16208e2dbe16e@ExMBX2.phison.com> <49158E750348AA499168FD41D88983607C75ACC4@fmsmsx117.amr.corp.intel.com> Message-ID: <49158E750348AA499168FD41D889836082D1BEA2@fmsmsx117.amr.corp.intel.com> Hi Justina, I'm still waiting for feedback from Samsung and HGST on your patch. Tom/Judy/Suman, Can you please provide feedback on Phison's patch for the BSOD fix for grabbing the same lock twice? Thanks... Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Thursday, May 25, 2017 7:08 PM To: Robles, Raymond C ; nvmewin at lists.openfabrics.org Cc: Larry Li ; 'umaparepalli at gmail.com' Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Thanks for reply. Please find attachment. It's modified based on 4/8 157 version. Thanks! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Friday, May 26, 2017 1:36 AM To: Justina Lai >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, The issue you mention below is a known issue. I thought there was a patch pushed to fix this, but if not, then my recommendation is to provide a patch that resolves this issue. The main issue is for commands that require 2 child commands, the lock is acquired a second time for the second time on the completions side of the first child command. It should be a very simple fix. Normally, I request that community members who find the issue submit patches to resolve this issue. This is our model. Could you please provide the patch for the fix? NOTE: Uma Parepalli is the new OFA chair and will handle this issue moving forward. Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Thursday, May 25, 2017 1:03 AM To: nvmewin at lists.openfabrics.org Cc: Larry Li > Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2E449.53D9E4A0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- .......... 1. SntiTranslateModeSense() .......... 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ......... callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ....... 3.SntiTranslateModeSenseResponse() ....... case MODE_SENSE_RETURN_ALL: ....... if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() ........ if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: From raymond.c.robles at intel.com Tue Jun 13 16:57:13 2017 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Tue, 13 Jun 2017 23:57:13 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <7125daf8d16c4290bbf16208e2dbe16e@ExMBX2.phison.com> <49158E750348AA499168FD41D88983607C75ACC4@fmsmsx117.amr.corp.intel.com> Message-ID: <49158E750348AA499168FD41D889836082D1C2E1@fmsmsx117.amr.corp.intel.com> Hi Justina, I've reached out to the reviewing companies for feedback. One question for you, have you unit tested your fix? If so, can you provide your configs, platforms, and method for unit testing. Thanks... Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Tuesday, June 06, 2017 1:20 AM To: Robles, Raymond C ; nvmewin at lists.openfabrics.org Cc: Larry Li ; 'umaparepalli at gmail.com' Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Will you help to release an official version to resolve this issue? Thank you! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Friday, May 26, 2017 10:02 AM To: 'Robles, Raymond C' >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Thanks for reply. Please find attachment. It's modified based on 4/8 157 version. Thanks! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Friday, May 26, 2017 1:36 AM To: Justina Lai >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, The issue you mention below is a known issue. I thought there was a patch pushed to fix this, but if not, then my recommendation is to provide a patch that resolves this issue. The main issue is for commands that require 2 child commands, the lock is acquired a second time for the second time on the completions side of the first child command. It should be a very simple fix. Normally, I request that community members who find the issue submit patches to resolve this issue. This is our model. Could you please provide the patch for the fix? NOTE: Uma Parepalli is the new OFA chair and will handle this issue moving forward. Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Thursday, May 25, 2017 1:03 AM To: nvmewin at lists.openfabrics.org Cc: Larry Li > Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2E464.F9DF4FF0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- .......... 1. SntiTranslateModeSense() .......... 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ......... callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ....... 3.SntiTranslateModeSenseResponse() ....... case MODE_SENSE_RETURN_ALL: ....... if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() ........ if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: From justina_lai at phison.com Wed Jun 14 01:20:06 2017 From: justina_lai at phison.com (Justina Lai) Date: Wed, 14 Jun 2017 08:20:06 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <49158E750348AA499168FD41D889836082D1C2E1@fmsmsx117.amr.corp.intel.com> References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <7125daf8d16c4290bbf16208e2dbe16e@ExMBX2.phison.com> <49158E750348AA499168FD41D88983607C75ACC4@fmsmsx117.amr.corp.intel.com> <49158E750348AA499168FD41D889836082D1C2E1@fmsmsx117.amr.corp.intel.com> Message-ID: <0ec8b43775a14322b6fabb25c6a7c725@ExMBX2.phison.com> Hi Raymond, Yes, we have tested in our environment. l Device: Phison PS5007 l Platform: MB CPU CORE GIGABYTE X99 I7 5820K 3.3GHz 12 Customer's notebook (not launch yet) I7 7700HQ 2.8GHz 8 l Test Method: With driver 1.5 version: after install driver and cold boot, PC hang up and cannot enter OS like below. Same situation for win7/win8.1/win10. [cid:image001.jpg at 01D2E529.ED0540D0] With our modified version: after install driver and cold boot, PC can enter OS successfully. Thanks! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Wednesday, June 14, 2017 7:57 AM To: Justina Lai ; nvmewin at lists.openfabrics.org Cc: Larry Li ; 'umaparepalli at gmail.com' Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, I've reached out to the reviewing companies for feedback. One question for you, have you unit tested your fix? If so, can you provide your configs, platforms, and method for unit testing. Thanks... Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Tuesday, June 06, 2017 1:20 AM To: Robles, Raymond C >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Will you help to release an official version to resolve this issue? Thank you! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Friday, May 26, 2017 10:02 AM To: 'Robles, Raymond C' >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Thanks for reply. Please find attachment. It's modified based on 4/8 157 version. Thanks! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Friday, May 26, 2017 1:36 AM To: Justina Lai >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, The issue you mention below is a known issue. I thought there was a patch pushed to fix this, but if not, then my recommendation is to provide a patch that resolves this issue. The main issue is for commands that require 2 child commands, the lock is acquired a second time for the second time on the completions side of the first child command. It should be a very simple fix. Normally, I request that community members who find the issue submit patches to resolve this issue. This is our model. Could you please provide the patch for the fix? NOTE: Uma Parepalli is the new OFA chair and will handle this issue moving forward. Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Thursday, May 25, 2017 1:03 AM To: nvmewin at lists.openfabrics.org Cc: Larry Li > Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2E529.ED0540D0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- .......... 1. SntiTranslateModeSense() .......... 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ......... callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ....... 3.SntiTranslateModeSenseResponse() ....... case MODE_SENSE_RETURN_ALL: ....... if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() ........ if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: From raymond.c.robles at intel.com Wed Jun 14 14:40:26 2017 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Wed, 14 Jun 2017 21:40:26 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <0ec8b43775a14322b6fabb25c6a7c725@ExMBX2.phison.com> References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <7125daf8d16c4290bbf16208e2dbe16e@ExMBX2.phison.com> <49158E750348AA499168FD41D88983607C75ACC4@fmsmsx117.amr.corp.intel.com> <49158E750348AA499168FD41D889836082D1C2E1@fmsmsx117.amr.corp.intel.com> <0ec8b43775a14322b6fabb25c6a7c725@ExMBX2.phison.com> Message-ID: <49158E750348AA499168FD41D889836082D1CB9F@fmsmsx117.amr.corp.intel.com> Hi Justina, Thanks for providing the information below. As a friendly reminder, the OFA NVMe Win patch process is documented in our archives. Here is a copy of the process patch submission. Please let me know when you have completed full unit testing as outlined. Intel is currently running the unit tests outlined below on your patch. Note all tools needed are in the Process: * Submitter needs to base their changes on the latest (and re-base/re-test prior to sending their patch). * They send the patch to the email list "nvmewin at lists.openfabrics.org". * Some review will happen over the reflector, the maintainer will send a message out that the db is locked when they're ready to apply the patch which will be once at least one member from each mandatory reviewing company on the review panel has approved (can be via email or con call if needed). Once the patch is applied, the maintainer will send an email out. Patch Contents: * Code changes, short summary for SVN log, more verbose write up for release notes, confirmation of what Windows platforms had been tested. * Patch must follow the coding guidelines as attached. * All source files should be zipped up into a .zip file with password enabled. The zip file name should properly describe the main changes of the patch. Reviews: * Patches submitted by anyone, email to distribution list "nvmewin at lists.openfabrics.org". * Patch submission should include time sensitivity/expectations. * Patch submission should include justification for the patch (what value will it add, and are tradeoffs what are they and why would we want to take a hit). If multiple implementation options were considered, what data/reasoning was behind the implementation choice. * Patch submission should include files modified and explanation of code changes in each file. * At a minimum reviews need to be completed by Intel, HGST, and Samsung representatives. * Reviews include compliance with coding guidelines (in SVN) as well as logic. Unit Testing (all patches and release candidates require, at a minimum, the following testing): * 1 hour of data integrity testing using sdstress (Microsoft Tool) * 1 hour of heavy stress testing using IOMETER covering, at least, 512B, 4KB and 128KB ranging from 1 OIO to 64 OIO both sequential and random * Quick and slow format of both MBR and GPT partitioning * Microsoft SCSI Compliance, no failures except (warnings OK) * Additional testing with other tools is encouraged * Occurs in all supported OSs for the release * 64-bit, Windows 7, 8.0, 8,1, server 2008R2 and 2012 * 32-bit, Windows 7, 8.0 * Minimum test platform is latest QEMU. Those with their HW should test on it as well. * QEMU is available at https://github.com/nvmeqemu/nvmeqemu Thanks... Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Wednesday, June 14, 2017 1:20 AM To: Robles, Raymond C ; nvmewin at lists.openfabrics.org Cc: Larry Li ; 'umaparepalli at gmail.com' Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Yes, we have tested in our environment. l Device: Phison PS5007 l Platform: MB CPU CORE GIGABYTE X99 I7 5820K 3.3GHz 12 Customer's notebook (not launch yet) I7 7700HQ 2.8GHz 8 l Test Method: With driver 1.5 version: after install driver and cold boot, PC hang up and cannot enter OS like below. Same situation for win7/win8.1/win10. [cid:image001.jpg at 01D2E51B.F3232490] With our modified version: after install driver and cold boot, PC can enter OS successfully. Thanks! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Wednesday, June 14, 2017 7:57 AM To: Justina Lai >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, I've reached out to the reviewing companies for feedback. One question for you, have you unit tested your fix? If so, can you provide your configs, platforms, and method for unit testing. Thanks... Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Tuesday, June 06, 2017 1:20 AM To: Robles, Raymond C >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Will you help to release an official version to resolve this issue? Thank you! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Friday, May 26, 2017 10:02 AM To: 'Robles, Raymond C' >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Thanks for reply. Please find attachment. It's modified based on 4/8 157 version. Thanks! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Friday, May 26, 2017 1:36 AM To: Justina Lai >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, The issue you mention below is a known issue. I thought there was a patch pushed to fix this, but if not, then my recommendation is to provide a patch that resolves this issue. The main issue is for commands that require 2 child commands, the lock is acquired a second time for the second time on the completions side of the first child command. It should be a very simple fix. Normally, I request that community members who find the issue submit patches to resolve this issue. This is our model. Could you please provide the patch for the fix? NOTE: Uma Parepalli is the new OFA chair and will handle this issue moving forward. Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Thursday, May 25, 2017 1:03 AM To: nvmewin at lists.openfabrics.org Cc: Larry Li > Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2E51B.F3232490] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- .......... 1. SntiTranslateModeSense() .......... 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ......... callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ....... 3.SntiTranslateModeSenseResponse() ....... case MODE_SENSE_RETURN_ALL: ....... if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() ........ if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: From thomas.freeman at wdc.com Fri Jun 16 12:47:26 2017 From: thomas.freeman at wdc.com (Tom Freeman) Date: Fri, 16 Jun 2017 19:47:26 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <49158E750348AA499168FD41D889836082D1BEA2@fmsmsx117.amr.corp.intel.com> References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <7125daf8d16c4290bbf16208e2dbe16e@ExMBX2.phison.com> <49158E750348AA499168FD41D88983607C75ACC4@fmsmsx117.amr.corp.intel.com> , <49158E750348AA499168FD41D889836082D1BEA2@fmsmsx117.amr.corp.intel.com> Message-ID: Ray, I just discovered this email - it was routed to my junk folder. I'll review early next week Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital Email: Thomas.Freeman at wdc.com Office:+1-507-322-2311 -------- Original Message -------- Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver From: "Robles, Raymond C" Date: Jun 13, 2017, 3:31 PM To: Justina Lai ,nvmewin at lists.openfabrics.org Hi Justina, I’m still waiting for feedback from Samsung and HGST on your patch. Tom/Judy/Suman, Can you please provide feedback on Phison’s patch for the BSOD fix for grabbing the same lock twice? Thanks… Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Thursday, May 25, 2017 7:08 PM To: Robles, Raymond C ; nvmewin at lists.openfabrics.org Cc: Larry Li ; 'umaparepalli at gmail.com' Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Thanks for reply. Please find attachment. It’s modified based on 4/8 157 version. Thanks! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Friday, May 26, 2017 1:36 AM To: Justina Lai >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, The issue you mention below is a known issue. I thought there was a patch pushed to fix this, but if not, then my recommendation is to provide a patch that resolves this issue. The main issue is for commands that require 2 child commands, the lock is acquired a second time for the second time on the completions side of the first child command. It should be a very simple fix. Normally, I request that community members who find the issue submit patches to resolve this issue. This is our model. Could you please provide the patch for the fix? NOTE: Uma Parepalli is the new OFA chair and will handle this issue moving forward. Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Thursday, May 25, 2017 1:03 AM To: nvmewin at lists.openfabrics.org Cc: Larry Li > Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2E449.53D9E4A0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- ………. 1. SntiTranslateModeSense() ………. 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ……… callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ……. 3.SntiTranslateModeSenseResponse() ……. case MODE_SENSE_RETURN_ALL: ……. if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() …….. if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: From raymond.c.robles at intel.com Fri Jun 16 14:51:06 2017 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Fri, 16 Jun 2017 21:51:06 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <7125daf8d16c4290bbf16208e2dbe16e@ExMBX2.phison.com> <49158E750348AA499168FD41D88983607C75ACC4@fmsmsx117.amr.corp.intel.com> , <49158E750348AA499168FD41D889836082D1BEA2@fmsmsx117.amr.corp.intel.com> Message-ID: <49158E750348AA499168FD41D889836082D1E513@fmsmsx117.amr.corp.intel.com> Hi Tom, No worries... thank you very much. Thanks... Ray From: Tom Freeman [mailto:thomas.freeman at wdc.com] Sent: Friday, June 16, 2017 12:47 PM To: Robles, Raymond C ; Justina Lai ; nvmewin at lists.openfabrics.org Cc: umaparepalli at gmail.com; Larry Li Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Ray, I just discovered this email - it was routed to my junk folder. I'll review early next week Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital Email: Thomas.Freeman at wdc.com Office:+1-507-322-2311 -------- Original Message -------- Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver From: "Robles, Raymond C" > Date: Jun 13, 2017, 3:31 PM To: Justina Lai >,nvmewin at lists.openfabrics.org Hi Justina, I'm still waiting for feedback from Samsung and HGST on your patch. Tom/Judy/Suman, Can you please provide feedback on Phison's patch for the BSOD fix for grabbing the same lock twice? Thanks... Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Thursday, May 25, 2017 7:08 PM To: Robles, Raymond C >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Thanks for reply. Please find attachment. It's modified based on 4/8 157 version. Thanks! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Friday, May 26, 2017 1:36 AM To: Justina Lai >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, The issue you mention below is a known issue. I thought there was a patch pushed to fix this, but if not, then my recommendation is to provide a patch that resolves this issue. The main issue is for commands that require 2 child commands, the lock is acquired a second time for the second time on the completions side of the first child command. It should be a very simple fix. Normally, I request that community members who find the issue submit patches to resolve this issue. This is our model. Could you please provide the patch for the fix? NOTE: Uma Parepalli is the new OFA chair and will handle this issue moving forward. Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Thursday, May 25, 2017 1:03 AM To: nvmewin at lists.openfabrics.org Cc: Larry Li > Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2E6AF.FAA0E780] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- .......... 1. SntiTranslateModeSense() .......... 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ......... callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ....... 3.SntiTranslateModeSenseResponse() ....... case MODE_SENSE_RETURN_ALL: ....... if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() ........ if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer: This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: From thomas.freeman at wdc.com Mon Jun 19 08:17:21 2017 From: thomas.freeman at wdc.com (Tom Freeman) Date: Mon, 19 Jun 2017 15:17:21 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <49158E750348AA499168FD41D889836082D1BEA2@fmsmsx117.amr.corp.intel.com> References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <7125daf8d16c4290bbf16208e2dbe16e@ExMBX2.phison.com> <49158E750348AA499168FD41D88983607C75ACC4@fmsmsx117.amr.corp.intel.com> <49158E750348AA499168FD41D889836082D1BEA2@fmsmsx117.amr.corp.intel.com> Message-ID: Ray, Western Digital approves the fix. FYI, it appears this also addresses a similar situation during format. I see the same issue in the following code path: IoCompletionRoutine - > NVMeIoctlFormatNVMCallback -> FormatNVMGetIdentify->ProcessIo Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital(r) Email: Thomas.Freeman at wdc.com Office: +1-507-322-2311 From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Tuesday, June 13, 2017 3:31 PM To: Justina Lai ; nvmewin at lists.openfabrics.org Cc: 'umaparepalli at gmail.com' ; Larry Li Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, I'm still waiting for feedback from Samsung and HGST on your patch. Tom/Judy/Suman, Can you please provide feedback on Phison's patch for the BSOD fix for grabbing the same lock twice? Thanks... Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Thursday, May 25, 2017 7:08 PM To: Robles, Raymond C >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Thanks for reply. Please find attachment. It's modified based on 4/8 157 version. Thanks! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Friday, May 26, 2017 1:36 AM To: Justina Lai >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, The issue you mention below is a known issue. I thought there was a patch pushed to fix this, but if not, then my recommendation is to provide a patch that resolves this issue. The main issue is for commands that require 2 child commands, the lock is acquired a second time for the second time on the completions side of the first child command. It should be a very simple fix. Normally, I request that community members who find the issue submit patches to resolve this issue. This is our model. Could you please provide the patch for the fix? NOTE: Uma Parepalli is the new OFA chair and will handle this issue moving forward. Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Thursday, May 25, 2017 1:03 AM To: nvmewin at lists.openfabrics.org Cc: Larry Li > Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2E8E4.58AB6220] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- .......... 1. SntiTranslateModeSense() .......... 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ......... callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ....... 3.SntiTranslateModeSenseResponse() ....... case MODE_SENSE_RETURN_ALL: ....... if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() ........ if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: From raymond.c.robles at intel.com Mon Jun 19 13:36:39 2017 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Mon, 19 Jun 2017 20:36:39 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> Message-ID: <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> Hi Justina, Yes, we did receive the email and have been emailing you. I notice that your emails keep getting bounced off the reflector list and I have to approve them. Could you please officially subscribe to the reflector email list so that your emails are not bounced (which require me to approve and forward). There was an additional request for you to run the normal unit tests required for all OFA patches. I'll forward that email again after you officially subscribe the email list. Did you receive that email? Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Wednesday, May 24, 2017 10:18 PM To: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi, Have you received my mail on 5/22? Thanks! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Monday, May 22, 2017 1:14 PM To: 'nvmewin at lists.openfabrics.org' > Cc: Larry Li > Subject: Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2E900.A1138AD0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- .......... 1. SntiTranslateModeSense() .......... 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ......... callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ....... 3.SntiTranslateModeSenseResponse() ....... case MODE_SENSE_RETURN_ALL: ....... if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() ........ if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: From raymond.c.robles at intel.com Mon Jun 19 13:39:07 2017 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Mon, 19 Jun 2017 20:39:07 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <49158E750348AA499168FD41D889836082D1CB9F@fmsmsx117.amr.corp.intel.com> References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <7125daf8d16c4290bbf16208e2dbe16e@ExMBX2.phison.com> <49158E750348AA499168FD41D88983607C75ACC4@fmsmsx117.amr.corp.intel.com> <49158E750348AA499168FD41D889836082D1C2E1@fmsmsx117.amr.corp.intel.com> <0ec8b43775a14322b6fabb25c6a7c725@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D1CB9F@fmsmsx117.amr.corp.intel.com> Message-ID: <49158E750348AA499168FD41D889836082D1F5BB@fmsmsx117.amr.corp.intel.com> Resending request for additional unit testing for Phison patch. Justina, please confirm that you have received this request. Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Wednesday, June 14, 2017 2:40 PM To: Justina Lai ; nvmewin at lists.openfabrics.org Cc: 'umaparepalli at gmail.com' ; Larry Li Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Thanks for providing the information below. As a friendly reminder, the OFA NVMe Win patch process is documented in our archives. Here is a copy of the process patch submission. Please let me know when you have completed full unit testing as outlined. Intel is currently running the unit tests outlined below on your patch. Note all tools needed are in the Process: * Submitter needs to base their changes on the latest (and re-base/re-test prior to sending their patch). * They send the patch to the email list "nvmewin at lists.openfabrics.org". * Some review will happen over the reflector, the maintainer will send a message out that the db is locked when they're ready to apply the patch which will be once at least one member from each mandatory reviewing company on the review panel has approved (can be via email or con call if needed). Once the patch is applied, the maintainer will send an email out. Patch Contents: * Code changes, short summary for SVN log, more verbose write up for release notes, confirmation of what Windows platforms had been tested. * Patch must follow the coding guidelines as attached. * All source files should be zipped up into a .zip file with password enabled. The zip file name should properly describe the main changes of the patch. Reviews: * Patches submitted by anyone, email to distribution list "nvmewin at lists.openfabrics.org". * Patch submission should include time sensitivity/expectations. * Patch submission should include justification for the patch (what value will it add, and are tradeoffs what are they and why would we want to take a hit). If multiple implementation options were considered, what data/reasoning was behind the implementation choice. * Patch submission should include files modified and explanation of code changes in each file. * At a minimum reviews need to be completed by Intel, HGST, and Samsung representatives. * Reviews include compliance with coding guidelines (in SVN) as well as logic. Unit Testing (all patches and release candidates require, at a minimum, the following testing): * 1 hour of data integrity testing using sdstress (Microsoft Tool) * 1 hour of heavy stress testing using IOMETER covering, at least, 512B, 4KB and 128KB ranging from 1 OIO to 64 OIO both sequential and random * Quick and slow format of both MBR and GPT partitioning * Microsoft SCSI Compliance, no failures except (warnings OK) * Additional testing with other tools is encouraged * Occurs in all supported OSs for the release * 64-bit, Windows 7, 8.0, 8,1, server 2008R2 and 2012 * 32-bit, Windows 7, 8.0 * Minimum test platform is latest QEMU. Those with their HW should test on it as well. * QEMU is available at https://github.com/nvmeqemu/nvmeqemu Thanks... Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Wednesday, June 14, 2017 1:20 AM To: Robles, Raymond C >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Yes, we have tested in our environment. l Device: Phison PS5007 l Platform: MB CPU CORE GIGABYTE X99 I7 5820K 3.3GHz 12 Customer's notebook (not launch yet) I7 7700HQ 2.8GHz 8 l Test Method: With driver 1.5 version: after install driver and cold boot, PC hang up and cannot enter OS like below. Same situation for win7/win8.1/win10. [cid:image001.jpg at 01D2E901.6B1A23C0] With our modified version: after install driver and cold boot, PC can enter OS successfully. Thanks! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Wednesday, June 14, 2017 7:57 AM To: Justina Lai >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, I've reached out to the reviewing companies for feedback. One question for you, have you unit tested your fix? If so, can you provide your configs, platforms, and method for unit testing. Thanks... Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Tuesday, June 06, 2017 1:20 AM To: Robles, Raymond C >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Will you help to release an official version to resolve this issue? Thank you! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Friday, May 26, 2017 10:02 AM To: 'Robles, Raymond C' >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Thanks for reply. Please find attachment. It's modified based on 4/8 157 version. Thanks! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Friday, May 26, 2017 1:36 AM To: Justina Lai >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, The issue you mention below is a known issue. I thought there was a patch pushed to fix this, but if not, then my recommendation is to provide a patch that resolves this issue. The main issue is for commands that require 2 child commands, the lock is acquired a second time for the second time on the completions side of the first child command. It should be a very simple fix. Normally, I request that community members who find the issue submit patches to resolve this issue. This is our model. Could you please provide the patch for the fix? NOTE: Uma Parepalli is the new OFA chair and will handle this issue moving forward. Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Thursday, May 25, 2017 1:03 AM To: nvmewin at lists.openfabrics.org Cc: Larry Li > Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2E901.6B1A23C0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- .......... 1. SntiTranslateModeSense() .......... 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ......... callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ....... 3.SntiTranslateModeSenseResponse() ....... case MODE_SENSE_RETURN_ALL: ....... if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() ........ if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: From judy.brock at samsung.com Mon Jun 19 23:07:51 2017 From: judy.brock at samsung.com (Judy Brock) Date: Tue, 20 Jun 2017 06:07:51 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <7125daf8d16c4290bbf16208e2dbe16e@ExMBX2.phison.com> <49158E750348AA499168FD41D88983607C75ACC4@fmsmsx117.amr.corp.intel.com> <49158E750348AA499168FD41D889836082D1BEA2@fmsmsx117.amr.corp.intel.com> Message-ID: <36E8D38D6B771A4BBDB1C0D800158A51879B5677@SSIEXCH-MB3.ssi.samsung.com> Hi Ray, We are currently reviewing. Thanks, Judy From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Tom Freeman Sent: Monday, June 19, 2017 8:17 AM To: Robles, Raymond C; Justina Lai; nvmewin at lists.openfabrics.org Cc: 'umaparepalli at gmail.com'; Larry Li Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Ray, Western Digital approves the fix. FYI, it appears this also addresses a similar situation during format. I see the same issue in the following code path: IoCompletionRoutine - > NVMeIoctlFormatNVMCallback -> FormatNVMGetIdentify->ProcessIo Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital® Email: Thomas.Freeman at wdc.com Office: +1-507-322-2311 From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Tuesday, June 13, 2017 3:31 PM To: Justina Lai >; nvmewin at lists.openfabrics.org Cc: 'umaparepalli at gmail.com' >; Larry Li > Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, I'm still waiting for feedback from Samsung and HGST on your patch. Tom/Judy/Suman, Can you please provide feedback on Phison's patch for the BSOD fix for grabbing the same lock twice? Thanks... Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Thursday, May 25, 2017 7:08 PM To: Robles, Raymond C >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Raymond, Thanks for reply. Please find attachment. It's modified based on 4/8 157 version. Thanks! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Friday, May 26, 2017 1:36 AM To: Justina Lai >; nvmewin at lists.openfabrics.org Cc: Larry Li >; 'umaparepalli at gmail.com' > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, The issue you mention below is a known issue. I thought there was a patch pushed to fix this, but if not, then my recommendation is to provide a patch that resolves this issue. The main issue is for commands that require 2 child commands, the lock is acquired a second time for the second time on the completions side of the first child command. It should be a very simple fix. Normally, I request that community members who find the issue submit patches to resolve this issue. This is our model. Could you please provide the patch for the fix? NOTE: Uma Parepalli is the new OFA chair and will handle this issue moving forward. Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Thursday, May 25, 2017 1:03 AM To: nvmewin at lists.openfabrics.org Cc: Larry Li > Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2E8E4.58AB6220] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- .......... 1. SntiTranslateModeSense() .......... 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ......... callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ....... 3.SntiTranslateModeSenseResponse() ....... case MODE_SENSE_RETURN_ALL: ....... if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() ........ if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: From justina_lai at phison.com Mon Jun 19 23:45:10 2017 From: justina_lai at phison.com (Justina Lai) Date: Tue, 20 Jun 2017 06:45:10 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> Message-ID: <41156d2cf2ac489c91c045c0a4069e57@ExMBX2.phison.com> Dear Raymond, Yes, I have received your mail regarding unit test. We are now carrying on the test and hope to finish it within this week. May I know how to officially subscribe to the reflector email list? I saw my mail is already in the Non-digested Members of nvmewin. Should I set as digested mode? Thank you! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Tuesday, June 20, 2017 5:37 AM To: Justina Lai Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, Yes, we did receive the email and have been emailing you. I notice that your emails keep getting bounced off the reflector list and I have to approve them. Could you please officially subscribe to the reflector email list so that your emails are not bounced (which require me to approve and forward). There was an additional request for you to run the normal unit tests required for all OFA patches. I'll forward that email again after you officially subscribe the email list. Did you receive that email? Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Wednesday, May 24, 2017 10:18 PM To: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi, Have you received my mail on 5/22? Thanks! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Monday, May 22, 2017 1:14 PM To: 'nvmewin at lists.openfabrics.org' > Cc: Larry Li > Subject: Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2E9DB.22648B90] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- .......... 1. SntiTranslateModeSense() .......... 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ......... callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ....... 3.SntiTranslateModeSenseResponse() ....... case MODE_SENSE_RETURN_ALL: ....... if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() ........ if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: From raymond.c.robles at intel.com Thu Jun 22 15:55:28 2017 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Thu, 22 Jun 2017 22:55:28 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <41156d2cf2ac489c91c045c0a4069e57@ExMBX2.phison.com> References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> <41156d2cf2ac489c91c045c0a4069e57@ExMBX2.phison.com> Message-ID: <49158E750348AA499168FD41D889836082D21916@fmsmsx117.amr.corp.intel.com> Hi Justina, Please go the Open Fabrics Alliance website and subscribe to our reflector on this page: http://lists.openfabrics.org/mailman/listinfo/nvmewin The main NVMe WG page can be found here: https://www.openfabrics.org/index.php/working-groups.html Thanks... Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Monday, June 19, 2017 11:45 PM To: Robles, Raymond C Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Dear Raymond, Yes, I have received your mail regarding unit test. We are now carrying on the test and hope to finish it within this week. May I know how to officially subscribe to the reflector email list? I saw my mail is already in the Non-digested Members of nvmewin. Should I set as digested mode? Thank you! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Tuesday, June 20, 2017 5:37 AM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, Yes, we did receive the email and have been emailing you. I notice that your emails keep getting bounced off the reflector list and I have to approve them. Could you please officially subscribe to the reflector email list so that your emails are not bounced (which require me to approve and forward). There was an additional request for you to run the normal unit tests required for all OFA patches. I'll forward that email again after you officially subscribe the email list. Did you receive that email? Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Wednesday, May 24, 2017 10:18 PM To: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi, Have you received my mail on 5/22? Thanks! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Monday, May 22, 2017 1:14 PM To: 'nvmewin at lists.openfabrics.org' > Cc: Larry Li > Subject: Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2EB6F.F6F51600] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- .......... 1. SntiTranslateModeSense() .......... 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ......... callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ....... 3.SntiTranslateModeSenseResponse() ....... case MODE_SENSE_RETURN_ALL: ....... if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() ........ if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: From raymond.c.robles at intel.com Thu Jun 22 16:20:23 2017 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Thu, 22 Jun 2017 23:20:23 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <49158E750348AA499168FD41D889836082D21916@fmsmsx117.amr.corp.intel.com> References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> <41156d2cf2ac489c91c045c0a4069e57@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D21916@fmsmsx117.amr.corp.intel.com> Message-ID: <49158E750348AA499168FD41D889836082D21974@fmsmsx117.amr.corp.intel.com> Hi Justina, Intel has reviewed your and tested your patch. Please see the following comments: * Need to remove code blocks in #if 0 * When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: * This will block NvmeStartIo when driver is processing a completion entry. * A couple of options: o Indicate in nvmeSnti.c when handling operations that require more than one command to the drive, and have that locking needs passed to ProcessIo appropriately. It does not seem plausible to isolate the changes to nvmeIO.c only. o You could also instead remove startIoLock from IoCompletionDpcRoutine while keeping startIoLock at processIo. This way we can avoid double lock and also we can make NvmeStartIo and IoCompletionDpcRoutine parallel. The key takeaway is that StartIo and IoCompletionDpcRoutine must be able to run concurrently with no deadlocks. As you've coded the patch, a deadlock is possible. Please revise your patch to account for this scenario. Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Thursday, June 22, 2017 3:55 PM To: Justina Lai Cc: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Please go the Open Fabrics Alliance website and subscribe to our reflector on this page: http://lists.openfabrics.org/mailman/listinfo/nvmewin The main NVMe WG page can be found here: https://www.openfabrics.org/index.php/working-groups.html Thanks... Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Monday, June 19, 2017 11:45 PM To: Robles, Raymond C > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Dear Raymond, Yes, I have received your mail regarding unit test. We are now carrying on the test and hope to finish it within this week. May I know how to officially subscribe to the reflector email list? I saw my mail is already in the Non-digested Members of nvmewin. Should I set as digested mode? Thank you! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Tuesday, June 20, 2017 5:37 AM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, Yes, we did receive the email and have been emailing you. I notice that your emails keep getting bounced off the reflector list and I have to approve them. Could you please officially subscribe to the reflector email list so that your emails are not bounced (which require me to approve and forward). There was an additional request for you to run the normal unit tests required for all OFA patches. I'll forward that email again after you officially subscribe the email list. Did you receive that email? Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Wednesday, May 24, 2017 10:18 PM To: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi, Have you received my mail on 5/22? Thanks! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Monday, May 22, 2017 1:14 PM To: 'nvmewin at lists.openfabrics.org' > Cc: Larry Li > Subject: Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2EB70.F77C03D0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- .......... 1. SntiTranslateModeSense() .......... 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ......... callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ....... 3.SntiTranslateModeSenseResponse() ....... case MODE_SENSE_RETURN_ALL: ....... if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() ........ if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: From sm.kumar at samsung.com Fri Jun 23 00:31:28 2017 From: sm.kumar at samsung.com (MEENAKSHIKUMAR SOMASUNDARAM) Date: Fri, 23 Jun 2017 07:31:28 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <49158E750348AA499168FD41D889836082D21974@fmsmsx117.amr.corp.intel.com> References: <49158E750348AA499168FD41D889836082D21974@fmsmsx117.amr.corp.intel.com> <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> <41156d2cf2ac489c91c045c0a4069e57@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D21916@fmsmsx117.amr.corp.intel.com> Message-ID: <20170623073128epcms5p6816f49800689ec7ecee0b4b7e77f7b71@epcms5p6> An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 201706231302778_TEACQZUW.jpg Type: image/jpeg Size: 11633 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 201602111742151_N3WZA6X7.png Type: image/png Size: 33527 bytes Desc: not available URL: From raymond.c.robles at intel.com Fri Jun 23 12:17:22 2017 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Fri, 23 Jun 2017 19:17:22 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <20170623073128epcms5p6816f49800689ec7ecee0b4b7e77f7b71@epcms5p6> References: <49158E750348AA499168FD41D889836082D21974@fmsmsx117.amr.corp.intel.com> <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> <41156d2cf2ac489c91c045c0a4069e57@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D21916@fmsmsx117.amr.corp.intel.com> <20170623073128epcms5p6816f49800689ec7ecee0b4b7e77f7b71@epcms5p6> Message-ID: <49158E750348AA499168FD41D889836082D2216E@fmsmsx117.amr.corp.intel.com> Hi Meenakshikumar, Thanks for reviewing this patch and my comments. I agree with your statement below (highlighted). I was attempting to drive the possible solution where cores == queues… when admin commands can come from both Storport and the completion path. As you state, there are still several areas where a the lock is acquired for commands in the completion path. In order to address, I believe the correct solution is to actually not grab the StartIO lock in the completion path, but instead a DPC lock. Both run at the same IRQL, and will be handled appropriately by the OS with no deadlock. Essentially: · The OFA driver is acquiring StartIO Lock from processIo & IOCompletionDpcRoutine. So there is possibility for deadlock when we have an OS command requires more than one NVMe command. · I’m proposing we acquire the StartIO Lock from processIo and a DPC Lock from IOCompletionDpcRoutine. So there is no possibility of same lock acquired twice in single call stack. · I would also suggest removing any lingering areas where we issue a new DPC from IOCompletionDpcRoutine to call processIo… so that DPC waiting time for lock is reduced. Thanks… Ray From: MEENAKSHIKUMAR SOMASUNDARAM [mailto:sm.kumar at samsung.com] Sent: Friday, June 23, 2017 12:31 AM To: Robles, Raymond C ; Justina Lai Cc: Larry Li ; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray, This is Meenakshikumar from Samsung. I have reviewed the fix and it looks fine to handle pAE->MultipleCoresToSingleQueueFlag = TRUE case while a new admin command is issued in completion path. For your comment, I am sharing my thoughts below : • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. ==> In OFA driver, Storport will acquire the StartIoLock before calling NVMeStartIo. We acquire StartIo lock only in the completion path in case if cores != queues, to synchronize multiple cores trying to access single queue. So in OFA driver, only if cores == queues, then both StartIO and DPCs will run concurrently. Apart from this, in pAE->MultipleCoresToSingleQueueFlag = FALSE case, the intention to synchronize issuing admin command from StartIO & various completion paths is not handled in similar fashion in all cases. For example, in SntiTranslateModeSenseResponse(), FormatNVMGetIdentify(), etc., lock is acquired. But in SntiTranslateTemperatureResponse(), SntiTranslateStartStopUnitResponse(), SntiTranslateWriteBufferResponse() etc., StartIO lock is not acquired. There could be race condition b/w an IOCTL and Completion path trying to issue admin command. This should be addressed in the OFA driver, might be in a different patch. Thanks, Meenakshikumar --------- Original Message --------- Sender : Robles, Raymond C > Date : 2017-06-23 04:50 (GMT+5:30) Title : Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Intel has reviewed your and tested your patch. Please see the following comments: • Need to remove code blocks in #if 0 • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. • A couple of options: o Indicate in nvmeSnti.c when handling operations that require more than one command to the drive, and have that locking needs passed to ProcessIo appropriately. It does not seem plausible to isolate the changes to nvmeIO.c only. o You could also instead remove startIoLock from IoCompletionDpcRoutine while keeping startIoLock at processIo. This way we can avoid double lock and also we can make NvmeStartIo and IoCompletionDpcRoutine parallel. The key takeaway is that StartIo and IoCompletionDpcRoutine must be able to run concurrently with no deadlocks. As you’ve coded the patch, a deadlock is possible. Please revise your patch to account for this scenario. Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Thursday, June 22, 2017 3:55 PM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Please go the Open Fabrics Alliance website and subscribe to our reflector on this page: http://lists.openfabrics.org/mailman/listinfo/nvmewin The main NVMe WG page can be found here: https://www.openfabrics.org/index.php/working-groups.html Thanks… Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Monday, June 19, 2017 11:45 PM To: Robles, Raymond C > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Dear Raymond, Yes, I have received your mail regarding unit test. We are now carrying on the test and hope to finish it within this week. May I know how to officially subscribe to the reflector email list? I saw my mail is already in the Non-digested Members of nvmewin. Should I set as digested mode? Thank you! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Tuesday, June 20, 2017 5:37 AM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, Yes, we did receive the email and have been emailing you. I notice that your emails keep getting bounced off the reflector list and I have to approve them. Could you please officially subscribe to the reflector email list so that your emails are not bounced (which require me to approve and forward). There was an additional request for you to run the normal unit tests required for all OFA patches. I’ll forward that email again after you officially subscribe the email list. Did you receive that email? Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Wednesday, May 24, 2017 10:18 PM To: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi, Have you received my mail on 5/22? Thanks! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Monday, May 22, 2017 1:14 PM To: 'nvmewin at lists.openfabrics.org' > Cc: Larry Li > Subject: Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2EC19.225614D0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- ………. 1. SntiTranslateModeSense() ………. 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ……… callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ……. 3.SntiTranslateModeSenseResponse() ……. case MODE_SENSE_RETURN_ALL: ……. if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() …….. if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. _______________________________________________ nvmewin mailing list nvmewin at lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/nvmewin [cid:image002.png at 01D2EC19.225614D0] [http://ext.samsung.net/mail/ext/v1/external/status/update?userid=sm.kumar&do=bWFpbElEPTIwMTcwNjIzMDczMTI4ZXBjbXM1cDY4MTZmNDk4MDA2ODllYzdlY2VlMGI0YjdlNzdmN2I3MSZyZWNpcGllbnRBZGRyZXNzPXJheW1vbmQuYy5yb2JsZXNAaW50ZWwuY29t] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 33527 bytes Desc: image002.png URL: From thomas.freeman at wdc.com Mon Jun 26 13:31:26 2017 From: thomas.freeman at wdc.com (Tom Freeman) Date: Mon, 26 Jun 2017 20:31:26 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <49158E750348AA499168FD41D889836082D2216E@fmsmsx117.amr.corp.intel.com> References: <49158E750348AA499168FD41D889836082D21974@fmsmsx117.amr.corp.intel.com> <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> <41156d2cf2ac489c91c045c0a4069e57@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D21916@fmsmsx117.amr.corp.intel.com> <20170623073128epcms5p6816f49800689ec7ecee0b4b7e77f7b71@epcms5p6> <49158E750348AA499168FD41D889836082D2216E@fmsmsx117.amr.corp.intel.com> Message-ID: Hi Ray, I missing something in the deadlock discussion. Since IoCompletionRoutine and StartIo run at the same IRQL, wouldn’t one run to completion before the other is scheduled thus preventing the deadlock? Thanks, Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital® Email: Thomas.Freeman at wdc.com Office: +1-507-322-2311 From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Friday, June 23, 2017 2:17 PM To: sm.kumar at samsung.com; Justina Lai Cc: Larry Li ; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Meenakshikumar, Thanks for reviewing this patch and my comments. I agree with your statement below (highlighted). I was attempting to drive the possible solution where cores == queues… when admin commands can come from both Storport and the completion path. As you state, there are still several areas where a the lock is acquired for commands in the completion path. In order to address, I believe the correct solution is to actually not grab the StartIO lock in the completion path, but instead a DPC lock. Both run at the same IRQL, and will be handled appropriately by the OS with no deadlock. Essentially: · The OFA driver is acquiring StartIO Lock from processIo & IOCompletionDpcRoutine. So there is possibility for deadlock when we have an OS command requires more than one NVMe command. · I’m proposing we acquire the StartIO Lock from processIo and a DPC Lock from IOCompletionDpcRoutine. So there is no possibility of same lock acquired twice in single call stack. · I would also suggest removing any lingering areas where we issue a new DPC from IOCompletionDpcRoutine to call processIo… so that DPC waiting time for lock is reduced. Thanks… Ray From: MEENAKSHIKUMAR SOMASUNDARAM [mailto:sm.kumar at samsung.com] Sent: Friday, June 23, 2017 12:31 AM To: Robles, Raymond C >; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray, This is Meenakshikumar from Samsung. I have reviewed the fix and it looks fine to handle pAE->MultipleCoresToSingleQueueFlag = TRUE case while a new admin command is issued in completion path. For your comment, I am sharing my thoughts below : • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. ==> In OFA driver, Storport will acquire the StartIoLock before calling NVMeStartIo. We acquire StartIo lock only in the completion path in case if cores != queues, to synchronize multiple cores trying to access single queue. So in OFA driver, only if cores == queues, then both StartIO and DPCs will run concurrently. Apart from this, in pAE->MultipleCoresToSingleQueueFlag = FALSE case, the intention to synchronize issuing admin command from StartIO & various completion paths is not handled in similar fashion in all cases. For example, in SntiTranslateModeSenseResponse(), FormatNVMGetIdentify(), etc., lock is acquired. But in SntiTranslateTemperatureResponse(), SntiTranslateStartStopUnitResponse(), SntiTranslateWriteBufferResponse() etc., StartIO lock is not acquired. There could be race condition b/w an IOCTL and Completion path trying to issue admin command. This should be addressed in the OFA driver, might be in a different patch. Thanks, Meenakshikumar --------- Original Message --------- Sender : Robles, Raymond C > Date : 2017-06-23 04:50 (GMT+5:30) Title : Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Intel has reviewed your and tested your patch. Please see the following comments: • Need to remove code blocks in #if 0 • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. • A couple of options: o Indicate in nvmeSnti.c when handling operations that require more than one command to the drive, and have that locking needs passed to ProcessIo appropriately. It does not seem plausible to isolate the changes to nvmeIO.c only. o You could also instead remove startIoLock from IoCompletionDpcRoutine while keeping startIoLock at processIo. This way we can avoid double lock and also we can make NvmeStartIo and IoCompletionDpcRoutine parallel. The key takeaway is that StartIo and IoCompletionDpcRoutine must be able to run concurrently with no deadlocks. As you’ve coded the patch, a deadlock is possible. Please revise your patch to account for this scenario. Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Thursday, June 22, 2017 3:55 PM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Please go the Open Fabrics Alliance website and subscribe to our reflector on this page: http://lists.openfabrics.org/mailman/listinfo/nvmewin The main NVMe WG page can be found here: https://www.openfabrics.org/index.php/working-groups.html Thanks… Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Monday, June 19, 2017 11:45 PM To: Robles, Raymond C > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Dear Raymond, Yes, I have received your mail regarding unit test. We are now carrying on the test and hope to finish it within this week. May I know how to officially subscribe to the reflector email list? I saw my mail is already in the Non-digested Members of nvmewin. Should I set as digested mode? Thank you! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Tuesday, June 20, 2017 5:37 AM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, Yes, we did receive the email and have been emailing you. I notice that your emails keep getting bounced off the reflector list and I have to approve them. Could you please officially subscribe to the reflector email list so that your emails are not bounced (which require me to approve and forward). There was an additional request for you to run the normal unit tests required for all OFA patches. I’ll forward that email again after you officially subscribe the email list. Did you receive that email? Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Wednesday, May 24, 2017 10:18 PM To: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi, Have you received my mail on 5/22? Thanks! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Monday, May 22, 2017 1:14 PM To: 'nvmewin at lists.openfabrics.org' > Cc: Larry Li > Subject: Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2EE8D.0DB3EAE0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- ………. 1. SntiTranslateModeSense() ………. 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ……… callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ……. 3.SntiTranslateModeSenseResponse() ……. case MODE_SENSE_RETURN_ALL: ……. if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() …….. if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. _______________________________________________ nvmewin mailing list nvmewin at lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/nvmewin [cid:image002.png at 01D2EE8D.0DB3EAE0] [http://ext.samsung.net/mail/ext/v1/external/status/update?userid=sm.kumar&do=bWFpbElEPTIwMTcwNjIzMDczMTI4ZXBjbXM1cDY4MTZmNDk4MDA2ODllYzdlY2VlMGI0YjdlNzdmN2I3MSZyZWNpcGllbnRBZGRyZXNzPXJheW1vbmQuYy5yb2JsZXNAaW50ZWwuY29t] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 33527 bytes Desc: image002.png URL: From raymond.c.robles at intel.com Mon Jun 26 17:45:48 2017 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Tue, 27 Jun 2017 00:45:48 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: References: <49158E750348AA499168FD41D889836082D21974@fmsmsx117.amr.corp.intel.com> <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> <41156d2cf2ac489c91c045c0a4069e57@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D21916@fmsmsx117.amr.corp.intel.com> <20170623073128epcms5p6816f49800689ec7ecee0b4b7e77f7b71@epcms5p6> <49158E750348AA499168FD41D889836082D2216E@fmsmsx117.amr.corp.intel.com> Message-ID: <49158E750348AA499168FD41D889836082D236CD@fmsmsx117.amr.corp.intel.com> Hi Tom, Your statement is correct. I’m probably not being clear… sorry for the confusion. My assertion is that when cpu == cores, the driver should not be grabbing the StartIo lock as a practice. The StartIo lock is meant for Storport to aquire for call to the HwStartIo callback. If we are in the completion DPC, then that means we have already acquired the DPC lock, and we should not be acquiring any other lock as part of that DPC calling back into ProcessIo for a second command. That is the deadlock to which I’m referring (and the issue that Meenakshikumar from Samsung references). In my opinion, this issue should be resolved as part of this patch. Thoughts? Thanks… Ray From: Tom Freeman [mailto:thomas.freeman at wdc.com] Sent: Monday, June 26, 2017 1:31 PM To: Robles, Raymond C ; sm.kumar at samsung.com; Justina Lai Cc: Larry Li ; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray, I missing something in the deadlock discussion. Since IoCompletionRoutine and StartIo run at the same IRQL, wouldn’t one run to completion before the other is scheduled thus preventing the deadlock? Thanks, Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital® Email: Thomas.Freeman at wdc.com Office: +1-507-322-2311 From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Friday, June 23, 2017 2:17 PM To: sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Meenakshikumar, Thanks for reviewing this patch and my comments. I agree with your statement below (highlighted). I was attempting to drive the possible solution where cores == queues… when admin commands can come from both Storport and the completion path. As you state, there are still several areas where a the lock is acquired for commands in the completion path. In order to address, I believe the correct solution is to actually not grab the StartIO lock in the completion path, but instead a DPC lock. Both run at the same IRQL, and will be handled appropriately by the OS with no deadlock. Essentially: · The OFA driver is acquiring StartIO Lock from processIo & IOCompletionDpcRoutine. So there is possibility for deadlock when we have an OS command requires more than one NVMe command. · I’m proposing we acquire the StartIO Lock from processIo and a DPC Lock from IOCompletionDpcRoutine. So there is no possibility of same lock acquired twice in single call stack. · I would also suggest removing any lingering areas where we issue a new DPC from IOCompletionDpcRoutine to call processIo… so that DPC waiting time for lock is reduced. Thanks… Ray From: MEENAKSHIKUMAR SOMASUNDARAM [mailto:sm.kumar at samsung.com] Sent: Friday, June 23, 2017 12:31 AM To: Robles, Raymond C >; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray, This is Meenakshikumar from Samsung. I have reviewed the fix and it looks fine to handle pAE->MultipleCoresToSingleQueueFlag = TRUE case while a new admin command is issued in completion path. For your comment, I am sharing my thoughts below : • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. ==> In OFA driver, Storport will acquire the StartIoLock before calling NVMeStartIo. We acquire StartIo lock only in the completion path in case if cores != queues, to synchronize multiple cores trying to access single queue. So in OFA driver, only if cores == queues, then both StartIO and DPCs will run concurrently. Apart from this, in pAE->MultipleCoresToSingleQueueFlag = FALSE case, the intention to synchronize issuing admin command from StartIO & various completion paths is not handled in similar fashion in all cases. For example, in SntiTranslateModeSenseResponse(), FormatNVMGetIdentify(), etc., lock is acquired. But in SntiTranslateTemperatureResponse(), SntiTranslateStartStopUnitResponse(), SntiTranslateWriteBufferResponse() etc., StartIO lock is not acquired. There could be race condition b/w an IOCTL and Completion path trying to issue admin command. This should be addressed in the OFA driver, might be in a different patch. Thanks, Meenakshikumar --------- Original Message --------- Sender : Robles, Raymond C > Date : 2017-06-23 04:50 (GMT+5:30) Title : Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Intel has reviewed your and tested your patch. Please see the following comments: • Need to remove code blocks in #if 0 • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. • A couple of options: o Indicate in nvmeSnti.c when handling operations that require more than one command to the drive, and have that locking needs passed to ProcessIo appropriately. It does not seem plausible to isolate the changes to nvmeIO.c only. o You could also instead remove startIoLock from IoCompletionDpcRoutine while keeping startIoLock at processIo. This way we can avoid double lock and also we can make NvmeStartIo and IoCompletionDpcRoutine parallel. The key takeaway is that StartIo and IoCompletionDpcRoutine must be able to run concurrently with no deadlocks. As you’ve coded the patch, a deadlock is possible. Please revise your patch to account for this scenario. Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Thursday, June 22, 2017 3:55 PM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Please go the Open Fabrics Alliance website and subscribe to our reflector on this page: http://lists.openfabrics.org/mailman/listinfo/nvmewin The main NVMe WG page can be found here: https://www.openfabrics.org/index.php/working-groups.html Thanks… Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Monday, June 19, 2017 11:45 PM To: Robles, Raymond C > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Dear Raymond, Yes, I have received your mail regarding unit test. We are now carrying on the test and hope to finish it within this week. May I know how to officially subscribe to the reflector email list? I saw my mail is already in the Non-digested Members of nvmewin. Should I set as digested mode? Thank you! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Tuesday, June 20, 2017 5:37 AM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, Yes, we did receive the email and have been emailing you. I notice that your emails keep getting bounced off the reflector list and I have to approve them. Could you please officially subscribe to the reflector email list so that your emails are not bounced (which require me to approve and forward). There was an additional request for you to run the normal unit tests required for all OFA patches. I’ll forward that email again after you officially subscribe the email list. Did you receive that email? Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Wednesday, May 24, 2017 10:18 PM To: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi, Have you received my mail on 5/22? Thanks! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Monday, May 22, 2017 1:14 PM To: 'nvmewin at lists.openfabrics.org' > Cc: Larry Li > Subject: Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2EE8D.0DB3EAE0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- ………. 1. SntiTranslateModeSense() ………. 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ……… callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ……. 3.SntiTranslateModeSenseResponse() ……. case MODE_SENSE_RETURN_ALL: ……. if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() …….. if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. _______________________________________________ nvmewin mailing list nvmewin at lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/nvmewin [cid:image002.png at 01D2EE8D.0DB3EAE0] [http://ext.samsung.net/mail/ext/v1/external/status/update?userid=sm.kumar&do=bWFpbElEPTIwMTcwNjIzMDczMTI4ZXBjbXM1cDY4MTZmNDk4MDA2ODllYzdlY2VlMGI0YjdlNzdmN2I3MSZyZWNpcGllbnRBZGRyZXNzPXJheW1vbmQuYy5yb2JsZXNAaW50ZWwuY29t] Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer: This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 33527 bytes Desc: image002.png URL: From justina_lai at phison.com Tue Jun 27 03:10:35 2017 From: justina_lai at phison.com (Justina Lai) Date: Tue, 27 Jun 2017 10:10:35 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <49158E750348AA499168FD41D889836082D21974@fmsmsx117.amr.corp.intel.com> References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> <41156d2cf2ac489c91c045c0a4069e57@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D21916@fmsmsx117.amr.corp.intel.com> <49158E750348AA499168FD41D889836082D21974@fmsmsx117.amr.corp.intel.com> Message-ID: <4416b1013881453fa83011ff80dd47d5@ExMBX2.phison.com> Dear Raymond, Sorry for late reply. Option-1 Please refer to attached nvmeSnti.c Modification: Judge if execute process() SpinLock by SntiTranslateModeSenseResponse(). Option-2 Please refer to attached nvmeStd.c Modification: Directly mark SpinLock/ ReleaseSpinLock in IoCompletionRoutine() Thanks! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Friday, June 23, 2017 7:20 AM To: Justina Lai Cc: nvmewin at lists.openfabrics.org; Larry Li Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, Intel has reviewed your and tested your patch. Please see the following comments: * Need to remove code blocks in #if 0 * When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: * This will block NvmeStartIo when driver is processing a completion entry. * A couple of options: * Indicate in nvmeSnti.c when handling operations that require more than one command to the drive, and have that locking needs passed to ProcessIo appropriately. It does not seem plausible to isolate the changes to nvmeIO.c only. * You could also instead remove startIoLock from IoCompletionDpcRoutine while keeping startIoLock at processIo. This way we can avoid double lock and also we can make NvmeStartIo and IoCompletionDpcRoutine parallel. The key takeaway is that StartIo and IoCompletionDpcRoutine must be able to run concurrently with no deadlocks. As you've coded the patch, a deadlock is possible. Please revise your patch to account for this scenario. Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Thursday, June 22, 2017 3:55 PM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Please go the Open Fabrics Alliance website and subscribe to our reflector on this page: http://lists.openfabrics.org/mailman/listinfo/nvmewin The main NVMe WG page can be found here: https://www.openfabrics.org/index.php/working-groups.html Thanks... Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Monday, June 19, 2017 11:45 PM To: Robles, Raymond C > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Dear Raymond, Yes, I have received your mail regarding unit test. We are now carrying on the test and hope to finish it within this week. May I know how to officially subscribe to the reflector email list? I saw my mail is already in the Non-digested Members of nvmewin. Should I set as digested mode? Thank you! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Tuesday, June 20, 2017 5:37 AM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, Yes, we did receive the email and have been emailing you. I notice that your emails keep getting bounced off the reflector list and I have to approve them. Could you please officially subscribe to the reflector email list so that your emails are not bounced (which require me to approve and forward). There was an additional request for you to run the normal unit tests required for all OFA patches. I'll forward that email again after you officially subscribe the email list. Did you receive that email? Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Wednesday, May 24, 2017 10:18 PM To: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi, Have you received my mail on 5/22? Thanks! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Monday, May 22, 2017 1:14 PM To: 'nvmewin at lists.openfabrics.org' > Cc: Larry Li > Subject: Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2EF70.83F23570] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- .......... 1. SntiTranslateModeSense() .......... 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ......... callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ....... 3.SntiTranslateModeSenseResponse() ....... case MODE_SENSE_RETURN_ALL: ....... if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() ........ if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: nvmeSnti.c URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: nvmeStd.c URL: From thomas.freeman at wdc.com Tue Jun 27 11:33:05 2017 From: thomas.freeman at wdc.com (Tom Freeman) Date: Tue, 27 Jun 2017 18:33:05 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <49158E750348AA499168FD41D889836082D236CD@fmsmsx117.amr.corp.intel.com> References: <49158E750348AA499168FD41D889836082D21974@fmsmsx117.amr.corp.intel.com> <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> <41156d2cf2ac489c91c045c0a4069e57@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D21916@fmsmsx117.amr.corp.intel.com> <20170623073128epcms5p6816f49800689ec7ecee0b4b7e77f7b71@epcms5p6> <49158E750348AA499168FD41D889836082D2216E@fmsmsx117.amr.corp.intel.com> <49158E750348AA499168FD41D889836082D236CD@fmsmsx117.amr.corp.intel.com> Message-ID: Thanks Ray, I'm still confused - and I think it relates to a basic question I have about dpc routing. If I understand it correctly, IoCompletionRoutine is queued to run on the core matching its submission/completion queue. But, that instance of IoCompletionRoutine could simultaneously run on other cores. That's why we get the DpcLock, to synchronize multiple threads running the same instance of the dpc, IoCompletionRoutine. If that's correct, I think the following example shows why we need to get the StartIo lock in the completion path when it is issuing another NVMe command. The code path StartIo->...->ProcessIo issues i/o to submission queue 3 (on core 2). StartIo lock is automatically taken. Simultaneously, IoCompletionRoutine->....->ProcessIo() for comp queue 3 could be executing on a core other than core 2. If so, unless IoCompletionRoutine->....->ProcessIo() gets the StartIo lock, it will not be synchronized with StartIo->...->Process() on core 2. Let me know your thoughts, Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital® Email: Thomas.Freeman at wdc.com Office: +1-507-322-2311 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Monday, June 26, 2017 7:46 PM To: Tom Freeman ; sm.kumar at samsung.com; Justina Lai Cc: Larry Li ; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Tom, Your statement is correct. I’m probably not being clear… sorry for the confusion. My assertion is that when cpu == cores, the driver should not be grabbing the StartIo lock as a practice. The StartIo lock is meant for Storport to aquire for call to the HwStartIo callback. If we are in the completion DPC, then that means we have already acquired the DPC lock, and we should not be acquiring any other lock as part of that DPC calling back into ProcessIo for a second command. That is the deadlock to which I’m referring (and the issue that Meenakshikumar from Samsung references). In my opinion, this issue should be resolved as part of this patch. Thoughts? Thanks… Ray From: Tom Freeman [mailto:thomas.freeman at wdc.com] Sent: Monday, June 26, 2017 1:31 PM To: Robles, Raymond C >; sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray, I missing something in the deadlock discussion. Since IoCompletionRoutine and StartIo run at the same IRQL, wouldn’t one run to completion before the other is scheduled thus preventing the deadlock? Thanks, Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital® Email: Thomas.Freeman at wdc.com Office: +1-507-322-2311 From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Friday, June 23, 2017 2:17 PM To: sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Meenakshikumar, Thanks for reviewing this patch and my comments. I agree with your statement below (highlighted). I was attempting to drive the possible solution where cores == queues… when admin commands can come from both Storport and the completion path. As you state, there are still several areas where a the lock is acquired for commands in the completion path. In order to address, I believe the correct solution is to actually not grab the StartIO lock in the completion path, but instead a DPC lock. Both run at the same IRQL, and will be handled appropriately by the OS with no deadlock. Essentially: · The OFA driver is acquiring StartIO Lock from processIo & IOCompletionDpcRoutine. So there is possibility for deadlock when we have an OS command requires more than one NVMe command. · I’m proposing we acquire the StartIO Lock from processIo and a DPC Lock from IOCompletionDpcRoutine. So there is no possibility of same lock acquired twice in single call stack. · I would also suggest removing any lingering areas where we issue a new DPC from IOCompletionDpcRoutine to call processIo… so that DPC waiting time for lock is reduced. Thanks… Ray From: MEENAKSHIKUMAR SOMASUNDARAM [mailto:sm.kumar at samsung.com] Sent: Friday, June 23, 2017 12:31 AM To: Robles, Raymond C >; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray, This is Meenakshikumar from Samsung. I have reviewed the fix and it looks fine to handle pAE->MultipleCoresToSingleQueueFlag = TRUE case while a new admin command is issued in completion path. For your comment, I am sharing my thoughts below : • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. ==> In OFA driver, Storport will acquire the StartIoLock before calling NVMeStartIo. We acquire StartIo lock only in the completion path in case if cores != queues, to synchronize multiple cores trying to access single queue. So in OFA driver, only if cores == queues, then both StartIO and DPCs will run concurrently. Apart from this, in pAE->MultipleCoresToSingleQueueFlag = FALSE case, the intention to synchronize issuing admin command from StartIO & various completion paths is not handled in similar fashion in all cases. For example, in SntiTranslateModeSenseResponse(), FormatNVMGetIdentify(), etc., lock is acquired. But in SntiTranslateTemperatureResponse(), SntiTranslateStartStopUnitResponse(), SntiTranslateWriteBufferResponse() etc., StartIO lock is not acquired. There could be race condition b/w an IOCTL and Completion path trying to issue admin command. This should be addressed in the OFA driver, might be in a different patch. Thanks, Meenakshikumar --------- Original Message --------- Sender : Robles, Raymond C > Date : 2017-06-23 04:50 (GMT+5:30) Title : Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Intel has reviewed your and tested your patch. Please see the following comments: • Need to remove code blocks in #if 0 • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. • A couple of options: o Indicate in nvmeSnti.c when handling operations that require more than one command to the drive, and have that locking needs passed to ProcessIo appropriately. It does not seem plausible to isolate the changes to nvmeIO.c only. o You could also instead remove startIoLock from IoCompletionDpcRoutine while keeping startIoLock at processIo. This way we can avoid double lock and also we can make NvmeStartIo and IoCompletionDpcRoutine parallel. The key takeaway is that StartIo and IoCompletionDpcRoutine must be able to run concurrently with no deadlocks. As you’ve coded the patch, a deadlock is possible. Please revise your patch to account for this scenario. Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Thursday, June 22, 2017 3:55 PM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Please go the Open Fabrics Alliance website and subscribe to our reflector on this page: http://lists.openfabrics.org/mailman/listinfo/nvmewin The main NVMe WG page can be found here: https://www.openfabrics.org/index.php/working-groups.html Thanks… Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Monday, June 19, 2017 11:45 PM To: Robles, Raymond C > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Dear Raymond, Yes, I have received your mail regarding unit test. We are now carrying on the test and hope to finish it within this week. May I know how to officially subscribe to the reflector email list? I saw my mail is already in the Non-digested Members of nvmewin. Should I set as digested mode? Thank you! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Tuesday, June 20, 2017 5:37 AM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, Yes, we did receive the email and have been emailing you. I notice that your emails keep getting bounced off the reflector list and I have to approve them. Could you please officially subscribe to the reflector email list so that your emails are not bounced (which require me to approve and forward). There was an additional request for you to run the normal unit tests required for all OFA patches. I’ll forward that email again after you officially subscribe the email list. Did you receive that email? Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Wednesday, May 24, 2017 10:18 PM To: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi, Have you received my mail on 5/22? Thanks! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Monday, May 22, 2017 1:14 PM To: 'nvmewin at lists.openfabrics.org' > Cc: Larry Li > Subject: Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2EE8D.0DB3EAE0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- ………. 1. SntiTranslateModeSense() ………. 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ……… callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ……. 3.SntiTranslateModeSenseResponse() ……. case MODE_SENSE_RETURN_ALL: ……. if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() …….. if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. _______________________________________________ nvmewin mailing list nvmewin at lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/nvmewin [cid:image002.png at 01D2EE8D.0DB3EAE0] [http://ext.samsung.net/mail/ext/v1/external/status/update?userid=sm.kumar&do=bWFpbElEPTIwMTcwNjIzMDczMTI4ZXBjbXM1cDY4MTZmNDk4MDA2ODllYzdlY2VlMGI0YjdlNzdmN2I3MSZyZWNpcGllbnRBZGRyZXNzPXJheW1vbmQuYy5yb2JsZXNAaW50ZWwuY29t] Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer: This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 33527 bytes Desc: image002.png URL: From raymond.c.robles at intel.com Tue Jun 27 16:28:53 2017 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Tue, 27 Jun 2017 23:28:53 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: References: <49158E750348AA499168FD41D889836082D21974@fmsmsx117.amr.corp.intel.com> <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> <41156d2cf2ac489c91c045c0a4069e57@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D21916@fmsmsx117.amr.corp.intel.com> <20170623073128epcms5p6816f49800689ec7ecee0b4b7e77f7b71@epcms5p6> <49158E750348AA499168FD41D889836082D2216E@fmsmsx117.amr.corp.intel.com> <49158E750348AA499168FD41D889836082D236CD@fmsmsx117.amr.corp.intel.com> Message-ID: <49158E750348AA499168FD41D889836082D24324@fmsmsx117.amr.corp.intel.com> Hi Tom, Thanks for bearing with me on the details of this. After looking more closely at the code, I agree with your assessment in the case where a DPC is issued for a secondary, internal command (we do need to acquire the StartIo lock). I’ll wait for Phison to catch up on this thread and provide the update on their unit testing. Thanks… Ray From: Tom Freeman [mailto:thomas.freeman at wdc.com] Sent: Tuesday, June 27, 2017 11:33 AM To: Robles, Raymond C ; sm.kumar at samsung.com; Justina Lai Cc: Larry Li ; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Thanks Ray, I'm still confused - and I think it relates to a basic question I have about dpc routing. If I understand it correctly, IoCompletionRoutine is queued to run on the core matching its submission/completion queue. But, that instance of IoCompletionRoutine could simultaneously run on other cores. That's why we get the DpcLock, to synchronize multiple threads running the same instance of the dpc, IoCompletionRoutine. If that's correct, I think the following example shows why we need to get the StartIo lock in the completion path when it is issuing another NVMe command. The code path StartIo->...->ProcessIo issues i/o to submission queue 3 (on core 2). StartIo lock is automatically taken. Simultaneously, IoCompletionRoutine->....->ProcessIo() for comp queue 3 could be executing on a core other than core 2. If so, unless IoCompletionRoutine->....->ProcessIo() gets the StartIo lock, it will not be synchronized with StartIo->...->Process() on core 2. Let me know your thoughts, Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital® Email: Thomas.Freeman at wdc.com Office: +1-507-322-2311 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Monday, June 26, 2017 7:46 PM To: Tom Freeman >; sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Tom, Your statement is correct. I’m probably not being clear… sorry for the confusion. My assertion is that when cpu == cores, the driver should not be grabbing the StartIo lock as a practice. The StartIo lock is meant for Storport to aquire for call to the HwStartIo callback. If we are in the completion DPC, then that means we have already acquired the DPC lock, and we should not be acquiring any other lock as part of that DPC calling back into ProcessIo for a second command. That is the deadlock to which I’m referring (and the issue that Meenakshikumar from Samsung references). In my opinion, this issue should be resolved as part of this patch. Thoughts? Thanks… Ray From: Tom Freeman [mailto:thomas.freeman at wdc.com] Sent: Monday, June 26, 2017 1:31 PM To: Robles, Raymond C >; sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray, I missing something in the deadlock discussion. Since IoCompletionRoutine and StartIo run at the same IRQL, wouldn’t one run to completion before the other is scheduled thus preventing the deadlock? Thanks, Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital® Email: Thomas.Freeman at wdc.com Office: +1-507-322-2311 From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Friday, June 23, 2017 2:17 PM To: sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Meenakshikumar, Thanks for reviewing this patch and my comments. I agree with your statement below (highlighted). I was attempting to drive the possible solution where cores == queues… when admin commands can come from both Storport and the completion path. As you state, there are still several areas where a the lock is acquired for commands in the completion path. In order to address, I believe the correct solution is to actually not grab the StartIO lock in the completion path, but instead a DPC lock. Both run at the same IRQL, and will be handled appropriately by the OS with no deadlock. Essentially: · The OFA driver is acquiring StartIO Lock from processIo & IOCompletionDpcRoutine. So there is possibility for deadlock when we have an OS command requires more than one NVMe command. · I’m proposing we acquire the StartIO Lock from processIo and a DPC Lock from IOCompletionDpcRoutine. So there is no possibility of same lock acquired twice in single call stack. · I would also suggest removing any lingering areas where we issue a new DPC from IOCompletionDpcRoutine to call processIo… so that DPC waiting time for lock is reduced. Thanks… Ray From: MEENAKSHIKUMAR SOMASUNDARAM [mailto:sm.kumar at samsung.com] Sent: Friday, June 23, 2017 12:31 AM To: Robles, Raymond C >; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray, This is Meenakshikumar from Samsung. I have reviewed the fix and it looks fine to handle pAE->MultipleCoresToSingleQueueFlag = TRUE case while a new admin command is issued in completion path. For your comment, I am sharing my thoughts below : • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. ==> In OFA driver, Storport will acquire the StartIoLock before calling NVMeStartIo. We acquire StartIo lock only in the completion path in case if cores != queues, to synchronize multiple cores trying to access single queue. So in OFA driver, only if cores == queues, then both StartIO and DPCs will run concurrently. Apart from this, in pAE->MultipleCoresToSingleQueueFlag = FALSE case, the intention to synchronize issuing admin command from StartIO & various completion paths is not handled in similar fashion in all cases. For example, in SntiTranslateModeSenseResponse(), FormatNVMGetIdentify(), etc., lock is acquired. But in SntiTranslateTemperatureResponse(), SntiTranslateStartStopUnitResponse(), SntiTranslateWriteBufferResponse() etc., StartIO lock is not acquired. There could be race condition b/w an IOCTL and Completion path trying to issue admin command. This should be addressed in the OFA driver, might be in a different patch. Thanks, Meenakshikumar --------- Original Message --------- Sender : Robles, Raymond C > Date : 2017-06-23 04:50 (GMT+5:30) Title : Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Intel has reviewed your and tested your patch. Please see the following comments: • Need to remove code blocks in #if 0 • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. • A couple of options: o Indicate in nvmeSnti.c when handling operations that require more than one command to the drive, and have that locking needs passed to ProcessIo appropriately. It does not seem plausible to isolate the changes to nvmeIO.c only. o You could also instead remove startIoLock from IoCompletionDpcRoutine while keeping startIoLock at processIo. This way we can avoid double lock and also we can make NvmeStartIo and IoCompletionDpcRoutine parallel. The key takeaway is that StartIo and IoCompletionDpcRoutine must be able to run concurrently with no deadlocks. As you’ve coded the patch, a deadlock is possible. Please revise your patch to account for this scenario. Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Thursday, June 22, 2017 3:55 PM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Please go the Open Fabrics Alliance website and subscribe to our reflector on this page: http://lists.openfabrics.org/mailman/listinfo/nvmewin The main NVMe WG page can be found here: https://www.openfabrics.org/index.php/working-groups.html Thanks… Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Monday, June 19, 2017 11:45 PM To: Robles, Raymond C > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Dear Raymond, Yes, I have received your mail regarding unit test. We are now carrying on the test and hope to finish it within this week. May I know how to officially subscribe to the reflector email list? I saw my mail is already in the Non-digested Members of nvmewin. Should I set as digested mode? Thank you! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Tuesday, June 20, 2017 5:37 AM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, Yes, we did receive the email and have been emailing you. I notice that your emails keep getting bounced off the reflector list and I have to approve them. Could you please officially subscribe to the reflector email list so that your emails are not bounced (which require me to approve and forward). There was an additional request for you to run the normal unit tests required for all OFA patches. I’ll forward that email again after you officially subscribe the email list. Did you receive that email? Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Wednesday, May 24, 2017 10:18 PM To: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi, Have you received my mail on 5/22? Thanks! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Monday, May 22, 2017 1:14 PM To: 'nvmewin at lists.openfabrics.org' > Cc: Larry Li > Subject: Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2EE8D.0DB3EAE0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- ………. 1. SntiTranslateModeSense() ………. 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ……… callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ……. 3.SntiTranslateModeSenseResponse() ……. case MODE_SENSE_RETURN_ALL: ……. if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() …….. if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. _______________________________________________ nvmewin mailing list nvmewin at lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/nvmewin [cid:image002.png at 01D2EE8D.0DB3EAE0] [http://ext.samsung.net/mail/ext/v1/external/status/update?userid=sm.kumar&do=bWFpbElEPTIwMTcwNjIzMDczMTI4ZXBjbXM1cDY4MTZmNDk4MDA2ODllYzdlY2VlMGI0YjdlNzdmN2I3MSZyZWNpcGllbnRBZGRyZXNzPXJheW1vbmQuYy5yb2JsZXNAaW50ZWwuY29t] Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer: This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system. Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer: This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 33527 bytes Desc: image002.png URL: From raymond.c.robles at intel.com Tue Jun 27 16:31:08 2017 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Tue, 27 Jun 2017 23:31:08 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <4416b1013881453fa83011ff80dd47d5@ExMBX2.phison.com> References: <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> <41156d2cf2ac489c91c045c0a4069e57@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D21916@fmsmsx117.amr.corp.intel.com> <49158E750348AA499168FD41D889836082D21974@fmsmsx117.amr.corp.intel.com> <4416b1013881453fa83011ff80dd47d5@ExMBX2.phison.com> Message-ID: <49158E750348AA499168FD41D889836082D24345@fmsmsx117.amr.corp.intel.com> Hi Justina, Could you please reply to the latest thread emails where Tom and I are discussing the possible solutions. After the discussion, I'm ok with acquiring the StartIo lock as long as it's in a new DPC spawned by the IOCompletionDpcRoutine. Please also report the results of the unit tests previously outlined. Thanks... Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Tuesday, June 27, 2017 3:11 AM To: Robles, Raymond C Cc: nvmewin at lists.openfabrics.org; Larry Li Subject: RE: Compatibility issue with 1.5 version nvme driver Dear Raymond, Sorry for late reply. Option-1 Please refer to attached nvmeSnti.c Modification: Judge if execute process() SpinLock by SntiTranslateModeSenseResponse(). Option-2 Please refer to attached nvmeStd.c Modification: Directly mark SpinLock/ ReleaseSpinLock in IoCompletionRoutine() Thanks! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Friday, June 23, 2017 7:20 AM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org; Larry Li > Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, Intel has reviewed your and tested your patch. Please see the following comments: * Need to remove code blocks in #if 0 * When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: * This will block NvmeStartIo when driver is processing a completion entry. * A couple of options: * Indicate in nvmeSnti.c when handling operations that require more than one command to the drive, and have that locking needs passed to ProcessIo appropriately. It does not seem plausible to isolate the changes to nvmeIO.c only. * You could also instead remove startIoLock from IoCompletionDpcRoutine while keeping startIoLock at processIo. This way we can avoid double lock and also we can make NvmeStartIo and IoCompletionDpcRoutine parallel. The key takeaway is that StartIo and IoCompletionDpcRoutine must be able to run concurrently with no deadlocks. As you've coded the patch, a deadlock is possible. Please revise your patch to account for this scenario. Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Thursday, June 22, 2017 3:55 PM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Please go the Open Fabrics Alliance website and subscribe to our reflector on this page: http://lists.openfabrics.org/mailman/listinfo/nvmewin The main NVMe WG page can be found here: https://www.openfabrics.org/index.php/working-groups.html Thanks... Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Monday, June 19, 2017 11:45 PM To: Robles, Raymond C > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Dear Raymond, Yes, I have received your mail regarding unit test. We are now carrying on the test and hope to finish it within this week. May I know how to officially subscribe to the reflector email list? I saw my mail is already in the Non-digested Members of nvmewin. Should I set as digested mode? Thank you! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Tuesday, June 20, 2017 5:37 AM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, Yes, we did receive the email and have been emailing you. I notice that your emails keep getting bounced off the reflector list and I have to approve them. Could you please officially subscribe to the reflector email list so that your emails are not bounced (which require me to approve and forward). There was an additional request for you to run the normal unit tests required for all OFA patches. I'll forward that email again after you officially subscribe the email list. Did you receive that email? Thanks... Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Wednesday, May 24, 2017 10:18 PM To: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi, Have you received my mail on 5/22? Thanks! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Monday, May 22, 2017 1:14 PM To: 'nvmewin at lists.openfabrics.org' > Cc: Larry Li > Subject: Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2EF62.C77E6AB0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- .......... 1. SntiTranslateModeSense() .......... 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ......... callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ....... 3.SntiTranslateModeSenseResponse() ....... case MODE_SENSE_RETURN_ALL: ....... if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() ........ if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: From judy.brock at samsung.com Fri Jun 30 06:15:45 2017 From: judy.brock at samsung.com (Judy Brock) Date: Fri, 30 Jun 2017 13:15:45 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <49158E750348AA499168FD41D889836082D24324@fmsmsx117.amr.corp.intel.com> References: <49158E750348AA499168FD41D889836082D21974@fmsmsx117.amr.corp.intel.com> <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> <41156d2cf2ac489c91c045c0a4069e57@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D21916@fmsmsx117.amr.corp.intel.com> <20170623073128epcms5p6816f49800689ec7ecee0b4b7e77f7b71@epcms5p6> <49158E750348AA499168FD41D889836082D2216E@fmsmsx117.amr.corp.intel.com> <49158E750348AA499168FD41D889836082D236CD@fmsmsx117.amr.corp.intel.com> <49158E750348AA499168FD41D889836082D24324@fmsmsx117.amr.corp.intel.com> Message-ID: <7df726714f614aca94deac3018fee192@samsung.com> Hi Ray et al, We just want to clarify our position/feedback after all the recent back and forth. In the MultipleCoresToSingleQueueFlag == TRUE completion path case: • we understand and agree with acquiring the StartIO lock in the IOCompletion Routine. We do not support acquiring a DPC lock instead. • we agree the code SHOULD NOT try to take the StartIO lock in ProcessIO if trying to issue a new secondary command (this caused the deadlock Justina reported) • we are ok with Justina’s original fix with respect to this issue In the MultipleCoresToSingleQueueFlag == FALSE completion path case (i.e. cores == queues): • The IOCompletion routine should take a DPC lock (as it does today) • The various translation completion command handlers should all be consistent and SHOULD take the StartIO lock in ProcessIO if trying to issue a new command Thanks, Judy From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Tuesday, June 27, 2017 4:29 PM To: Tom Freeman; sm.kumar at samsung.com; Justina Lai Cc: Larry Li; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Tom, Thanks for bearing with me on the details of this. After looking more closely at the code, I agree with your assessment in the case where a DPC is issued for a secondary, internal command (we do need to acquire the StartIo lock). I’ll wait for Phison to catch up on this thread and provide the update on their unit testing. Thanks… Ray From: Tom Freeman [mailto:thomas.freeman at wdc.com] Sent: Tuesday, June 27, 2017 11:33 AM To: Robles, Raymond C >; sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Thanks Ray, I'm still confused - and I think it relates to a basic question I have about dpc routing. If I understand it correctly, IoCompletionRoutine is queued to run on the core matching its submission/completion queue. But, that instance of IoCompletionRoutine could simultaneously run on other cores. That's why we get the DpcLock, to synchronize multiple threads running the same instance of the dpc, IoCompletionRoutine. If that's correct, I think the following example shows why we need to get the StartIo lock in the completion path when it is issuing another NVMe command. The code path StartIo->...->ProcessIo issues i/o to submission queue 3 (on core 2). StartIo lock is automatically taken. Simultaneously, IoCompletionRoutine->....->ProcessIo() for comp queue 3 could be executing on a core other than core 2. If so, unless IoCompletionRoutine->....->ProcessIo() gets the StartIo lock, it will not be synchronized with StartIo->...->Process() on core 2. Let me know your thoughts, Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital® Email: Thomas.Freeman at wdc.com Office: +1-507-322-2311 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Monday, June 26, 2017 7:46 PM To: Tom Freeman >; sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Tom, Your statement is correct. I’m probably not being clear… sorry for the confusion. My assertion is that when cpu == cores, the driver should not be grabbing the StartIo lock as a practice. The StartIo lock is meant for Storport to aquire for call to the HwStartIo callback. If we are in the completion DPC, then that means we have already acquired the DPC lock, and we should not be acquiring any other lock as part of that DPC calling back into ProcessIo for a second command. That is the deadlock to which I’m referring (and the issue that Meenakshikumar from Samsung references). In my opinion, this issue should be resolved as part of this patch. Thoughts? Thanks… Ray From: Tom Freeman [mailto:thomas.freeman at wdc.com] Sent: Monday, June 26, 2017 1:31 PM To: Robles, Raymond C >; sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray, I missing something in the deadlock discussion. Since IoCompletionRoutine and StartIo run at the same IRQL, wouldn’t one run to completion before the other is scheduled thus preventing the deadlock? Thanks, Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital® Email: Thomas.Freeman at wdc.com Office: +1-507-322-2311 From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Friday, June 23, 2017 2:17 PM To: sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Meenakshikumar, Thanks for reviewing this patch and my comments. I agree with your statement below (highlighted). I was attempting to drive the possible solution where cores == queues… when admin commands can come from both Storport and the completion path. As you state, there are still several areas where a the lock is acquired for commands in the completion path. In order to address, I believe the correct solution is to actually not grab the StartIO lock in the completion path, but instead a DPC lock. Both run at the same IRQL, and will be handled appropriately by the OS with no deadlock. Essentially: • The OFA driver is acquiring StartIO Lock from processIo & IOCompletionDpcRoutine. So there is possibility for deadlock when we have an OS command requires more than one NVMe command. • I’m proposing we acquire the StartIO Lock from processIo and a DPC Lock from IOCompletionDpcRoutine. So there is no possibility of same lock acquired twice in single call stack. • I would also suggest removing any lingering areas where we issue a new DPC from IOCompletionDpcRoutine to call processIo… so that DPC waiting time for lock is reduced. Thanks… Ray From: MEENAKSHIKUMAR SOMASUNDARAM [mailto:sm.kumar at samsung.com] Sent: Friday, June 23, 2017 12:31 AM To: Robles, Raymond C >; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray, This is Meenakshikumar from Samsung. I have reviewed the fix and it looks fine to handle pAE->MultipleCoresToSingleQueueFlag = TRUE case while a new admin command is issued in completion path. For your comment, I am sharing my thoughts below : • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. ==> In OFA driver, Storport will acquire the StartIoLock before calling NVMeStartIo. We acquire StartIo lock only in the completion path in case if cores != queues, to synchronize multiple cores trying to access single queue. So in OFA driver, only if cores == queues, then both StartIO and DPCs will run concurrently. Apart from this, in pAE->MultipleCoresToSingleQueueFlag = FALSE case, the intention to synchronize issuing admin command from StartIO & various completion paths is not handled in similar fashion in all cases. For example, in SntiTranslateModeSenseResponse(), FormatNVMGetIdentify(), etc., lock is acquired. But in SntiTranslateTemperatureResponse(), SntiTranslateStartStopUnitResponse(), SntiTranslateWriteBufferResponse() etc., StartIO lock is not acquired. There could be race condition b/w an IOCTL and Completion path trying to issue admin command. This should be addressed in the OFA driver, might be in a different patch. Thanks, Meenakshikumar --------- Original Message --------- Sender : Robles, Raymond C > Date : 2017-06-23 04:50 (GMT+5:30) Title : Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Intel has reviewed your and tested your patch. Please see the following comments: • Need to remove code blocks in #if 0 • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. • A couple of options: o Indicate in nvmeSnti.c when handling operations that require more than one command to the drive, and have that locking needs passed to ProcessIo appropriately. It does not seem plausible to isolate the changes to nvmeIO.c only. o You could also instead remove startIoLock from IoCompletionDpcRoutine while keeping startIoLock at processIo. This way we can avoid double lock and also we can make NvmeStartIo and IoCompletionDpcRoutine parallel. The key takeaway is that StartIo and IoCompletionDpcRoutine must be able to run concurrently with no deadlocks. As you’ve coded the patch, a deadlock is possible. Please revise your patch to account for this scenario. Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Thursday, June 22, 2017 3:55 PM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Please go the Open Fabrics Alliance website and subscribe to our reflector on this page: http://lists.openfabrics.org/mailman/listinfo/nvmewin The main NVMe WG page can be found here: https://www.openfabrics.org/index.php/working-groups.html Thanks… Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Monday, June 19, 2017 11:45 PM To: Robles, Raymond C > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Dear Raymond, Yes, I have received your mail regarding unit test. We are now carrying on the test and hope to finish it within this week. May I know how to officially subscribe to the reflector email list? I saw my mail is already in the Non-digested Members of nvmewin. Should I set as digested mode? Thank you! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Tuesday, June 20, 2017 5:37 AM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, Yes, we did receive the email and have been emailing you. I notice that your emails keep getting bounced off the reflector list and I have to approve them. Could you please officially subscribe to the reflector email list so that your emails are not bounced (which require me to approve and forward). There was an additional request for you to run the normal unit tests required for all OFA patches. I’ll forward that email again after you officially subscribe the email list. Did you receive that email? Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Wednesday, May 24, 2017 10:18 PM To: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi, Have you received my mail on 5/22? Thanks! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Monday, May 22, 2017 1:14 PM To: 'nvmewin at lists.openfabrics.org' > Cc: Larry Li > Subject: Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2EE8D.0DB3EAE0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- ………. 1. SntiTranslateModeSense() ………. 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ……… callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ……. 3.SntiTranslateModeSenseResponse() ……. case MODE_SENSE_RETURN_ALL: ……. if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() …….. if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. _______________________________________________ nvmewin mailing list nvmewin at lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/nvmewin [cid:image002.png at 01D2EE8D.0DB3EAE0] [http://ext.samsung.net/mail/ext/v1/external/status/update?userid=sm.kumar&do=bWFpbElEPTIwMTcwNjIzMDczMTI4ZXBjbXM1cDY4MTZmNDk4MDA2ODllYzdlY2VlMGI0YjdlNzdmN2I3MSZyZWNpcGllbnRBZGRyZXNzPXJheW1vbmQuYy5yb2JsZXNAaW50ZWwuY29t] Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer: This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system. Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer: This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 33527 bytes Desc: image002.png URL: From raymond.c.robles at intel.com Fri Jun 30 10:30:37 2017 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Fri, 30 Jun 2017 17:30:37 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <7df726714f614aca94deac3018fee192@samsung.com> References: <49158E750348AA499168FD41D889836082D21974@fmsmsx117.amr.corp.intel.com> <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> <41156d2cf2ac489c91c045c0a4069e57@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D21916@fmsmsx117.amr.corp.intel.com> <20170623073128epcms5p6816f49800689ec7ecee0b4b7e77f7b71@epcms5p6> <49158E750348AA499168FD41D889836082D2216E@fmsmsx117.amr.corp.intel.com> <49158E750348AA499168FD41D889836082D236CD@fmsmsx117.amr.corp.intel.com> <49158E750348AA499168FD41D889836082D24324@fmsmsx117.amr.corp.intel.com> <7df726714f614aca94deac3018fee192@samsung.com> Message-ID: <49158E750348AA499168FD41D889836082D271FA@fmsmsx117.amr.corp.intel.com> Hi Judy, Thank you for your feedback. I think we are all in agreement on Phison’s patch. My only concern is the original patch did not address any changes in nvmeSnti.c where the completion path needed to invoke the StartIo lock appropriately. Did you observe the same? Thanks… Ray From: Judy Brock [mailto:judy.brock at samsung.com] Sent: Friday, June 30, 2017 6:16 AM To: Robles, Raymond C ; Tom Freeman ; sm.kumar at samsung.com; Justina Lai Cc: Larry Li ; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray et al, We just want to clarify our position/feedback after all the recent back and forth. In the MultipleCoresToSingleQueueFlag == TRUE completion path case: • we understand and agree with acquiring the StartIO lock in the IOCompletion Routine. We do not support acquiring a DPC lock instead. • we agree the code SHOULD NOT try to take the StartIO lock in ProcessIO if trying to issue a new secondary command (this caused the deadlock Justina reported) • we are ok with Justina’s original fix with respect to this issue In the MultipleCoresToSingleQueueFlag == FALSE completion path case (i.e. cores == queues): • The IOCompletion routine should take a DPC lock (as it does today) • The various translation completion command handlers should all be consistent and SHOULD take the StartIO lock in ProcessIO if trying to issue a new command Thanks, Judy From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Tuesday, June 27, 2017 4:29 PM To: Tom Freeman; sm.kumar at samsung.com; Justina Lai Cc: Larry Li; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Tom, Thanks for bearing with me on the details of this. After looking more closely at the code, I agree with your assessment in the case where a DPC is issued for a secondary, internal command (we do need to acquire the StartIo lock). I’ll wait for Phison to catch up on this thread and provide the update on their unit testing. Thanks… Ray From: Tom Freeman [mailto:thomas.freeman at wdc.com] Sent: Tuesday, June 27, 2017 11:33 AM To: Robles, Raymond C >; sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Thanks Ray, I'm still confused - and I think it relates to a basic question I have about dpc routing. If I understand it correctly, IoCompletionRoutine is queued to run on the core matching its submission/completion queue. But, that instance of IoCompletionRoutine could simultaneously run on other cores. That's why we get the DpcLock, to synchronize multiple threads running the same instance of the dpc, IoCompletionRoutine. If that's correct, I think the following example shows why we need to get the StartIo lock in the completion path when it is issuing another NVMe command. The code path StartIo->...->ProcessIo issues i/o to submission queue 3 (on core 2). StartIo lock is automatically taken. Simultaneously, IoCompletionRoutine->....->ProcessIo() for comp queue 3 could be executing on a core other than core 2. If so, unless IoCompletionRoutine->....->ProcessIo() gets the StartIo lock, it will not be synchronized with StartIo->...->Process() on core 2. Let me know your thoughts, Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital® Email: Thomas.Freeman at wdc.com Office: +1-507-322-2311 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Monday, June 26, 2017 7:46 PM To: Tom Freeman >; sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Tom, Your statement is correct. I’m probably not being clear… sorry for the confusion. My assertion is that when cpu == cores, the driver should not be grabbing the StartIo lock as a practice. The StartIo lock is meant for Storport to aquire for call to the HwStartIo callback. If we are in the completion DPC, then that means we have already acquired the DPC lock, and we should not be acquiring any other lock as part of that DPC calling back into ProcessIo for a second command. That is the deadlock to which I’m referring (and the issue that Meenakshikumar from Samsung references). In my opinion, this issue should be resolved as part of this patch. Thoughts? Thanks… Ray From: Tom Freeman [mailto:thomas.freeman at wdc.com] Sent: Monday, June 26, 2017 1:31 PM To: Robles, Raymond C >; sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray, I missing something in the deadlock discussion. Since IoCompletionRoutine and StartIo run at the same IRQL, wouldn’t one run to completion before the other is scheduled thus preventing the deadlock? Thanks, Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital® Email: Thomas.Freeman at wdc.com Office: +1-507-322-2311 From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Friday, June 23, 2017 2:17 PM To: sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Meenakshikumar, Thanks for reviewing this patch and my comments. I agree with your statement below (highlighted). I was attempting to drive the possible solution where cores == queues… when admin commands can come from both Storport and the completion path. As you state, there are still several areas where a the lock is acquired for commands in the completion path. In order to address, I believe the correct solution is to actually not grab the StartIO lock in the completion path, but instead a DPC lock. Both run at the same IRQL, and will be handled appropriately by the OS with no deadlock. Essentially: • The OFA driver is acquiring StartIO Lock from processIo & IOCompletionDpcRoutine. So there is possibility for deadlock when we have an OS command requires more than one NVMe command. • I’m proposing we acquire the StartIO Lock from processIo and a DPC Lock from IOCompletionDpcRoutine. So there is no possibility of same lock acquired twice in single call stack. • I would also suggest removing any lingering areas where we issue a new DPC from IOCompletionDpcRoutine to call processIo… so that DPC waiting time for lock is reduced. Thanks… Ray From: MEENAKSHIKUMAR SOMASUNDARAM [mailto:sm.kumar at samsung.com] Sent: Friday, June 23, 2017 12:31 AM To: Robles, Raymond C >; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray, This is Meenakshikumar from Samsung. I have reviewed the fix and it looks fine to handle pAE->MultipleCoresToSingleQueueFlag = TRUE case while a new admin command is issued in completion path. For your comment, I am sharing my thoughts below : • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. ==> In OFA driver, Storport will acquire the StartIoLock before calling NVMeStartIo. We acquire StartIo lock only in the completion path in case if cores != queues, to synchronize multiple cores trying to access single queue. So in OFA driver, only if cores == queues, then both StartIO and DPCs will run concurrently. Apart from this, in pAE->MultipleCoresToSingleQueueFlag = FALSE case, the intention to synchronize issuing admin command from StartIO & various completion paths is not handled in similar fashion in all cases. For example, in SntiTranslateModeSenseResponse(), FormatNVMGetIdentify(), etc., lock is acquired. But in SntiTranslateTemperatureResponse(), SntiTranslateStartStopUnitResponse(), SntiTranslateWriteBufferResponse() etc., StartIO lock is not acquired. There could be race condition b/w an IOCTL and Completion path trying to issue admin command. This should be addressed in the OFA driver, might be in a different patch. Thanks, Meenakshikumar --------- Original Message --------- Sender : Robles, Raymond C > Date : 2017-06-23 04:50 (GMT+5:30) Title : Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Intel has reviewed your and tested your patch. Please see the following comments: • Need to remove code blocks in #if 0 • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. • A couple of options: o Indicate in nvmeSnti.c when handling operations that require more than one command to the drive, and have that locking needs passed to ProcessIo appropriately. It does not seem plausible to isolate the changes to nvmeIO.c only. o You could also instead remove startIoLock from IoCompletionDpcRoutine while keeping startIoLock at processIo. This way we can avoid double lock and also we can make NvmeStartIo and IoCompletionDpcRoutine parallel. The key takeaway is that StartIo and IoCompletionDpcRoutine must be able to run concurrently with no deadlocks. As you’ve coded the patch, a deadlock is possible. Please revise your patch to account for this scenario. Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Thursday, June 22, 2017 3:55 PM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Please go the Open Fabrics Alliance website and subscribe to our reflector on this page: http://lists.openfabrics.org/mailman/listinfo/nvmewin The main NVMe WG page can be found here: https://www.openfabrics.org/index.php/working-groups.html Thanks… Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Monday, June 19, 2017 11:45 PM To: Robles, Raymond C > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Dear Raymond, Yes, I have received your mail regarding unit test. We are now carrying on the test and hope to finish it within this week. May I know how to officially subscribe to the reflector email list? I saw my mail is already in the Non-digested Members of nvmewin. Should I set as digested mode? Thank you! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Tuesday, June 20, 2017 5:37 AM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, Yes, we did receive the email and have been emailing you. I notice that your emails keep getting bounced off the reflector list and I have to approve them. Could you please officially subscribe to the reflector email list so that your emails are not bounced (which require me to approve and forward). There was an additional request for you to run the normal unit tests required for all OFA patches. I’ll forward that email again after you officially subscribe the email list. Did you receive that email? Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Wednesday, May 24, 2017 10:18 PM To: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi, Have you received my mail on 5/22? Thanks! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Monday, May 22, 2017 1:14 PM To: 'nvmewin at lists.openfabrics.org' > Cc: Larry Li > Subject: Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2EE8D.0DB3EAE0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- ………. 1. SntiTranslateModeSense() ………. 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ……… callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ……. 3.SntiTranslateModeSenseResponse() ……. case MODE_SENSE_RETURN_ALL: ……. if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() …….. if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. _______________________________________________ nvmewin mailing list nvmewin at lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/nvmewin [cid:image002.png at 01D2EE8D.0DB3EAE0] [http://ext.samsung.net/mail/ext/v1/external/status/update?userid=sm.kumar&do=bWFpbElEPTIwMTcwNjIzMDczMTI4ZXBjbXM1cDY4MTZmNDk4MDA2ODllYzdlY2VlMGI0YjdlNzdmN2I3MSZyZWNpcGllbnRBZGRyZXNzPXJheW1vbmQuYy5yb2JsZXNAaW50ZWwuY29t] Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer: This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system. Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer: This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 33527 bytes Desc: image002.png URL: From judy.brock at samsung.com Fri Jun 30 10:37:28 2017 From: judy.brock at samsung.com (Judy Brock) Date: Fri, 30 Jun 2017 17:37:28 +0000 Subject: [nvmewin] Compatibility issue with 1.5 version nvme driver In-Reply-To: <49158E750348AA499168FD41D889836082D271FA@fmsmsx117.amr.corp.intel.com> References: <49158E750348AA499168FD41D889836082D21974@fmsmsx117.amr.corp.intel.com> <72e3a49d013142819ddb77a9a2c941f3@ExMBX2.phison.com> <61e5436482f94a7f9a13ddb98942cf1b@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D1F59A@fmsmsx117.amr.corp.intel.com> <41156d2cf2ac489c91c045c0a4069e57@ExMBX2.phison.com> <49158E750348AA499168FD41D889836082D21916@fmsmsx117.amr.corp.intel.com> <20170623073128epcms5p6816f49800689ec7ecee0b4b7e77f7b71@epcms5p6> <49158E750348AA499168FD41D889836082D2216E@fmsmsx117.amr.corp.intel.com> <49158E750348AA499168FD41D889836082D236CD@fmsmsx117.amr.corp.intel.com> <49158E750348AA499168FD41D889836082D24324@fmsmsx117.amr.corp.intel.com> <7df726714f614aca94deac3018fee192@samsung.com> <"49158E750348AA499168FD41D889836082D 271FA"@fmsmsx117.amr.corp.intel.com> Message-ID: <726bf4aac48747e6b4011471a440ba8f@samsung.com> Yes, we agree. That is why we included the following statement below: “The various translation completion command handlers should all be consistent and SHOULD take the StartIO lock in ProcessIO if trying to issue a new command” Right now, they are not all consistent. Thanks, Judy From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Friday, June 30, 2017 10:31 AM To: Judy Brock; Tom Freeman; sm.kumar at samsung.com; Justina Lai Cc: Larry Li; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Judy, Thank you for your feedback. I think we are all in agreement on Phison’s patch. My only concern is the original patch did not address any changes in nvmeSnti.c where the completion path needed to invoke the StartIo lock appropriately. Did you observe the same? Thanks… Ray From: Judy Brock [mailto:judy.brock at samsung.com] Sent: Friday, June 30, 2017 6:16 AM To: Robles, Raymond C >; Tom Freeman >; sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray et al, We just want to clarify our position/feedback after all the recent back and forth. In the MultipleCoresToSingleQueueFlag == TRUE completion path case: • we understand and agree with acquiring the StartIO lock in the IOCompletion Routine. We do not support acquiring a DPC lock instead. • we agree the code SHOULD NOT try to take the StartIO lock in ProcessIO if trying to issue a new secondary command (this caused the deadlock Justina reported) • we are ok with Justina’s original fix with respect to this issue In the MultipleCoresToSingleQueueFlag == FALSE completion path case (i.e. cores == queues): • The IOCompletion routine should take a DPC lock (as it does today) • The various translation completion command handlers should all be consistent and SHOULD take the StartIO lock in ProcessIO if trying to issue a new command Thanks, Judy From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Tuesday, June 27, 2017 4:29 PM To: Tom Freeman; sm.kumar at samsung.com; Justina Lai Cc: Larry Li; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Tom, Thanks for bearing with me on the details of this. After looking more closely at the code, I agree with your assessment in the case where a DPC is issued for a secondary, internal command (we do need to acquire the StartIo lock). I’ll wait for Phison to catch up on this thread and provide the update on their unit testing. Thanks… Ray From: Tom Freeman [mailto:thomas.freeman at wdc.com] Sent: Tuesday, June 27, 2017 11:33 AM To: Robles, Raymond C >; sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Thanks Ray, I'm still confused - and I think it relates to a basic question I have about dpc routing. If I understand it correctly, IoCompletionRoutine is queued to run on the core matching its submission/completion queue. But, that instance of IoCompletionRoutine could simultaneously run on other cores. That's why we get the DpcLock, to synchronize multiple threads running the same instance of the dpc, IoCompletionRoutine. If that's correct, I think the following example shows why we need to get the StartIo lock in the completion path when it is issuing another NVMe command. The code path StartIo->...->ProcessIo issues i/o to submission queue 3 (on core 2). StartIo lock is automatically taken. Simultaneously, IoCompletionRoutine->....->ProcessIo() for comp queue 3 could be executing on a core other than core 2. If so, unless IoCompletionRoutine->....->ProcessIo() gets the StartIo lock, it will not be synchronized with StartIo->...->Process() on core 2. Let me know your thoughts, Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital® Email: Thomas.Freeman at wdc.com Office: +1-507-322-2311 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Monday, June 26, 2017 7:46 PM To: Tom Freeman >; sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Tom, Your statement is correct. I’m probably not being clear… sorry for the confusion. My assertion is that when cpu == cores, the driver should not be grabbing the StartIo lock as a practice. The StartIo lock is meant for Storport to aquire for call to the HwStartIo callback. If we are in the completion DPC, then that means we have already acquired the DPC lock, and we should not be acquiring any other lock as part of that DPC calling back into ProcessIo for a second command. That is the deadlock to which I’m referring (and the issue that Meenakshikumar from Samsung references). In my opinion, this issue should be resolved as part of this patch. Thoughts? Thanks… Ray From: Tom Freeman [mailto:thomas.freeman at wdc.com] Sent: Monday, June 26, 2017 1:31 PM To: Robles, Raymond C >; sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray, I missing something in the deadlock discussion. Since IoCompletionRoutine and StartIo run at the same IRQL, wouldn’t one run to completion before the other is scheduled thus preventing the deadlock? Thanks, Tom Freeman Software Engineer, Device Manager and Driver Development Western Digital® Email: Thomas.Freeman at wdc.com Office: +1-507-322-2311 From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Friday, June 23, 2017 2:17 PM To: sm.kumar at samsung.com; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Meenakshikumar, Thanks for reviewing this patch and my comments. I agree with your statement below (highlighted). I was attempting to drive the possible solution where cores == queues… when admin commands can come from both Storport and the completion path. As you state, there are still several areas where a the lock is acquired for commands in the completion path. In order to address, I believe the correct solution is to actually not grab the StartIO lock in the completion path, but instead a DPC lock. Both run at the same IRQL, and will be handled appropriately by the OS with no deadlock. Essentially: • The OFA driver is acquiring StartIO Lock from processIo & IOCompletionDpcRoutine. So there is possibility for deadlock when we have an OS command requires more than one NVMe command. • I’m proposing we acquire the StartIO Lock from processIo and a DPC Lock from IOCompletionDpcRoutine. So there is no possibility of same lock acquired twice in single call stack. • I would also suggest removing any lingering areas where we issue a new DPC from IOCompletionDpcRoutine to call processIo… so that DPC waiting time for lock is reduced. Thanks… Ray From: MEENAKSHIKUMAR SOMASUNDARAM [mailto:sm.kumar at samsung.com] Sent: Friday, June 23, 2017 12:31 AM To: Robles, Raymond C >; Justina Lai > Cc: Larry Li >; nvmewin at lists.openfabrics.org Subject: RE: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Ray, This is Meenakshikumar from Samsung. I have reviewed the fix and it looks fine to handle pAE->MultipleCoresToSingleQueueFlag = TRUE case while a new admin command is issued in completion path. For your comment, I am sharing my thoughts below : • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. ==> In OFA driver, Storport will acquire the StartIoLock before calling NVMeStartIo. We acquire StartIo lock only in the completion path in case if cores != queues, to synchronize multiple cores trying to access single queue. So in OFA driver, only if cores == queues, then both StartIO and DPCs will run concurrently. Apart from this, in pAE->MultipleCoresToSingleQueueFlag = FALSE case, the intention to synchronize issuing admin command from StartIO & various completion paths is not handled in similar fashion in all cases. For example, in SntiTranslateModeSenseResponse(), FormatNVMGetIdentify(), etc., lock is acquired. But in SntiTranslateTemperatureResponse(), SntiTranslateStartStopUnitResponse(), SntiTranslateWriteBufferResponse() etc., StartIO lock is not acquired. There could be race condition b/w an IOCTL and Completion path trying to issue admin command. This should be addressed in the OFA driver, might be in a different patch. Thanks, Meenakshikumar --------- Original Message --------- Sender : Robles, Raymond C > Date : 2017-06-23 04:50 (GMT+5:30) Title : Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Intel has reviewed your and tested your patch. Please see the following comments: • Need to remove code blocks in #if 0 • When multiple cpu cores are mapped with single queue, the patch is removing startIoLock from processIO and startIoLock used only in IOCompletionDpcRoutine: • This will block NvmeStartIo when driver is processing a completion entry. • A couple of options: o Indicate in nvmeSnti.c when handling operations that require more than one command to the drive, and have that locking needs passed to ProcessIo appropriately. It does not seem plausible to isolate the changes to nvmeIO.c only. o You could also instead remove startIoLock from IoCompletionDpcRoutine while keeping startIoLock at processIo. This way we can avoid double lock and also we can make NvmeStartIo and IoCompletionDpcRoutine parallel. The key takeaway is that StartIo and IoCompletionDpcRoutine must be able to run concurrently with no deadlocks. As you’ve coded the patch, a deadlock is possible. Please revise your patch to account for this scenario. Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Thursday, June 22, 2017 3:55 PM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi Justina, Please go the Open Fabrics Alliance website and subscribe to our reflector on this page: http://lists.openfabrics.org/mailman/listinfo/nvmewin The main NVMe WG page can be found here: https://www.openfabrics.org/index.php/working-groups.html Thanks… Ray From: Justina Lai [mailto:justina_lai at phison.com] Sent: Monday, June 19, 2017 11:45 PM To: Robles, Raymond C > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Dear Raymond, Yes, I have received your mail regarding unit test. We are now carrying on the test and hope to finish it within this week. May I know how to officially subscribe to the reflector email list? I saw my mail is already in the Non-digested Members of nvmewin. Should I set as digested mode? Thank you! Best Regards, Justina Lai #5707 From: Robles, Raymond C [mailto:raymond.c.robles at intel.com] Sent: Tuesday, June 20, 2017 5:37 AM To: Justina Lai > Cc: nvmewin at lists.openfabrics.org Subject: RE: Compatibility issue with 1.5 version nvme driver Hi Justina, Yes, we did receive the email and have been emailing you. I notice that your emails keep getting bounced off the reflector list and I have to approve them. Could you please officially subscribe to the reflector email list so that your emails are not bounced (which require me to approve and forward). There was an additional request for you to run the normal unit tests required for all OFA patches. I’ll forward that email again after you officially subscribe the email list. Did you receive that email? Thanks… Ray From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Justina Lai Sent: Wednesday, May 24, 2017 10:18 PM To: nvmewin at lists.openfabrics.org Subject: Re: [nvmewin] Compatibility issue with 1.5 version nvme driver Hi, Have you received my mail on 5/22? Thanks! Best Regards, Justina Lai #5707 From: Justina Lai Sent: Monday, May 22, 2017 1:14 PM To: 'nvmewin at lists.openfabrics.org' > Cc: Larry Li > Subject: Compatibility issue with 1.5 version nvme driver Dear Sir/Madam, We are facing compatibility issue on our PS5007 NVMe device with recent 1.5 version nvme driver. Please check below issue description. Phison PS5007 supports 7 IO queues, and if we use PS5007 with OFA driver on the platform core number>7, ex: 8-core or 12-core PC, we will see PC hang up and cannot enter OS like below: [cid:image001.jpg at 01D2EE8D.0DB3EAE0] After debugging on our side, we found the fail is caused by below flow: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- ………. 1. SntiTranslateModeSense() ………. 2.IoCompletionRoutine() if (pDpc != NULL) { ASSERT(pAE->ntldrDump == FALSE); if (pAE->MultipleCoresToSingleQueueFlag) { StorPortAcquireSpinLock(pAE, StartIoLock, NULL, &StartLockHandle); ----------------------> execute lock } else { StorPortAcquireSpinLock(pAE, DpcLock, pDpc, &DpcLockhandle); } } ……… callStorportNotification = pSrbExtension->pNvmeCompletionRoutine(pAE, (PVOID)pSrbExtension) && (pSrbExtension->pSrb != NULL); ……. 3.SntiTranslateModeSenseResponse() ……. case MODE_SENSE_RETURN_ALL: ……. if (supportsVwc == TRUE) { pSrbExt->pNvmeCompletionRoutine = SntiCompletionCallbackRoutine; /* Finally, make sure we issue the GET FEATURES command */ SntiBuildGetFeaturesCmd(pSrbExt, VOLATILE_WRITE_CACHE); ioStarted = ProcessIo(pSrbExt->pNvmeDevExt, pSrbExt, NVME_QUEUE_TYPE_ADMIN, TRUE); 4.ProcessIo() …….. if (AcquireLock == TRUE) { StorPortAcquireSpinLock(pAdapterExtension, StartIoLock, NULL, &hStartIoLock); ------------------------> double execute lock and cause (Assertion failure - code c0000420) } --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Since many users are now using our PS5007 device with 8-core or 12-core PC, they are facing this issue right now. Could you please help to modify driver to solve this problem asap? Any unclear point, please kindly let us know. Thank you very much for the help! Best Regards, Justina Lai #5707 This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. This message and any attachments are confidential and may be legally privileged. Any unauthorized review, use or distribution by anyone other than the intended recipient is strictly prohibited. If you are not the intended recipient, please immediately notify the sender, completely delete the message and any attachments, and destroy all copies. Your cooperation will be highly appreciated. _______________________________________________ nvmewin mailing list nvmewin at lists.openfabrics.org http://lists.openfabrics.org/mailman/listinfo/nvmewin [cid:image002.png at 01D2EE8D.0DB3EAE0] [http://ext.samsung.net/mail/ext/v1/external/status/update?userid=sm.kumar&do=bWFpbElEPTIwMTcwNjIzMDczMTI4ZXBjbXM1cDY4MTZmNDk4MDA2ODllYzdlY2VlMGI0YjdlNzdmN2I3MSZyZWNpcGllbnRBZGRyZXNzPXJheW1vbmQuYy5yb2JsZXNAaW50ZWwuY29t] Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer: This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system. Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer: This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 11633 bytes Desc: image001.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 33527 bytes Desc: image002.png URL: