[nvmewin] Samsung patch for Hot plug fixes
Judy Brock-SSI
judy.brock at ssi.samsung.com
Wed Nov 5 13:55:33 PST 2014
Hi All,
>>And since this is not a fatal error, we can ignore checking the ST and SCT bits.
And again, just to emphasize, the purpose of learning is not to see if we can access media w/out errors – the purpose is just to establish the MSI-x/logical processor relationship. This can be confirmed with the authors of the original learning code.
>> Also when device returns NAMESPACE_NOT_READY, driver does not complete the learning cores and this may result in performance degradation as context switch will happen when device interrupts.
To clarify, the above describes the behavior of the driver if it does not ignore the ST and SCT bits. This is not the behavior of the submitted hot-plug patch driver.
Thanks,
Judy
From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Wednesday, November 05, 2014 8:03 AM
To: nvmewin at lists.openfabrics.org
Subject: [nvmewin] FW: RE: Samsung patch for Hot plug fixes
Dear all,
Please see the reply from Suman and provide your thoughts.
Thanks,
Alex
From: SUMAN PRAKASH B [mailto:suman.p at samsung.com]
Sent: Wednesday, November 05, 2014 3:55 AM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>; judy.brock at ssi.samsung.com<mailto:judy.brock at ssi.samsung.com>
Subject: Re: RE: Samsung patch for Hot plug fixes
Hi Alex,
1. In NVMeInitCallback function, why we don’t need to check SC and SCT anymore for NVMeWaitOnLearnMapping case?
[Suman] As Judy has mentioned, the namespace may not be ready early on during initialization. We have tested with different devices and observed that one of the device returns NAMEPSACE_NOT_READY. And since this is not a fatal error, we can ignore checking the ST and SCT bits. Also when device returns NAMESPACE_NOT_READY, driver does not complete the learning cores and this may result in performance degradation as context switch will happen when device interrupts.
2. The resolution of start surprise removal timer is set as 1 second. What are the reasons to set it as 1 second rather than others?
[Suman] Generally, reading device register in a PCIe SSD is costy and can degrade the performance. When we implemented this logic, there was no impact in performance when driver reads device register frequently, may be because the register read was in a separate thread and not in IO path. But still we did not want to keep reading the register very frequently and also when we hot remove the device during IO, device should not take more time to be removed from device manager. Hence 1 second delay was a trade off between "not reading device register frequently" and "remove the device from device manager as soon as possible".
Thanks,
Suman
------- Original Message -------
Sender : Alex Chang<Alex.Chang at pmcs.com<mailto:Alex.Chang at pmcs.com>>
Date : Nov 05, 2014 01:41 (GMT+05:00)
Title : RE: Samsung patch for Hot plug fixes
<!--[if mso 9]-->
Hi Suman,
I have couple of questions for you:
1. In NVMeInitCallback function, why we don’t need to check SC and SCT anymore for NVMeWaitOnLearnMapping case?
2. The resolution of start surprise removal timer is set as 1 second. What are the reasons to set it as 1 second rather than others?
Thank you!
Alex
From: SUMAN PRAKASH B [mailto:suman.p at samsung.com]
Sent: Tuesday, November 04, 2014 2:50 AM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>; Alex Chang; judy.brock at ssi.samsung.com<mailto:judy.brock at ssi.samsung.com>
Subject: Samsung patch for Hot plug fixes
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Date: %%SENT_DATE%%
Subject: Suspect Message Quarantined
WARNING: The virus scanner was unable to scan an attachment in an email message sent to you. This attachment could possibly contain viruses or other malicious programs. The attachment could not be scanned for the following reasons:
%%DESC%%
The full message and the attachment have been stored in the quarantine.
The identifier for this message is '%%QID%%'.
Access the quarantine at:
https://puremessage.pmc-sierra.bc.ca:28443/
For more information on PMC's Anti-Spam system:
http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ
IT Services
PureMessage Admin
Hi Everyone,
We have a patch for the Hot plug fixes.
Please find attached the source code. The password is samsung123
Please find the change description below -
1) Surprise removal while IOs are in progress.
To reproduce this scenario -
Connect the disk and execute IOmeter on the disk volume. When IOs are in progress, surprise remove the device. User expects that the device should be removed from device manager immediately and iometer should increase the error count field. This does not happen since we don't handle this scenario in OFA driver.
Resolution -
a. Added a new function IsdeviceRemoved(). This is a recursive function. Compares the values of Version Register values with old value and incase of mismatch complete the outstanding commands with SRB_STATUS_ERROR. (nvmeStd.c/IsDeviceRemoved)
b. Start the Timer for IsDeviceRemoved() when the NextDriverState is set to StartComplete.(nvmeStat.c/NVMeRunning)
c. Stop the timer for IsDeviceRemoved() incase of ScsiStopAdapter. (nvmeStat.c/NVMeAdapterControl)
d. Restart the timer for IsDeviceRemoved() incase of ScsiRestartAdapter. (nvmeStat.c/NVMeAdapterControl)
e. Stop the timer for IsDeviceRemoved() incase of SRB_FUNCTION_SHUTDOWN. (nvmeStd.c/NVMeBuildIo)
f. If DeviceRemovedDuringIO flag is set to TRUE, complete the SRBs with SRB_STATUS_ERROR for the IOs. This case is to handle the IOs received once the device has been surprise removed. (nvmeStdc/NVMeBuildIo)
g. Modified the prototype of NVMeDetectPendingCmds function. When device is surprise removed when IOs are pending, the outstanding IOs has to be completed with SRB_STATUS_ERROR. (nvmeIo.c/NVMeDetectPendingCmds)
h. Call the NVMeDetectPendingCmds function with SRB_STATUS_BUS_RESET. (nvmeInit.c/NVMeNormalShutdown, nvmePwrMgmt.c/NVMeAdapterControlPowerDown, nvmeStd.c/RecoveryDpcRoutine)
2) Memory leak issues.
To reproduce this scenario -
a. Memory leak observed during hot removal in Resource monitor->Non-paged pool. (On Server2012R2 -> Task Manager -> Performance -> Non-paged pool)
b. Memory leak observed during disable/enable the NVMe controller in device manager.
Resolution -
To fix memory leak, in NVMeBuildIo()->SRB_FUNCTION_PNP, when PnPAction is StorRemoveDevice(disable controller) and StorSurpriseRemoval(hot remove device), NVMeAdapterControlPowerDown() is invoked to stop the adapter and then NVMeFreeBuffers is invoked to free the memory. At this point, since the ShutdownInProgress is set in NVMeAdapterControlPowerDown(), nothing is done during NVMeAdapterControl() - ScsiStopAdapter -> NVMeAdapterControlPowerDown().
3) Surprise Removal during Disk Initialization
To reproduce this scenario -
Hot insert the device and hot remove the device immediately. At this point, our driver might be executing the initialization state machine in NVMePassiveInitialize. The device will not be immediately removed from the device manger. The while loop will be active till passiveTimeout happens, then system BSOD.
Resolution -
a. Read the Version register. This is used to compare against the value in version register after a surprise removal. (nvmeStd.c/NVMeFindAdapter)
b. Read the Version Register and compare with old Version Register value(i.e. value read in NVMeFindAdapter). Mismatch in these values means surprise removal. (nvmeStd.c/NVMePassiveInitialize)
c. Set the NextDriverState to NVMeStartComplete and DeviceRemovedDuringIO to TRUE and return TRUE from NVMePassiveInitialize.
d. Driver may get commands in NVMeBuildIo, where driver returns SRB_STATUS_ERROR when DeviceRemovedDuringIO is set to TRUE.
e. Then NVMeAdapterControl() - ScsiStopAdapter is executed.
4) Delay in removing the device from device manager after hot removal of device. When device is hot removed, the NVMeAdapterControlPowerDown() -> NVMeResetAdapter() -> NVMeWaitForCtrlRDY() is invoked which sets
the EN bit to 0 and waits for RDY bit to become 0. Since the device is physically removed, the memory mapped registers will be come all 1's and the RDY bit will never become 0. Hence the while loop in NVMeWaitForCtrlRDY() is active for some time even after device removal and hence device is not removed from device manager immediately.
Resolution -
Check for the value of CSTS. If its 0xFFFFFFFF, then device has been surprise removed and return TRUE. (nvmeStd.c/NVMeWaitForCtrlRDY)
5) Avoid redundant call of NVMeResetAdapter()
a. File/Function: nvmeInit.c/NVMeEnableAdapter - Removed the NVMeResetAdapter() function call from NVMeEnableAdapter() as this is redundant. The NVMeResetAdapter() is being invoked in the RecoveryDpcRouitne() and then again its being invoked in the NVMeEnableAdapter.
b. In the NVMeInitialize() function the EN and RDY bit are set to 0 before the NVMeEnableAdapter() is being invoked. But NVMeResetAdapter() does again the same functionality.
6) When testing hot insertion with different devices, we observed some devices returned NAMESPACE_NOT_READY for IO commands during learning cores and disk initialization(report luns, inquiry, etc). To address this issue and provide support for these devices in the driver, we have done the following changes.
a. During learning cores, driver sends read commands on all the queues to get the core to MSI-x mapping. When the read commands are interrupted, in the NVMeInitCallback(), if the SC and SCT values are not 0, then the learning cores is not completed. This check is not required as driver wants only the core to MSI-x mapping. Since this is not a fatal error, we can skip reading the SC and SCT values, as this will impact the performance. (nvmeInit.c/NVMeInitCallback).
b. Following the above, when the initialization state machine is complete and kernel starts sending SCSI commands for disk initialization, and when device returns NAMESPACE_NOT_READY, this has to be translated to the corresponding SCSI sense data so that the commands will be re-tried after some time. (nvmeSnti.c/genericCommandStatusTable[]).
Tested the following.
- WHCK on Win7 and 2012R2
- Install/Uninstall, Enable/Disable, FS Format
- Hibernation/Resume, Sleep/Resume
- IOmeter
- Hot removal which iometer is running.
- Hot removal immediately after hot insertion.
- Continous hot insert and remove operations.
- Check for device removal after following sequence - Hot insert, system hibernation, Hot remove, system resume.
- Check for device presense after following sequence - System hibernation, hot insert, system resume.
- Memory leaks during hot plug operations and disable/enable.
Thanks,
Suman
[cid:image001.gif at 01CFF900.00A80000]
[Image removed by sender.]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20141105/a22f83ca/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 13168 bytes
Desc: image001.gif
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20141105/a22f83ca/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 823 bytes
Desc: image002.jpg
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20141105/a22f83ca/attachment.jpg>
More information about the nvmewin
mailing list