[nvmewin] Patch for handling surprise removal in IOCTL path

Thomas Freeman thomas.freeman at hgst.com
Thu May 26 14:15:32 PDT 2016


Suman,
I've reviewed changes and have no comments/concerns.
Thank you,
Tom Freeman
Software Engineer, Device Manager and Driver Development
Western Digital Corporation
e.  Thomas.freeman at hgst.com
o.  +1-507-322-2311

[cid:image002.jpg at 01D1B769.CD9DE060]

From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of SUMAN PRAKASH B
Sent: Monday, May 16, 2016 8:21 AM
To: nvmewin at lists.openfabrics.org; raymond.c.robles at intel.com
Subject: [nvmewin] Patch for handling surprise removal in IOCTL path


Hi All,



This patch includes changes for supporting device surprise removal in IOCTL path. Samsung already submitted a patch to handle surprise removal during normal I/O in 2014. This patch is an extention to the previous patch.

I have made a detailed overview of the changes in the attached doc file(the contents are also copied here below) and the attached zip file contains the source code.

Password is samsungnvme

Please let me know if you have any questions.

Thanks,
Suman

***************************************



Handling device Surprise removal in IOCTL path:

    Current OFA driver does not need any change to support surprise removal when no I/Os are outstanding. But when I/Os are outstanding and the device is surprise removed, the tool(lets say IOMeter) hangs and the device is not removed from the device manager for very long time. Since NVMe is suprise remove capable device, user expectation is that the tool should exit/stop gracefully and the device should be removed from the device manage immediately.

    The Windows storage stack currently does not handle surprise removal during outstanding I/Os. To handle this scenario, we have submitted a patch in 2014. But this patch was handling surprise removal during normal I/O commands and not when IOCTL commands from proprietary tools were outstanding.
    In the previous patch, we had implemented the StorPortNotification(RequestTimerCall,...) to handle surprise removal durign I/O. This API does not detect surprise removal when executing IOCTL commands. For example, when executing Format NVM command with Secure erase or Cryptographic erase, when device is surprise removed, the device is not removed from the device manager and system hangs.

    To fix this issue, in Windows 8 and later versions, there is support for StorPortRequestTimer() which does the same as StorPortNotification(RequestTimerCall,...). When StorPortRequestTimer() is used, device surprise removal during IOCTL commands is also detected. Hence for Win 7 kernel, StorPortNotification(RequestTimerCall,...) and for Win 8 and above kernels, StorPortRequestTimer() can be used.



Code changes:

1. In NVMeFindAdapter(), initialize the timer routine using StorPortInitializeTimer() API.



2. In NVMeRunning(), in NVMeStartComplete, start the surprise removal timer routine.



3. Stop the timer routine using StorPortRequestTimer() in following functions:
 a. In NVMeAdapterControl(), in ScsiStopAdapter, when shutting down the device
 b. In NVMeBuildIo(), for SRB_FUNCTION_SHUTDOWN
 c. In NVMeBuildIo(), for SRB_FUNCTION_PNP, if PnPAction is StorRemoveDevice or StorSurpriseRemoval and ntldrDump is FALSE



4. Free the timer routine using StorPortFreeTimer() in following functions:
 a. In NVMePassiveInitialize(), when the initialization state machine fails
 b. In NVMeAdapterControl(), in ScsiStopAdapter, when shutting down the device
 c. In NVMeBuildIo(), for SRB_FUNCTION_SHUTDOWN,
 d. In NVMeBuildIo(), for SRB_FUNCTION_PNP, if PnPAction is StorRemoveDevice or StorSurpriseRemoval and ntldrDump is FALSE
 e. In NVMeStartIo(), when NextDriverState is not NVMeStartComplete
 f. In RecoveryDpcRoutine()



5. New Surprise removal timer routine IsDeviceRemoved() for kernels above Windows 8 which uses StorPortFreeTimer() and StorPortRequestTimer().



6. In FormatNVMGetIdentify(), when the structure to retrieve is Identify namespace, the lunId is set to INVALID_LUN_EXTN, because of which BSOD is observed. Invoked the function NVMeIsNamespaceVisible() to correct this. We observed this when secure erase in IOCTL is in progress, and device is surprise removed.



7. In NVMeBuildIo(), to block read/write commands when format nvm is in progress, we have a check. But when formatNVM is in progress, we get requests other than SRB_FUNCTION_EXECUTE_SCSI, like SRB_FUNCTION_PNP or SRB_FUNCTION_IO_CONTROL, in which case opCode = GET_OPCODE(Srb) in line number 1125 results in BSOD. This is observed when secure erase is executed in IOCTL path, and system is hibernated and then device is surprise removed. Hence moved the blocking of Read/Write commands while Format NVM is in progress code inside SRB_FUNCTION_EXECUTE_SCSI.



8. Changes in inf file:
    a. Following line has been included to allow interrupts on processors beyond group 0. For the StorPortInitializePerfOpts() MessageTargets to work correctly when a system has multiple groups are, the following line is required, otherwise we observe BSOD. Also, when we tested in a system which has 1 group by default and 32 logical processors in Server 2012R2, some WHQL tests create multiple groups after which the driver does not work properly if the below line is not included in inf file.
         HKR, Interrupt Management\Affinity Policy, GroupPolicy, %REG_DWORD%, 1

    b. Following line has been included to specifies that the device's interrupts are of high priority.
         HKR, Interrupt Management\Affinity Policy, DevicePriority, %REG_DWORD%, 3



9. As requested by HGST, the MAX_NAMESPACES is changed to 128. Correspondingly, the DUMP_BUFFER_SIZE is also changed from (5*64*1024) to ((5*64*1024) + (sizeof(NVME_LUN_EXTENSION)*MAX_NAMESPACES)). If DUMP_BUFFER_SIZE is not increased, in dump mode, in NVMeInitialize(), memory allocation for pAE->pLunExtensionTable will fail.

Also, the correct method to implement the above would be to allocate memory for pAE->pLunExtensionTable in NVMeInitCallback() in the NVMeWaitOnIdentifyCtrl state, where we get the number of Namespaces supported by the device, and remove the memory allocated for pAE->pLunExtensionTable in dump mode(NVMeInitialize()) and in normal mode(NVMePassiveInitialize()).

For this patch, we have set DUMP_BUFFER_SIZE to ((5*64*1024) + (sizeof(NVME_LUN_EXTENSION) * MAX_NAMESPACES)). Please note that as per MSDN, we should not allocate more than 32 KB for RequestedDumpBufferSize.



10. Implemented ScsiAdapterPower controltype in NVMeAdapterControl(). ScsiAdapterPower executes in <= DISPATCH_IRQL. This is supported only from Windows 8. Since we need to free the buffer and stop the timer only at IRQL <= DISPATCH_IRQL, this is implemented. Also, this will avoid executing initialization state machine in NVMeReInitializeController in ScsiRestartAdapter(DIRQL) during resume from hibernation.



We have tested following:

a. WHQL tests

b. Connect device -> execute secure erase in IOCTL path -> during secure erase execution remove device -> expectation: disk should be removed and tool(executing the secure erase) should exit gracefully

c. Connect device -> hibernation -> remove device -> resume -> expectation: disk should be removed

d. Connect device -> run IOMeter -> execute secure erase in IOCTL path -> hibernation -> remove device -> resume -> expectation: disk should be removed and tools should exit gracefully

c. OS installation




[Image removed by sender.]
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:

This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20160526/109359ab/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ~WRD000.jpg
Type: image/jpeg
Size: 823 bytes
Desc: ~WRD000.jpg
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20160526/109359ab/attachment.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 2937 bytes
Desc: image002.jpg
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20160526/109359ab/attachment-0001.jpg>


More information about the nvmewin mailing list