[nvmewin] Patch for handling surprise removal in IOCTL path

Robles, Raymond C raymond.c.robles at intel.com
Tue May 31 14:23:06 PDT 2016


Hi Suman… Intel has no comments/concerns. Everything looks good. I’ll push the patch in the next day.
I believe Samsung had one more patch for concurrent channels, correct? Was there anything after that? I’d like to start planning for the mid-year release.
Thank you… Ray
From: SUMAN PRAKASH B [mailto:suman.p at samsung.com]
Sent: Monday, May 30, 2016 8:00 AM
To: Thomas Freeman <thomas.freeman at hgst.com>; nvmewin at lists.openfabrics.org; Robles, Raymond C <raymond.c.robles at intel.com>
Subject: Re: RE: [nvmewin] Patch for handling surprise removal in IOCTL path




Thanks Tom.



To other reviewers: Please let me know if there is any comments/concerns.



- Suman



------- Original Message -------

Sender : Thomas Freeman<thomas.freeman at hgst.com<mailto:thomas.freeman at hgst.com>>

Date : May 27, 2016 02:45 (GMT+05:30)

Title : RE: [nvmewin] Patch for handling surprise removal in IOCTL path


Suman,
I’ve reviewed changes and have no comments/concerns.
Thank you,
Tom Freeman
Software Engineer, Device Manager and Driver Development
Western Digital Corporation
e.  Thomas.freeman at hgst.com<mailto:Thomas.freeman at hgst.com>
o.  +1-507-322-2311


From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of SUMAN PRAKASH B
Sent: Monday, May 16, 2016 8:21 AM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>; raymond.c.robles at intel.com<mailto:raymond.c.robles at intel.com>
Subject: [nvmewin] Patch for handling surprise removal in IOCTL path


Hi All,



This patch includes changes for supporting device surprise removal in IOCTL path. Samsung already submitted a patch to handle surprise removal during normal I/O in 2014. This patch is an extention to the previous patch.

I have made a detailed overview of the changes in the attached doc file(the contents are also copied here below) and the attached zip file contains the source code.

Password is samsungnvme

Please let me know if you have any questions.

Thanks,
Suman

***************************************



Handling device Surprise removal in IOCTL path:

    Current OFA driver does not need any change to support surprise removal when no I/Os are outstanding. But when I/Os are outstanding and the device is surprise removed, the tool(lets say IOMeter) hangs and the device is not removed from the device manager for very long time. Since NVMe is suprise remove capable device, user expectation is that the tool should exit/stop gracefully and the device should be removed from the device manage immediately.

    The Windows storage stack currently does not handle surprise removal during outstanding I/Os. To handle this scenario, we have submitted a patch in 2014. But this patch was handling surprise removal during normal I/O commands and not when IOCTL commands from proprietary tools were outstanding.
    In the previous patch, we had implemented the StorPortNotification(RequestTimerCall,…) to handle surprise removal durign I/O. This API does not detect surprise removal when executing IOCTL commands. For example, when executing Format NVM command with Secure erase or Cryptographic erase, when device is surprise removed, the device is not removed from the device manager and system hangs.

    To fix this issue, in Windows 8 and later versions, there is support for StorPortRequestTimer() which does the same as StorPortNotification(RequestTimerCall,…). When StorPortRequestTimer() is used, device surprise removal during IOCTL commands is also detected. Hence for Win 7 kernel, StorPortNotification(RequestTimerCall,…) and for Win 8 and above kernels, StorPortRequestTimer() can be used.



Code changes:

1. In NVMeFindAdapter(), initialize the timer routine using StorPortInitializeTimer() API.



2. In NVMeRunning(), in NVMeStartComplete, start the surprise removal timer routine.



3. Stop the timer routine using StorPortRequestTimer() in following functions:
 a. In NVMeAdapterControl(), in ScsiStopAdapter, when shutting down the device
 b. In NVMeBuildIo(), for SRB_FUNCTION_SHUTDOWN
 c. In NVMeBuildIo(), for SRB_FUNCTION_PNP, if PnPAction is StorRemoveDevice or StorSurpriseRemoval and ntldrDump is FALSE



4. Free the timer routine using StorPortFreeTimer() in following functions:
 a. In NVMePassiveInitialize(), when the initialization state machine fails
 b. In NVMeAdapterControl(), in ScsiStopAdapter, when shutting down the device
 c. In NVMeBuildIo(), for SRB_FUNCTION_SHUTDOWN,
 d. In NVMeBuildIo(), for SRB_FUNCTION_PNP, if PnPAction is StorRemoveDevice or StorSurpriseRemoval and ntldrDump is FALSE
 e. In NVMeStartIo(), when NextDriverState is not NVMeStartComplete
 f. In RecoveryDpcRoutine()



5. New Surprise removal timer routine IsDeviceRemoved() for kernels above Windows 8 which uses StorPortFreeTimer() and StorPortRequestTimer().



6. In FormatNVMGetIdentify(), when the structure to retrieve is Identify namespace, the lunId is set to INVALID_LUN_EXTN, because of which BSOD is observed. Invoked the function NVMeIsNamespaceVisible() to correct this. We observed this when secure erase in IOCTL is in progress, and device is surprise removed.



7. In NVMeBuildIo(), to block read/write commands when format nvm is in progress, we have a check. But when formatNVM is in progress, we get requests other than SRB_FUNCTION_EXECUTE_SCSI, like SRB_FUNCTION_PNP or SRB_FUNCTION_IO_CONTROL, in which case opCode = GET_OPCODE(Srb) in line number 1125 results in BSOD. This is observed when secure erase is executed in IOCTL path, and system is hibernated and then device is surprise removed. Hence moved the blocking of Read/Write commands while Format NVM is in progress code inside SRB_FUNCTION_EXECUTE_SCSI.



8. Changes in inf file:
    a. Following line has been included to allow interrupts on processors beyond group 0. For the StorPortInitializePerfOpts() MessageTargets to work correctly when a system has multiple groups are, the following line is required, otherwise we observe BSOD. Also, when we tested in a system which has 1 group by default and 32 logical processors in Server 2012R2, some WHQL tests create multiple groups after which the driver does not work properly if the below line is not included in inf file.
         HKR, Interrupt Management\Affinity Policy, GroupPolicy, %REG_DWORD%, 1

    b. Following line has been included to specifies that the device's interrupts are of high priority.
         HKR, Interrupt Management\Affinity Policy, DevicePriority, %REG_DWORD%, 3



9. As requested by HGST, the MAX_NAMESPACES is changed to 128. Correspondingly, the DUMP_BUFFER_SIZE is also changed from (5*64*1024) to ((5*64*1024) + (sizeof(NVME_LUN_EXTENSION)*MAX_NAMESPACES)). If DUMP_BUFFER_SIZE is not increased, in dump mode, in NVMeInitialize(), memory allocation for pAE->pLunExtensionTable will fail.

Also, the correct method to implement the above would be to allocate memory for pAE->pLunExtensionTable in NVMeInitCallback() in the NVMeWaitOnIdentifyCtrl state, where we get the number of Namespaces supported by the device, and remove the memory allocated for pAE->pLunExtensionTable in dump mode(NVMeInitialize()) and in normal mode(NVMePassiveInitialize()).

For this patch, we have set DUMP_BUFFER_SIZE to ((5*64*1024) + (sizeof(NVME_LUN_EXTENSION) * MAX_NAMESPACES)). Please note that as per MSDN, we should not allocate more than 32 KB for RequestedDumpBufferSize.



10. Implemented ScsiAdapterPower controltype in NVMeAdapterControl(). ScsiAdapterPower executes in <= DISPATCH_IRQL. This is supported only from Windows 8. Since we need to free the buffer and stop the timer only at IRQL <= DISPATCH_IRQL, this is implemented. Also, this will avoid executing initialization state machine in NVMeReInitializeController in ScsiRestartAdapter(DIRQL) during resume from hibernation.



We have tested following:

a. WHQL tests

b. Connect device -> execute secure erase in IOCTL path -> during secure erase execution remove device -> expectation: disk should be removed and tool(executing the secure erase) should exit gracefully

c. Connect device -> hibernation -> remove device -> resume -> expectation: disk should be removed

d. Connect device -> run IOMeter -> execute secure erase in IOCTL path -> hibernation -> remove device -> resume -> expectation: disk should be removed and tools should exit gracefully

c. OS installation




[http://ext.samsung.net/mailcheck/SeenTimeChecker?do=e7bc4897eaf7ac7a189c3eecb549349dcf4c3e2988f12776313cb48408a29da1bbef98290b9ea6844a835b19b8ec809fc7b41e955949e5c8a728c55b39cc59eacf878f9a26ce15a0]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20160531/29da19e4/attachment.html>


More information about the nvmewin mailing list