[nvmewin] NVMeResetBus design

Judy Brock-SSI judy.brock at ssi.samsung.com
Thu Jul 18 06:34:11 PDT 2013


In our testing, we create a situation where we put the NVMe driver under heavy I/O load with Iometer and then cause the device to stop responding.  This results in I/O request timeouts which eventually causes the driver to be called at it's HwStorResetBus entry point (NVMeResetBus).  I have some feedback on the current architecture of that routine:


1.      Among other things, NMeResetBus schedules a DPC to complete any pending commands. This creates a situation where upon return from this entry point, there are still cmds outstanding which don't get completed till the DPC runs.  According to the WDK, this doesn't appear to be legal - all outstanding cmds have to be completed by the HwStorResetBus routine before it returns:

HwResetBus

Pointer to the miniport driver's HwStorResetBus<ms-help://MS.WDK.v10.7600.091201/Storage_r/hh/Storage_r/stormini_b3051379-4caa-4502-9492-a21672cfbf0d.xml.htm> routine, which is a required entry point for all miniport drivers. This member has the same meaning for the Storport version of the HW_INITIALIZATION_DATA structure as it does for the SCSI Port version of the structure. For more information, see the HwResetBus member of HW_INITIALIZATION_DATA (SCSI)
and
HwScsiResetBus must complete any outstanding requests by calling ScsiPortCompleteRequest with the SrbStatus value SRB_STATUS_BUS_RESET or, for individual SRBs, ScsiPortNotification with this status value.
and
The port driver pauses all device IO queues for the adapter and then calls the HwStorResetBus routine at IRQL DISPATCH_LEVEL after acquiring the StartIo spin lock. A miniport driver is responsible for completing SRBs received by HwStorStartIo<http://msdn.microsoft.com/en-us/library/windows/hardware/ff557423(v=vs.85).aspx> for PathId during this routine and setting their status to SRB_STATUS_BUS_RESET if necessary

Since HwStorResetBus must finish its work before returning; it can't schedule a DPC to do so later on. The logic which schedules a DPC should be removed.


2.      Code should be added to call StorPortPause() to hold off any new requests till StorPortResume() is called.


3.      Code should be added to call  StorPortSynchronizeAccess() in order to synchronize with HwStorInterrupt. A callback routine in the NVMe driver should also be added for NVMeResetBus to do the synchronized work in. HwStorResetBus is already synchronized with HwStorStartIo since the port driver calls it only after acquiring the StartIo spinlock.



4.      We should implement a driver-internal global (per "adapter") flag signifying we are busy with reset processing and thus can't allow new I/O requests to go through to the hardware.



5.      Code should be added to call StorPortResume() when all work is complete.



6.      We should refer to the WDK-supplied LSI parallel SCSI StorPort miniport sample driver for an example of all of the above.



Thanks,
Judy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20130718/f7e66dbb/attachment.html>


More information about the nvmewin mailing list