[nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR?

Thu Jul 18 16:43:28 PDT 2013

Out of curiosity, what was the original reason to have the ISR path in the first place?   If it is currently in the driver code, there must had been some purpose to be able to either compile it using an ISR or a DPC.

From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Murray, Kris R
Sent: Thursday, July 18, 2013 4:29 PM
To: Po-Yen Chang; Judy Brock-SSI; nvmewin at lists.openfabrics.org
Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR?

Judy,
I have no problems removing it.
~Kris

From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Po-Yen Chang
Sent: Thursday, July 18, 2013 4:27 PM
To: Judy Brock-SSI; nvmewin at lists.openfabrics.org
Subject: Re: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR?

Judy,

I feel the same way as well. Let's wait for the response from LSI and Intel on this. If they all agree, I will go ahead remove it.

Thanks,
Alex

________________________________
From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Judy Brock-SSI
Sent: Thursday, July 18, 2013 4:17 PM
To: Judy Brock-SSI; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] COMPLETE_IN_DPC flag & relationship to NvmeStartio path critical section handling not protected from NVMe ISR?
So it looks like  the reason this problem was not seen before is because it only surfaces when the COMPLETE_IN_DPC compile flag is not set. In other words, the COMPLETE_IN_ISR path is broken because it accesses our HwDeviceExtension without being synchronized with other paths in the driver which do the same.

We can either fix the path which does completions in the ISR or get rid of that option entirely. Since it's generally considered bad practice to do that kind of work in an ISR because it's supposed to be as lean and mean as possible, would the team be adverse to getting rid of the logic which optionally allows completions to be handled by the ISR?

If we insist on retaining it,  a) we should come up with a good reason why and b) we should fix it asap because it is definitely not safe to use in its present form.

Personally I vote for removal  - we wouldn't need the COMPLETE_IN_DPC flag either anymore if we go that route.

Thanks,
Judy

From: Judy Brock-SSI
Sent: Thursday, July 18, 2013 6:22 AM
To: Judy Brock-SSI; 'nvmewin at lists.openfabrics.org'
Subject: RE: NvmeStartio path critical section handling not protected from NVMe ISR?

I just thought of another way to handle this problem.

Could we not call StorPortSynchronizeAccess() with a pointer back to our ProcessIo() routine?  ProcessIo would get called before the call to StorPortSynchronizeAccess() returns and this would have the effect of guaranteeing synchronization with our ISR.

This seems like a much cleaner solution that a lock-acquiring approach.

I still don't know if there are any issues with ProcessIo being called multiple times, from non-StartIo code paths, etc. - would still need to be looked at.

Thanks,
Judy

From: Judy Brock-SSI
Sent: Wednesday, July 17, 2013 10:08 PM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: NvmeStartio path critical section handling not protected from NVMe ISR?

All,

Under Windows Server 2012,  I've seen a crash where NVMeStartIo() gets interrupted by our ISR at a time when it's  in the middle of manipulating a linked list critical data structure which the ISR then goes on to attempt to manipulate also -  which results in a crash. Below is the call stack - see where I've inserted the  comment "<---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK"

2: kd> kc
Call Site
nt!RtlpBreakWithStatusInstruction
nt!KiBugCheckDebugBreak
nt!KeBugCheck2
nt!KeBugCheckEx
nt!KiBugCheckDispatch
nt!KiFastFailDispatch
nt!KiRaiseSecurityCheckFailure
nvme!RtlFailFast
nvme!FatalListEntryError
nvme!RtlpCheckListEntry
nvme!InsertTailList
nvme!NVMeCompleteCmd
nvme!NVMeIsrMsix
nt!KiInterruptDispatch       <---STARTIO PATH GETS CLOBBERED BY OUR INTERRUPT HANDLER BECAUSE WE AREN'T HOLDING THE INTERRUPT SPIN LOCK
nvme!RemoveHeadList
nvme!NVMeGetCmdEntry
nvme!ProcessIo
nvme!NVMeStartIo
storport!RaidpAdapterContinueScatterGather
hal!HalpAllocateAdapterCallbackV2
hal!IoFreeAdapterChannelV2
hal!HalAllocateAdapterChannelV2
hal!HalBuildScatterGatherListV2
storport!RaUnitStartIo
storport!RaidUnitCompleteRequest
storport!RaidpAdapterRedirectDpcRoutine
nt!KiExecuteAllDpcs
nt!KiRetireDpcList

I looked through the code and noticed we never call StorPortAcquireSpinLock to acquire the InterruptLock to protect us from such pre-emption. Another way to achieve this would be to indicate we run at half-duplex rather than full-duplex  but that would degrade the general performance of the driver.

I'm not sure why we didn't run into this way before now - is there some other re-entrance protection algorithm besides the two above that others are aware of?  If not,  I believe we need to fix this asap. Suggestions:

A.     Simplest approach is to lock down all of NVMeStartIo as per below (not tested yet) but we almost may as well run half-duplex if we do this:

1 . At the very the top of NVMeStartIo:

/* we should never be holding the interrupt lock upon entry to NVMeStartIo.
       * Acquire the Interrupt Spin Lock to protect against getting hit by our ISR.
        */

       if (NULL == pAdapterExtension->hInterruptLock) {
        (StorPortAcquireSpinLock(pAdapterExtension,
        InterruptLock,
        NULL,
        &pAdapterExtension->hInterruptLock);
       }
       else {
          ASSERT(FALSE);
       }

2.  At the very the top of IO_StorPortNotification

       PNVME_DEVICE_EXTENSION pAE = (PNVME_DEVICE_EXTENSION) pHwDeviceExtension;

    /* if we got here from NvmeStartIo we need to release the interrupt lock */
       if (NULL != pAE->hInterruptLock) {
             STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock;
             pAE->hInterruptLock = NULL;
        StorPortReleaseSpinLock(pAE, &hInterruptLockCopy);
       }

3. At the very bottom of NVMeStartIo:

       /* if we didn't release the Interrupt Lock in one of the calls to
        * IO_StorPortNotification above we need to release before we exit NVMEStartIo
       */

       if (NULL != pAE->hInterruptLock) {
             STOR_LOCK_HANDLE hInterruptLockCopy = pAE->hInterruptLock;
             pAE->hInterruptLock = NULL;
        StorPortReleaseSpinLock(pAE, &hInterruptLockCopy);
       }

    return TRUE;
} /* NVMeStartIo */

B.     Better approach is to just lock ProcessIo().  But code exists in that routine which acquires the StartIo lock - we can't take locks out of order or we'll cause deadlock.  Right now that code never gets invoked - what was it for? Do we still need it? Can ProcessIo() get called from non-StartIo Paths? Can it get called multiple times? Not having been involved in the initial development of this driver, I would need to study the flow to make sure to respect the StorPort lock acquiring/releasing hierarchy rules at all times. If those conversant in the overall developmental history and architecture of this driver could share their thoughts, that would be great.

Thanks,

Judy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20130718/be3aa69c/attachment.html>