[nvmewin] Samsung Patch for Bus Reset Enhancements
Parag Sheth
parag.sheth at seagate.com
Mon Nov 3 15:53:48 PST 2014
Hi Alex,
The changes look good. I approve the patch.
Thanks
Parag Sheth
On Mon, Nov 3, 2014 at 1:49 PM, Alex Chang <Alex.Chang at pmcs.com> wrote:
> Thanks, Carolyn.
>
>
>
> Alex
>
>
>
> *From:* Foster, Carolyn D [mailto:carolyn.d.foster at intel.com]
> *Sent:* Monday, November 03, 2014 1:30 PM
> *To:* Alex Chang; suman.p at samsung.com; Judy Brock-SSI;
> nvmewin at lists.openfabrics.org
> *Subject:* RE: RE: Samsung Patch for Bus Reset Enhancements
>
>
>
> Hi Alex, I approve the patch.
>
>
>
> Carolyn
>
>
>
> *From:* Alex Chang [mailto:Alex.Chang at pmcs.com <Alex.Chang at pmcs.com>]
> *Sent:* Monday, November 03, 2014 2:25 PM
> *To:* suman.p at samsung.com; Judy Brock-SSI; Foster, Carolyn D;
> nvmewin at lists.openfabrics.org
> *Subject:* RE: RE: Samsung Patch for Bus Reset Enhancements
>
>
>
> Hi Carolyn and Parag/Rick,
>
>
>
> Please let me know if you approve the patch.
>
>
>
> Thank you very much,
>
> Alex
>
>
>
> *From:* SUMAN PRAKASH B [mailto:suman.p at samsung.com <suman.p at samsung.com>]
>
> *Sent:* Wednesday, October 29, 2014 5:02 AM
> *To:* Judy Brock-SSI; Foster, Carolyn D; Alex Chang;
> nvmewin at lists.openfabrics.org
> *Subject:* Re: RE: Samsung Patch for Bus Reset Enhancements
>
>
>
> Content-Type: text/plain; charset=UTF-8
>
> Content-Transfer-Encoding: 8bit
>
> Date: %%SENT_DATE%%
>
> Subject: Suspect Message Quarantined
>
>
>
>
>
>
>
> WARNING: The virus scanner was unable to scan an attachment in an email message sent to you. This attachment could possibly contain viruses or other malicious programs. The attachment could not be scanned for the following reasons:
>
>
>
> %%DESC%%
>
>
>
> The full message and the attachment have been stored in the quarantine.
>
>
>
> The identifier for this message is '%%QID%%'.
>
>
>
> Access the quarantine at:
>
> https://puremessage.pmc-sierra.bc.ca:28443/ <https://urldefense.proofpoint.com/v2/url?u=https-3A__puremessage.pmc-2Dsierra.bc.ca-3A28443_&d=AAMGaQ&c=IGDlg0lD0b-nebmJJ0Kp8A&r=QOwFo5M7MYyQeT06CcSuSQHSUdSO20xC9GZe6-T9Svk&m=6bm0ONW6oD10WH5fcm7if9lz_4yuDxwQDOPCn5mU9Yw&s=b2GnSQGHpa21kYJfPDjqPt8x7boc4Emgnawu5qX8pz4&e=>
>
>
>
> For more information on PMC's Anti-Spam system:
>
> http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ <https://urldefense.proofpoint.com/v2/url?u=http-3A__pmc-2Dintranet_wiki_index.php_Outlook-3AAnti-2DSpam-5FFAQ&d=AAMGaQ&c=IGDlg0lD0b-nebmJJ0Kp8A&r=QOwFo5M7MYyQeT06CcSuSQHSUdSO20xC9GZe6-T9Svk&m=6bm0ONW6oD10WH5fcm7if9lz_4yuDxwQDOPCn5mU9Yw&s=IRya09wPpH0ZT31Tw0U53C7xVd8p1lDQoFqiNOaEjk0&e=>
>
>
>
> IT Services
>
> PureMessage Admin
>
>
>
> Hi Alex,
>
> I have revised the patch with the following review comments incorporated -
>
> 1. Both polledMode and hwResetInProg are used for exactly same purpose.
> [Suman] As suggested by Judy, replaced all occurrences of hwResetInProg
> and polledMode with a single variable polledResetInProg.
>
> 2. There is a call of StorPortResume(pAE) in Line2434 of nvmestd.c, which
> is redundant.
> [Suman] Changed code to call the StorPortResume(pAE) in Line2434 of
> nvmestd.c only when pAE->DriverState.NextDriverState != NVMeStartComplete.
>
> 3. To comply with our agreed coding style and make the logic easier, may I
> suggest changing Line#184 of nvmestat.c to:
> if (pAE->ntldrDump == FALSE) {
> if (pAE->polledMode == FALSE) {
> NVMeRunning(pAE);
> } else {
> /*
> * we poll if we're launching the reinit state machine from
> HwStorResetBus
> * or HwStorAdapterControl->ScsiRestartAdapter path
> */
> NVMeRunning(pAE);
> /* TO val is based on CAP register plus a few, 5, seconds to init
> post RDY */
> passiveTimeout = pAE->uSecCrtlTimeout + (STORPORT_TIMER_CB_us *
> MICRO_TO_SEC);
> ...
> return (pAE->DriverState.NextDriverState == NVMeStartComplete) ?
> TRUE : FALSE;
> }
> } else {
> PRES_MAPPING_TBL pRMT = &pAE->ResMapTbl;
> .....
> }
> [Suman] Corrected as per suggestion.
>
> 4. Rename IoCompletionDpcRoutine to avoid confusion, since this routine
> will be called in both DPC and polled mode.
> [Suman] As suggested by Judy, renamed this routine to IoCompletionRoutine
> and added comment in header section that this routine can either be
> scheduled to run as a DPC or called directly.
>
> Please find attached the revised patch. Password is samsung123.
>
> Thanks all for reviewing.
>
> Thanks,
> Suman
>
>
>
>
>
> ------- *Original Message* -------
>
> *Sender* : Judy Brock-SSI<judy.brock at ssi.samsung.com>
>
> *Date* : Oct 29, 2014 05:24 (GMT+05:00)
>
> *Title* : RE: Samsung Patch for Bus Reset Enhancements
>
>
>
> Hi Carolyn,
>
> Thank you for your feedback. It is a good idea to eliminate confusion over
> the currently-named routine – IoCompletionDpcRoutine - which does indeed
> imply running as a DPC.
>
> We could either create a new function, as you suggest or alternatively, we
> could just rename IoCompletionDpcRoutine directly to for example, “
> ProcessIoCompletions” or just “IoCompletionRoutine” and then indicate at
> the top of that routine that it can either be scheduled to run as a DPC or
> called directly, depending on context. If we take the word “Dpc” out of it,
> I think it might eliminate all potential confusion.
>
> If it is ok with you, I think renaming IoCompletionDpcRoutine to no
> longer imply the method by which its invoked might be preferable as this
> would also cover/clarify the case where, in dump mode, NVMeIsrMsix is
> currently calling IoCompletionDpcRoutine directly.
>
> If you still prefer a new routine, to cover both cases above, the name
> should imply the direct call nature of the function rather than polled mode
> per se since, if we call it from the ISR it could lead to the opposite
> confusion – a function that implies no ISR involvement being called from
> the ISR itself. We could:
>
> Create new function called ImmediateCompletionProcessing (or some other
> name that implies non-deferred processing) which in turn would call
> IoCompletionDpcRoutine.
>
> But again, I think renaming IoCompletionDpcRoutine is better because no
> matter what we name a new function, if it still calls a routine with the
> word “Dpc” in it, it will still have the potential to create confusion in
> (especially new) readers minds.
>
> Thanks,
>
> Judy
>
> *From:* Foster, Carolyn D [mailto:carolyn.d.foster at intel.com
> <carolyn.d.foster at intel.com>]
> *Sent:* Tuesday, October 28, 2014 4:24 PM
> *To:* Alex Chang; Judy Brock-SSI; suman.p at samsung.com;
> nvmewin at lists.openfabrics.org
> *Subject:* RE: Samsung Patch for Bus Reset Enhancements
>
> Hi Alex, Judy and Suman,
>
> I have completed my testing of the proposed patch and verified that it
> works. I also agree with Judy’s comments below. Since we’re now calling a
> function that is normally called in a DPC, for clarity, I would like to see
> a new function that calls the DPC routine instead. You could give the new
> function a name that indicates that it will handle command completions in
> polled mode. Then that new function would call the DPC routine directly,
> and the new function would be called from RunningStartAttempt. This way
> it doesn’t look like the RunningStartAttempt routine is doing anything with
> DPCs.
>
> Thanks,
>
> Carolyn
>
> *From:* Alex Chang [mailto:Alex.Chang at pmcs.com <Alex.Chang at pmcs.com>]
> *Sent:* Monday, October 27, 2014 8:57 PM
> *To:* Judy Brock-SSI; Foster, Carolyn D; suman.p at samsung.com;
> nvmewin at lists.openfabrics.org
> *Subject:* RE: Samsung Patch for Bus Reset Enhancements
>
> Hi Carolyn,
>
> Please let us know what you think.
>
> Thanks,
>
> Alex
>
> *From:* Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com
> <judy.brock at ssi.samsung.com>]
> *Sent:* Thursday, October 23, 2014 6:45 PM
> *To:* Foster, Carolyn D; Alex Chang; suman.p at samsung.com;
> nvmewin at lists.openfabrics.org
> *Subject:* RE: Samsung Patch for Bus Reset Enhancements
>
> Hi Carolyn,
>
> Replies inline below in blue.
>
> Thanks,
>
> Judy
>
> -----Original Message-----
> From: nvmewin-bounces at lists.openfabrics.org [
> mailto:nvmewin-bounces at lists.openfabrics.org
> <nvmewin-bounces at lists.openfabrics.org>] On Behalf Of Foster, Carolyn D
> Sent: Wednesday, October 22, 2014 4:29 PM
> To: Alex Chang; suman.p at samsung.com; nvmewin at lists.openfabrics.org
> Cc: cgps at samsung.com
> Subject: Re: [nvmewin] Samsung Patch for Bus Reset Enhancements
>
> Hi Suman,
>
> I have some feedback in addition to Alex's comments. I believe there is
> an issue with the loop that was added to NVMeRunningStartAttempt. The
> issue is that IoCompletionDpcRoutine was never meant to be called
> directly. It was architected and designed to always run from a DPC.
>
> [Judy]
>
> That’s because at runtime, we don’t want to be doing
> time-consuming request-completion work in the ISR. Therefore the work is
> offloaded to a DPC which runs at a lower IRQL. However, the work we need
> to do to process cmd completions is fixed - there is actually no innate
> architectural design impediment in the routine itself to calling this
> routine directly in the two scenarios our patch addresses – i.e., those
> situations where by architectural definition we are expected to finish all
> work before returning to the caller (and in our case, that includes
> sending and completing multiple commands in our init state machine )..
> Those scenarios are the two that Suman listed in the change notes:
>
> a) NVMeResetBus
>
> b) NVMeAdapterControl-> ScsiRestartAdapter
>
> By design, we don’t want to schedule a DPC to handle completions for the
> commands generated by the init state machine in these 2 reset paths – we
> want to poll. That’s why we make the direct call instead.
>
> It's possible that a command from the init state machine could generate
> an interrupt and run the IoCompletionDpcRoutine before it can be called in
> RunningStartAttempt.
>
> [Judy]
>
> This can’t happen.
>
> If an interrupt is generated on behalf of a command from the init state
> machine during the first scenario above (NVMeResetBus), the hwResetInProg
> flag at the top of the ISR causes us to return immediately:
>
> NVMeIsrMsix (
>
> …
>
> if (pAE->hwResetInProg)
>
> return TRUE;
>
> The second scenario above (NVMeAdapterControl-> ScsiRestartAdapter) is not
> interrupt-driven by definition. That is, at the time it is called,
> interrupts aren’t enabled. But even if it they were, the hwResetInProg flag
> would catch it.
>
> A better solution would be to have a loop similar to the one at the end of
> NVMePassiveInitialize where RunningStartAttempt is called, and is followed
> by a loop that waits for the state machine to complete.
>
> [Judy] This is actually the first approach we took and were intending to
> use but we found it didn’t work. The reason was the loop you refer to is
> periodic timer-driven but the timer was not getting scheduled in the
> NVMeAdapterControl-> ScsiRestartAdapter path as there is no timer available
> at that point. The reason this is not an issue for the current OFA driver
> is because we launch the state machine but then return from the call to
> NVMeAdapterControl and let the state machine run asynchronously and
> complete outside of that context (violates the spec).
>
> As the patch is currently written I am not comfortable approving it.This
> change to wait for the state machine's completion could be made in the new
> ReinitializeController function, and then you wouldn't need the changes to
> RunningStartAttempt or any of the polledmode code.
>
> [Judy] The approach you propose will not work for the reason explained
> above. Again, we too had first hoped it would but it won’t. Hence we went
> to a polled-mode model. Since we have to finish all work before returning
> anyway and since reset bus is not a performance path, there is no downside
> to polling.
>
> Thanks,
>
> Carolyn
>
> -----Original Message-----
>
> From: nvmewin-bounces at lists.openfabrics.org [
> mailto:nvmewin-bounces at lists.openfabrics.org
> <nvmewin-bounces at lists.openfabrics.org>] On Behalf Of Alex Chang
>
> Sent: Tuesday, October 21, 2014 1:20 PM
>
> To: suman.p at samsung.com; nvmewin at lists.openfabrics.org
>
> Cc: cgps at samsung.com
>
> Subject: Re: [nvmewin] Samsung Patch for Bus Reset Enhancements
>
> Hi Suman,
>
> (1) There is a call of StorPortResume(pAE) in Line2434 of nvmestd.c, which
> is redundant because, when NextDriverState is NVMeStartComplete, in the end
> of NVMeRunning, StorPortResume had been called already.
>
> (2) To comply with our agreed coding style and make the logic easier, may
> I suggest changing Line#184 of nvmestat.c to:
>
> if (pAE->ntldrDump == FALSE) {
>
> if (pAE->polledMode == FALSE) {
>
> NVMeRunning(pAE);
>
> } else {
>
> /*
>
> * we poll if we're launching the reinit state machine from
> HwStorResetBus
>
> * or HwStorAdapterControl->ScsiRestartAdapter path
>
> */
>
> NVMeRunning(pAE);
>
> /* TO val is based on CAP register plus a few, 5, seconds to init
> post RDY */
>
> passiveTimeout = pAE->uSecCrtlTimeout + (STORPORT_TIMER_CB_us *
> MICRO_TO_SEC);
>
> ...
>
> return (pAE->DriverState.NextDriverState == NVMeStartComplete) ?
> TRUE : FALSE;
>
> }
>
> } else {
>
> PRES_MAPPING_TBL pRMT = &pAE->ResMapTbl;
>
> .....
>
> }
>
> Thank you!
>
> Alex
>
> From: SUMAN PRAKASH B [mailto:suman.p at samsung.com <suman.p at samsung.com>]
>
> Sent: Wednesday, October 15, 2014 6:00 AM
>
> To: nvmewin at lists.openfabrics.org
>
> Cc: Alex Chang; cgps at samsung.com
>
> Subject: Samsung Patch for Bus Reset Enhancements
>
> Content-Type: text/plain; charset=UTF-8
>
> Content-Transfer-Encoding: 8bit
>
> Date: %%SENT_DATE%%
>
> Subject: Suspect Message Quarantined
>
> WARNING: The virus scanner was unable to scan an attachment in an email
> message sent to you. This attachment could possibly contain viruses or
> other malicious programs. The attachment could not be scanned for the
> following reasons:
>
> %%DESC%%
>
> The full message and the attachment have been stored in the quarantine.
>
> The identifier for this message is '%%QID%%'.
>
> Access the quarantine at:
>
> https://puremessage.pmc-sierra.bc.ca:28443/
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__puremessage.pmc-2Dsierra.bc.ca-3A28443_&d=AAMGaQ&c=IGDlg0lD0b-nebmJJ0Kp8A&r=QOwFo5M7MYyQeT06CcSuSQHSUdSO20xC9GZe6-T9Svk&m=6bm0ONW6oD10WH5fcm7if9lz_4yuDxwQDOPCn5mU9Yw&s=b2GnSQGHpa21kYJfPDjqPt8x7boc4Emgnawu5qX8pz4&e=>
>
> For more information on PMC's Anti-Spam system:
>
> http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__pmc-2Dintranet_wiki_index.php_Outlook-3AAnti-2DSpam-5FFAQ&d=AAMGaQ&c=IGDlg0lD0b-nebmJJ0Kp8A&r=QOwFo5M7MYyQeT06CcSuSQHSUdSO20xC9GZe6-T9Svk&m=6bm0ONW6oD10WH5fcm7if9lz_4yuDxwQDOPCn5mU9Yw&s=IRya09wPpH0ZT31Tw0U53C7xVd8p1lDQoFqiNOaEjk0&e=>
>
> IT Services
>
> PureMessage Admin
>
> Hi Everyone,
>
> We have a patch for the Bus Reset Enhancements.
>
> Please find attached the source code. The password is samsung123
>
> Please find the change description below -
>
> 1. There are multiple paths in the driver that reset the controller and
> execute the initialization state machine. Our patch is not concerned with
> the majority of those paths. Aside from a few additional isolated
> modifications, our patch focuses on the two paths that are supposed to be
> synchronous -i.e. they should not return to caller until all work is
> completed - but which currently are not so. They are:
>
> a) NVMeResetBus (and)
>
> b) NVMeAdapterControl-> ScsiRestartAdapter We have introduced a new
> routine NVMeReInitializeController(), which will be invoked from
> NVMeReseBus() and NVMeAdapterControl() - ScsiRestartAdapter. This routine
> will reset and initialize the controller and then complete the requests. It
> will not return until the initialization state machine is complete.
>
> We disallow processing of any SRB in NVMeStartIo() when NextDriverState !=
> NVMeStateComplete. In this way we direct the PowerUp operations to be
> executed in NVMeAdapterControl() - ScsiRestartAdapter only. When resuming
> from hibernation for example, NVMeStartio() will not process the POWER SRB.
> Instead, the Power Up operations will be invoked in
> NVMeAdapterControl()->ScsiRestartAdapter.
>
> Additionally , Miniport drivers should disregard requests to reset the bus
> when ntldrDump is set to TRUE in NvmeResetBus(). But current implementation
> processes this request.
>
> 2. When pAE->ntldrDump is TRUE, in the NVMeMapCore2Queue() routine, the
> pPGT value is NULL. Hence a BSOD occurs when executing ULONG coreNum =
> (ULONG)(pPN->Number + pPGT->BaseProcessor). We fixed the problem by moving
> access to pPGT when ntldrDump is FALSE.
>
> 3. In ProcessIo(), when IoStatus is set to NOT_SUBMITTED, the SRB is not
> completed. Due to this, a BSOD was occuring when executing WHCK test "DP
> WLK - Hot-Add - Device test". We fixed the problem by changing the code to
> complete SRB when IoStatus is NOT_SUBMITTED.
>
> 4. We changed the use of StorPortBusy()/StorPortReady() to
> StorPortPause()/StorPortResume(), since StorPortBusy() will not prevent new
> IOS from coming in once the current ones in the driver have been completed.
>
> Tested the following on Win7 and Windows 2012R2.
>
> - WHCK
>
> - Install/Uninstall, Enable/Disable, FS Format
>
> - Hibernation/Resume, Sleep/Resume
>
> - IOmeter
>
> Thanks,
>
> Suman
>
> _______________________________________________
>
> nvmewin mailing list
>
> nvmewin at lists.openfabrics.org
>
> http://lists.openfabrics.org/mailman/listinfo/nvmewin
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.openfabrics.org_mailman_listinfo_nvmewin&d=AAMGaQ&c=IGDlg0lD0b-nebmJJ0Kp8A&r=QOwFo5M7MYyQeT06CcSuSQHSUdSO20xC9GZe6-T9Svk&m=6bm0ONW6oD10WH5fcm7if9lz_4yuDxwQDOPCn5mU9Yw&s=Ju1jOe05ck9uEvZgQNGbhlcz8MS97eK9bwyDdJlF8SQ&e=>
>
> _______________________________________________
>
> nvmewin mailing list
>
> nvmewin at lists.openfabrics.org
>
> http://lists.openfabrics.org/mailman/listinfo/nvmewin
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.openfabrics.org_mailman_listinfo_nvmewin&d=AAMGaQ&c=IGDlg0lD0b-nebmJJ0Kp8A&r=QOwFo5M7MYyQeT06CcSuSQHSUdSO20xC9GZe6-T9Svk&m=6bm0ONW6oD10WH5fcm7if9lz_4yuDxwQDOPCn5mU9Yw&s=Ju1jOe05ck9uEvZgQNGbhlcz8MS97eK9bwyDdJlF8SQ&e=>
>
>
>
>
>
> [image: Image removed by sender.]
>
> _______________________________________________
> nvmewin mailing list
> nvmewin at lists.openfabrics.org
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.openfabrics.org_mailman_listinfo_nvmewin&d=AAICAg&c=IGDlg0lD0b-nebmJJ0Kp8A&r=QOwFo5M7MYyQeT06CcSuSQHSUdSO20xC9GZe6-T9Svk&m=6bm0ONW6oD10WH5fcm7if9lz_4yuDxwQDOPCn5mU9Yw&s=Ju1jOe05ck9uEvZgQNGbhlcz8MS97eK9bwyDdJlF8SQ&e=
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20141103/f5abf29b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 823 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20141103/f5abf29b/attachment.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 13168 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20141103/f5abf29b/attachment.gif>
More information about the nvmewin
mailing list