From Alex.Chang at pmcs.com  Wed Apr  2 14:23:57 2014
From: Alex.Chang at pmcs.com (Alex Chang)
Date: Wed, 2 Apr 2014 21:23:57 +0000
Subject: [nvmewin] PMC New Patch
In-Reply-To: <B3A485AFDDB1DD4598621E85E8EB67A83AB22196@FMSMSX105.amr.corp.intel.com>
References: <E1729D5DBAB9E948BA87B76FDFA1298A0C88ADEB@BBYEXM01.pmc-sierra.internal>
	<B3A485AFDDB1DD4598621E85E8EB67A83AB21B8D@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C88B052@BBYEXM01.pmc-sierra.internal>
	<49158E750348AA499168FD41D88983606269EFB4@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C88B06E@BBYEXM01.pmc-sierra.internal>
	<49158E750348AA499168FD41D88983606269F0EA@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C88B161@BBYEXM01.pmc-sierra.internal>
	<B3A485AFDDB1DD4598621E85E8EB67A83AB22196@FMSMSX105.amr.corp.intel.com>
Message-ID: <E1729D5DBAB9E948BA87B76FDFA1298A0C88B587@BBYEXM01.pmc-sierra.internal>

Hi all,

It's been more than a week and the only feedback received was from Intel. I decided to revise it based on Carolyn's suggestion and send it out for your final review and test. Will start collecting approvals next Monday.

Thanks,
Alex

From: Foster, Carolyn D [mailto:carolyn.d.foster at intel.com]
Sent: Friday, March 28, 2014 4:01 PM
To: Alex Chang; Robles, Raymond C; nvmewin at lists.openfabrics.org
Subject: RE: PMC New Patch

Thank you Alex, I think it's a configuration issue on my part.  The only feedback I really have for you about this patch is in NVMeAllocateMem in nvmeInit.c.  On line 184, if the initial allocation attempt failed, we try to allocate from node 0.  I'd like to see this changed to MM_ANY_NODE_OK instead of specifically hard coding it for node 0.  I know this isn't something specific to your patch, but I think it will be a bit more generic and flexible.

I have one or two more tests I'd like to wrap up on Monday, but I think the patch is looking good so far.

Thanks!
Carolyn

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Thursday, March 27, 2014 5:56 PM
To: Robles, Raymond C; Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Hi Ray and Carolyn,

Just let you know that I retested it as boot driver/hibernation with Patch#22, #23 and the patch I sent out. They are all working properly.

Regards,
Alex

From: Robles, Raymond C [mailto:raymond.c.robles at intel.com]
Sent: Thursday, March 27, 2014 12:33 PM
To: Alex Chang; Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Understood. However, the I/O that is not working is when entering S4 when the hiber-driver is loaded, as a boot device. This needs to work regardless of any other issues being seen, otherwise S4 as a boot device is not functional in the OFA driver.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Thursday, March 27, 2014 12:27 PM
To: Robles, Raymond C; Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Hi Ray,

That's what I thought, too. For some reasons, after coming back from S4, IOMeter discontinues and prompts out error messages. After terminating it and re-launching IOMeter, it works fine.

Regards,
Alex

From: Robles, Raymond C [mailto:raymond.c.robles at intel.com]
Sent: Thursday, March 27, 2014 12:18 PM
To: Alex Chang; Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Shouldn't S4 work as a boot and data device after the hibernation support patch? I/O generated during the hiber driver by the OS (to write out the hiber-file) should work regardless of any IOMeter workloads.

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Thursday, March 27, 2014 12:14 PM
To: Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: Re: [nvmewin] PMC New Patch

Hi Carolyn,

Since I don't think my changes will introduce the problem, I replaced the driver with  tag "Patch#22_Hibernation_Support", used our device as secondary drive and ran IOMeter to issue IOs to the drive. I've seen IOmeter reporting errors after the system/our device came back from hibernation properly. If no IO accesses, S4 works fine as either boot drive or secondary drive. Could you please verify that as well in your side? Once it's confirmed as a known issue, we need to decide when to fix it.

Regards,
Alex

From: Foster, Carolyn D [mailto:carolyn.d.foster at intel.com]
Sent: Thursday, March 27, 2014 11:18 AM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Hi Alex,

Were you able to test S4 as a boot device?  I am seeing some issues with the IO during hiber driver execution.  The hiber driver enumeration and initialization seems to complete with no issues, but after the first call to start io for the inquiry, I'm not seeing any more IO happen.  I will try to debug further, but is this something you can look into?

Thanks,
Carolyn

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Monday, March 24, 2014 4:30 PM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] ***UNCHECKED*** PMC New Patch

Hi all,

Please find the attached patch from PMC-Sierra. The password is pmc123. In order to speed up the entire process and meet our next release date, please review the changes and provide feedbacks as soon as possible. For each outstanding patch, we collect feedbacks for about a week after it is being sent out. A revised patch shall be sent out to include the feedbacks. I will follow up for approval after a week or so to allow more testing and reviewing if necessary.
Summary of changes:

1.       SRB Extension support for Windows 8 and up.

Files changed: nvmeStd.c, nvmeSnti.c, nvmeStat.c, nvmePwrMgmt.c, nvmeInit.c and the related header files.

2.       PRP list building for IOCTL and internal requests.

Files changed: nvmeStd.c, nvmeInit.c and nvmestd.h.

3.       Performance issue in Windows 8/Server 2012.

File changed: nvmeStd.c (removed StorPortGetUncachedExtension calling in NVMeFindAdapter)

4.       NVMeInitAdminQueues return value.

File changed: nvmeStd.c (Instead of returning TRUE/FALSE, return Storport defined status)

5.       Non-contiguous Namespace ID support.

Files changed: nvmeStat.c and nvmeInit.c (When fetching Namespace Structure with an invalid Namespace ID (which is less than value of NN field of Controller Structure), driver moves on to next Namespace ID as long as it's not larger than the value of NN field)

6.       Removal of using mask bits as core index to allocate/identify core tables.

Files changed: nvmeStd.c, nvmeInit.c and the related header files.

7.       Implemented logical processor group defined by Windows.

Files changed: nvmeStd.c, nvmeInit.c and the related header files.

8.       Core-MSI vector-Queue mapping, CMD_ENTRY synchronization and FreeQList access issues are related to using core mask bits as core index (#6) and no support for logical processor group (#7).
Platforms tested:

1.       Windows 7 64-bit

2.       Windows Server 2008 R2

3.       Windows 8 64-bit

4.       Windows Server 2012
Tests run;

1.       Installation(clean and update)/Un-Installation/Enable/Disable/hibernation and resume.

2.       IOMeter 4K Read/write combining in random/sequential manners.

3.       SCSC Compliance.

4.       SDStress.

5.       Quick/full disk formats.

6.       Non-contiguous Namespace IDs.

Thanks,
Alex

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140402/5d8d5b18/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pmc_patch_v3_0402_2014.zip
Type: application/x-zip-compressed
Size: 177768 bytes
Desc: pmc_patch_v3_0402_2014.zip
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140402/5d8d5b18/attachment.bin>

From kensopenfabrics at catlowcommunications.com  Fri Apr  4 20:55:43 2014
From: kensopenfabrics at catlowcommunications.com (Ken Strandberg)
Date: Fri, 4 Apr 2014 20:55:43 -0700
Subject: [nvmewin] [ewg] links and such
In-Reply-To: <001501cf4cf7$da2a44b0$8e7ece10$@opengridcomputing.com>
References: <1396278777.57364@flatbed.openfabrics.org>
	<001501cf4cf7$da2a44b0$8e7ece10$@opengridcomputing.com>
Message-ID: <CAFXU464ie6rzUgyet-KwGmq9EVj15o3OuP_TBCSgVVpoALfs7Q@mail.gmail.com>

Have you tried to F5 your browser?


On Mon, Mar 31, 2014 at 8:42 AM, Steve Wise <swise at opengridcomputing.com>wrote:

> FYI: This URL isn't working:
>
> t4:~ # wget www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
> --2014-03-31 11:00:39--
> http://www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
> Resolving www.openfabrics.org... 69.55.231.74
> Connecting to www.openfabrics.org|69.55.231.74|:80... connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2014-03-31 11:00:40 ERROR 404: Not Found.
>
>
> > -----Original Message-----
> > From: ewg-bounces at lists.openfabrics.org [mailto:
> ewg-bounces at lists.openfabrics.org] On
> > Behalf Of kens at flatbed.openfabrics.org
> > Sent: Monday, March 31, 2014 10:13 AM
> > To: nvmewin at openfabrics.org; ewg at openfabrics.org
> > Subject: [ewg] links and such
> >
> > We are migrating all web service to hardware. Some links and urls are
> not yet working,
> but I
> > diligently trying to solve the issues. The web site, lists server, and
> mail server are
> running.
> > Bugs are bugs.openfabrics.org/bugzilla/. The git daemon is running, but
> the web
> interface is
> > not yet up. SVN is available through a client at svn://
> flatbed.openfabrics.org. The web
> > interface is not up yet. My goal is to have them running today.
> >
> > Thanks for your patience. And thanks to Vladimir for help in getting the
> git daemon
> running.
> >
> > Ken
>
> _______________________________________________
> nvmewin mailing list
> nvmewin at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/nvmewin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140404/cb063c9a/attachment.html>

From Alex.Chang at pmcs.com  Tue Apr  8 09:23:09 2014
From: Alex.Chang at pmcs.com (Alex Chang)
Date: Tue, 8 Apr 2014 16:23:09 +0000
Subject: [nvmewin] FW: PMC New Patch
In-Reply-To: <897af0d467c145d98601c47f5a9d983a@DM2PR07MB285.namprd07.prod.outlook.com>
References: <E1729D5DBAB9E948BA87B76FDFA1298A0C88ADEB@BBYEXM01.pmc-sierra.internal>
	<B3A485AFDDB1DD4598621E85E8EB67A83AB21B8D@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C88B052@BBYEXM01.pmc-sierra.internal>
	<49158E750348AA499168FD41D88983606269EFB4@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C88B06E@BBYEXM01.pmc-sierra.internal>
	<49158E750348AA499168FD41D88983606269F0EA@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C88B161@BBYEXM01.pmc-sierra.internal>
	<B3A485AFDDB1DD4598621E85E8EB67A83AB22196@FMSMSX105.amr.corp.intel.com>
	,
	<E1729D5DBAB9E948BA87B76FDFA1298A0C88B970@BBYEXM01.pmc-sierra.internal>
	<897af0d467c145d98601c47f5a9d983a@DM2PR07MB285.namprd07.prod.outlook.com>
Message-ID: <E1729D5DBAB9E948BA87B76FDFA1298A0C88BA25@BBYEXM01.pmc-sierra.internal>

Hi all,

I had received approvals from Intel and LSI and will push the patch later today. Thank you for reviewing/testing the patch.

Regards,
Alex


From: Knoblaugh, Rick [mailto:Rick.Knoblaugh at lsi.com]
Sent: Tuesday, April 08, 2014 9:06 AM
To: Alex Chang; Foster, Carolyn D
Cc: Kwok Kong
Subject: RE: PMC New Patch


​Hi Alex,

                 We approve the patch. Thanks.


     -Rick

________________________________
From: Alex Chang <Alex.Chang at pmcs.com<mailto:Alex.Chang at pmcs.com>>
Sent: Monday, April 07, 2014 10:54 AM
To: Foster, Carolyn D; Knoblaugh, Rick
Cc: Kwok Kong
Subject: RE: PMC New Patch

Good morning, Carolyn and Rick,

Hope you had a great weekend. I plan to push this patch by the end of this week. If you approve it, please let me know at your earliest convenience.

Thanks a lot,
Alex


From: Alex Chang
Sent: Wednesday, April 02, 2014 2:24 PM
To: 'Foster, Carolyn D'; Robles, Raymond C; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Hi all,

It’s been more than a week and the only feedback received was from Intel. I decided to revise it based on Carolyn’s suggestion and send it out for your final review and test. Will start collecting approvals next Monday.

Thanks,
Alex

From: Foster, Carolyn D [mailto:carolyn.d.foster at intel.com]
Sent: Friday, March 28, 2014 4:01 PM
To: Alex Chang; Robles, Raymond C; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Thank you Alex, I think it’s a configuration issue on my part.  The only feedback I really have for you about this patch is in NVMeAllocateMem in nvmeInit.c.  On line 184, if the initial allocation attempt failed, we try to allocate from node 0.  I’d like to see this changed to MM_ANY_NODE_OK instead of specifically hard coding it for node 0.  I know this isn’t something specific to your patch, but I think it will be a bit more generic and flexible.

I have one or two more tests I’d like to wrap up on Monday, but I think the patch is looking good so far.

Thanks!
Carolyn

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Thursday, March 27, 2014 5:56 PM
To: Robles, Raymond C; Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Hi Ray and Carolyn,

Just let you know that I retested it as boot driver/hibernation with Patch#22, #23 and the patch I sent out. They are all working properly.

Regards,
Alex

From: Robles, Raymond C [mailto:raymond.c.robles at intel.com]
Sent: Thursday, March 27, 2014 12:33 PM
To: Alex Chang; Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Understood. However, the I/O that is not working is when entering S4 when the hiber-driver is loaded, as a boot device. This needs to work regardless of any other issues being seen, otherwise S4 as a boot device is not functional in the OFA driver.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Thursday, March 27, 2014 12:27 PM
To: Robles, Raymond C; Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Hi Ray,

That’s what I thought, too. For some reasons, after coming back from S4, IOMeter discontinues and prompts out error messages. After terminating it and re-launching IOMeter, it works fine.

Regards,
Alex

From: Robles, Raymond C [mailto:raymond.c.robles at intel.com]
Sent: Thursday, March 27, 2014 12:18 PM
To: Alex Chang; Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Shouldn’t S4 work as a boot and data device after the hibernation support patch? I/O generated during the hiber driver by the OS (to write out the hiber-file) should work regardless of any IOMeter workloads.

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Thursday, March 27, 2014 12:14 PM
To: Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: Re: [nvmewin] PMC New Patch

Hi Carolyn,

Since I don’t think my changes will introduce the problem, I replaced the driver with  tag “Patch#22_Hibernation_Support”, used our device as secondary drive and ran IOMeter to issue IOs to the drive. I’ve seen IOmeter reporting errors after the system/our device came back from hibernation properly. If no IO accesses, S4 works fine as either boot drive or secondary drive. Could you please verify that as well in your side? Once it’s confirmed as a known issue, we need to decide when to fix it.

Regards,
Alex

From: Foster, Carolyn D [mailto:carolyn.d.foster at intel.com]
Sent: Thursday, March 27, 2014 11:18 AM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Hi Alex,

Were you able to test S4 as a boot device?  I am seeing some issues with the IO during hiber driver execution.  The hiber driver enumeration and initialization seems to complete with no issues, but after the first call to start io for the inquiry, I’m not seeing any more IO happen.  I will try to debug further, but is this something you can look into?

Thanks,
Carolyn

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Monday, March 24, 2014 4:30 PM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] ***UNCHECKED*** PMC New Patch

Hi all,

Please find the attached patch from PMC-Sierra. The password is pmc123. In order to speed up the entire process and meet our next release date, please review the changes and provide feedbacks as soon as possible. For each outstanding patch, we collect feedbacks for about a week after it is being sent out. A revised patch shall be sent out to include the feedbacks. I will follow up for approval after a week or so to allow more testing and reviewing if necessary.
Summary of changes:

1.       SRB Extension support for Windows 8 and up.

Files changed: nvmeStd.c, nvmeSnti.c, nvmeStat.c, nvmePwrMgmt.c, nvmeInit.c and the related header files.

2.       PRP list building for IOCTL and internal requests.

Files changed: nvmeStd.c, nvmeInit.c and nvmestd.h.

3.       Performance issue in Windows 8/Server 2012.

File changed: nvmeStd.c (removed StorPortGetUncachedExtension calling in NVMeFindAdapter)

4.       NVMeInitAdminQueues return value.

File changed: nvmeStd.c (Instead of returning TRUE/FALSE, return Storport defined status)

5.       Non-contiguous Namespace ID support.

Files changed: nvmeStat.c and nvmeInit.c (When fetching Namespace Structure with an invalid Namespace ID (which is less than value of NN field of Controller Structure), driver moves on to next Namespace ID as long as it’s not larger than the value of NN field)

6.       Removal of using mask bits as core index to allocate/identify core tables.

Files changed: nvmeStd.c, nvmeInit.c and the related header files.

7.       Implemented logical processor group defined by Windows.

Files changed: nvmeStd.c, nvmeInit.c and the related header files.

8.       Core-MSI vector-Queue mapping, CMD_ENTRY synchronization and FreeQList access issues are related to using core mask bits as core index (#6) and no support for logical processor group (#7).
Platforms tested:

1.       Windows 7 64-bit

2.       Windows Server 2008 R2

3.       Windows 8 64-bit

4.       Windows Server 2012
Tests run;

1.       Installation(clean and update)/Un-Installation/Enable/Disable/hibernation and resume.

2.       IOMeter 4K Read/write combining in random/sequential manners.

3.       SCSC Compliance.

4.       SDStress.

5.       Quick/full disk formats.

6.       Non-contiguous Namespace IDs.

Thanks,
Alex

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140408/d10fc107/attachment.html>

From carolyn.d.foster at intel.com  Tue Apr  8 09:06:58 2014
From: carolyn.d.foster at intel.com (Foster, Carolyn D)
Date: Tue, 8 Apr 2014 16:06:58 +0000
Subject: [nvmewin] PMC New Patch
In-Reply-To: <E1729D5DBAB9E948BA87B76FDFA1298A0C88B587@BBYEXM01.pmc-sierra.internal>
References: <E1729D5DBAB9E948BA87B76FDFA1298A0C88ADEB@BBYEXM01.pmc-sierra.internal>
	<B3A485AFDDB1DD4598621E85E8EB67A83AB21B8D@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C88B052@BBYEXM01.pmc-sierra.internal>
	<49158E750348AA499168FD41D88983606269EFB4@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C88B06E@BBYEXM01.pmc-sierra.internal>
	<49158E750348AA499168FD41D88983606269F0EA@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C88B161@BBYEXM01.pmc-sierra.internal>
	<B3A485AFDDB1DD4598621E85E8EB67A83AB22196@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C88B587@BBYEXM01.pmc-sierra.internal>
Message-ID: <B3A485AFDDB1DD4598621E85E8EB67A83AB25689@FMSMSX105.amr.corp.intel.com>

Hi Alex, I approve this patch.

Thanks!
Carolyn

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Wednesday, April 02, 2014 2:24 PM
To: Foster, Carolyn D; Robles, Raymond C; nvmewin at lists.openfabrics.org
Subject: RE: PMC New Patch

Hi all,

It's been more than a week and the only feedback received was from Intel. I decided to revise it based on Carolyn's suggestion and send it out for your final review and test. Will start collecting approvals next Monday.

Thanks,
Alex

From: Foster, Carolyn D [mailto:carolyn.d.foster at intel.com]
Sent: Friday, March 28, 2014 4:01 PM
To: Alex Chang; Robles, Raymond C; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Thank you Alex, I think it's a configuration issue on my part.  The only feedback I really have for you about this patch is in NVMeAllocateMem in nvmeInit.c.  On line 184, if the initial allocation attempt failed, we try to allocate from node 0.  I'd like to see this changed to MM_ANY_NODE_OK instead of specifically hard coding it for node 0.  I know this isn't something specific to your patch, but I think it will be a bit more generic and flexible.

I have one or two more tests I'd like to wrap up on Monday, but I think the patch is looking good so far.

Thanks!
Carolyn

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Thursday, March 27, 2014 5:56 PM
To: Robles, Raymond C; Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Hi Ray and Carolyn,

Just let you know that I retested it as boot driver/hibernation with Patch#22, #23 and the patch I sent out. They are all working properly.

Regards,
Alex

From: Robles, Raymond C [mailto:raymond.c.robles at intel.com]
Sent: Thursday, March 27, 2014 12:33 PM
To: Alex Chang; Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Understood. However, the I/O that is not working is when entering S4 when the hiber-driver is loaded, as a boot device. This needs to work regardless of any other issues being seen, otherwise S4 as a boot device is not functional in the OFA driver.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Thursday, March 27, 2014 12:27 PM
To: Robles, Raymond C; Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Hi Ray,

That's what I thought, too. For some reasons, after coming back from S4, IOMeter discontinues and prompts out error messages. After terminating it and re-launching IOMeter, it works fine.

Regards,
Alex

From: Robles, Raymond C [mailto:raymond.c.robles at intel.com]
Sent: Thursday, March 27, 2014 12:18 PM
To: Alex Chang; Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Shouldn't S4 work as a boot and data device after the hibernation support patch? I/O generated during the hiber driver by the OS (to write out the hiber-file) should work regardless of any IOMeter workloads.

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Thursday, March 27, 2014 12:14 PM
To: Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: Re: [nvmewin] PMC New Patch

Hi Carolyn,

Since I don't think my changes will introduce the problem, I replaced the driver with  tag "Patch#22_Hibernation_Support", used our device as secondary drive and ran IOMeter to issue IOs to the drive. I've seen IOmeter reporting errors after the system/our device came back from hibernation properly. If no IO accesses, S4 works fine as either boot drive or secondary drive. Could you please verify that as well in your side? Once it's confirmed as a known issue, we need to decide when to fix it.

Regards,
Alex

From: Foster, Carolyn D [mailto:carolyn.d.foster at intel.com]
Sent: Thursday, March 27, 2014 11:18 AM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: PMC New Patch

Hi Alex,

Were you able to test S4 as a boot device?  I am seeing some issues with the IO during hiber driver execution.  The hiber driver enumeration and initialization seems to complete with no issues, but after the first call to start io for the inquiry, I'm not seeing any more IO happen.  I will try to debug further, but is this something you can look into?

Thanks,
Carolyn

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Monday, March 24, 2014 4:30 PM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] ***UNCHECKED*** PMC New Patch

Hi all,

Please find the attached patch from PMC-Sierra. The password is pmc123. In order to speed up the entire process and meet our next release date, please review the changes and provide feedbacks as soon as possible. For each outstanding patch, we collect feedbacks for about a week after it is being sent out. A revised patch shall be sent out to include the feedbacks. I will follow up for approval after a week or so to allow more testing and reviewing if necessary.
Summary of changes:

1.       SRB Extension support for Windows 8 and up.

Files changed: nvmeStd.c, nvmeSnti.c, nvmeStat.c, nvmePwrMgmt.c, nvmeInit.c and the related header files.

2.       PRP list building for IOCTL and internal requests.

Files changed: nvmeStd.c, nvmeInit.c and nvmestd.h.

3.       Performance issue in Windows 8/Server 2012.

File changed: nvmeStd.c (removed StorPortGetUncachedExtension calling in NVMeFindAdapter)

4.       NVMeInitAdminQueues return value.

File changed: nvmeStd.c (Instead of returning TRUE/FALSE, return Storport defined status)

5.       Non-contiguous Namespace ID support.

Files changed: nvmeStat.c and nvmeInit.c (When fetching Namespace Structure with an invalid Namespace ID (which is less than value of NN field of Controller Structure), driver moves on to next Namespace ID as long as it's not larger than the value of NN field)

6.       Removal of using mask bits as core index to allocate/identify core tables.

Files changed: nvmeStd.c, nvmeInit.c and the related header files.

7.       Implemented logical processor group defined by Windows.

Files changed: nvmeStd.c, nvmeInit.c and the related header files.

8.       Core-MSI vector-Queue mapping, CMD_ENTRY synchronization and FreeQList access issues are related to using core mask bits as core index (#6) and no support for logical processor group (#7).
Platforms tested:

1.       Windows 7 64-bit

2.       Windows Server 2008 R2

3.       Windows 8 64-bit

4.       Windows Server 2012
Tests run;

1.       Installation(clean and update)/Un-Installation/Enable/Disable/hibernation and resume.

2.       IOMeter 4K Read/write combining in random/sequential manners.

3.       SCSC Compliance.

4.       SDStress.

5.       Quick/full disk formats.

6.       Non-contiguous Namespace IDs.

Thanks,
Alex


CAUTION: Please confirm that the password protected
.zip attachment which contains the file(s) of type
  pmc_patch_v3_0402_2014.zip
is legitimate prior to opening.  To make sure this
message is not infected with a virus, it is important to
verify that you are expecting the message or else
confirm its legitimacy with the sender.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140408/4c27131c/attachment.html>

From Alex.Chang at pmcs.com  Tue Apr  8 09:56:33 2014
From: Alex.Chang at pmcs.com (Alex Chang)
Date: Tue, 8 Apr 2014 16:56:33 +0000
Subject: [nvmewin] NVMe Windows DB Is LOCKED - Pushing Patch From PMC For
 SRB Ext. Support And Bug Fixes
Message-ID: <E1729D5DBAB9E948BA87B76FDFA1298A0C88BA58@BBYEXM01.pmc-sierra.internal>

Locking NVMe Windows DB.

Thanks,
Alex

nvmewin mailing list
nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140408/3aa02473/attachment.html>

From Alex.Chang at pmcs.com  Wed Apr  9 11:41:22 2014
From: Alex.Chang at pmcs.com (Alex Chang)
Date: Wed, 9 Apr 2014 18:41:22 +0000
Subject: [nvmewin] NVMe Windows DB Is UNLOCKED - Pushing Patch From PMC For
 SRB Ext. Support And Bug Fixes
Message-ID: <E1729D5DBAB9E948BA87B76FDFA1298A0C88BC64@BBYEXM01.pmc-sierra.internal>

Hi all,

Thank you for reviewing/testing the patch from PMC.
The patch had been pushed into the source base and a new tag called "Patch#24_SRBExt_CoreGroup_N_Bug_Fixes" had been created under tags directory.
Next scheduled changes include:

1.       Removing CHATHAM related codes.

2.       Handling failures in learning CPU Cores to MSI vectors mappings.
Intel will re-base, add the changes and send a patch out for review/test when it's ready.
Should you have any questions, please reply to the email listed below.

Thanks,
Alex

nvmewin mailing list
nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140409/34706c8e/attachment.html>

From james.p.freyensee at intel.com  Fri Apr 11 10:10:46 2014
From: james.p.freyensee at intel.com (Freyensee, James P)
Date: Fri, 11 Apr 2014 17:10:46 +0000
Subject: [nvmewin] [ewg] links and such
In-Reply-To: <001501cf4cf7$da2a44b0$8e7ece10$@opengridcomputing.com>
References: <1396278777.57364@flatbed.openfabrics.org>
	<001501cf4cf7$da2a44b0$8e7ece10$@opengridcomputing.com>
Message-ID: <2D98093777D3FD46A36253F35FE9D693997C7A58@ORSMSX109.amr.corp.intel.com>

Is the new NVMe Windows Driver site fully functional yet?  From the main NVM Express website:

http://www.nvmexpress.org/products/

The "Windows Driver" link is broken.

Thanks!

-----Original Message-----
From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Steve Wise
Sent: Monday, March 31, 2014 8:43 AM
To: kens at flatbed.openfabrics.org; nvmewin at openfabrics.org; ewg at openfabrics.org
Subject: Re: [nvmewin] [ewg] links and such

FYI: This URL isn't working:

t4:~ # wget www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
--2014-03-31 11:00:39--
http://www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
Resolving www.openfabrics.org... 69.55.231.74 Connecting to www.openfabrics.org|69.55.231.74|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2014-03-31 11:00:40 ERROR 404: Not Found.


> -----Original Message-----
> From: ewg-bounces at lists.openfabrics.org 
> [mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of 
> kens at flatbed.openfabrics.org
> Sent: Monday, March 31, 2014 10:13 AM
> To: nvmewin at openfabrics.org; ewg at openfabrics.org
> Subject: [ewg] links and such
> 
> We are migrating all web service to hardware. Some links and urls are 
> not yet working,
but I
> diligently trying to solve the issues. The web site, lists server, and 
> mail server are
running.
> Bugs are bugs.openfabrics.org/bugzilla/. The git daemon is running, 
> but the web
interface is
> not yet up. SVN is available through a client at 
> svn://flatbed.openfabrics.org. The web interface is not up yet. My goal is to have them running today.
> 
> Thanks for your patience. And thanks to Vladimir for help in getting 
> the git daemon
running.
> 
> Ken

_______________________________________________
nvmewin mailing list
nvmewin at lists.openfabrics.org
http://lists.openfabrics.org/mailman/listinfo/nvmewin


From kens at openfabrics.org  Fri Apr 11 12:07:15 2014
From: kens at openfabrics.org (Ken Strandberg)
Date: Fri, 11 Apr 2014 12:07:15 -0700
Subject: [nvmewin] [ewg] links and such
In-Reply-To: <2D98093777D3FD46A36253F35FE9D693997C7A58@ORSMSX109.amr.corp.intel.com>
References: <1396278777.57364@flatbed.openfabrics.org>
	<001501cf4cf7$da2a44b0$8e7ece10$@opengridcomputing.com>
	<2D98093777D3FD46A36253F35FE9D693997C7A58@ORSMSX109.amr.corp.intel.com>
Message-ID: <CAFXU466Cs2LfOjoG4igp5Fc3p-qfG9Lm7uZiRUOTg_HaY5TkDA@mail.gmail.com>

I'm working on fixing this external links issue.


On Fri, Apr 11, 2014 at 10:10 AM, Freyensee, James P <
james.p.freyensee at intel.com> wrote:

> Is the new NVMe Windows Driver site fully functional yet?  From the main
> NVM Express website:
>
> http://www.nvmexpress.org/products/
>
> The "Windows Driver" link is broken.
>
> Thanks!
>
> -----Original Message-----
> From: nvmewin-bounces at lists.openfabrics.org [mailto:
> nvmewin-bounces at lists.openfabrics.org] On Behalf Of Steve Wise
> Sent: Monday, March 31, 2014 8:43 AM
> To: kens at flatbed.openfabrics.org; nvmewin at openfabrics.org;
> ewg at openfabrics.org
> Subject: Re: [nvmewin] [ewg] links and such
>
> FYI: This URL isn't working:
>
> t4:~ # wget www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
> --2014-03-31 11:00:39--
> http://www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
> Resolving www.openfabrics.org... 69.55.231.74 Connecting to
> www.openfabrics.org|69.55.231.74|:80... connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2014-03-31 11:00:40 ERROR 404: Not Found.
>
>
> > -----Original Message-----
> > From: ewg-bounces at lists.openfabrics.org
> > [mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of
> > kens at flatbed.openfabrics.org
> > Sent: Monday, March 31, 2014 10:13 AM
> > To: nvmewin at openfabrics.org; ewg at openfabrics.org
> > Subject: [ewg] links and such
> >
> > We are migrating all web service to hardware. Some links and urls are
> > not yet working,
> but I
> > diligently trying to solve the issues. The web site, lists server, and
> > mail server are
> running.
> > Bugs are bugs.openfabrics.org/bugzilla/. The git daemon is running,
> > but the web
> interface is
> > not yet up. SVN is available through a client at
> > svn://flatbed.openfabrics.org. The web interface is not up yet. My goal
> is to have them running today.
> >
> > Thanks for your patience. And thanks to Vladimir for help in getting
> > the git daemon
> running.
> >
> > Ken
>
> _______________________________________________
> nvmewin mailing list
> nvmewin at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/nvmewin
> _______________________________________________
> nvmewin mailing list
> nvmewin at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/nvmewin
>


-- 


*Ken Strandberg*
*Webmanager/SysAdmin*
*OpenFabrics Alliance*
kens at openfabrics.org
www.openfabrics.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140411/d6f20b5c/attachment.html>

From kens at openfabrics.org  Sat Apr 12 08:45:09 2014
From: kens at openfabrics.org (Ken Strandberg)
Date: Sat, 12 Apr 2014 08:45:09 -0700
Subject: [nvmewin] [ewg] links and such
In-Reply-To: <2D98093777D3FD46A36253F35FE9D693997C7A58@ORSMSX109.amr.corp.intel.com>
References: <1396278777.57364@flatbed.openfabrics.org>
	<001501cf4cf7$da2a44b0$8e7ece10$@opengridcomputing.com>
	<2D98093777D3FD46A36253F35FE9D693997C7A58@ORSMSX109.amr.corp.intel.com>
Message-ID: <CAFXU465Yj3y7tcDtTmFi1Zr6n7SGti=Nu107FUH5ZArDo4hyLw@mail.gmail.com>

With the recent server migration, URLs to the OFA site need /index.php/
prefixing the URI. www.openfabrics.org/URI should be changed to
www.openfabrics/index.php/URI. I've sent a request to
info at nvmexpress.orgto update their link.


On Fri, Apr 11, 2014 at 10:10 AM, Freyensee, James P <
james.p.freyensee at intel.com> wrote:

> Is the new NVMe Windows Driver site fully functional yet?  From the main
> NVM Express website:
>
> http://www.nvmexpress.org/products/
>
> The "Windows Driver" link is broken.
>
> Thanks!
>
> -----Original Message-----
> From: nvmewin-bounces at lists.openfabrics.org [mailto:
> nvmewin-bounces at lists.openfabrics.org] On Behalf Of Steve Wise
> Sent: Monday, March 31, 2014 8:43 AM
> To: kens at flatbed.openfabrics.org; nvmewin at openfabrics.org;
> ewg at openfabrics.org
> Subject: Re: [nvmewin] [ewg] links and such
>
> FYI: This URL isn't working:
>
> t4:~ # wget www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
> --2014-03-31 11:00:39--
> http://www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
> Resolving www.openfabrics.org... 69.55.231.74 Connecting to
> www.openfabrics.org|69.55.231.74|:80... connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2014-03-31 11:00:40 ERROR 404: Not Found.
>
>
> > -----Original Message-----
> > From: ewg-bounces at lists.openfabrics.org
> > [mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of
> > kens at flatbed.openfabrics.org
> > Sent: Monday, March 31, 2014 10:13 AM
> > To: nvmewin at openfabrics.org; ewg at openfabrics.org
> > Subject: [ewg] links and such
> >
> > We are migrating all web service to hardware. Some links and urls are
> > not yet working,
> but I
> > diligently trying to solve the issues. The web site, lists server, and
> > mail server are
> running.
> > Bugs are bugs.openfabrics.org/bugzilla/. The git daemon is running,
> > but the web
> interface is
> > not yet up. SVN is available through a client at
> > svn://flatbed.openfabrics.org. The web interface is not up yet. My goal
> is to have them running today.
> >
> > Thanks for your patience. And thanks to Vladimir for help in getting
> > the git daemon
> running.
> >
> > Ken
>
> _______________________________________________
> nvmewin mailing list
> nvmewin at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/nvmewin
> _______________________________________________
> nvmewin mailing list
> nvmewin at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/nvmewin
>


-- 


*Ken Strandberg*
*Webmanager/SysAdmin*
*OpenFabrics Alliance*
kens at openfabrics.org
www.openfabrics.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140412/f473b510/attachment.html>

From judy.brock at ssi.samsung.com  Sun Apr 13 00:43:48 2014
From: judy.brock at ssi.samsung.com (Judy Brock-SSI)
Date: Sun, 13 Apr 2014 07:43:48 +0000
Subject: [nvmewin] [ewg] links and such
In-Reply-To: <CAFXU465Yj3y7tcDtTmFi1Zr6n7SGti=Nu107FUH5ZArDo4hyLw@mail.gmail.com>
References: <1396278777.57364@flatbed.openfabrics.org>
	<001501cf4cf7$da2a44b0$8e7ece10$@opengridcomputing.com>
	<2D98093777D3FD46A36253F35FE9D693997C7A58@ORSMSX109.amr.corp.intel.com>
	<CAFXU465Yj3y7tcDtTmFi1Zr6n7SGti=Nu107FUH5ZArDo4hyLw@mail.gmail.com>
Message-ID: <36E8D38D6B771A4BBDB1C0D800158A516B618E62@SSIEXCH-MB3.ssi.samsung.com>

Hello,

Looks like www.openfabrics.org is down - can't get to it at all at the moment.

Thanks,
Judy


From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Ken Strandberg
Sent: Saturday, April 12, 2014 8:45 AM
To: Freyensee, James P
Cc: nvmewin at openfabrics.org; ewg at openfabrics.org; Steve Wise; kens at flatbed.openfabrics.org
Subject: Re: [nvmewin] [ewg] links and such

With the recent server migration, URLs to the OFA site need /index.php/ prefixing the URI. www.openfabrics.org/URI<http://www.openfabrics.org/URI> should be changed to www.openfabrics/index.php/URI<http://www.openfabrics/index.php/URI>. I've sent a request to info at nvmexpress.org<mailto:info at nvmexpress.org> to update their link.

On Fri, Apr 11, 2014 at 10:10 AM, Freyensee, James P <james.p.freyensee at intel.com<mailto:james.p.freyensee at intel.com>> wrote:
Is the new NVMe Windows Driver site fully functional yet?  From the main NVM Express website:

http://www.nvmexpress.org/products/

The "Windows Driver" link is broken.

Thanks!

-----Original Message-----
From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org>] On Behalf Of Steve Wise
Sent: Monday, March 31, 2014 8:43 AM
To: kens at flatbed.openfabrics.org<mailto:kens at flatbed.openfabrics.org>; nvmewin at openfabrics.org<mailto:nvmewin at openfabrics.org>; ewg at openfabrics.org<mailto:ewg at openfabrics.org>
Subject: Re: [nvmewin] [ewg] links and such
FYI: This URL isn't working:

t4:~ # wget www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
--2014-03-31<http://www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz--2014-03-31> 11:00:39--
http://www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
Resolving www.openfabrics.org<http://www.openfabrics.org>... 69.55.231.74 Connecting to www.openfabrics.org<http://www.openfabrics.org>|69.55.231.74|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2014-03-31 11:00:40 ERROR 404: Not Found.


> -----Original Message-----
> From: ewg-bounces at lists.openfabrics.org<mailto:ewg-bounces at lists.openfabrics.org>
> [mailto:ewg-bounces at lists.openfabrics.org<mailto:ewg-bounces at lists.openfabrics.org>] On Behalf Of
> kens at flatbed.openfabrics.org<mailto:kens at flatbed.openfabrics.org>
> Sent: Monday, March 31, 2014 10:13 AM
> To: nvmewin at openfabrics.org<mailto:nvmewin at openfabrics.org>; ewg at openfabrics.org<mailto:ewg at openfabrics.org>
> Subject: [ewg] links and such
>
> We are migrating all web service to hardware. Some links and urls are
> not yet working,
but I
> diligently trying to solve the issues. The web site, lists server, and
> mail server are
running.
> Bugs are bugs.openfabrics.org/bugzilla/<http://bugs.openfabrics.org/bugzilla/>. The git daemon is running,
> but the web
interface is
> not yet up. SVN is available through a client at
> svn://flatbed.openfabrics.org<http://flatbed.openfabrics.org>. The web interface is not up yet. My goal is to have them running today.
>
> Thanks for your patience. And thanks to Vladimir for help in getting
> the git daemon
running.
>
> Ken

_______________________________________________
nvmewin mailing list
nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
http://lists.openfabrics.org/mailman/listinfo/nvmewin
_______________________________________________
nvmewin mailing list
nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
http://lists.openfabrics.org/mailman/listinfo/nvmewin


--

Ken Strandberg
Webmanager/SysAdmin
OpenFabrics Alliance
kens at openfabrics.org<mailto:kens at openfabrics.org>
www.openfabrics.org<http://www.openfabrics.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140413/e76b303c/attachment.html>

From carolyn.d.foster at intel.com  Tue Apr 15 16:07:29 2014
From: carolyn.d.foster at intel.com (Foster, Carolyn D)
Date: Tue, 15 Apr 2014 23:07:29 +0000
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode
Message-ID: <B3A485AFDDB1DD4598621E85E8EB67A83AB28A73@FMSMSX105.amr.corp.intel.com>

The password is intel1234

Problem statement:
The current OFA driver assumes a one to one mapping of MSI vectors, queues and CPU cores.  If there is not a one to one mapping then the driver does not go through learning mode and we see a performance drop.  Learning mode is how we maintain our source core interrupt steering, where we map MSI vectors to CPU cores.  This mapping allows the driver to issue and complete commands on the same CPU core.

Proposed changes:
Instead of giving up on learn mapping if there are differences between MSI vectors and CPU cores, we will proceed as normal with learning mode.  We allocate the Core table for the max number of cores, and if at the end of learning mode, any CPU cores have not been mapped, we will map them to Submission queues in a round robin fashion.  We also take into account if the MSI vectors are not mapped contiguously, or if the number of submission and completion queues are different from each other and the number of cores.  These changes still won't have 100% functionality of the source core interrupt steering, but performance is better than if we don't try at all.  Most of the changes are in the initialization path, there was no change to the IO path.

Also in this patch is the removal of the #defines for the CHATHAM prototype hardware.

Unit Tests:
Tested the following on Windows 7 and Windows 8 based systems.
Booted from a system with more CPU cores than MSI vectors.
IO stress on a setup with fewer IO queues than CPU cores and MSI vectors
Ran SCSI compliance tests
Ran SDStress
Ran IOmeter
Hibernate
Format (quick and slow) of MBR and GPT
Install/Uninstall, Enable/Disable


Thanks!
Carolyn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140415/2ad71eca/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CPULearnMappingFixes.zip
Type: application/x-zip-compressed
Size: 169430 bytes
Desc: CPULearnMappingFixes.zip
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140415/2ad71eca/attachment.bin>

From Alex.Chang at pmcs.com  Tue Apr 15 16:13:59 2014
From: Alex.Chang at pmcs.com (Alex Chang)
Date: Tue, 15 Apr 2014 23:13:59 +0000
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode
In-Reply-To: <23542_1397603272_534DBBC8_23542_9879_1_B3A485AFDDB1DD4598621E85E8EB67A83AB28A73@FMSMSX105.amr.corp.intel.com>
References: <23542_1397603272_534DBBC8_23542_9879_1_B3A485AFDDB1DD4598621E85E8EB67A83AB28A73@FMSMSX105.amr.corp.intel.com>
Message-ID: <E1729D5DBAB9E948BA87B76FDFA1298A0C892100@BBYEXM01.pmc-sierra.internal>

Thank you very much, Carolyn.

Hi all,

Please start reviewing/testing the patch and provide feedback if you have any at your earliest convenience.

Thanks,
Alex

From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Foster, Carolyn D
Sent: Tuesday, April 15, 2014 4:07 PM
To: nvmewin at lists.openfabrics.org
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


The password is intel1234

Problem statement:
The current OFA driver assumes a one to one mapping of MSI vectors, queues and CPU cores.  If there is not a one to one mapping then the driver does not go through learning mode and we see a performance drop.  Learning mode is how we maintain our source core interrupt steering, where we map MSI vectors to CPU cores.  This mapping allows the driver to issue and complete commands on the same CPU core.

Proposed changes:
Instead of giving up on learn mapping if there are differences between MSI vectors and CPU cores, we will proceed as normal with learning mode.  We allocate the Core table for the max number of cores, and if at the end of learning mode, any CPU cores have not been mapped, we will map them to Submission queues in a round robin fashion.  We also take into account if the MSI vectors are not mapped contiguously, or if the number of submission and completion queues are different from each other and the number of cores.  These changes still won't have 100% functionality of the source core interrupt steering, but performance is better than if we don't try at all.  Most of the changes are in the initialization path, there was no change to the IO path.

Also in this patch is the removal of the #defines for the CHATHAM prototype hardware.

Unit Tests:
Tested the following on Windows 7 and Windows 8 based systems.
Booted from a system with more CPU cores than MSI vectors.
IO stress on a setup with fewer IO queues than CPU cores and MSI vectors
Ran SCSI compliance tests
Ran SDStress
Ran IOmeter
Hibernate
Format (quick and slow) of MBR and GPT
Install/Uninstall, Enable/Disable


Thanks!
Carolyn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140415/96c09aa2/attachment.html>

From Alex.Chang at pmcs.com  Wed Apr 16 11:30:42 2014
From: Alex.Chang at pmcs.com (Alex Chang)
Date: Wed, 16 Apr 2014 18:30:42 +0000
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode
In-Reply-To: <23542_1397603272_534DBBC8_23542_9879_1_B3A485AFDDB1DD4598621E85E8EB67A83AB28A73@FMSMSX105.amr.corp.intel.com>
References: <23542_1397603272_534DBBC8_23542_9879_1_B3A485AFDDB1DD4598621E85E8EB67A83AB28A73@FMSMSX105.amr.corp.intel.com>
Message-ID: <E1729D5DBAB9E948BA87B76FDFA1298A0C892952@BBYEXM01.pmc-sierra.internal>

Hi Carolyn,

Did you re-base the sources before adding your changes? The patch you sent out seems not including what I added in Patch#24.

Thanks,
Alex

From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Foster, Carolyn D
Sent: Tuesday, April 15, 2014 4:07 PM
To: nvmewin at lists.openfabrics.org
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


The password is intel1234

Problem statement:
The current OFA driver assumes a one to one mapping of MSI vectors, queues and CPU cores.  If there is not a one to one mapping then the driver does not go through learning mode and we see a performance drop.  Learning mode is how we maintain our source core interrupt steering, where we map MSI vectors to CPU cores.  This mapping allows the driver to issue and complete commands on the same CPU core.

Proposed changes:
Instead of giving up on learn mapping if there are differences between MSI vectors and CPU cores, we will proceed as normal with learning mode.  We allocate the Core table for the max number of cores, and if at the end of learning mode, any CPU cores have not been mapped, we will map them to Submission queues in a round robin fashion.  We also take into account if the MSI vectors are not mapped contiguously, or if the number of submission and completion queues are different from each other and the number of cores.  These changes still won't have 100% functionality of the source core interrupt steering, but performance is better than if we don't try at all.  Most of the changes are in the initialization path, there was no change to the IO path.

Also in this patch is the removal of the #defines for the CHATHAM prototype hardware.

Unit Tests:
Tested the following on Windows 7 and Windows 8 based systems.
Booted from a system with more CPU cores than MSI vectors.
IO stress on a setup with fewer IO queues than CPU cores and MSI vectors
Ran SCSI compliance tests
Ran SDStress
Ran IOmeter
Hibernate
Format (quick and slow) of MBR and GPT
Install/Uninstall, Enable/Disable


Thanks!
Carolyn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140416/45af064c/attachment.html>

From carolyn.d.foster at intel.com  Wed Apr 16 15:00:28 2014
From: carolyn.d.foster at intel.com (Foster, Carolyn D)
Date: Wed, 16 Apr 2014 22:00:28 +0000
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode
In-Reply-To: <E1729D5DBAB9E948BA87B76FDFA1298A0C892952@BBYEXM01.pmc-sierra.internal>
References: <23542_1397603272_534DBBC8_23542_9879_1_B3A485AFDDB1DD4598621E85E8EB67A83AB28A73@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C892952@BBYEXM01.pmc-sierra.internal>
Message-ID: <B3A485AFDDB1DD4598621E85E8EB67A83AB290F1@FMSMSX105.amr.corp.intel.com>

My apologies, I sent out the wrong version, thank you Alex.  I have attached the correct rebased version.

The password is still intel123

Carolyn

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Wednesday, April 16, 2014 11:31 AM
To: Foster, Carolyn D; nvmewin at lists.openfabrics.org
Subject: RE: NVMe OFA patch for CPU Learning mode

Hi Carolyn,

Did you re-base the sources before adding your changes? The patch you sent out seems not including what I added in Patch#24.

Thanks,
Alex

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Foster, Carolyn D
Sent: Tuesday, April 15, 2014 4:07 PM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


The password is intel1234

Problem statement:
The current OFA driver assumes a one to one mapping of MSI vectors, queues and CPU cores.  If there is not a one to one mapping then the driver does not go through learning mode and we see a performance drop.  Learning mode is how we maintain our source core interrupt steering, where we map MSI vectors to CPU cores.  This mapping allows the driver to issue and complete commands on the same CPU core.

Proposed changes:
Instead of giving up on learn mapping if there are differences between MSI vectors and CPU cores, we will proceed as normal with learning mode.  We allocate the Core table for the max number of cores, and if at the end of learning mode, any CPU cores have not been mapped, we will map them to Submission queues in a round robin fashion.  We also take into account if the MSI vectors are not mapped contiguously, or if the number of submission and completion queues are different from each other and the number of cores.  These changes still won't have 100% functionality of the source core interrupt steering, but performance is better than if we don't try at all.  Most of the changes are in the initialization path, there was no change to the IO path.

Also in this patch is the removal of the #defines for the CHATHAM prototype hardware.

Unit Tests:
Tested the following on Windows 7 and Windows 8 based systems.
Booted from a system with more CPU cores than MSI vectors.
IO stress on a setup with fewer IO queues than CPU cores and MSI vectors
Ran SCSI compliance tests
Ran SDStress
Ran IOmeter
Hibernate
Format (quick and slow) of MBR and GPT
Install/Uninstall, Enable/Disable


Thanks!
Carolyn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140416/05fd6596/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: IntelCPUPatch_v1_04162014.zip
Type: application/x-zip-compressed
Size: 172360 bytes
Desc: IntelCPUPatch_v1_04162014.zip
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140416/05fd6596/attachment.bin>

From Alex.Chang at pmcs.com  Wed Apr 16 15:09:09 2014
From: Alex.Chang at pmcs.com (Alex Chang)
Date: Wed, 16 Apr 2014 22:09:09 +0000
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode
In-Reply-To: <13527_1397685728_534EFDE0_13527_9074_1_B3A485AFDDB1DD4598621E85E8EB67A83AB290F1@FMSMSX105.amr.corp.intel.com>
References: <23542_1397603272_534DBBC8_23542_9879_1_B3A485AFDDB1DD4598621E85E8EB67A83AB28A73@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C892952@BBYEXM01.pmc-sierra.internal>
	<13527_1397685728_534EFDE0_13527_9074_1_B3A485AFDDB1DD4598621E85E8EB67A83AB290F1@FMSMSX105.amr.corp.intel.com>
Message-ID: <E1729D5DBAB9E948BA87B76FDFA1298A0C8929DD@BBYEXM01.pmc-sierra.internal>

Thank you, Carolyn, for fixing it in a timely manner. Now, we review/test it...

Regards,
Alex

From: Foster, Carolyn D [mailto:carolyn.d.foster at intel.com]
Sent: Wednesday, April 16, 2014 3:00 PM
To: Alex Chang; nvmewin at lists.openfabrics.org
Subject: RE: NVMe OFA patch for CPU Learning mode


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


My apologies, I sent out the wrong version, thank you Alex.  I have attached the correct rebased version.

The password is still intel123

Carolyn

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Wednesday, April 16, 2014 11:31 AM
To: Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: NVMe OFA patch for CPU Learning mode

Hi Carolyn,

Did you re-base the sources before adding your changes? The patch you sent out seems not including what I added in Patch#24.

Thanks,
Alex

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Foster, Carolyn D
Sent: Tuesday, April 15, 2014 4:07 PM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


The password is intel1234

Problem statement:
The current OFA driver assumes a one to one mapping of MSI vectors, queues and CPU cores.  If there is not a one to one mapping then the driver does not go through learning mode and we see a performance drop.  Learning mode is how we maintain our source core interrupt steering, where we map MSI vectors to CPU cores.  This mapping allows the driver to issue and complete commands on the same CPU core.

Proposed changes:
Instead of giving up on learn mapping if there are differences between MSI vectors and CPU cores, we will proceed as normal with learning mode.  We allocate the Core table for the max number of cores, and if at the end of learning mode, any CPU cores have not been mapped, we will map them to Submission queues in a round robin fashion.  We also take into account if the MSI vectors are not mapped contiguously, or if the number of submission and completion queues are different from each other and the number of cores.  These changes still won't have 100% functionality of the source core interrupt steering, but performance is better than if we don't try at all.  Most of the changes are in the initialization path, there was no change to the IO path.

Also in this patch is the removal of the #defines for the CHATHAM prototype hardware.

Unit Tests:
Tested the following on Windows 7 and Windows 8 based systems.
Booted from a system with more CPU cores than MSI vectors.
IO stress on a setup with fewer IO queues than CPU cores and MSI vectors
Ran SCSI compliance tests
Ran SDStress
Ran IOmeter
Hibernate
Format (quick and slow) of MBR and GPT
Install/Uninstall, Enable/Disable


Thanks!
Carolyn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140416/f87661e2/attachment.html>

From judy.brock at ssi.samsung.com  Mon Apr 21 12:03:57 2014
From: judy.brock at ssi.samsung.com (Judy Brock-SSI)
Date: Mon, 21 Apr 2014 19:03:57 +0000
Subject: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset
 Fixes
In-Reply-To: <23EC73C80FB59046A6B7B8EB7B3826593DA6D6E0@SACMBXIP02.sdcorp.global.sandisk.com>
References: <E1729D5DBAB9E948BA87B76FDFA1298A0C884EEB@BBYEXM01.pmc-sierra.internal>
	<23EC73C80FB59046A6B7B8EB7B3826593BDF0B5C@SACMBXIP01.sdcorp.global.sandisk.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C889023@BBYEXM01.pmc-sierra.internal>
	<23EC73C80FB59046A6B7B8EB7B3826593BDF134C@SACMBXIP01.sdcorp.global.sandisk.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C889036@BBYEXM01.pmc-sierra.internal>
	<26455_1392829222_5304E326_26455_6404_1_23EC73C80FB59046A6B7B8EB7B3826593BDF1488@SACMBXIP01.sdcorp.global.sandisk.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C88952B@BBYEXM01.pmc-sierra.internal>
	<B3A485AFDDB1DD4598621E85E8EB67A83AAF9C48@FMSMSX105.amr.corp.intel.com>
	<23EC73C80FB59046A6B7B8EB7B3826593DA6D6E0@SACMBXIP02.sdcorp.global.sandisk.com>
Message-ID: <36E8D38D6B771A4BBDB1C0D800158A516B61AC50@SSIEXCH-MB3.ssi.samsung.com>

Hi Carolyn and Dharani et al,

[Carolyn wrote] Line 2219: StorPortSynchronizeAccess - I don't understand why this is needed.  The SynchronizeReset function looks very much like the recovery DPC routine, which should already be synchronized with Start IO and the interrupt DPC.
[Dharani wrote] Line 2219: StorPortSynchronizeAccess, This is the request from Samsung suggested by Judy. Below is the reference mail.

The reason it was needed is because when the reference email was sent (7/18/2013), the OFA driver had a flag called COMPLETE_IN_DPC which controlled whether completions were handled in ISR context directly vs handled later in  DPC context.  So there was a true ISR context to contend with which the HwResetBus routine had to be synchronized with.

One way to do that is to use StorPortSynchronizeAccess as the legacy LSI miniport does, another way is to acquire/release the StorPortInterruptLock ourselves in our HwResetBus routine as the storport AHCI  miniport does, for example. One way or another, we needed to synchronize with our HwInterrupt routine at that time. We have since eliminated the COMPLETE_IN_DPC  flag along with the path which did completions in the ISR directly (see http://lists.openfabrics.org/pipermail/nvmewin/2013-July/000608.html  ) so Carolyn is correct -  NVMeResetBus is currently synchronized with StartIo and the Interrupt DPC already.

However, that does not mean the original recovery DPC routine could be scheduled as it originally wasfrom NVMeResetBus since there is still a need to not schedule a DPC from within that routine - by definition, all work must be completed before returning.  Below please find some additional feedback on the restructured reset logic. I apologize for not having provided it in the originally requested timeframe. I hope it can discussed and dealt with as the team decides is most convenient.

1. NVMeResetBus routine does not need to use StorPortSynchronizeAccess to synchronize with the ISR because we no longer do completions directly in ISR context as we did at the time I wrote my original recommendations for revising the HwBusReset routine.
See http://lists.openfabrics.org/pipermail/nvmewin/2013-July/000608.html in which we decided to remove the COMPLETE_IN_DPC flag which allowed the driver to switch between completions in ISR context vs completions in DPC context.  However, there is still a need to not schedule a DPC from NVMeResetBus as was the case prior to the reset patch effort. All work must be completed before returning from the call to NVMeResetBus.

2. Even if called directly, the current logic in the NVMeSynchronizeReset() routine has the same problem as the original code had in that it will not wait for all necessary work to be done before returning. After resetting the controller and completing all outstanding requests, it starts the re-initialize state machine with a call to NVMeRunningStartAttempt(). However,upon return from that call, there is no logic in place to wait for the initialization state machine to run to completion. We just fall straight through, allow IOs to resume, and return.
There needs to be logic, similar to that in NVMePassiveInitialize, which waits for pAE->DriverState.NextDriverState to become either NVMeStartComplete or NVMeStartFailed in a while loop which calls NVMeStallExecution between checks, up to some maximum amount of time.

3. The Recovery DPC routine has the same problem NVMeSynchronizeReset - there is no logic in place to wait for the initialization state machine to run to completion after the call to NVMeRunningStartAttempt() which starts it off.

4. NVMeWaitForCtrlRDY should have a return value that can be checked to see if it was successful or not. Right now, everywhere it is called the code forges ahead regardless of whether the RDY bit is in the desired state or not. Specifically, the following changes should be made (highlighted,line nums based on most-recently circulated Intel patch):
In NvmeStd.h:
BOOLEAN NVMeWaitForCtrlRDY(
    __in PNVME_DEVICE_EXTENSION pAE,
    __in ULONG expectedValue
);
In NvmeStd.c:
Line 1978:
BOOLEAN NVMeWaitForCtrlRDY(
    __in PNVME_DEVICE_EXTENSION pAE,
    __in ULONG expectedValue
)
{
    NVMe_CONTROLLER_STATUS CSTS = {0};
    ULONG time = 0;

     CSTS.AsUlong =
         StorPortReadRegisterUlong(pAE,
                                   &pAE->pCtrlRegister->CSTS.AsUlong);
     while (CSTS.RDY != expectedValue) {
        NVMeCrashDelay(STORPORT_TIMER_CB_us, pAE->ntldrDump);
       time += STORPORT_TIMER_CB_us;
        if (time > pAE->uSecCrtlTimeout) {
            return FALSE;
        }
        CSTS.AsUlong =
            StorPortReadRegisterUlong(pAE,
                                      &pAE->pCtrlRegister->CSTS.AsUlong);
     };
    return TRUE;
}
Line 651:
if(NVMeWaitForCtrlRDY(pAE, 1) == FALSE) {
StorPortDebugPrint(INFO,
       "NVMeInitialize: EN bit set to 1 but RDY bit set to 0\n");
return FALSE;
}

Line 661:
if(NVMeWaitForCtrlRDY(pAE, 0) == FALSE) {
StorPortDebugPrint(INFO,
    "NVMeInitialize: EN bit set to 0 but RDY bit won't clear- still 1\n");
return FALSE;
}
                               etc.
5. NVMeCompleteCmd should have a return value that can be checked to see if it was successful or not. Right now, everywhere it is called the code forges ahead regardless of whether the RDY bit is in the desired state or not. NVMeResetController is called from several places in the driver. One of the routines which it is called from is NVMeCompleteCmd:

VOID NVMeCompleteCmd{
. . .
if ((pCmdEntry->Pending == FALSE) || (pCmdEntry->Context == NULL)) {
/*
* Something bad happened so reset the adapter and hope for the best
                                   */
                  NVMeResetController(pAE, NULL);
                                    return;
}

Since NVMeCompleteCmd has no return value, this fatal error return is never detected in any of the places that the function is called from (quite a few) - the logic just proceeds on as if everything is fine. In some cases NVMeCompleteCmd can be called over and over (if it is called from DetectPendingCmds or IoCompletionDpcRoutine for example) which may in turn cause repeated calls to NVMeResetController.

6. There is redundancy in the new routine NVMeWaitForCtrlRDY() and the routine NVMeWaitOnReady(). Although the new routine is missing a return value (see item #1), we don't need both - we can get rid of the old routine.

7. In NvmeStd.c, line 646:
Except for the first sentence, this comment is not accurate,should be removed:
/*
* Before we transition to 0, make sure the ctrl is actually RDY
* NOTE:  Some HW implementations may not require this wait and  if not then it could be removed as waiting at this IRQL is  not recommended.  The spec is not clear on whether we
* need  to wait for RDY to transition EN back to 0 or not.
*/
NVM Express 1.0e and beyond includes the following statement in the definition of the EN bit(emphasis added): "Setting this field from a '0' to a '1' when CSTS.RDY is a '1,' or setting this field from a '1' to a '0' when CSTS.RDY is a '0,' has undefined results."
8. The routine NVMeResetAdapter() sets CC.EN to 0 without ever checking to make sure that CSTS.RDY is set to '1' first. This check has to be included in this routine. Since it is not, there are many paths in the driver where there is no prior check for this condition:
                  a) NVMeInitAdminQueues -> NVMeEnableAdapter -> NVMeResetAdapter
b) NVMeNormalShutdown -> NVMeResetAdapter
c) NVMeAdapterControlPowerDown -> NVMeResetAdapter
d) NVMeSynchronizeReset -> NVMeResetAdapter

9. In the RecoveryDpcRoutine():
a) the code does not need to set CC.EN to '0' and then wait for CSTS.RDY to become 0 because right after it does so, it calls NVMeResetAdapter which does the exact same thing.
b) is there an actual requirement for the following code?:
                                   /* 10 msec "settle" delay post reset */
                                    NVMeStallExecution(pAE, 10000);
c) is it really safe and/or required to always acquire/release the StartIo lock?

10. This is not feedback related Reset logic per-se but do we really need the NVMeCallArbiter() function at this point? I think we could replace all occurences of
NVMeCallArbiter(pAE);
with
                 if (pAE->ntldrDump == FALSE) {
                                   StorPortNotification(RequestTimerCall,
                                                   pAE,
                        NVMeRunning,
                                                      pAE->DriverState.CheckbackInterval);
                  }

Thanks,
Judy

From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Dharani Kotte
Sent: Tuesday, February 25, 2014 4:28 AM
To: Foster, Carolyn D; Alex Chang; nvmewin at lists.openfabrics.org
Subject: Re: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Hi Carolyn,

Line 1384: I can take care of this item.

Line 2219: StorPortSynchronizeAccess, This is the request from Samsung suggested by Judy. Below is the reference mail.

In our testing, we create a situation where we put the NVMe driver under heavy I/O load with Iometer and then cause the device to stop responding.  This results in I/O request timeouts which eventually causes the driver to be called at it's HwStorResetBus entry point (NVMeResetBus).  I have some feedback on the current architecture of that routine:


1.       Among other things, NMeResetBus schedules a DPC to complete any pending commands. This creates a situation where upon return from this entry point, there are still cmds outstanding which don't get completed till the DPC runs.  According to the WDK, this doesn't appear to be legal - all outstanding cmds have to be completed by the HwStorResetBus routine before it returns:

HwResetBus

Pointer to the miniport driver's HwStorResetBus<ms-help://MS.WDK.v10.7600.091201/Storage_r/hh/Storage_r/stormini_b3051379-4caa-4502-9492-a21672cfbf0d.xml.htm> routine, which is a required entry point for all miniport drivers. This member has the same meaning for the Storport version of the HW_INITIALIZATION_DATA structure as it does for the SCSI Port version of the structure. For more information, see the HwResetBus member of HW_INITIALIZATION_DATA (SCSI)
and
HwScsiResetBus must complete any outstanding requests by calling ScsiPortCompleteRequest with the SrbStatus value SRB_STATUS_BUS_RESET or, for individual SRBs, ScsiPortNotification with this status value.
and
The port driver pauses all device IO queues for the adapter and then calls the HwStorResetBus routine at IRQL DISPATCH_LEVEL after acquiring the StartIo spin lock. A miniport driver is responsible for completing SRBs received by HwStorStartIo<http://msdn.microsoft.com/en-us/library/windows/hardware/ff557423(v=vs.85).aspx> for PathId during this routine and setting their status to SRB_STATUS_BUS_RESET if necessary

Since HwStorResetBus must finish its work before returning; it can't schedule a DPC to do so later on. The logic which schedules a DPC should be removed.


2.       Code should be added to call StorPortPause() to hold off any new requests till StorPortResume() is called.


3.       Code should be added to call  StorPortSynchronizeAccess() in order to synchronize with HwStorInterrupt. A callback routine in the NVMe driver should also be added for NVMeResetBus to do the synchronized work in. HwStorResetBus is already synchronized with HwStorStartIo since the port driver calls it only after acquiring the StartIo spinlock.


4.       We should implement a driver-internal global (per "adapter") flag signifying we are busy with reset processing and thus can't allow new I/O requests to go through to the hardware.


5.       Code should be added to call StorPortResume() when all work is complete.


6.       We should refer to the WDK-supplied LSI parallel SCSI StorPort miniport sample driver for an example of all of the above.


Thanks,
Judy


Thanks,
Dharani.


From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Foster, Carolyn D
Sent: Monday, February 24, 2014 3:51 PM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: Re: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Hi Alex and Dharni,

I have been reviewing the code and performing some tests and I have some concerns about this patch.

In nvmeStd.c:
Line 1384: NVMeProcessAbortLunReset - This change will now send abort commands for all pending requests when a RESET_LOGICAL_UNIT request comes in, instead of issuing the RecoveryDpc routine.  This change concerns me the most.  During a reset there is no need to send individual abort requests for outstanding commands.  When the LUN reset comes in, we will set CC.EN to 0 and the spec clearly states that "the controller shall not process commands nor post completion queue entries to the completion queue."  This reset behavior has been accounted for in the driver, by design.  In the LUN reset case, we should continue to issue the recovery DPC routine, which will complete all outstanding commands.

What should happen here is that the new processAbortLun function should be moved under the SRB_FUNCTION_ABORT_COMMAND only.  Then the procesAbortLunReset function should only send one abort and not abort all outstanding commands.

Also, during testing, I hit a D1 BSOD when I tried to step through the code.  I ran IO and forced a timeout by using the debugger to skip over the line of code that rings the submission queue doorbell.  The IO should be timed out by storport, which will then send a reset lun.

Line 2219: StorPortSynchronizeAccess - I don't understand why this is needed.  The SynchronizeReset function looks very much like the recovery DPC routine, which should already be synchronized with Start IO and the interrupt DPC.

Thanks,
Carolyn


From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Wednesday, February 19, 2014 10:06 AM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Thank you, Dharani.

Hi all,

Please review/test the attached reset fix patch from Sandisk and provide your feedbacks.

Thank you very much,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Wednesday, February 19, 2014 9:00 AM
To: Alex Chang
Subject: [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


Hi Alex,

The attached is the patch source for review. I have tested the I/O running over night.

Areas need to be focused for test this patch:
1. Test abort/LUN resets.
2. Test chip reset.
3. Test the format command.
4.Test Firmware download command.

Password is "sndk1234"

Thanks,
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Tuesday, February 18, 2014 12:15 PM
To: Dharani Kotte
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Great!

Thanks,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Tuesday, February 18, 2014 12:14 PM
To: Alex Chang
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Just testing after merging the code it I should be able to  send it tomorrow morning.
Thanks,
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Tuesday, February 18, 2014 12:13 PM
To: Dharani Kotte
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Hi Dharani,

Just a friendly reminder, could you please send out your patch as soon as it's ready?

Many thanks,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Friday, February 14, 2014 10:18 AM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Sure Alex.
Dharani.

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Friday, February 14, 2014 10:17 AM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Good morning, Dharani,

As you may know, both Intel and Huawei patches had been added into OFA source base. Now, you may re-base your changes and send a patch out for review/test. Thank you very much for contributing the fixes.

Regards,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Wednesday, January 15, 2014 2:08 PM
To: Alex Chang; Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] RE: Would you please help to resolve a few OFA NVMe driver problems ?


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


Hi Alex,

The attached is the source for the preliminary review. I have tested the IO and scsi compliance test. I don't have a drive which supports abort/lun resets, not sure how to test the format command.

Thanks,
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Friday, December 20, 2013 11:54 AM
To: Dharani Kotte; Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Happy Holidays to you all.
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Friday, December 20, 2013 11:52 AM
To: Alex Chang; Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Thank you for the explanation. Sure I will take look.
Happy Holidays.
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Friday, December 20, 2013 11:44 AM
To: Kwok Kong; Dharani Kotte; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Hi Dharani,

The controller reset can be issued from either from the host or the driver itself. Currently, the driver seems handling them in the same manner via single entry "NVMeResetController". In the case of "from the host", the driver needs to separate the cases of SRB_FUNCTION_RESET_... requests from the ioctl request of NVME_RESET_DEVICE in the sense of handling pending IOs. In the case of "the driver itself", needs to re-exam the related error recovery codes as well.
Judy from Samsung suggested referring the storahci.sys driver sample codes for Windows 7/8 based on reset bus logic examples and detailed recommendations.

Thank you,
Alex


From: Kwok Kong
Sent: Friday, December 20, 2013 9:08 AM
To: Dharani Kotte; Akshay Mathur; Alex Chang
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Dharani,

Yes, these are the three areas that you are committed to.

Alex,

Please send more details on the "Controller reset does not handle all cases"  to Dharani.

Thanks

-Kwok

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Friday, December 20, 2013 9:02 AM
To: Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Hi Kwok,

I think the below are the items that we are committing for:
- Not handling CSTS.RDY status (from 1->0 and 0->1) properly on NVMe reset
- Controller reset does not handle all cases
- orphaned requests

Can somebody provide little bit more details on the expectation for the item "Controller reset does not handle all cases".

Thanks,
Dharani.


From: Kwok Kong [mailto:Kwok.Kong at pmcs.com]
Sent: Thursday, December 19, 2013 6:53 PM
To: Akshay Mathur
Cc: Dharani Kotte; Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Excellent! Your help is much appreciated.

Dharani,

Please let me know if you have any question.

Happy holiday to all of you.

-Kwok

From: Akshay Mathur [mailto:Akshay.Mathur at sandisk.com]
Sent: Thursday, December 19, 2013 6:51 PM
To: Kwok Kong
Cc: Dharani Kotte; Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Kwok,
You are welcome. We are pleased to contribute to the community and appreciate you driving it!

We will try our best to complete the implementation by end of January but we may not be able to complete comprehensive testing by that time. This is because of overlaps with few internal business deliverables and a company-wide shut-down for next 1.5 weeks.

Anyway, Dharani will be in touch with you as he makes progress.
Thanks
Akshay

From: Kwok Kong [mailto:Kwok.Kong at pmcs.com]
Sent: Tuesday, December 17, 2013 4:21 PM
To: Akshay Mathur
Cc: Dharani Kotte; Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Akshay,

Thanks for your willingness to contribute to the driver.   I am looking for a patch before end of Jan 2014, the earlier the better.
Please let me know if Sandisk can commit to that.

Your help is much appreciated.

Thanks

-Kwok

From: Akshay Mathur [mailto:Akshay.Mathur at sandisk.com]
Sent: Tuesday, December 17, 2013 4:11 PM
To: Kwok Kong
Cc: Dharani Kotte; Dave Landsman; Akshay Mathur
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Kowk,
I manage the Software and driver development team at SanDisk/ESS.
We are certainly willing to contribute to fixing the problems listed below but before we can commit, we would like to get clarification on the timeline i.e. by when these fixes are expected to be completed.
Thanks
Akshay Mathur
Sr Software Manager, Enterprise Storage Solutions
951 SanDisk Drive, Building #5  |  Milpitas, CA 95035 U.S.A.  |  Direct  +1 408.801.1336  |
Cell +1 856.607.7323  |  Corporate +1 408.801.1000  |  Akshay.Mathur at sandisk.com<mailto:Akshay.Mathur at sandisk.com>
[Description: cid:image001.jpg at 01CC358D.60974910]


From: Kwok Kong [mailto:Kwok.Kong at pmcs.com]
Sent: Wednesday, December 11, 2013 18:00
To: Dave Landsman
Cc: Dharani Kotte
Subject: Would you please help to resolve a few OFA NVMe driver problems ?

Dave and Dharani,

There are some issues with the current OFA driver that need to be fixed. PMC is working on resolving some of the problems. Intel has agreed to work on the following two problems:
- remove #define for CHATHAM2
- Learning of CPU core to Vector failure handling

I am also making request to other companies to work on some of the issues.

I wonder if your company can work on the following three problems:
                - Not handling CSTS.RDY status (from 1->0 and 0->1) properly on NVMe reset
                - Controller reset does not handle all cases
                - orphaned requests

Please let me know if your company can work on these two issues.

Thanks

-Kwok


________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140421/f19a1a63/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 9449 bytes
Desc: image001.jpg
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140421/f19a1a63/attachment.jpg>

From Dharani.Kotte at sandisk.com  Mon Apr 21 12:17:08 2014
From: Dharani.Kotte at sandisk.com (Dharani Kotte)
Date: Mon, 21 Apr 2014 19:17:08 +0000
Subject: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset
 Fixes
In-Reply-To: <36E8D38D6B771A4BBDB1C0D800158A516B61AC50@SSIEXCH-MB3.ssi.samsung.com>
References: <E1729D5DBAB9E948BA87B76FDFA1298A0C884EEB@BBYEXM01.pmc-sierra.internal>
	<23EC73C80FB59046A6B7B8EB7B3826593BDF0B5C@SACMBXIP01.sdcorp.global.sandisk.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C889023@BBYEXM01.pmc-sierra.internal>
	<23EC73C80FB59046A6B7B8EB7B3826593BDF134C@SACMBXIP01.sdcorp.global.sandisk.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C889036@BBYEXM01.pmc-sierra.internal>
	<26455_1392829222_5304E326_26455_6404_1_23EC73C80FB59046A6B7B8EB7B3826593BDF1488@SACMBXIP01.sdcorp.global.sandisk.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C88952B@BBYEXM01.pmc-sierra.internal>
	<B3A485AFDDB1DD4598621E85E8EB67A83AAF9C48@FMSMSX105.amr.corp.intel.com>
	<23EC73C80FB59046A6B7B8EB7B3826593DA6D6E0@SACMBXIP02.sdcorp.global.sandisk.com>
	<36E8D38D6B771A4BBDB1C0D800158A516B61AC50@SSIEXCH-MB3.ssi.samsung.com>
Message-ID: <23EC73C80FB59046A6B7B8EB7B3826593DA967BE@SACMBXIP01.sdcorp.global.sandisk.com>

Hi Judy,

It is long email let me take a look at the code and get back to you.

Thank you for bringing it up.

Thanks,
Dharani.

From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com]
Sent: Monday, April 21, 2014 12:04 PM
To: nvmewin at openfabrics.org
Cc: Foster, Carolyn D; Dharani Kotte; Alex Chang
Subject: RE: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Hi Carolyn and Dharani et al,

[Carolyn wrote] Line 2219: StorPortSynchronizeAccess - I don't understand why this is needed.  The SynchronizeReset function looks very much like the recovery DPC routine, which should already be synchronized with Start IO and the interrupt DPC.
[Dharani wrote] Line 2219: StorPortSynchronizeAccess, This is the request from Samsung suggested by Judy. Below is the reference mail.

The reason it was needed is because when the reference email was sent (7/18/2013), the OFA driver had a flag called COMPLETE_IN_DPC which controlled whether completions were handled in ISR context directly vs handled later in  DPC context.  So there was a true ISR context to contend with which the HwResetBus routine had to be synchronized with.

One way to do that is to use StorPortSynchronizeAccess as the legacy LSI miniport does, another way is to acquire/release the StorPortInterruptLock ourselves in our HwResetBus routine as the storport AHCI  miniport does, for example. One way or another, we needed to synchronize with our HwInterrupt routine at that time. We have since eliminated the COMPLETE_IN_DPC  flag along with the path which did completions in the ISR directly (see http://lists.openfabrics.org/pipermail/nvmewin/2013-July/000608.html  ) so Carolyn is correct -  NVMeResetBus is currently synchronized with StartIo and the Interrupt DPC already.

However, that does not mean the original recovery DPC routine could be scheduled as it originally wasfrom NVMeResetBus since there is still a need to not schedule a DPC from within that routine - by definition, all work must be completed before returning.  Below please find some additional feedback on the restructured reset logic. I apologize for not having provided it in the originally requested timeframe. I hope it can discussed and dealt with as the team decides is most convenient.

1. NVMeResetBus routine does not need to use StorPortSynchronizeAccess to synchronize with the ISR because we no longer do completions directly in ISR context as we did at the time I wrote my original recommendations for revising the HwBusReset routine.
See http://lists.openfabrics.org/pipermail/nvmewin/2013-July/000608.html in which we decided to remove the COMPLETE_IN_DPC flag which allowed the driver to switch between completions in ISR context vs completions in DPC context.  However, there is still a need to not schedule a DPC from NVMeResetBus as was the case prior to the reset patch effort. All work must be completed before returning from the call to NVMeResetBus.

2. Even if called directly, the current logic in the NVMeSynchronizeReset() routine has the same problem as the original code had in that it will not wait for all necessary work to be done before returning. After resetting the controller and completing all outstanding requests, it starts the re-initialize state machine with a call to NVMeRunningStartAttempt(). However,upon return from that call, there is no logic in place to wait for the initialization state machine to run to completion. We just fall straight through, allow IOs to resume, and return.
There needs to be logic, similar to that in NVMePassiveInitialize, which waits for pAE->DriverState.NextDriverState to become either NVMeStartComplete or NVMeStartFailed in a while loop which calls NVMeStallExecution between checks, up to some maximum amount of time.

3. The Recovery DPC routine has the same problem NVMeSynchronizeReset - there is no logic in place to wait for the initialization state machine to run to completion after the call to NVMeRunningStartAttempt() which starts it off.

4. NVMeWaitForCtrlRDY should have a return value that can be checked to see if it was successful or not. Right now, everywhere it is called the code forges ahead regardless of whether the RDY bit is in the desired state or not. Specifically, the following changes should be made (highlighted,line nums based on most-recently circulated Intel patch):
In NvmeStd.h:
BOOLEAN NVMeWaitForCtrlRDY(
    __in PNVME_DEVICE_EXTENSION pAE,
    __in ULONG expectedValue
);
In NvmeStd.c:
Line 1978:
BOOLEAN NVMeWaitForCtrlRDY(
    __in PNVME_DEVICE_EXTENSION pAE,
    __in ULONG expectedValue
)
{
    NVMe_CONTROLLER_STATUS CSTS = {0};
    ULONG time = 0;

     CSTS.AsUlong =
         StorPortReadRegisterUlong(pAE,
                                   &pAE->pCtrlRegister->CSTS.AsUlong);
     while (CSTS.RDY != expectedValue) {
        NVMeCrashDelay(STORPORT_TIMER_CB_us, pAE->ntldrDump);
       time += STORPORT_TIMER_CB_us;
        if (time > pAE->uSecCrtlTimeout) {
            return FALSE;
        }
        CSTS.AsUlong =
            StorPortReadRegisterUlong(pAE,
                                      &pAE->pCtrlRegister->CSTS.AsUlong);
     };
    return TRUE;
}
Line 651:
if(NVMeWaitForCtrlRDY(pAE, 1) == FALSE) {
StorPortDebugPrint(INFO,
       "NVMeInitialize: EN bit set to 1 but RDY bit set to 0\n");
return FALSE;
}

Line 661:
if(NVMeWaitForCtrlRDY(pAE, 0) == FALSE) {
StorPortDebugPrint(INFO,
    "NVMeInitialize: EN bit set to 0 but RDY bit won't clear- still 1\n");
return FALSE;
}
                               etc.
5. NVMeCompleteCmd should have a return value that can be checked to see if it was successful or not. Right now, everywhere it is called the code forges ahead regardless of whether the RDY bit is in the desired state or not. NVMeResetController is called from several places in the driver. One of the routines which it is called from is NVMeCompleteCmd:

VOID NVMeCompleteCmd{
. . .
if ((pCmdEntry->Pending == FALSE) || (pCmdEntry->Context == NULL)) {
/*
* Something bad happened so reset the adapter and hope for the best
                                   */
                  NVMeResetController(pAE, NULL);
                                    return;
}

Since NVMeCompleteCmd has no return value, this fatal error return is never detected in any of the places that the function is called from (quite a few) - the logic just proceeds on as if everything is fine. In some cases NVMeCompleteCmd can be called over and over (if it is called from DetectPendingCmds or IoCompletionDpcRoutine for example) which may in turn cause repeated calls to NVMeResetController.

6. There is redundancy in the new routine NVMeWaitForCtrlRDY() and the routine NVMeWaitOnReady(). Although the new routine is missing a return value (see item #1), we don't need both - we can get rid of the old routine.

7. In NvmeStd.c, line 646:
Except for the first sentence, this comment is not accurate,should be removed:
/*
* Before we transition to 0, make sure the ctrl is actually RDY
* NOTE:  Some HW implementations may not require this wait and  if not then it could be removed as waiting at this IRQL is  not recommended.  The spec is not clear on whether we
* need  to wait for RDY to transition EN back to 0 or not.
*/
NVM Express 1.0e and beyond includes the following statement in the definition of the EN bit(emphasis added): "Setting this field from a '0' to a '1' when CSTS.RDY is a '1,' or setting this field from a '1' to a '0' when CSTS.RDY is a '0,' has undefined results."

8. The routine NVMeResetAdapter() sets CC.EN to 0 without ever checking to make sure that CSTS.RDY is set to '1' first. This check has to be included in this routine. Since it is not, there are many paths in the driver where there is no prior check for this condition:
                  a) NVMeInitAdminQueues -> NVMeEnableAdapter -> NVMeResetAdapter
b) NVMeNormalShutdown -> NVMeResetAdapter
c) NVMeAdapterControlPowerDown -> NVMeResetAdapter
d) NVMeSynchronizeReset -> NVMeResetAdapter

9. In the RecoveryDpcRoutine():
a) the code does not need to set CC.EN to '0' and then wait for CSTS.RDY to become 0 because right after it does so, it calls NVMeResetAdapter which does the exact same thing.
b) is there an actual requirement for the following code?:
                                   /* 10 msec "settle" delay post reset */
                                    NVMeStallExecution(pAE, 10000);
c) is it really safe and/or required to always acquire/release the StartIo lock?

10. This is not feedback related Reset logic per-se but do we really need the NVMeCallArbiter() function at this point? I think we could replace all occurences of
NVMeCallArbiter(pAE);
with
                 if (pAE->ntldrDump == FALSE) {
                                   StorPortNotification(RequestTimerCall,
                                                   pAE,
                        NVMeRunning,
                                                      pAE->DriverState.CheckbackInterval);
                  }

Thanks,
Judy

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Dharani Kotte
Sent: Tuesday, February 25, 2014 4:28 AM
To: Foster, Carolyn D; Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: Re: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Hi Carolyn,

Line 1384: I can take care of this item.

Line 2219: StorPortSynchronizeAccess, This is the request from Samsung suggested by Judy. Below is the reference mail.

In our testing, we create a situation where we put the NVMe driver under heavy I/O load with Iometer and then cause the device to stop responding.  This results in I/O request timeouts which eventually causes the driver to be called at it's HwStorResetBus entry point (NVMeResetBus).  I have some feedback on the current architecture of that routine:


1.       Among other things, NMeResetBus schedules a DPC to complete any pending commands. This creates a situation where upon return from this entry point, there are still cmds outstanding which don't get completed till the DPC runs.  According to the WDK, this doesn't appear to be legal - all outstanding cmds have to be completed by the HwStorResetBus routine before it returns:

HwResetBus

Pointer to the miniport driver's HwStorResetBus<ms-help://MS.WDK.v10.7600.091201/Storage_r/hh/Storage_r/stormini_b3051379-4caa-4502-9492-a21672cfbf0d.xml.htm> routine, which is a required entry point for all miniport drivers. This member has the same meaning for the Storport version of the HW_INITIALIZATION_DATA structure as it does for the SCSI Port version of the structure. For more information, see the HwResetBus member of HW_INITIALIZATION_DATA (SCSI)
and
HwScsiResetBus must complete any outstanding requests by calling ScsiPortCompleteRequest with the SrbStatus value SRB_STATUS_BUS_RESET or, for individual SRBs, ScsiPortNotification with this status value.
and
The port driver pauses all device IO queues for the adapter and then calls the HwStorResetBus routine at IRQL DISPATCH_LEVEL after acquiring the StartIo spin lock. A miniport driver is responsible for completing SRBs received by HwStorStartIo<http://msdn.microsoft.com/en-us/library/windows/hardware/ff557423(v=vs.85).aspx> for PathId during this routine and setting their status to SRB_STATUS_BUS_RESET if necessary

Since HwStorResetBus must finish its work before returning; it can't schedule a DPC to do so later on. The logic which schedules a DPC should be removed.


2.       Code should be added to call StorPortPause() to hold off any new requests till StorPortResume() is called.


3.       Code should be added to call  StorPortSynchronizeAccess() in order to synchronize with HwStorInterrupt. A callback routine in the NVMe driver should also be added for NVMeResetBus to do the synchronized work in. HwStorResetBus is already synchronized with HwStorStartIo since the port driver calls it only after acquiring the StartIo spinlock.


4.       We should implement a driver-internal global (per "adapter") flag signifying we are busy with reset processing and thus can't allow new I/O requests to go through to the hardware.


5.       Code should be added to call StorPortResume() when all work is complete.


6.       We should refer to the WDK-supplied LSI parallel SCSI StorPort miniport sample driver for an example of all of the above.


Thanks,
Judy


Thanks,
Dharani.


From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Foster, Carolyn D
Sent: Monday, February 24, 2014 3:51 PM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: Re: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Hi Alex and Dharni,

I have been reviewing the code and performing some tests and I have some concerns about this patch.

In nvmeStd.c:
Line 1384: NVMeProcessAbortLunReset - This change will now send abort commands for all pending requests when a RESET_LOGICAL_UNIT request comes in, instead of issuing the RecoveryDpc routine.  This change concerns me the most.  During a reset there is no need to send individual abort requests for outstanding commands.  When the LUN reset comes in, we will set CC.EN to 0 and the spec clearly states that "the controller shall not process commands nor post completion queue entries to the completion queue."  This reset behavior has been accounted for in the driver, by design.  In the LUN reset case, we should continue to issue the recovery DPC routine, which will complete all outstanding commands.

What should happen here is that the new processAbortLun function should be moved under the SRB_FUNCTION_ABORT_COMMAND only.  Then the procesAbortLunReset function should only send one abort and not abort all outstanding commands.

Also, during testing, I hit a D1 BSOD when I tried to step through the code.  I ran IO and forced a timeout by using the debugger to skip over the line of code that rings the submission queue doorbell.  The IO should be timed out by storport, which will then send a reset lun.

Line 2219: StorPortSynchronizeAccess - I don't understand why this is needed.  The SynchronizeReset function looks very much like the recovery DPC routine, which should already be synchronized with Start IO and the interrupt DPC.

Thanks,
Carolyn


From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Wednesday, February 19, 2014 10:06 AM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Thank you, Dharani.

Hi all,

Please review/test the attached reset fix patch from Sandisk and provide your feedbacks.

Thank you very much,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Wednesday, February 19, 2014 9:00 AM
To: Alex Chang
Subject: [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


Hi Alex,

The attached is the patch source for review. I have tested the I/O running over night.

Areas need to be focused for test this patch:
1. Test abort/LUN resets.
2. Test chip reset.
3. Test the format command.
4.Test Firmware download command.

Password is "sndk1234"

Thanks,
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Tuesday, February 18, 2014 12:15 PM
To: Dharani Kotte
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Great!

Thanks,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Tuesday, February 18, 2014 12:14 PM
To: Alex Chang
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Just testing after merging the code it I should be able to  send it tomorrow morning.
Thanks,
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Tuesday, February 18, 2014 12:13 PM
To: Dharani Kotte
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Hi Dharani,

Just a friendly reminder, could you please send out your patch as soon as it's ready?

Many thanks,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Friday, February 14, 2014 10:18 AM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Sure Alex.
Dharani.

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Friday, February 14, 2014 10:17 AM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Good morning, Dharani,

As you may know, both Intel and Huawei patches had been added into OFA source base. Now, you may re-base your changes and send a patch out for review/test. Thank you very much for contributing the fixes.

Regards,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Wednesday, January 15, 2014 2:08 PM
To: Alex Chang; Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] RE: Would you please help to resolve a few OFA NVMe driver problems ?


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


Hi Alex,

The attached is the source for the preliminary review. I have tested the IO and scsi compliance test. I don't have a drive which supports abort/lun resets, not sure how to test the format command.

Thanks,
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Friday, December 20, 2013 11:54 AM
To: Dharani Kotte; Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Happy Holidays to you all.
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Friday, December 20, 2013 11:52 AM
To: Alex Chang; Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Thank you for the explanation. Sure I will take look.
Happy Holidays.
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Friday, December 20, 2013 11:44 AM
To: Kwok Kong; Dharani Kotte; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Hi Dharani,

The controller reset can be issued from either from the host or the driver itself. Currently, the driver seems handling them in the same manner via single entry "NVMeResetController". In the case of "from the host", the driver needs to separate the cases of SRB_FUNCTION_RESET_... requests from the ioctl request of NVME_RESET_DEVICE in the sense of handling pending IOs. In the case of "the driver itself", needs to re-exam the related error recovery codes as well.
Judy from Samsung suggested referring the storahci.sys driver sample codes for Windows 7/8 based on reset bus logic examples and detailed recommendations.

Thank you,
Alex


From: Kwok Kong
Sent: Friday, December 20, 2013 9:08 AM
To: Dharani Kotte; Akshay Mathur; Alex Chang
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Dharani,

Yes, these are the three areas that you are committed to.

Alex,

Please send more details on the "Controller reset does not handle all cases"  to Dharani.

Thanks

-Kwok

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Friday, December 20, 2013 9:02 AM
To: Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Hi Kwok,

I think the below are the items that we are committing for:
- Not handling CSTS.RDY status (from 1->0 and 0->1) properly on NVMe reset
- Controller reset does not handle all cases
- orphaned requests

Can somebody provide little bit more details on the expectation for the item "Controller reset does not handle all cases".

Thanks,
Dharani.


From: Kwok Kong [mailto:Kwok.Kong at pmcs.com]
Sent: Thursday, December 19, 2013 6:53 PM
To: Akshay Mathur
Cc: Dharani Kotte; Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Excellent! Your help is much appreciated.

Dharani,

Please let me know if you have any question.

Happy holiday to all of you.

-Kwok

From: Akshay Mathur [mailto:Akshay.Mathur at sandisk.com]
Sent: Thursday, December 19, 2013 6:51 PM
To: Kwok Kong
Cc: Dharani Kotte; Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Kwok,
You are welcome. We are pleased to contribute to the community and appreciate you driving it!

We will try our best to complete the implementation by end of January but we may not be able to complete comprehensive testing by that time. This is because of overlaps with few internal business deliverables and a company-wide shut-down for next 1.5 weeks.

Anyway, Dharani will be in touch with you as he makes progress.
Thanks
Akshay

From: Kwok Kong [mailto:Kwok.Kong at pmcs.com]
Sent: Tuesday, December 17, 2013 4:21 PM
To: Akshay Mathur
Cc: Dharani Kotte; Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Akshay,

Thanks for your willingness to contribute to the driver.   I am looking for a patch before end of Jan 2014, the earlier the better.
Please let me know if Sandisk can commit to that.

Your help is much appreciated.

Thanks

-Kwok

From: Akshay Mathur [mailto:Akshay.Mathur at sandisk.com]
Sent: Tuesday, December 17, 2013 4:11 PM
To: Kwok Kong
Cc: Dharani Kotte; Dave Landsman; Akshay Mathur
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Kowk,
I manage the Software and driver development team at SanDisk/ESS.
We are certainly willing to contribute to fixing the problems listed below but before we can commit, we would like to get clarification on the timeline i.e. by when these fixes are expected to be completed.
Thanks
Akshay Mathur
Sr Software Manager, Enterprise Storage Solutions
951 SanDisk Drive, Building #5  |  Milpitas, CA 95035 U.S.A.  |  Direct  +1 408.801.1336  |
Cell +1 856.607.7323  |  Corporate +1 408.801.1000  |  Akshay.Mathur at sandisk.com<mailto:Akshay.Mathur at sandisk.com>
[Description: cid:image001.jpg at 01CC358D.60974910]


From: Kwok Kong [mailto:Kwok.Kong at pmcs.com]
Sent: Wednesday, December 11, 2013 18:00
To: Dave Landsman
Cc: Dharani Kotte
Subject: Would you please help to resolve a few OFA NVMe driver problems ?

Dave and Dharani,

There are some issues with the current OFA driver that need to be fixed. PMC is working on resolving some of the problems. Intel has agreed to work on the following two problems:
- remove #define for CHATHAM2
- Learning of CPU core to Vector failure handling

I am also making request to other companies to work on some of the issues.

I wonder if your company can work on the following three problems:
                - Not handling CSTS.RDY status (from 1->0 and 0->1) properly on NVMe reset
                - Controller reset does not handle all cases
                - orphaned requests

Please let me know if your company can work on these two issues.

Thanks

-Kwok


________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140421/e38d2608/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 9449 bytes
Desc: image001.jpg
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140421/e38d2608/attachment.jpg>

From Uma.Parepalli at skhms.com  Mon Apr 21 12:23:36 2014
From: Uma.Parepalli at skhms.com (Uma Parepalli)
Date: Mon, 21 Apr 2014 19:23:36 +0000
Subject: [nvmewin]   Relationship between System Power States,
 Device Power States and Link Power States
Message-ID: <efedbc253018413ebe2cb235d36a2ab1@N111XMB0240.SKHMS.COM>

I am trying to understand the relationship between the System and Device Power States used by the Windows drivers and how are these related to the power states?
Thank you,
Uma

The information contained in this e-mail is considered confidential of SK hynix memory solutions Inc. and intended only for the persons addressed or copied in this e-mail. Any unauthorized use, dissemination of the information, or copying of this message is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140421/8af6d94c/attachment.html>

From judy.brock at ssi.samsung.com  Mon Apr 21 12:26:04 2014
From: judy.brock at ssi.samsung.com (Judy Brock-SSI)
Date: Mon, 21 Apr 2014 19:26:04 +0000
Subject: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset
 Fixes
In-Reply-To: <23EC73C80FB59046A6B7B8EB7B3826593DA967BE@SACMBXIP01.sdcorp.global.sandisk.com>
References: <E1729D5DBAB9E948BA87B76FDFA1298A0C884EEB@BBYEXM01.pmc-sierra.internal>
	<23EC73C80FB59046A6B7B8EB7B3826593BDF0B5C@SACMBXIP01.sdcorp.global.sandisk.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C889023@BBYEXM01.pmc-sierra.internal>
	<23EC73C80FB59046A6B7B8EB7B3826593BDF134C@SACMBXIP01.sdcorp.global.sandisk.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C889036@BBYEXM01.pmc-sierra.internal>
	<26455_1392829222_5304E326_26455_6404_1_23EC73C80FB59046A6B7B8EB7B3826593BDF1488@SACMBXIP01.sdcorp.global.sandisk.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C88952B@BBYEXM01.pmc-sierra.internal>
	<B3A485AFDDB1DD4598621E85E8EB67A83AAF9C48@FMSMSX105.amr.corp.intel.com>
	<23EC73C80FB59046A6B7B8EB7B3826593DA6D6E0@SACMBXIP02.sdcorp.global.sandisk.com>
	<36E8D38D6B771A4BBDB1C0D800158A516B61AC50@SSIEXCH-MB3.ssi.samsung.com>
	<23EC73C80FB59046A6B7B8EB7B3826593DA967BE@SACMBXIP01.sdcorp.global.sandisk.com>
Message-ID: <36E8D38D6B771A4BBDB1C0D800158A516B61AC9D@SSIEXCH-MB3.ssi.samsung.com>

Hi Dharani,

Please note, definitely not all of this feedback is related to the reset patch but you are just the lucky one who got the entire feedback "snapshot" - again, I'm not sure how the team will want to evaluate the various pieces of input. I'd appreciate feedback on the feedback from the team!

thanks :)

Judy

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Monday, April 21, 2014 12:17 PM
To: Judy Brock-SSI; nvmewin at openfabrics.org
Cc: Foster, Carolyn D; Alex Chang
Subject: RE: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Hi Judy,

It is long email let me take a look at the code and get back to you.

Thank you for bringing it up.

Thanks,
Dharani.

From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com]
Sent: Monday, April 21, 2014 12:04 PM
To: nvmewin at openfabrics.org<mailto:nvmewin at openfabrics.org>
Cc: Foster, Carolyn D; Dharani Kotte; Alex Chang
Subject: RE: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Hi Carolyn and Dharani et al,

[Carolyn wrote] Line 2219: StorPortSynchronizeAccess - I don't understand why this is needed.  The SynchronizeReset function looks very much like the recovery DPC routine, which should already be synchronized with Start IO and the interrupt DPC.
[Dharani wrote] Line 2219: StorPortSynchronizeAccess, This is the request from Samsung suggested by Judy. Below is the reference mail.

The reason it was needed is because when the reference email was sent (7/18/2013), the OFA driver had a flag called COMPLETE_IN_DPC which controlled whether completions were handled in ISR context directly vs handled later in  DPC context.  So there was a true ISR context to contend with which the HwResetBus routine had to be synchronized with.

One way to do that is to use StorPortSynchronizeAccess as the legacy LSI miniport does, another way is to acquire/release the StorPortInterruptLock ourselves in our HwResetBus routine as the storport AHCI  miniport does, for example. One way or another, we needed to synchronize with our HwInterrupt routine at that time. We have since eliminated the COMPLETE_IN_DPC  flag along with the path which did completions in the ISR directly (see http://lists.openfabrics.org/pipermail/nvmewin/2013-July/000608.html  ) so Carolyn is correct -  NVMeResetBus is currently synchronized with StartIo and the Interrupt DPC already.

However, that does not mean the original recovery DPC routine could be scheduled as it originally wasfrom NVMeResetBus since there is still a need to not schedule a DPC from within that routine - by definition, all work must be completed before returning.  Below please find some additional feedback on the restructured reset logic. I apologize for not having provided it in the originally requested timeframe. I hope it can discussed and dealt with as the team decides is most convenient.

1. NVMeResetBus routine does not need to use StorPortSynchronizeAccess to synchronize with the ISR because we no longer do completions directly in ISR context as we did at the time I wrote my original recommendations for revising the HwBusReset routine.
See http://lists.openfabrics.org/pipermail/nvmewin/2013-July/000608.html in which we decided to remove the COMPLETE_IN_DPC flag which allowed the driver to switch between completions in ISR context vs completions in DPC context.  However, there is still a need to not schedule a DPC from NVMeResetBus as was the case prior to the reset patch effort. All work must be completed before returning from the call to NVMeResetBus.

2. Even if called directly, the current logic in the NVMeSynchronizeReset() routine has the same problem as the original code had in that it will not wait for all necessary work to be done before returning. After resetting the controller and completing all outstanding requests, it starts the re-initialize state machine with a call to NVMeRunningStartAttempt(). However,upon return from that call, there is no logic in place to wait for the initialization state machine to run to completion. We just fall straight through, allow IOs to resume, and return.
There needs to be logic, similar to that in NVMePassiveInitialize, which waits for pAE->DriverState.NextDriverState to become either NVMeStartComplete or NVMeStartFailed in a while loop which calls NVMeStallExecution between checks, up to some maximum amount of time.

3. The Recovery DPC routine has the same problem NVMeSynchronizeReset - there is no logic in place to wait for the initialization state machine to run to completion after the call to NVMeRunningStartAttempt() which starts it off.

4. NVMeWaitForCtrlRDY should have a return value that can be checked to see if it was successful or not. Right now, everywhere it is called the code forges ahead regardless of whether the RDY bit is in the desired state or not. Specifically, the following changes should be made (highlighted,line nums based on most-recently circulated Intel patch):
In NvmeStd.h:
BOOLEAN NVMeWaitForCtrlRDY(
    __in PNVME_DEVICE_EXTENSION pAE,
    __in ULONG expectedValue
);
In NvmeStd.c:
Line 1978:
BOOLEAN NVMeWaitForCtrlRDY(
    __in PNVME_DEVICE_EXTENSION pAE,
    __in ULONG expectedValue
)
{
    NVMe_CONTROLLER_STATUS CSTS = {0};
    ULONG time = 0;

     CSTS.AsUlong =
         StorPortReadRegisterUlong(pAE,
                                   &pAE->pCtrlRegister->CSTS.AsUlong);
     while (CSTS.RDY != expectedValue) {
        NVMeCrashDelay(STORPORT_TIMER_CB_us, pAE->ntldrDump);
       time += STORPORT_TIMER_CB_us;
        if (time > pAE->uSecCrtlTimeout) {
            return FALSE;
        }
        CSTS.AsUlong =
            StorPortReadRegisterUlong(pAE,
                                      &pAE->pCtrlRegister->CSTS.AsUlong);
     };
    return TRUE;
}
Line 651:
if(NVMeWaitForCtrlRDY(pAE, 1) == FALSE) {
StorPortDebugPrint(INFO,
       "NVMeInitialize: EN bit set to 1 but RDY bit set to 0\n");
return FALSE;
}

Line 661:
if(NVMeWaitForCtrlRDY(pAE, 0) == FALSE) {
StorPortDebugPrint(INFO,
    "NVMeInitialize: EN bit set to 0 but RDY bit won't clear- still 1\n");
return FALSE;
}
                               etc.
5. NVMeCompleteCmd should have a return value that can be checked to see if it was successful or not. Right now, everywhere it is called the code forges ahead regardless of whether the RDY bit is in the desired state or not. NVMeResetController is called from several places in the driver. One of the routines which it is called from is NVMeCompleteCmd:

VOID NVMeCompleteCmd{
. . .
if ((pCmdEntry->Pending == FALSE) || (pCmdEntry->Context == NULL)) {
/*
* Something bad happened so reset the adapter and hope for the best
                                   */
                  NVMeResetController(pAE, NULL);
                                    return;
}

Since NVMeCompleteCmd has no return value, this fatal error return is never detected in any of the places that the function is called from (quite a few) - the logic just proceeds on as if everything is fine. In some cases NVMeCompleteCmd can be called over and over (if it is called from DetectPendingCmds or IoCompletionDpcRoutine for example) which may in turn cause repeated calls to NVMeResetController.

6. There is redundancy in the new routine NVMeWaitForCtrlRDY() and the routine NVMeWaitOnReady(). Although the new routine is missing a return value (see item #1), we don't need both - we can get rid of the old routine.

7. In NvmeStd.c, line 646:
Except for the first sentence, this comment is not accurate,should be removed:
/*
* Before we transition to 0, make sure the ctrl is actually RDY
* NOTE:  Some HW implementations may not require this wait and  if not then it could be removed as waiting at this IRQL is  not recommended.  The spec is not clear on whether we
* need  to wait for RDY to transition EN back to 0 or not.
*/
NVM Express 1.0e and beyond includes the following statement in the definition of the EN bit(emphasis added): "Setting this field from a '0' to a '1' when CSTS.RDY is a '1,' or setting this field from a '1' to a '0' when CSTS.RDY is a '0,' has undefined results."

8. The routine NVMeResetAdapter() sets CC.EN to 0 without ever checking to make sure that CSTS.RDY is set to '1' first. This check has to be included in this routine. Since it is not, there are many paths in the driver where there is no prior check for this condition:
                  a) NVMeInitAdminQueues -> NVMeEnableAdapter -> NVMeResetAdapter
b) NVMeNormalShutdown -> NVMeResetAdapter
c) NVMeAdapterControlPowerDown -> NVMeResetAdapter
d) NVMeSynchronizeReset -> NVMeResetAdapter

9. In the RecoveryDpcRoutine():
a) the code does not need to set CC.EN to '0' and then wait for CSTS.RDY to become 0 because right after it does so, it calls NVMeResetAdapter which does the exact same thing.
b) is there an actual requirement for the following code?:
                                   /* 10 msec "settle" delay post reset */
                                    NVMeStallExecution(pAE, 10000);
c) is it really safe and/or required to always acquire/release the StartIo lock?

10. This is not feedback related Reset logic per-se but do we really need the NVMeCallArbiter() function at this point? I think we could replace all occurences of
NVMeCallArbiter(pAE);
with
                 if (pAE->ntldrDump == FALSE) {
                                   StorPortNotification(RequestTimerCall,
                                                   pAE,
                        NVMeRunning,
                                                      pAE->DriverState.CheckbackInterval);
                  }

Thanks,
Judy

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Dharani Kotte
Sent: Tuesday, February 25, 2014 4:28 AM
To: Foster, Carolyn D; Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: Re: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Hi Carolyn,

Line 1384: I can take care of this item.

Line 2219: StorPortSynchronizeAccess, This is the request from Samsung suggested by Judy. Below is the reference mail.

In our testing, we create a situation where we put the NVMe driver under heavy I/O load with Iometer and then cause the device to stop responding.  This results in I/O request timeouts which eventually causes the driver to be called at it's HwStorResetBus entry point (NVMeResetBus).  I have some feedback on the current architecture of that routine:


1.       Among other things, NMeResetBus schedules a DPC to complete any pending commands. This creates a situation where upon return from this entry point, there are still cmds outstanding which don't get completed till the DPC runs.  According to the WDK, this doesn't appear to be legal - all outstanding cmds have to be completed by the HwStorResetBus routine before it returns:

HwResetBus

Pointer to the miniport driver's HwStorResetBus<ms-help://MS.WDK.v10.7600.091201/Storage_r/hh/Storage_r/stormini_b3051379-4caa-4502-9492-a21672cfbf0d.xml.htm> routine, which is a required entry point for all miniport drivers. This member has the same meaning for the Storport version of the HW_INITIALIZATION_DATA structure as it does for the SCSI Port version of the structure. For more information, see the HwResetBus member of HW_INITIALIZATION_DATA (SCSI)
and
HwScsiResetBus must complete any outstanding requests by calling ScsiPortCompleteRequest with the SrbStatus value SRB_STATUS_BUS_RESET or, for individual SRBs, ScsiPortNotification with this status value.
and
The port driver pauses all device IO queues for the adapter and then calls the HwStorResetBus routine at IRQL DISPATCH_LEVEL after acquiring the StartIo spin lock. A miniport driver is responsible for completing SRBs received by HwStorStartIo<http://msdn.microsoft.com/en-us/library/windows/hardware/ff557423(v=vs.85).aspx> for PathId during this routine and setting their status to SRB_STATUS_BUS_RESET if necessary

Since HwStorResetBus must finish its work before returning; it can't schedule a DPC to do so later on. The logic which schedules a DPC should be removed.


2.       Code should be added to call StorPortPause() to hold off any new requests till StorPortResume() is called.


3.       Code should be added to call  StorPortSynchronizeAccess() in order to synchronize with HwStorInterrupt. A callback routine in the NVMe driver should also be added for NVMeResetBus to do the synchronized work in. HwStorResetBus is already synchronized with HwStorStartIo since the port driver calls it only after acquiring the StartIo spinlock.


4.       We should implement a driver-internal global (per "adapter") flag signifying we are busy with reset processing and thus can't allow new I/O requests to go through to the hardware.


5.       Code should be added to call StorPortResume() when all work is complete.


6.       We should refer to the WDK-supplied LSI parallel SCSI StorPort miniport sample driver for an example of all of the above.


Thanks,
Judy


Thanks,
Dharani.


From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Foster, Carolyn D
Sent: Monday, February 24, 2014 3:51 PM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: Re: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Hi Alex and Dharni,

I have been reviewing the code and performing some tests and I have some concerns about this patch.

In nvmeStd.c:
Line 1384: NVMeProcessAbortLunReset - This change will now send abort commands for all pending requests when a RESET_LOGICAL_UNIT request comes in, instead of issuing the RecoveryDpc routine.  This change concerns me the most.  During a reset there is no need to send individual abort requests for outstanding commands.  When the LUN reset comes in, we will set CC.EN to 0 and the spec clearly states that "the controller shall not process commands nor post completion queue entries to the completion queue."  This reset behavior has been accounted for in the driver, by design.  In the LUN reset case, we should continue to issue the recovery DPC routine, which will complete all outstanding commands.

What should happen here is that the new processAbortLun function should be moved under the SRB_FUNCTION_ABORT_COMMAND only.  Then the procesAbortLunReset function should only send one abort and not abort all outstanding commands.

Also, during testing, I hit a D1 BSOD when I tried to step through the code.  I ran IO and forced a timeout by using the debugger to skip over the line of code that rings the submission queue doorbell.  The IO should be timed out by storport, which will then send a reset lun.

Line 2219: StorPortSynchronizeAccess - I don't understand why this is needed.  The SynchronizeReset function looks very much like the recovery DPC routine, which should already be synchronized with Start IO and the interrupt DPC.

Thanks,
Carolyn


From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Wednesday, February 19, 2014 10:06 AM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Thank you, Dharani.

Hi all,

Please review/test the attached reset fix patch from Sandisk and provide your feedbacks.

Thank you very much,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Wednesday, February 19, 2014 9:00 AM
To: Alex Chang
Subject: [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


Hi Alex,

The attached is the patch source for review. I have tested the I/O running over night.

Areas need to be focused for test this patch:
1. Test abort/LUN resets.
2. Test chip reset.
3. Test the format command.
4.Test Firmware download command.

Password is "sndk1234"

Thanks,
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Tuesday, February 18, 2014 12:15 PM
To: Dharani Kotte
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Great!

Thanks,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Tuesday, February 18, 2014 12:14 PM
To: Alex Chang
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Just testing after merging the code it I should be able to  send it tomorrow morning.
Thanks,
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Tuesday, February 18, 2014 12:13 PM
To: Dharani Kotte
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Hi Dharani,

Just a friendly reminder, could you please send out your patch as soon as it's ready?

Many thanks,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Friday, February 14, 2014 10:18 AM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Sure Alex.
Dharani.

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Friday, February 14, 2014 10:17 AM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Good morning, Dharani,

As you may know, both Intel and Huawei patches had been added into OFA source base. Now, you may re-base your changes and send a patch out for review/test. Thank you very much for contributing the fixes.

Regards,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Wednesday, January 15, 2014 2:08 PM
To: Alex Chang; Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] RE: Would you please help to resolve a few OFA NVMe driver problems ?


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


Hi Alex,

The attached is the source for the preliminary review. I have tested the IO and scsi compliance test. I don't have a drive which supports abort/lun resets, not sure how to test the format command.

Thanks,
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Friday, December 20, 2013 11:54 AM
To: Dharani Kotte; Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Happy Holidays to you all.
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Friday, December 20, 2013 11:52 AM
To: Alex Chang; Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Thank you for the explanation. Sure I will take look.
Happy Holidays.
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Friday, December 20, 2013 11:44 AM
To: Kwok Kong; Dharani Kotte; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Hi Dharani,

The controller reset can be issued from either from the host or the driver itself. Currently, the driver seems handling them in the same manner via single entry "NVMeResetController". In the case of "from the host", the driver needs to separate the cases of SRB_FUNCTION_RESET_... requests from the ioctl request of NVME_RESET_DEVICE in the sense of handling pending IOs. In the case of "the driver itself", needs to re-exam the related error recovery codes as well.
Judy from Samsung suggested referring the storahci.sys driver sample codes for Windows 7/8 based on reset bus logic examples and detailed recommendations.

Thank you,
Alex


From: Kwok Kong
Sent: Friday, December 20, 2013 9:08 AM
To: Dharani Kotte; Akshay Mathur; Alex Chang
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Dharani,

Yes, these are the three areas that you are committed to.

Alex,

Please send more details on the "Controller reset does not handle all cases"  to Dharani.

Thanks

-Kwok

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Friday, December 20, 2013 9:02 AM
To: Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Hi Kwok,

I think the below are the items that we are committing for:
- Not handling CSTS.RDY status (from 1->0 and 0->1) properly on NVMe reset
- Controller reset does not handle all cases
- orphaned requests

Can somebody provide little bit more details on the expectation for the item "Controller reset does not handle all cases".

Thanks,
Dharani.


From: Kwok Kong [mailto:Kwok.Kong at pmcs.com]
Sent: Thursday, December 19, 2013 6:53 PM
To: Akshay Mathur
Cc: Dharani Kotte; Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Excellent! Your help is much appreciated.

Dharani,

Please let me know if you have any question.

Happy holiday to all of you.

-Kwok

From: Akshay Mathur [mailto:Akshay.Mathur at sandisk.com]
Sent: Thursday, December 19, 2013 6:51 PM
To: Kwok Kong
Cc: Dharani Kotte; Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Kwok,
You are welcome. We are pleased to contribute to the community and appreciate you driving it!

We will try our best to complete the implementation by end of January but we may not be able to complete comprehensive testing by that time. This is because of overlaps with few internal business deliverables and a company-wide shut-down for next 1.5 weeks.

Anyway, Dharani will be in touch with you as he makes progress.
Thanks
Akshay

From: Kwok Kong [mailto:Kwok.Kong at pmcs.com]
Sent: Tuesday, December 17, 2013 4:21 PM
To: Akshay Mathur
Cc: Dharani Kotte; Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Akshay,

Thanks for your willingness to contribute to the driver.   I am looking for a patch before end of Jan 2014, the earlier the better.
Please let me know if Sandisk can commit to that.

Your help is much appreciated.

Thanks

-Kwok

From: Akshay Mathur [mailto:Akshay.Mathur at sandisk.com]
Sent: Tuesday, December 17, 2013 4:11 PM
To: Kwok Kong
Cc: Dharani Kotte; Dave Landsman; Akshay Mathur
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Kowk,
I manage the Software and driver development team at SanDisk/ESS.
We are certainly willing to contribute to fixing the problems listed below but before we can commit, we would like to get clarification on the timeline i.e. by when these fixes are expected to be completed.
Thanks
Akshay Mathur
Sr Software Manager, Enterprise Storage Solutions
951 SanDisk Drive, Building #5  |  Milpitas, CA 95035 U.S.A.  |  Direct  +1 408.801.1336  |
Cell +1 856.607.7323  |  Corporate +1 408.801.1000  |  Akshay.Mathur at sandisk.com<mailto:Akshay.Mathur at sandisk.com>
[Description: cid:image001.jpg at 01CC358D.60974910]


From: Kwok Kong [mailto:Kwok.Kong at pmcs.com]
Sent: Wednesday, December 11, 2013 18:00
To: Dave Landsman
Cc: Dharani Kotte
Subject: Would you please help to resolve a few OFA NVMe driver problems ?

Dave and Dharani,

There are some issues with the current OFA driver that need to be fixed. PMC is working on resolving some of the problems. Intel has agreed to work on the following two problems:
- remove #define for CHATHAM2
- Learning of CPU core to Vector failure handling

I am also making request to other companies to work on some of the issues.

I wonder if your company can work on the following three problems:
                - Not handling CSTS.RDY status (from 1->0 and 0->1) properly on NVMe reset
                - Controller reset does not handle all cases
                - orphaned requests

Please let me know if your company can work on these two issues.

Thanks

-Kwok


________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140421/a2402368/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 9449 bytes
Desc: image001.jpg
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140421/a2402368/attachment.jpg>

From Alex.Chang at pmcs.com  Tue Apr 22 09:26:22 2014
From: Alex.Chang at pmcs.com (Alex Chang)
Date: Tue, 22 Apr 2014 16:26:22 +0000
Subject: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset
 Fixes
In-Reply-To: <36E8D38D6B771A4BBDB1C0D800158A516B61AC50@SSIEXCH-MB3.ssi.samsung.com>
References: <E1729D5DBAB9E948BA87B76FDFA1298A0C884EEB@BBYEXM01.pmc-sierra.internal>
	<23EC73C80FB59046A6B7B8EB7B3826593BDF0B5C@SACMBXIP01.sdcorp.global.sandisk.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C889023@BBYEXM01.pmc-sierra.internal>
	<23EC73C80FB59046A6B7B8EB7B3826593BDF134C@SACMBXIP01.sdcorp.global.sandisk.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C889036@BBYEXM01.pmc-sierra.internal>
	<26455_1392829222_5304E326_26455_6404_1_23EC73C80FB59046A6B7B8EB7B3826593BDF1488@SACMBXIP01.sdcorp.global.sandisk.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C88952B@BBYEXM01.pmc-sierra.internal>
	<B3A485AFDDB1DD4598621E85E8EB67A83AAF9C48@FMSMSX105.amr.corp.intel.com>
	<23EC73C80FB59046A6B7B8EB7B3826593DA6D6E0@SACMBXIP02.sdcorp.global.sandisk.com>
	<36E8D38D6B771A4BBDB1C0D800158A516B61AC50@SSIEXCH-MB3.ssi.samsung.com>
Message-ID: <E1729D5DBAB9E948BA87B76FDFA1298A0C89A1CF@BBYEXM01.pmc-sierra.internal>

Hi Judy,

Thank you for your suggestions/comments. When reviewing/testing Dharani's patch, we believe the reset related changes had covered most cases and paths. We're going to release revision 1.3 soon and, unless there are some immediate urgent fixes required, I'd suggest we can discuss more before adding changes that are agreed by the community and included in revision 1.4 later. Please let us know what you think.

Regards,
Alex

From: Judy Brock-SSI [mailto:judy.brock at ssi.samsung.com]
Sent: Monday, April 21, 2014 12:04 PM
To: nvmewin at openfabrics.org
Cc: Foster, Carolyn D; Dharani Kotte (Dharani.Kotte at sandisk.com); Alex Chang
Subject: RE: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Hi Carolyn and Dharani et al,

[Carolyn wrote] Line 2219: StorPortSynchronizeAccess - I don't understand why this is needed.  The SynchronizeReset function looks very much like the recovery DPC routine, which should already be synchronized with Start IO and the interrupt DPC.
[Dharani wrote] Line 2219: StorPortSynchronizeAccess, This is the request from Samsung suggested by Judy. Below is the reference mail.

The reason it was needed is because when the reference email was sent (7/18/2013), the OFA driver had a flag called COMPLETE_IN_DPC which controlled whether completions were handled in ISR context directly vs handled later in  DPC context.  So there was a true ISR context to contend with which the HwResetBus routine had to be synchronized with.

One way to do that is to use StorPortSynchronizeAccess as the legacy LSI miniport does, another way is to acquire/release the StorPortInterruptLock ourselves in our HwResetBus routine as the storport AHCI  miniport does, for example. One way or another, we needed to synchronize with our HwInterrupt routine at that time. We have since eliminated the COMPLETE_IN_DPC  flag along with the path which did completions in the ISR directly (see http://lists.openfabrics.org/pipermail/nvmewin/2013-July/000608.html  ) so Carolyn is correct -  NVMeResetBus is currently synchronized with StartIo and the Interrupt DPC already.

However, that does not mean the original recovery DPC routine could be scheduled as it originally wasfrom NVMeResetBus since there is still a need to not schedule a DPC from within that routine - by definition, all work must be completed before returning.  Below please find some additional feedback on the restructured reset logic. I apologize for not having provided it in the originally requested timeframe. I hope it can discussed and dealt with as the team decides is most convenient.

1. NVMeResetBus routine does not need to use StorPortSynchronizeAccess to synchronize with the ISR because we no longer do completions directly in ISR context as we did at the time I wrote my original recommendations for revising the HwBusReset routine.
See http://lists.openfabrics.org/pipermail/nvmewin/2013-July/000608.html in which we decided to remove the COMPLETE_IN_DPC flag which allowed the driver to switch between completions in ISR context vs completions in DPC context.  However, there is still a need to not schedule a DPC from NVMeResetBus as was the case prior to the reset patch effort. All work must be completed before returning from the call to NVMeResetBus.

2. Even if called directly, the current logic in the NVMeSynchronizeReset() routine has the same problem as the original code had in that it will not wait for all necessary work to be done before returning. After resetting the controller and completing all outstanding requests, it starts the re-initialize state machine with a call to NVMeRunningStartAttempt(). However,upon return from that call, there is no logic in place to wait for the initialization state machine to run to completion. We just fall straight through, allow IOs to resume, and return.
There needs to be logic, similar to that in NVMePassiveInitialize, which waits for pAE->DriverState.NextDriverState to become either NVMeStartComplete or NVMeStartFailed in a while loop which calls NVMeStallExecution between checks, up to some maximum amount of time.

3. The Recovery DPC routine has the same problem NVMeSynchronizeReset - there is no logic in place to wait for the initialization state machine to run to completion after the call to NVMeRunningStartAttempt() which starts it off.

4. NVMeWaitForCtrlRDY should have a return value that can be checked to see if it was successful or not. Right now, everywhere it is called the code forges ahead regardless of whether the RDY bit is in the desired state or not. Specifically, the following changes should be made (highlighted,line nums based on most-recently circulated Intel patch):
In NvmeStd.h:
BOOLEAN NVMeWaitForCtrlRDY(
    __in PNVME_DEVICE_EXTENSION pAE,
    __in ULONG expectedValue
);
In NvmeStd.c:
Line 1978:
BOOLEAN NVMeWaitForCtrlRDY(
    __in PNVME_DEVICE_EXTENSION pAE,
    __in ULONG expectedValue
)
{
    NVMe_CONTROLLER_STATUS CSTS = {0};
    ULONG time = 0;

     CSTS.AsUlong =
         StorPortReadRegisterUlong(pAE,
                                   &pAE->pCtrlRegister->CSTS.AsUlong);
     while (CSTS.RDY != expectedValue) {
        NVMeCrashDelay(STORPORT_TIMER_CB_us, pAE->ntldrDump);
       time += STORPORT_TIMER_CB_us;
        if (time > pAE->uSecCrtlTimeout) {
            return FALSE;
        }
        CSTS.AsUlong =
            StorPortReadRegisterUlong(pAE,
                                      &pAE->pCtrlRegister->CSTS.AsUlong);
     };
    return TRUE;
}
Line 651:
if(NVMeWaitForCtrlRDY(pAE, 1) == FALSE) {
StorPortDebugPrint(INFO,
       "NVMeInitialize: EN bit set to 1 but RDY bit set to 0\n");
return FALSE;
}

Line 661:
if(NVMeWaitForCtrlRDY(pAE, 0) == FALSE) {
StorPortDebugPrint(INFO,
    "NVMeInitialize: EN bit set to 0 but RDY bit won't clear- still 1\n");
return FALSE;
}
                               etc.
5. NVMeCompleteCmd should have a return value that can be checked to see if it was successful or not. Right now, everywhere it is called the code forges ahead regardless of whether the RDY bit is in the desired state or not. NVMeResetController is called from several places in the driver. One of the routines which it is called from is NVMeCompleteCmd:

VOID NVMeCompleteCmd{
. . .
if ((pCmdEntry->Pending == FALSE) || (pCmdEntry->Context == NULL)) {
/*
* Something bad happened so reset the adapter and hope for the best
                                   */
                  NVMeResetController(pAE, NULL);
                                    return;
}

Since NVMeCompleteCmd has no return value, this fatal error return is never detected in any of the places that the function is called from (quite a few) - the logic just proceeds on as if everything is fine. In some cases NVMeCompleteCmd can be called over and over (if it is called from DetectPendingCmds or IoCompletionDpcRoutine for example) which may in turn cause repeated calls to NVMeResetController.

6. There is redundancy in the new routine NVMeWaitForCtrlRDY() and the routine NVMeWaitOnReady(). Although the new routine is missing a return value (see item #1), we don't need both - we can get rid of the old routine.

7. In NvmeStd.c, line 646:
Except for the first sentence, this comment is not accurate,should be removed:
/*
* Before we transition to 0, make sure the ctrl is actually RDY
* NOTE:  Some HW implementations may not require this wait and  if not then it could be removed as waiting at this IRQL is  not recommended.  The spec is not clear on whether we
* need  to wait for RDY to transition EN back to 0 or not.
*/
NVM Express 1.0e and beyond includes the following statement in the definition of the EN bit(emphasis added): "Setting this field from a '0' to a '1' when CSTS.RDY is a '1,' or setting this field from a '1' to a '0' when CSTS.RDY is a '0,' has undefined results."

8. The routine NVMeResetAdapter() sets CC.EN to 0 without ever checking to make sure that CSTS.RDY is set to '1' first. This check has to be included in this routine. Since it is not, there are many paths in the driver where there is no prior check for this condition:
                  a) NVMeInitAdminQueues -> NVMeEnableAdapter -> NVMeResetAdapter
b) NVMeNormalShutdown -> NVMeResetAdapter
c) NVMeAdapterControlPowerDown -> NVMeResetAdapter
d) NVMeSynchronizeReset -> NVMeResetAdapter

9. In the RecoveryDpcRoutine():
a) the code does not need to set CC.EN to '0' and then wait for CSTS.RDY to become 0 because right after it does so, it calls NVMeResetAdapter which does the exact same thing.
b) is there an actual requirement for the following code?:
                                   /* 10 msec "settle" delay post reset */
                                    NVMeStallExecution(pAE, 10000);
c) is it really safe and/or required to always acquire/release the StartIo lock?

10. This is not feedback related Reset logic per-se but do we really need the NVMeCallArbiter() function at this point? I think we could replace all occurences of
NVMeCallArbiter(pAE);
with
                 if (pAE->ntldrDump == FALSE) {
                                   StorPortNotification(RequestTimerCall,
                                                   pAE,
                        NVMeRunning,
                                                      pAE->DriverState.CheckbackInterval);
                  }

Thanks,
Judy

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Dharani Kotte
Sent: Tuesday, February 25, 2014 4:28 AM
To: Foster, Carolyn D; Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: Re: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Hi Carolyn,

Line 1384: I can take care of this item.

Line 2219: StorPortSynchronizeAccess, This is the request from Samsung suggested by Judy. Below is the reference mail.

In our testing, we create a situation where we put the NVMe driver under heavy I/O load with Iometer and then cause the device to stop responding.  This results in I/O request timeouts which eventually causes the driver to be called at it's HwStorResetBus entry point (NVMeResetBus).  I have some feedback on the current architecture of that routine:


1.       Among other things, NMeResetBus schedules a DPC to complete any pending commands. This creates a situation where upon return from this entry point, there are still cmds outstanding which don't get completed till the DPC runs.  According to the WDK, this doesn't appear to be legal - all outstanding cmds have to be completed by the HwStorResetBus routine before it returns:

HwResetBus

Pointer to the miniport driver's HwStorResetBus<ms-help://MS.WDK.v10.7600.091201/Storage_r/hh/Storage_r/stormini_b3051379-4caa-4502-9492-a21672cfbf0d.xml.htm> routine, which is a required entry point for all miniport drivers. This member has the same meaning for the Storport version of the HW_INITIALIZATION_DATA structure as it does for the SCSI Port version of the structure. For more information, see the HwResetBus member of HW_INITIALIZATION_DATA (SCSI)
and
HwScsiResetBus must complete any outstanding requests by calling ScsiPortCompleteRequest with the SrbStatus value SRB_STATUS_BUS_RESET or, for individual SRBs, ScsiPortNotification with this status value.
and
The port driver pauses all device IO queues for the adapter and then calls the HwStorResetBus routine at IRQL DISPATCH_LEVEL after acquiring the StartIo spin lock. A miniport driver is responsible for completing SRBs received by HwStorStartIo<http://msdn.microsoft.com/en-us/library/windows/hardware/ff557423(v=vs.85).aspx> for PathId during this routine and setting their status to SRB_STATUS_BUS_RESET if necessary

Since HwStorResetBus must finish its work before returning; it can't schedule a DPC to do so later on. The logic which schedules a DPC should be removed.


2.       Code should be added to call StorPortPause() to hold off any new requests till StorPortResume() is called.


3.       Code should be added to call  StorPortSynchronizeAccess() in order to synchronize with HwStorInterrupt. A callback routine in the NVMe driver should also be added for NVMeResetBus to do the synchronized work in. HwStorResetBus is already synchronized with HwStorStartIo since the port driver calls it only after acquiring the StartIo spinlock.


4.       We should implement a driver-internal global (per "adapter") flag signifying we are busy with reset processing and thus can't allow new I/O requests to go through to the hardware.


5.       Code should be added to call StorPortResume() when all work is complete.


6.       We should refer to the WDK-supplied LSI parallel SCSI StorPort miniport sample driver for an example of all of the above.


Thanks,
Judy


Thanks,
Dharani.


From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Foster, Carolyn D
Sent: Monday, February 24, 2014 3:51 PM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: Re: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Hi Alex and Dharni,

I have been reviewing the code and performing some tests and I have some concerns about this patch.

In nvmeStd.c:
Line 1384: NVMeProcessAbortLunReset - This change will now send abort commands for all pending requests when a RESET_LOGICAL_UNIT request comes in, instead of issuing the RecoveryDpc routine.  This change concerns me the most.  During a reset there is no need to send individual abort requests for outstanding commands.  When the LUN reset comes in, we will set CC.EN to 0 and the spec clearly states that "the controller shall not process commands nor post completion queue entries to the completion queue."  This reset behavior has been accounted for in the driver, by design.  In the LUN reset case, we should continue to issue the recovery DPC routine, which will complete all outstanding commands.

What should happen here is that the new processAbortLun function should be moved under the SRB_FUNCTION_ABORT_COMMAND only.  Then the procesAbortLunReset function should only send one abort and not abort all outstanding commands.

Also, during testing, I hit a D1 BSOD when I tried to step through the code.  I ran IO and forced a timeout by using the debugger to skip over the line of code that rings the submission queue doorbell.  The IO should be timed out by storport, which will then send a reset lun.

Line 2219: StorPortSynchronizeAccess - I don't understand why this is needed.  The SynchronizeReset function looks very much like the recovery DPC routine, which should already be synchronized with Start IO and the interrupt DPC.

Thanks,
Carolyn


From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Wednesday, February 19, 2014 10:06 AM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] ***UNCHECKED*** FW: Re-send Sandisk Patch For Reset Fixes

Thank you, Dharani.

Hi all,

Please review/test the attached reset fix patch from Sandisk and provide your feedbacks.

Thank you very much,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Wednesday, February 19, 2014 9:00 AM
To: Alex Chang
Subject: [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


Hi Alex,

The attached is the patch source for review. I have tested the I/O running over night.

Areas need to be focused for test this patch:
1. Test abort/LUN resets.
2. Test chip reset.
3. Test the format command.
4.Test Firmware download command.

Password is "sndk1234"

Thanks,
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Tuesday, February 18, 2014 12:15 PM
To: Dharani Kotte
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Great!

Thanks,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Tuesday, February 18, 2014 12:14 PM
To: Alex Chang
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Just testing after merging the code it I should be able to  send it tomorrow morning.
Thanks,
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Tuesday, February 18, 2014 12:13 PM
To: Dharani Kotte
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Hi Dharani,

Just a friendly reminder, could you please send out your patch as soon as it's ready?

Many thanks,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Friday, February 14, 2014 10:18 AM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Sure Alex.
Dharani.

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Friday, February 14, 2014 10:17 AM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] Re-send Sandisk Patch For Reset Fixes

Good morning, Dharani,

As you may know, both Intel and Huawei patches had been added into OFA source base. Now, you may re-base your changes and send a patch out for review/test. Thank you very much for contributing the fixes.

Regards,
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Wednesday, January 15, 2014 2:08 PM
To: Alex Chang; Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: [WARNING - ENCRYPTED ATTACHMENT NOT VIRUS SCANNED] RE: Would you please help to resolve a few OFA NVMe driver problems ?


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


Hi Alex,

The attached is the source for the preliminary review. I have tested the IO and scsi compliance test. I don't have a drive which supports abort/lun resets, not sure how to test the format command.

Thanks,
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Friday, December 20, 2013 11:54 AM
To: Dharani Kotte; Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Happy Holidays to you all.
Alex

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Friday, December 20, 2013 11:52 AM
To: Alex Chang; Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Thank you for the explanation. Sure I will take look.
Happy Holidays.
Dharani.

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Friday, December 20, 2013 11:44 AM
To: Kwok Kong; Dharani Kotte; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Hi Dharani,

The controller reset can be issued from either from the host or the driver itself. Currently, the driver seems handling them in the same manner via single entry "NVMeResetController". In the case of "from the host", the driver needs to separate the cases of SRB_FUNCTION_RESET_... requests from the ioctl request of NVME_RESET_DEVICE in the sense of handling pending IOs. In the case of "the driver itself", needs to re-exam the related error recovery codes as well.
Judy from Samsung suggested referring the storahci.sys driver sample codes for Windows 7/8 based on reset bus logic examples and detailed recommendations.

Thank you,
Alex


From: Kwok Kong
Sent: Friday, December 20, 2013 9:08 AM
To: Dharani Kotte; Akshay Mathur; Alex Chang
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Dharani,

Yes, these are the three areas that you are committed to.

Alex,

Please send more details on the "Controller reset does not handle all cases"  to Dharani.

Thanks

-Kwok

From: Dharani Kotte [mailto:Dharani.Kotte at sandisk.com]
Sent: Friday, December 20, 2013 9:02 AM
To: Kwok Kong; Akshay Mathur
Cc: Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Hi Kwok,

I think the below are the items that we are committing for:
- Not handling CSTS.RDY status (from 1->0 and 0->1) properly on NVMe reset
- Controller reset does not handle all cases
- orphaned requests

Can somebody provide little bit more details on the expectation for the item "Controller reset does not handle all cases".

Thanks,
Dharani.


From: Kwok Kong [mailto:Kwok.Kong at pmcs.com]
Sent: Thursday, December 19, 2013 6:53 PM
To: Akshay Mathur
Cc: Dharani Kotte; Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Excellent! Your help is much appreciated.

Dharani,

Please let me know if you have any question.

Happy holiday to all of you.

-Kwok

From: Akshay Mathur [mailto:Akshay.Mathur at sandisk.com]
Sent: Thursday, December 19, 2013 6:51 PM
To: Kwok Kong
Cc: Dharani Kotte; Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Kwok,
You are welcome. We are pleased to contribute to the community and appreciate you driving it!

We will try our best to complete the implementation by end of January but we may not be able to complete comprehensive testing by that time. This is because of overlaps with few internal business deliverables and a company-wide shut-down for next 1.5 weeks.

Anyway, Dharani will be in touch with you as he makes progress.
Thanks
Akshay

From: Kwok Kong [mailto:Kwok.Kong at pmcs.com]
Sent: Tuesday, December 17, 2013 4:21 PM
To: Akshay Mathur
Cc: Dharani Kotte; Dave Landsman
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Akshay,

Thanks for your willingness to contribute to the driver.   I am looking for a patch before end of Jan 2014, the earlier the better.
Please let me know if Sandisk can commit to that.

Your help is much appreciated.

Thanks

-Kwok

From: Akshay Mathur [mailto:Akshay.Mathur at sandisk.com]
Sent: Tuesday, December 17, 2013 4:11 PM
To: Kwok Kong
Cc: Dharani Kotte; Dave Landsman; Akshay Mathur
Subject: RE: Would you please help to resolve a few OFA NVMe driver problems ?

Kowk,
I manage the Software and driver development team at SanDisk/ESS.
We are certainly willing to contribute to fixing the problems listed below but before we can commit, we would like to get clarification on the timeline i.e. by when these fixes are expected to be completed.
Thanks
Akshay Mathur
Sr Software Manager, Enterprise Storage Solutions
951 SanDisk Drive, Building #5  |  Milpitas, CA 95035 U.S.A.  |  Direct  +1 408.801.1336  |
Cell +1 856.607.7323  |  Corporate +1 408.801.1000  |  Akshay.Mathur at sandisk.com<mailto:Akshay.Mathur at sandisk.com>
[Description: cid:image001.jpg at 01CC358D.60974910]


From: Kwok Kong [mailto:Kwok.Kong at pmcs.com]
Sent: Wednesday, December 11, 2013 18:00
To: Dave Landsman
Cc: Dharani Kotte
Subject: Would you please help to resolve a few OFA NVMe driver problems ?

Dave and Dharani,

There are some issues with the current OFA driver that need to be fixed. PMC is working on resolving some of the problems. Intel has agreed to work on the following two problems:
- remove #define for CHATHAM2
- Learning of CPU core to Vector failure handling

I am also making request to other companies to work on some of the issues.

I wonder if your company can work on the following three problems:
                - Not handling CSTS.RDY status (from 1->0 and 0->1) properly on NVMe reset
                - Controller reset does not handle all cases
                - orphaned requests

Please let me know if your company can work on these two issues.

Thanks

-Kwok


________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140422/17583c97/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 9449 bytes
Desc: image001.jpg
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140422/17583c97/attachment.jpg>

From Alex.Chang at pmcs.com  Tue Apr 22 19:52:32 2014
From: Alex.Chang at pmcs.com (Alex Chang)
Date: Wed, 23 Apr 2014 02:52:32 +0000
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode
In-Reply-To: <13527_1397685728_534EFDE0_13527_9074_1_B3A485AFDDB1DD4598621E85E8EB67A83AB290F1@FMSMSX105.amr.corp.intel.com>
References: <23542_1397603272_534DBBC8_23542_9879_1_B3A485AFDDB1DD4598621E85E8EB67A83AB28A73@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C892952@BBYEXM01.pmc-sierra.internal>
	<13527_1397685728_534EFDE0_13527_9074_1_B3A485AFDDB1DD4598621E85E8EB67A83AB290F1@FMSMSX105.amr.corp.intel.com>
Message-ID: <E1729D5DBAB9E948BA87B76FDFA1298A0C89A259@BBYEXM01.pmc-sierra.internal>

Hi Rick,

I had finished reviewing/testing on the patch. When you've done so, please let us know with your approval.

Thanks,
Alex

From: Foster, Carolyn D [mailto:carolyn.d.foster at intel.com]
Sent: Wednesday, April 16, 2014 3:00 PM
To: Alex Chang; nvmewin at lists.openfabrics.org
Subject: RE: NVMe OFA patch for CPU Learning mode


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


My apologies, I sent out the wrong version, thank you Alex.  I have attached the correct rebased version.

The password is still intel123

Carolyn

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Wednesday, April 16, 2014 11:31 AM
To: Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: NVMe OFA patch for CPU Learning mode

Hi Carolyn,

Did you re-base the sources before adding your changes? The patch you sent out seems not including what I added in Patch#24.

Thanks,
Alex

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Foster, Carolyn D
Sent: Tuesday, April 15, 2014 4:07 PM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


The password is intel1234

Problem statement:
The current OFA driver assumes a one to one mapping of MSI vectors, queues and CPU cores.  If there is not a one to one mapping then the driver does not go through learning mode and we see a performance drop.  Learning mode is how we maintain our source core interrupt steering, where we map MSI vectors to CPU cores.  This mapping allows the driver to issue and complete commands on the same CPU core.

Proposed changes:
Instead of giving up on learn mapping if there are differences between MSI vectors and CPU cores, we will proceed as normal with learning mode.  We allocate the Core table for the max number of cores, and if at the end of learning mode, any CPU cores have not been mapped, we will map them to Submission queues in a round robin fashion.  We also take into account if the MSI vectors are not mapped contiguously, or if the number of submission and completion queues are different from each other and the number of cores.  These changes still won't have 100% functionality of the source core interrupt steering, but performance is better than if we don't try at all.  Most of the changes are in the initialization path, there was no change to the IO path.

Also in this patch is the removal of the #defines for the CHATHAM prototype hardware.

Unit Tests:
Tested the following on Windows 7 and Windows 8 based systems.
Booted from a system with more CPU cores than MSI vectors.
IO stress on a setup with fewer IO queues than CPU cores and MSI vectors
Ran SCSI compliance tests
Ran SDStress
Ran IOmeter
Hibernate
Format (quick and slow) of MBR and GPT
Install/Uninstall, Enable/Disable


Thanks!
Carolyn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140423/c95efe5e/attachment.html>

From judy.brock at ssi.samsung.com  Wed Apr 23 01:19:01 2014
From: judy.brock at ssi.samsung.com (Judy Brock-SSI)
Date: Wed, 23 Apr 2014 08:19:01 +0000
Subject: [nvmewin] [ewg] links and such
In-Reply-To: <CAFXU465Yj3y7tcDtTmFi1Zr6n7SGti=Nu107FUH5ZArDo4hyLw@mail.gmail.com>
References: <1396278777.57364@flatbed.openfabrics.org>
	<001501cf4cf7$da2a44b0$8e7ece10$@opengridcomputing.com>
	<2D98093777D3FD46A36253F35FE9D693997C7A58@ORSMSX109.amr.corp.intel.com>
	<CAFXU465Yj3y7tcDtTmFi1Zr6n7SGti=Nu107FUH5ZArDo4hyLw@mail.gmail.com>
Message-ID: <36E8D38D6B771A4BBDB1C0D800158A516B61B577@SSIEXCH-MB3.ssi.samsung.com>

Hello,

I am wondering what the correct link is to the Windows driver SVN repo - the link on nvmexpress.org is still broken as is the one referred to in the following URL: https://www.openfabrics.org/index.php/developer-tools/nvme-windows-development.html

I've tried inserting index.php as directed below into the old link but that doesn't seem to work either.

Thanks,
Judy

From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Ken Strandberg
Sent: Saturday, April 12, 2014 8:45 AM
To: Freyensee, James P
Cc: nvmewin at openfabrics.org; ewg at openfabrics.org; Steve Wise; kens at flatbed.openfabrics.org
Subject: Re: [nvmewin] [ewg] links and such

With the recent server migration, URLs to the OFA site need /index.php/ prefixing the URI. www.openfabrics.org/URI<http://www.openfabrics.org/URI> should be changed to www.openfabrics/index.php/URI<http://www.openfabrics/index.php/URI>. I've sent a request to info at nvmexpress.org<mailto:info at nvmexpress.org> to update their link.

On Fri, Apr 11, 2014 at 10:10 AM, Freyensee, James P <james.p.freyensee at intel.com<mailto:james.p.freyensee at intel.com>> wrote:
Is the new NVMe Windows Driver site fully functional yet?  From the main NVM Express website:

http://www.nvmexpress.org/products/

The "Windows Driver" link is broken.

Thanks!

-----Original Message-----
From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org>] On Behalf Of Steve Wise
Sent: Monday, March 31, 2014 8:43 AM
To: kens at flatbed.openfabrics.org<mailto:kens at flatbed.openfabrics.org>; nvmewin at openfabrics.org<mailto:nvmewin at openfabrics.org>; ewg at openfabrics.org<mailto:ewg at openfabrics.org>
Subject: Re: [nvmewin] [ewg] links and such
FYI: This URL isn't working:

t4:~ # wget www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
--2014-03-31<http://www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz--2014-03-31> 11:00:39--
http://www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
Resolving www.openfabrics.org<http://www.openfabrics.org>... 69.55.231.74 Connecting to www.openfabrics.org<http://www.openfabrics.org>|69.55.231.74|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2014-03-31 11:00:40 ERROR 404: Not Found.


> -----Original Message-----
> From: ewg-bounces at lists.openfabrics.org<mailto:ewg-bounces at lists.openfabrics.org>
> [mailto:ewg-bounces at lists.openfabrics.org<mailto:ewg-bounces at lists.openfabrics.org>] On Behalf Of
> kens at flatbed.openfabrics.org<mailto:kens at flatbed.openfabrics.org>
> Sent: Monday, March 31, 2014 10:13 AM
> To: nvmewin at openfabrics.org<mailto:nvmewin at openfabrics.org>; ewg at openfabrics.org<mailto:ewg at openfabrics.org>
> Subject: [ewg] links and such
>
> We are migrating all web service to hardware. Some links and urls are
> not yet working,
but I
> diligently trying to solve the issues. The web site, lists server, and
> mail server are
running.
> Bugs are bugs.openfabrics.org/bugzilla/<http://bugs.openfabrics.org/bugzilla/>. The git daemon is running,
> but the web
interface is
> not yet up. SVN is available through a client at
> svn://flatbed.openfabrics.org<http://flatbed.openfabrics.org>. The web interface is not up yet. My goal is to have them running today.
>
> Thanks for your patience. And thanks to Vladimir for help in getting
> the git daemon
running.
>
> Ken

_______________________________________________
nvmewin mailing list
nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
http://lists.openfabrics.org/mailman/listinfo/nvmewin
_______________________________________________
nvmewin mailing list
nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
http://lists.openfabrics.org/mailman/listinfo/nvmewin


--

Ken Strandberg
Webmanager/SysAdmin
OpenFabrics Alliance
kens at openfabrics.org<mailto:kens at openfabrics.org>
www.openfabrics.org<http://www.openfabrics.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140423/ec2695c9/attachment.html>

From kens at openfabrics.org  Wed Apr 23 07:15:56 2014
From: kens at openfabrics.org (Ken Strandberg)
Date: Wed, 23 Apr 2014 07:15:56 -0700
Subject: [nvmewin] [ewg] links and such
In-Reply-To: <36E8D38D6B771A4BBDB1C0D800158A516B61B577@SSIEXCH-MB3.ssi.samsung.com>
References: <1396278777.57364@flatbed.openfabrics.org>
	<001501cf4cf7$da2a44b0$8e7ece10$@opengridcomputing.com>
	<2D98093777D3FD46A36253F35FE9D693997C7A58@ORSMSX109.amr.corp.intel.com>
	<CAFXU465Yj3y7tcDtTmFi1Zr6n7SGti=Nu107FUH5ZArDo4hyLw@mail.gmail.com>
	<36E8D38D6B771A4BBDB1C0D800158A516B61B577@SSIEXCH-MB3.ssi.samsung.com>
Message-ID: <CAFXU466k+7hOOHO3tGGpz-m=_zDhDeXHLHf_y1eCavC=vZYYog@mail.gmail.com>

Hi Judy,

Wow. I am sincerely sorry I didn't fix the links on the OFA web page. The
SVN (Windows and NVMEwin) pages I missed completely. They still pointed to
the old server. I've fixed them now.

Use an SVN client, like Tortoise SVN to browse, checkout, and update the
repos. You can read/checkout only through a web browser. Have you accessed
the repos before? Do you have an account for writing to them, assuming
you're a contributor? If you don't, let me know and I'll set up your
account. I you continue to have problems, please give me a call at
775.690.6575.

Ken


On Wed, Apr 23, 2014 at 1:19 AM, Judy Brock-SSI
<judy.brock at ssi.samsung.com>wrote:

>  Hello,
>
>
>
> I am wondering what the correct link is to the Windows driver SVN repo –
> the link on nvmexpress.org is still broken as is the one referred to in
> the following URL:
> https://www.openfabrics.org/index.php/developer-tools/nvme-windows-development.html
>
>
>
> I’ve tried inserting index.php as directed below into the old link but
> that doesn’t seem to work either.
>
>
>
> Thanks,
>
> Judy
>
>
>
> *From:* nvmewin-bounces at lists.openfabrics.org [mailto:
> nvmewin-bounces at lists.openfabrics.org] *On Behalf Of *Ken Strandberg
>
> *Sent:* Saturday, April 12, 2014 8:45 AM
> *To:* Freyensee, James P
> *Cc:* nvmewin at openfabrics.org; ewg at openfabrics.org; Steve Wise;
> kens at flatbed.openfabrics.org
> *Subject:* Re: [nvmewin] [ewg] links and such
>
>
>
> With the recent server migration, URLs to the OFA site need /index.php/
> prefixing the URI. www.openfabrics.org/URI should be changed to
> www.openfabrics/index.php/URI. I've sent a request to info at nvmexpress.orgto update their link.
>
>
>
> On Fri, Apr 11, 2014 at 10:10 AM, Freyensee, James P <
> james.p.freyensee at intel.com> wrote:
>
> Is the new NVMe Windows Driver site fully functional yet?  From the main
> NVM Express website:
>
> http://www.nvmexpress.org/products/
>
> The "Windows Driver" link is broken.
>
> Thanks!
>
>
> -----Original Message-----
> From: nvmewin-bounces at lists.openfabrics.org [mailto:
> nvmewin-bounces at lists.openfabrics.org] On Behalf Of Steve Wise
> Sent: Monday, March 31, 2014 8:43 AM
> To: kens at flatbed.openfabrics.org; nvmewin at openfabrics.org;
> ewg at openfabrics.org
> Subject: Re: [nvmewin] [ewg] links and such
>
> FYI: This URL isn't working:
>
> t4:~ # wget www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
> --2014-03-31<http://www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz--2014-03-31>11:00:39--
> http://www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
> Resolving www.openfabrics.org... 69.55.231.74 Connecting to
> www.openfabrics.org|69.55.231.74|:80... connected.
> HTTP request sent, awaiting response... 404 Not Found
> 2014-03-31 11:00:40 ERROR 404: Not Found.
>
>
> > -----Original Message-----
> > From: ewg-bounces at lists.openfabrics.org
> > [mailto:ewg-bounces at lists.openfabrics.org] On Behalf Of
> > kens at flatbed.openfabrics.org
> > Sent: Monday, March 31, 2014 10:13 AM
> > To: nvmewin at openfabrics.org; ewg at openfabrics.org
> > Subject: [ewg] links and such
> >
> > We are migrating all web service to hardware. Some links and urls are
> > not yet working,
> but I
> > diligently trying to solve the issues. The web site, lists server, and
> > mail server are
> running.
> > Bugs are bugs.openfabrics.org/bugzilla/. The git daemon is running,
> > but the web
> interface is
> > not yet up. SVN is available through a client at
> > svn://flatbed.openfabrics.org. The web interface is not up yet. My goal
> is to have them running today.
> >
> > Thanks for your patience. And thanks to Vladimir for help in getting
> > the git daemon
> running.
> >
> > Ken
>
> _______________________________________________
> nvmewin mailing list
> nvmewin at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/nvmewin
> _______________________________________________
> nvmewin mailing list
> nvmewin at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/nvmewin
>
>
>
>
>
> --
>
>
>
> *Ken Strandberg*
>
> *Webmanager/SysAdmin*
>
> *OpenFabrics Alliance*
>
> kens at openfabrics.org
>
> www.openfabrics.org
>


-- 


*Ken Strandberg*
*Webmanager/SysAdmin*
*OpenFabrics Alliance*
kens at openfabrics.org
www.openfabrics.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140423/637a3500/attachment.html>

From carolyn.d.foster at intel.com  Wed Apr 23 10:46:36 2014
From: carolyn.d.foster at intel.com (Foster, Carolyn D)
Date: Wed, 23 Apr 2014 17:46:36 +0000
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode
In-Reply-To: <E1729D5DBAB9E948BA87B76FDFA1298A0C89A259@BBYEXM01.pmc-sierra.internal>
References: <23542_1397603272_534DBBC8_23542_9879_1_B3A485AFDDB1DD4598621E85E8EB67A83AB28A73@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C892952@BBYEXM01.pmc-sierra.internal>
	<13527_1397685728_534EFDE0_13527_9074_1_B3A485AFDDB1DD4598621E85E8EB67A83AB290F1@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C89A259@BBYEXM01.pmc-sierra.internal>
Message-ID: <B3A485AFDDB1DD4598621E85E8EB67A83AB3575B@FMSMSX105.amr.corp.intel.com>

Hello Alex and Rick, We did find an issue with this patch where it was possible for the number of MSI messages granted to be 0.  In this case the number of IO queues would then be set to 0.  I have made a small change to fix this in the attached zip file.

The password is intel123

Thanks,
Carolyn

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Tuesday, April 22, 2014 7:53 PM
To: Foster, Carolyn D; nvmewin at lists.openfabrics.org
Subject: RE: NVMe OFA patch for CPU Learning mode

Hi Rick,

I had finished reviewing/testing on the patch. When you've done so, please let us know with your approval.

Thanks,
Alex

From: Foster, Carolyn D [mailto:carolyn.d.foster at intel.com]
Sent: Wednesday, April 16, 2014 3:00 PM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: NVMe OFA patch for CPU Learning mode


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


My apologies, I sent out the wrong version, thank you Alex.  I have attached the correct rebased version.

The password is still intel123

Carolyn

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Wednesday, April 16, 2014 11:31 AM
To: Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: NVMe OFA patch for CPU Learning mode

Hi Carolyn,

Did you re-base the sources before adding your changes? The patch you sent out seems not including what I added in Patch#24.

Thanks,
Alex

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Foster, Carolyn D
Sent: Tuesday, April 15, 2014 4:07 PM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


The password is intel1234

Problem statement:
The current OFA driver assumes a one to one mapping of MSI vectors, queues and CPU cores.  If there is not a one to one mapping then the driver does not go through learning mode and we see a performance drop.  Learning mode is how we maintain our source core interrupt steering, where we map MSI vectors to CPU cores.  This mapping allows the driver to issue and complete commands on the same CPU core.

Proposed changes:
Instead of giving up on learn mapping if there are differences between MSI vectors and CPU cores, we will proceed as normal with learning mode.  We allocate the Core table for the max number of cores, and if at the end of learning mode, any CPU cores have not been mapped, we will map them to Submission queues in a round robin fashion.  We also take into account if the MSI vectors are not mapped contiguously, or if the number of submission and completion queues are different from each other and the number of cores.  These changes still won't have 100% functionality of the source core interrupt steering, but performance is better than if we don't try at all.  Most of the changes are in the initialization path, there was no change to the IO path.

Also in this patch is the removal of the #defines for the CHATHAM prototype hardware.

Unit Tests:
Tested the following on Windows 7 and Windows 8 based systems.
Booted from a system with more CPU cores than MSI vectors.
IO stress on a setup with fewer IO queues than CPU cores and MSI vectors
Ran SCSI compliance tests
Ran SDStress
Ran IOmeter
Hibernate
Format (quick and slow) of MBR and GPT
Install/Uninstall, Enable/Disable


Thanks!
Carolyn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140423/e4803fb8/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: IntelCPUPatch_v2_04232014.zip
Type: application/x-zip-compressed
Size: 172408 bytes
Desc: IntelCPUPatch_v2_04232014.zip
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140423/e4803fb8/attachment.bin>

From Rick.Knoblaugh at lsi.com  Wed Apr 23 10:50:36 2014
From: Rick.Knoblaugh at lsi.com (Knoblaugh, Rick)
Date: Wed, 23 Apr 2014 17:50:36 +0000
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode
In-Reply-To: <B3A485AFDDB1DD4598621E85E8EB67A83AB3575B@FMSMSX105.amr.corp.intel.com>
References: <23542_1397603272_534DBBC8_23542_9879_1_B3A485AFDDB1DD4598621E85E8EB67A83AB28A73@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C892952@BBYEXM01.pmc-sierra.internal>
	<13527_1397685728_534EFDE0_13527_9074_1_B3A485AFDDB1DD4598621E85E8EB67A83AB290F1@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C89A259@BBYEXM01.pmc-sierra.internal>
	<B3A485AFDDB1DD4598621E85E8EB67A83AB3575B@FMSMSX105.amr.corp.intel.com>
Message-ID: <d316f93a821f48a98837ece3438b6c2c@DM2PR07MB285.namprd07.prod.outlook.com>

Hi Carolyn,
                           Thanks for catching that. I should hopefully be done with approval by EOD.

Greatly appreciate it.

               -Rick

From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Foster, Carolyn D
Sent: Wednesday, April 23, 2014 10:47 AM
To: Alex Chang; nvmewin at lists.openfabrics.org
Subject: Re: [nvmewin] NVMe OFA patch for CPU Learning mode

Hello Alex and Rick, We did find an issue with this patch where it was possible for the number of MSI messages granted to be 0.  In this case the number of IO queues would then be set to 0.  I have made a small change to fix this in the attached zip file.

The password is intel123

Thanks,
Carolyn

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Tuesday, April 22, 2014 7:53 PM
To: Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: NVMe OFA patch for CPU Learning mode

Hi Rick,

I had finished reviewing/testing on the patch. When you've done so, please let us know with your approval.

Thanks,
Alex

From: Foster, Carolyn D [mailto:carolyn.d.foster at intel.com]
Sent: Wednesday, April 16, 2014 3:00 PM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: NVMe OFA patch for CPU Learning mode


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


My apologies, I sent out the wrong version, thank you Alex.  I have attached the correct rebased version.

The password is still intel123

Carolyn

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Wednesday, April 16, 2014 11:31 AM
To: Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: NVMe OFA patch for CPU Learning mode

Hi Carolyn,

Did you re-base the sources before adding your changes? The patch you sent out seems not including what I added in Patch#24.

Thanks,
Alex

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Foster, Carolyn D
Sent: Tuesday, April 15, 2014 4:07 PM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


The password is intel1234

Problem statement:
The current OFA driver assumes a one to one mapping of MSI vectors, queues and CPU cores.  If there is not a one to one mapping then the driver does not go through learning mode and we see a performance drop.  Learning mode is how we maintain our source core interrupt steering, where we map MSI vectors to CPU cores.  This mapping allows the driver to issue and complete commands on the same CPU core.

Proposed changes:
Instead of giving up on learn mapping if there are differences between MSI vectors and CPU cores, we will proceed as normal with learning mode.  We allocate the Core table for the max number of cores, and if at the end of learning mode, any CPU cores have not been mapped, we will map them to Submission queues in a round robin fashion.  We also take into account if the MSI vectors are not mapped contiguously, or if the number of submission and completion queues are different from each other and the number of cores.  These changes still won't have 100% functionality of the source core interrupt steering, but performance is better than if we don't try at all.  Most of the changes are in the initialization path, there was no change to the IO path.

Also in this patch is the removal of the #defines for the CHATHAM prototype hardware.

Unit Tests:
Tested the following on Windows 7 and Windows 8 based systems.
Booted from a system with more CPU cores than MSI vectors.
IO stress on a setup with fewer IO queues than CPU cores and MSI vectors
Ran SCSI compliance tests
Ran SDStress
Ran IOmeter
Hibernate
Format (quick and slow) of MBR and GPT
Install/Uninstall, Enable/Disable


Thanks!
Carolyn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140423/2cc40a76/attachment.html>

From judy.brock at ssi.samsung.com  Wed Apr 23 14:16:13 2014
From: judy.brock at ssi.samsung.com (Judy Brock-SSI)
Date: Wed, 23 Apr 2014 21:16:13 +0000
Subject: [nvmewin] [ewg] links and such
In-Reply-To: <CAFXU466k+7hOOHO3tGGpz-m=_zDhDeXHLHf_y1eCavC=vZYYog@mail.gmail.com>
References: <1396278777.57364@flatbed.openfabrics.org>
	<001501cf4cf7$da2a44b0$8e7ece10$@opengridcomputing.com>
	<2D98093777D3FD46A36253F35FE9D693997C7A58@ORSMSX109.amr.corp.intel.com>
	<CAFXU465Yj3y7tcDtTmFi1Zr6n7SGti=Nu107FUH5ZArDo4hyLw@mail.gmail.com>
	<36E8D38D6B771A4BBDB1C0D800158A516B61B577@SSIEXCH-MB3.ssi.samsung.com>
	<CAFXU466k+7hOOHO3tGGpz-m=_zDhDeXHLHf_y1eCavC=vZYYog@mail.gmail.com>
Message-ID: <36E8D38D6B771A4BBDB1C0D800158A516B61B798@SSIEXCH-MB3.ssi.samsung.com>

Hi Ken,

Thanks so much for fixing the links – it’s all working now.

To date, I have not had a need to write to the repos so I don’t need an account – I only browse/checkout. If that changes , I will request an account from you.

Thanks again,

Judy

From: ken.l.strandberg at gmail.com [mailto:ken.l.strandberg at gmail.com] On Behalf Of Ken Strandberg
Sent: Wednesday, April 23, 2014 7:16 AM
To: Judy Brock-SSI
Cc: Ken Strandberg; Freyensee, James P; nvmewin at openfabrics.org; ewg at openfabrics.org; Steve Wise; kens at flatbed.openfabrics.org
Subject: Re: [nvmewin] [ewg] links and such

Hi Judy,

Wow. I am sincerely sorry I didn't fix the links on the OFA web page. The SVN (Windows and NVMEwin) pages I missed completely. They still pointed to the old server. I've fixed them now.

Use an SVN client, like Tortoise SVN to browse, checkout, and update the repos. You can read/checkout only through a web browser. Have you accessed the repos before? Do you have an account for writing to them, assuming you're a contributor? If you don't, let me know and I'll set up your account. I you continue to have problems, please give me a call at 775.690.6575.

Ken

On Wed, Apr 23, 2014 at 1:19 AM, Judy Brock-SSI <judy.brock at ssi.samsung.com<mailto:judy.brock at ssi.samsung.com>> wrote:
Hello,

I am wondering what the correct link is to the Windows driver SVN repo – the link on nvmexpress.org<http://nvmexpress.org> is still broken as is the one referred to in the following URL: https://www.openfabrics.org/index.php/developer-tools/nvme-windows-development.html

I’ve tried inserting index.php as directed below into the old link but that doesn’t seem to work either.

Thanks,
Judy

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org>] On Behalf Of Ken Strandberg

Sent: Saturday, April 12, 2014 8:45 AM
To: Freyensee, James P
Cc: nvmewin at openfabrics.org<mailto:nvmewin at openfabrics.org>; ewg at openfabrics.org<mailto:ewg at openfabrics.org>; Steve Wise; kens at flatbed.openfabrics.org<mailto:kens at flatbed.openfabrics.org>
Subject: Re: [nvmewin] [ewg] links and such

With the recent server migration, URLs to the OFA site need /index.php/ prefixing the URI. www.openfabrics.org/URI<http://www.openfabrics.org/URI> should be changed to www.openfabrics/index.php/URI<http://www.openfabrics/index.php/URI>. I've sent a request to info at nvmexpress.org<mailto:info at nvmexpress.org> to update their link.

On Fri, Apr 11, 2014 at 10:10 AM, Freyensee, James P <james.p.freyensee at intel.com<mailto:james.p.freyensee at intel.com>> wrote:
Is the new NVMe Windows Driver site fully functional yet?  From the main NVM Express website:

http://www.nvmexpress.org/products/

The "Windows Driver" link is broken.

Thanks!

-----Original Message-----
From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org>] On Behalf Of Steve Wise
Sent: Monday, March 31, 2014 8:43 AM
To: kens at flatbed.openfabrics.org<mailto:kens at flatbed.openfabrics.org>; nvmewin at openfabrics.org<mailto:nvmewin at openfabrics.org>; ewg at openfabrics.org<mailto:ewg at openfabrics.org>
Subject: Re: [nvmewin] [ewg] links and such
FYI: This URL isn't working:

t4:~ # wget www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
--2014-03-31<http://www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz--2014-03-31> 11:00:39--
http://www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
Resolving www.openfabrics.org<http://www.openfabrics.org>... 69.55.231.74 Connecting to www.openfabrics.org<http://www.openfabrics.org>|69.55.231.74|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2014-03-31 11:00:40 ERROR 404: Not Found.


> -----Original Message-----
> From: ewg-bounces at lists.openfabrics.org<mailto:ewg-bounces at lists.openfabrics.org>
> [mailto:ewg-bounces at lists.openfabrics.org<mailto:ewg-bounces at lists.openfabrics.org>] On Behalf Of
> kens at flatbed.openfabrics.org<mailto:kens at flatbed.openfabrics.org>
> Sent: Monday, March 31, 2014 10:13 AM
> To: nvmewin at openfabrics.org<mailto:nvmewin at openfabrics.org>; ewg at openfabrics.org<mailto:ewg at openfabrics.org>
> Subject: [ewg] links and such
>
> We are migrating all web service to hardware. Some links and urls are
> not yet working,
but I
> diligently trying to solve the issues. The web site, lists server, and
> mail server are
running.
> Bugs are bugs.openfabrics.org/bugzilla/<http://bugs.openfabrics.org/bugzilla/>. The git daemon is running,
> but the web
interface is
> not yet up. SVN is available through a client at
> svn://flatbed.openfabrics.org<http://flatbed.openfabrics.org>. The web interface is not up yet. My goal is to have them running today.
>
> Thanks for your patience. And thanks to Vladimir for help in getting
> the git daemon
running.
>
> Ken

_______________________________________________
nvmewin mailing list
nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
http://lists.openfabrics.org/mailman/listinfo/nvmewin
_______________________________________________
nvmewin mailing list
nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
http://lists.openfabrics.org/mailman/listinfo/nvmewin


--

Ken Strandberg
Webmanager/SysAdmin
OpenFabrics Alliance
kens at openfabrics.org<mailto:kens at openfabrics.org>
www.openfabrics.org<http://www.openfabrics.org>


--

Ken Strandberg
Webmanager/SysAdmin
OpenFabrics Alliance
kens at openfabrics.org<mailto:kens at openfabrics.org>
www.openfabrics.org<http://www.openfabrics.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140423/cf7b53d4/attachment.html>

From Rick.Knoblaugh at lsi.com  Wed Apr 23 15:53:53 2014
From: Rick.Knoblaugh at lsi.com (Knoblaugh, Rick)
Date: Wed, 23 Apr 2014 22:53:53 +0000
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode
In-Reply-To: <E1729D5DBAB9E948BA87B76FDFA1298A0C89A259@BBYEXM01.pmc-sierra.internal>
References: <23542_1397603272_534DBBC8_23542_9879_1_B3A485AFDDB1DD4598621E85E8EB67A83AB28A73@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C892952@BBYEXM01.pmc-sierra.internal>
	<13527_1397685728_534EFDE0_13527_9074_1_B3A485AFDDB1DD4598621E85E8EB67A83AB290F1@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C89A259@BBYEXM01.pmc-sierra.internal>
Message-ID: <2cabccce11ad439c920e715a7abef945@DM2PR07MB285.namprd07.prod.outlook.com>

Hi Alex,
                     We approve the patch as well. Thanks.

   -Rick

From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Tuesday, April 22, 2014 7:53 PM
To: Foster, Carolyn D; nvmewin at lists.openfabrics.org
Subject: Re: [nvmewin] NVMe OFA patch for CPU Learning mode

Hi Rick,

I had finished reviewing/testing on the patch. When you've done so, please let us know with your approval.

Thanks,
Alex

From: Foster, Carolyn D [mailto:carolyn.d.foster at intel.com]
Sent: Wednesday, April 16, 2014 3:00 PM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: NVMe OFA patch for CPU Learning mode


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


My apologies, I sent out the wrong version, thank you Alex.  I have attached the correct rebased version.

The password is still intel123

Carolyn

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Wednesday, April 16, 2014 11:31 AM
To: Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: NVMe OFA patch for CPU Learning mode

Hi Carolyn,

Did you re-base the sources before adding your changes? The patch you sent out seems not including what I added in Patch#24.

Thanks,
Alex

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Foster, Carolyn D
Sent: Tuesday, April 15, 2014 4:07 PM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


The password is intel1234

Problem statement:
The current OFA driver assumes a one to one mapping of MSI vectors, queues and CPU cores.  If there is not a one to one mapping then the driver does not go through learning mode and we see a performance drop.  Learning mode is how we maintain our source core interrupt steering, where we map MSI vectors to CPU cores.  This mapping allows the driver to issue and complete commands on the same CPU core.

Proposed changes:
Instead of giving up on learn mapping if there are differences between MSI vectors and CPU cores, we will proceed as normal with learning mode.  We allocate the Core table for the max number of cores, and if at the end of learning mode, any CPU cores have not been mapped, we will map them to Submission queues in a round robin fashion.  We also take into account if the MSI vectors are not mapped contiguously, or if the number of submission and completion queues are different from each other and the number of cores.  These changes still won't have 100% functionality of the source core interrupt steering, but performance is better than if we don't try at all.  Most of the changes are in the initialization path, there was no change to the IO path.

Also in this patch is the removal of the #defines for the CHATHAM prototype hardware.

Unit Tests:
Tested the following on Windows 7 and Windows 8 based systems.
Booted from a system with more CPU cores than MSI vectors.
IO stress on a setup with fewer IO queues than CPU cores and MSI vectors
Ran SCSI compliance tests
Ran SDStress
Ran IOmeter
Hibernate
Format (quick and slow) of MBR and GPT
Install/Uninstall, Enable/Disable


Thanks!
Carolyn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140423/55bef8ac/attachment.html>

From Alex.Chang at pmcs.com  Wed Apr 23 15:56:05 2014
From: Alex.Chang at pmcs.com (Alex Chang)
Date: Wed, 23 Apr 2014 22:56:05 +0000
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode
In-Reply-To: <2cabccce11ad439c920e715a7abef945@DM2PR07MB285.namprd07.prod.outlook.com>
References: <23542_1397603272_534DBBC8_23542_9879_1_B3A485AFDDB1DD4598621E85E8EB67A83AB28A73@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C892952@BBYEXM01.pmc-sierra.internal>
	<13527_1397685728_534EFDE0_13527_9074_1_B3A485AFDDB1DD4598621E85E8EB67A83AB290F1@FMSMSX105.amr.corp.intel.com>
	<E1729D5DBAB9E948BA87B76FDFA1298A0C89A259@BBYEXM01.pmc-sierra.internal>
	<2cabccce11ad439c920e715a7abef945@DM2PR07MB285.namprd07.prod.outlook.com>
Message-ID: <E1729D5DBAB9E948BA87B76FDFA1298A0C89A317@BBYEXM01.pmc-sierra.internal>

Thanks, Rick.

Alex

From: Knoblaugh, Rick [mailto:Rick.Knoblaugh at lsi.com]
Sent: Wednesday, April 23, 2014 3:54 PM
To: Alex Chang; Foster, Carolyn D; nvmewin at lists.openfabrics.org
Subject: RE: NVMe OFA patch for CPU Learning mode

Hi Alex,
                     We approve the patch as well. Thanks.

   -Rick

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Alex Chang
Sent: Tuesday, April 22, 2014 7:53 PM
To: Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: Re: [nvmewin] NVMe OFA patch for CPU Learning mode

Hi Rick,

I had finished reviewing/testing on the patch. When you've done so, please let us know with your approval.

Thanks,
Alex

From: Foster, Carolyn D [mailto:carolyn.d.foster at intel.com]
Sent: Wednesday, April 16, 2014 3:00 PM
To: Alex Chang; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: NVMe OFA patch for CPU Learning mode


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


My apologies, I sent out the wrong version, thank you Alex.  I have attached the correct rebased version.

The password is still intel123

Carolyn

From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Wednesday, April 16, 2014 11:31 AM
To: Foster, Carolyn D; nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: RE: NVMe OFA patch for CPU Learning mode

Hi Carolyn,

Did you re-base the sources before adding your changes? The patch you sent out seems not including what I added in Patch#24.

Thanks,
Alex

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Foster, Carolyn D
Sent: Tuesday, April 15, 2014 4:07 PM
To: nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
Subject: [nvmewin] NVMe OFA patch for CPU Learning mode


Content-Type: text/plain; charset=UTF-8

Content-Transfer-Encoding: 8bit

Date: %%SENT_DATE%%

Subject: Suspect Message Quarantined


WARNING: The virus scanner was unable to scan an attachment in an email message sent to you.  This attachment could possibly contain viruses or other malicious programs.  The attachment could not be scanned for the following reasons:


%%DESC%%


The full message and the attachment have been stored in the quarantine.


The identifier for this message is '%%QID%%'.


Access the quarantine at:

https://puremessage.pmc-sierra.bc.ca:28443/


For more information on PMC's Anti-Spam system:

http://pmc-intranet/wiki/index.php/Outlook:Anti-Spam_FAQ


IT Services

PureMessage Admin


The password is intel1234

Problem statement:
The current OFA driver assumes a one to one mapping of MSI vectors, queues and CPU cores.  If there is not a one to one mapping then the driver does not go through learning mode and we see a performance drop.  Learning mode is how we maintain our source core interrupt steering, where we map MSI vectors to CPU cores.  This mapping allows the driver to issue and complete commands on the same CPU core.

Proposed changes:
Instead of giving up on learn mapping if there are differences between MSI vectors and CPU cores, we will proceed as normal with learning mode.  We allocate the Core table for the max number of cores, and if at the end of learning mode, any CPU cores have not been mapped, we will map them to Submission queues in a round robin fashion.  We also take into account if the MSI vectors are not mapped contiguously, or if the number of submission and completion queues are different from each other and the number of cores.  These changes still won't have 100% functionality of the source core interrupt steering, but performance is better than if we don't try at all.  Most of the changes are in the initialization path, there was no change to the IO path.

Also in this patch is the removal of the #defines for the CHATHAM prototype hardware.

Unit Tests:
Tested the following on Windows 7 and Windows 8 based systems.
Booted from a system with more CPU cores than MSI vectors.
IO stress on a setup with fewer IO queues than CPU cores and MSI vectors
Ran SCSI compliance tests
Ran SDStress
Ran IOmeter
Hibernate
Format (quick and slow) of MBR and GPT
Install/Uninstall, Enable/Disable


Thanks!
Carolyn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140423/8bfbe205/attachment.html>

From Alex.Chang at pmcs.com  Wed Apr 23 16:14:47 2014
From: Alex.Chang at pmcs.com (Alex Chang)
Date: Wed, 23 Apr 2014 23:14:47 +0000
Subject: [nvmewin] NVMe Windows DB Is LOCKED - Pushing Patch From Intel For
 CPU-MSI Map Learning Fixes
Message-ID: <E1729D5DBAB9E948BA87B76FDFA1298A0C89A335@BBYEXM01.pmc-sierra.internal>

Locking NVMe Windows DB.

Thanks,
Alex

nvmewin mailing list
nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140423/78738974/attachment.html>

From Alex.Chang at pmcs.com  Thu Apr 24 11:31:39 2014
From: Alex.Chang at pmcs.com (Alex Chang)
Date: Thu, 24 Apr 2014 18:31:39 +0000
Subject: [nvmewin] NVMe Windows DB Is UNLOCKED - Pushing Patch From Intel
 For CPU-MSI Map Learning Fixes
Message-ID: <E1729D5DBAB9E948BA87B76FDFA1298A0C89A3E0@BBYEXM01.pmc-sierra.internal>

Hi all,

Thank you for reviewing/testing the patch from Intel.
The patch had been pushed into the source base and a new tag called "Patch#25_CPU_MSI_Map_Learning_Fix" had been created under tags directory.
Now, all the planed patches are in and I will have a new release as revision 1.3 soon.
Should you have any questions, please reply to the email listed below.

Thanks,
Alex
nvmewin mailing list
nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140424/cd4c7e7b/attachment.html>

From Alex.Chang at pmcs.com  Fri Apr 25 12:09:13 2014
From: Alex.Chang at pmcs.com (Alex Chang)
Date: Fri, 25 Apr 2014 19:09:13 +0000
Subject: [nvmewin]  NVMe Windows Driver Released As Revision 1.3
Message-ID: <E1729D5DBAB9E948BA87B76FDFA1298A0C89A46D@BBYEXM01.pmc-sierra.internal>

Hi all,

I am pleased to announce that the new release is ready for the community. Thank you all for the effort and contribution. You will find the entire package under releases\revision_1.3 directory after updating from repository. Similarly to previous release, there is a readme.txt that details what the release includes, changes added, etc. If you're new to the driver, please refer to the readme.txt first.
Under installations directory, there are both 32- and 64-bit installation packages included for your convenience. You may also find Visual Studio 2012 solution/project files under nvme directory and with the files, you can enter the pre-configured development environments depending on your target operating system(s).

To summarize the changes this release includes:
- Hibernation support.
- NUMA Group support.
- SRB Extension support.
- Proper NVMe Reset handling/checking.
- CPU-MSI map learning rework.
- Logical core enumeration rework.
- 32-bit driver support for Windows 7/8
- Treating LBA Range Type as optional in driver initialization.
- Reset handling rework.
- PRP list construction fix for driver-initiated requests.
- FreeQList access synchronization fix.
- Removal of CHATHAM related codes.

Should you have any questions, please reply to the email listed below.

Thanks,
Alex
nvmewin mailing list
nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140425/330acc31/attachment.html>

From james.p.freyensee at intel.com  Fri Apr 25 12:29:07 2014
From: james.p.freyensee at intel.com (Freyensee, James P)
Date: Fri, 25 Apr 2014 19:29:07 +0000
Subject: [nvmewin] [ewg] links and such
In-Reply-To: <CAFXU466k+7hOOHO3tGGpz-m=_zDhDeXHLHf_y1eCavC=vZYYog@mail.gmail.com>
References: <1396278777.57364@flatbed.openfabrics.org>
	<001501cf4cf7$da2a44b0$8e7ece10$@opengridcomputing.com>
	<2D98093777D3FD46A36253F35FE9D693997C7A58@ORSMSX109.amr.corp.intel.com>
	<CAFXU465Yj3y7tcDtTmFi1Zr6n7SGti=Nu107FUH5ZArDo4hyLw@mail.gmail.com>
	<36E8D38D6B771A4BBDB1C0D800158A516B61B577@SSIEXCH-MB3.ssi.samsung.com>
	<CAFXU466k+7hOOHO3tGGpz-m=_zDhDeXHLHf_y1eCavC=vZYYog@mail.gmail.com>
Message-ID: <2D98093777D3FD46A36253F35FE9D6939A1AA278@ORSMSX115.amr.corp.intel.com>

Ken:

I have a couple questions.


1.     In this website: http://www.nvmexpress.org/products/, the Windows driver link is still broken.  It points to this url which is broken:

https://www.openfabrics.org/resources/developer-tools/nvme-windows-development.html


2.       SVN question.  On the page https://www.openfabrics.org/index.php/developer-tools/windows-tools.html, this link:

The repository can be browsed with a Web browser at https://www.openfabrics.org/svnrepo/ofw. This is a view/checkout-only repository with the browser.

Is behaving differently than before the website/repo migration.  Before I really could use a web browser and it would give me the directories I would see as if I pulled the code to my computer.  I definitely did not see the SVN directories “conf”, “db”, “gen1”, etc., etc like is seen now when I use my web browser to try and browse the directory structure and try and find single files.

Is this expected behavior?  It didn’t work this way before.

Thanks,
Jay


From: ken.l.strandberg at gmail.com [mailto:ken.l.strandberg at gmail.com] On Behalf Of Ken Strandberg
Sent: Wednesday, April 23, 2014 7:16 AM
To: Judy Brock-SSI
Cc: Ken Strandberg; Freyensee, James P; nvmewin at openfabrics.org; ewg at openfabrics.org; Steve Wise; kens at flatbed.openfabrics.org
Subject: Re: [nvmewin] [ewg] links and such

Hi Judy,

Wow. I am sincerely sorry I didn't fix the links on the OFA web page. The SVN (Windows and NVMEwin) pages I missed completely. They still pointed to the old server. I've fixed them now.

Use an SVN client, like Tortoise SVN to browse, checkout, and update the repos. You can read/checkout only through a web browser. Have you accessed the repos before? Do you have an account for writing to them, assuming you're a contributor? If you don't, let me know and I'll set up your account. I you continue to have problems, please give me a call at 775.690.6575.

Ken

On Wed, Apr 23, 2014 at 1:19 AM, Judy Brock-SSI <judy.brock at ssi.samsung.com<mailto:judy.brock at ssi.samsung.com>> wrote:
Hello,

I am wondering what the correct link is to the Windows driver SVN repo – the link on nvmexpress.org<http://nvmexpress.org> is still broken as is the one referred to in the following URL: https://www.openfabrics.org/index.php/developer-tools/nvme-windows-development.html

I’ve tried inserting index.php as directed below into the old link but that doesn’t seem to work either.

Thanks,
Judy

From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org>] On Behalf Of Ken Strandberg

Sent: Saturday, April 12, 2014 8:45 AM
To: Freyensee, James P
Cc: nvmewin at openfabrics.org<mailto:nvmewin at openfabrics.org>; ewg at openfabrics.org<mailto:ewg at openfabrics.org>; Steve Wise; kens at flatbed.openfabrics.org<mailto:kens at flatbed.openfabrics.org>
Subject: Re: [nvmewin] [ewg] links and such

With the recent server migration, URLs to the OFA site need /index.php/ prefixing the URI. www.openfabrics.org/URI<http://www.openfabrics.org/URI> should be changed to www.openfabrics/index.php/URI<http://www.openfabrics/index.php/URI>. I've sent a request to info at nvmexpress.org<mailto:info at nvmexpress.org> to update their link.

On Fri, Apr 11, 2014 at 10:10 AM, Freyensee, James P <james.p.freyensee at intel.com<mailto:james.p.freyensee at intel.com>> wrote:
Is the new NVMe Windows Driver site fully functional yet?  From the main NVM Express website:

http://www.nvmexpress.org/products/

The "Windows Driver" link is broken.

Thanks!

-----Original Message-----
From: nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org> [mailto:nvmewin-bounces at lists.openfabrics.org<mailto:nvmewin-bounces at lists.openfabrics.org>] On Behalf Of Steve Wise
Sent: Monday, March 31, 2014 8:43 AM
To: kens at flatbed.openfabrics.org<mailto:kens at flatbed.openfabrics.org>; nvmewin at openfabrics.org<mailto:nvmewin at openfabrics.org>; ewg at openfabrics.org<mailto:ewg at openfabrics.org>
Subject: Re: [nvmewin] [ewg] links and such
FYI: This URL isn't working:

t4:~ # wget www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
--2014-03-31<http://www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz--2014-03-31> 11:00:39--
http://www.openfabrics.org/downloads/OFED/ofed-3.12-daily/latest.tgz
Resolving www.openfabrics.org<http://www.openfabrics.org>... 69.55.231.74 Connecting to www.openfabrics.org<http://www.openfabrics.org>|69.55.231.74|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2014-03-31 11:00:40 ERROR 404: Not Found.


> -----Original Message-----
> From: ewg-bounces at lists.openfabrics.org<mailto:ewg-bounces at lists.openfabrics.org>
> [mailto:ewg-bounces at lists.openfabrics.org<mailto:ewg-bounces at lists.openfabrics.org>] On Behalf Of
> kens at flatbed.openfabrics.org<mailto:kens at flatbed.openfabrics.org>
> Sent: Monday, March 31, 2014 10:13 AM
> To: nvmewin at openfabrics.org<mailto:nvmewin at openfabrics.org>; ewg at openfabrics.org<mailto:ewg at openfabrics.org>
> Subject: [ewg] links and such
>
> We are migrating all web service to hardware. Some links and urls are
> not yet working,
but I
> diligently trying to solve the issues. The web site, lists server, and
> mail server are
running.
> Bugs are bugs.openfabrics.org/bugzilla/<http://bugs.openfabrics.org/bugzilla/>. The git daemon is running,
> but the web
interface is
> not yet up. SVN is available through a client at
> svn://flatbed.openfabrics.org<http://flatbed.openfabrics.org>. The web interface is not up yet. My goal is to have them running today.
>
> Thanks for your patience. And thanks to Vladimir for help in getting
> the git daemon
running.
>
> Ken

_______________________________________________
nvmewin mailing list
nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
http://lists.openfabrics.org/mailman/listinfo/nvmewin
_______________________________________________
nvmewin mailing list
nvmewin at lists.openfabrics.org<mailto:nvmewin at lists.openfabrics.org>
http://lists.openfabrics.org/mailman/listinfo/nvmewin


--

Ken Strandberg
Webmanager/SysAdmin
OpenFabrics Alliance
kens at openfabrics.org<mailto:kens at openfabrics.org>
www.openfabrics.org<http://www.openfabrics.org>


--

Ken Strandberg
Webmanager/SysAdmin
OpenFabrics Alliance
kens at openfabrics.org<mailto:kens at openfabrics.org>
www.openfabrics.org<http://www.openfabrics.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20140425/84594d3a/attachment.html>