[nvmewin] review comments for patch from Google

Iuliu Rus iuliur at google.com
Thu Jun 16 14:10:01 PDT 2016


We already have a pool of locks - the dpc locks, which is what i'm using in
my latest patch. You cannot create spinlocks directly in a storport driver,
the only option is to use either the StartIoLock, the dpc lock, the
interrupt lock or one of the MSI locks.

On Thu, Jun 16, 2016 at 2:04 PM, Robles, Raymond C <
raymond.c.robles at intel.com> wrote:

> Hello,
>
>
>
> Here are my thoughts…
>
>
>
> In short, I agree with Thomas’s assessment.
>
>
>
> Concurrent channels allows Storport to call StartIo concurrently… and with
> no locks. Therefore the locking of critical sections is now the
> responsibility of the miniport.
>
>
>
> Any solution proposed should protect critical resources (cmd id pool, cmd
> entries, doorbells) with a lock that is granular enough for our driver, but
> not at the visibility of the OS or Storport. Acquiring the StartIo lock to
> protect resources in our driver only is too broad of a lock. My suggestion
> would be to create a pool of locks (one for every SQ/CQ pair) that is
> maintained within our driver. Again, crashdump mode is a non-issue because
> it is single threaded.
>
>
>
> Thank you…
>
> Ray
>
>
>
> *From:* Thomas Freeman [mailto:thomas.freeman at hgst.com]
> *Sent:* Thursday, June 16, 2016 1:51 PM
> *To:* Iuliu Rus <iuliur at google.com>
> *Cc:* Robles, Raymond C <raymond.c.robles at intel.com>; nvmewin <
> nvmewin at lists.openfabrics.org>
> *Subject:* RE: [nvmewin] review comments for patch from Google
>
>
>
> Hi Iuliu,
>
> Here my view, I hope someone else can chime in to verify or correct what
> I'm saying
>
>
>
> The purpose of the concurrent channels change is to prevent Storport from
> acquiring the StartIo lock before calling Startio. This allows multiple
> simultaneous calls to StartIo. If you require ProcessIo get the StartIo
> lock, you are replicating what Storport used to - it defeats the point of
> concurrent channels.
>
>
>
> Here is my suggestion for what needs to be changed
>
> 1. There is no need for StartIo Locks - remove the requests to acquire them
>
> 2. Acquire the Dpc locks in the following situations
>
>    -ProcessIo, before calling GetCmdEntry - if there is queue sharing
>
>    -ProcessIo, before calling IssueCmd - always
>
>    -IoCompletionRoutine, before calling NVMeCompleteCmd - if there is
> queue sharing
>
>
>
>    -recoveryDpcRoutine - get all of the Dpc locks
>
>
>
>
>
> Here is my reasoning
>
> -GetCmdEntry and NVMeCompleteCmd both access the FreeQList. With queue
> sharing, these could run simultaneously and have a collision when they
> update the list.
>
> -IssueCmd. With queue sharing, different cpu's could attempt to put
> entries on the queue simultaneously. But, to make RecoveryDpcRoutine work,
> this must be protected by the dpcLock whether or not there is queue
> sharing.
>
>
>
> I think this will address the deadlock problem I reported also.
>
>
>
> Regards,
>
> *Tom Freeman*
>
> *Software Engineer, Device Manager and Driver Development*
>
> Western Digital Corporation
>
> e.  Thomas.freeman at hgst.com
>
> o.  +1-507-322-2311
>
>
>
>
>
> *From:* Iuliu Rus [mailto:iuliur at google.com <iuliur at google.com>]
> *Sent:* Thursday, June 16, 2016 2:42 PM
> *To:* Thomas Freeman <thomas.freeman at hgst.com>
> *Cc:* Robles, Raymond C <raymond.c.robles at intel.com>; nvmewin <
> nvmewin at lists.openfabrics.org>
> *Subject:* Re: [nvmewin] review comments for patch from Google
>
>
>
> Here's my revised patch for concurrent channels. The synchronization rules
> are now as follows:
>
> 1. Init is single threaded so it doesn't matter if acquirelock is true or
> false
>
> 2. Anything called from the DPC has to call ProcessIo with false, since
> the dpc spinlock is already acquired.
>
> 3. Anything called from StartIo has to call ProcessIo with true, since now
> the startio is not automatically acquired.
>
>
>
> Thomas, i think the bug that you found is because the ProcessIo was
> incorrectly called with TRUE instead of FALSE.
>
> Let me know what are your thoughts on this.
>
>
>
> On Thu, Jun 16, 2016 at 10:41 AM, Iuliu Rus <iuliur at google.com> wrote:
>
> Guys, am i OK continuing with the patch on the current code base or do you
> want me to wait until you pull in some more changes?
>
>
>
> On Wed, Jun 15, 2016 at 8:37 AM, Thomas Freeman <thomas.freeman at hgst.com>
> wrote:
>
> Ray,
>
> It appears that the latest patch file from Iuliu is applied to revision
> 137. I don’t see Suman’s changes in that revision.
>
>
>
> As an example, the use of StorPortRequestTimer is not in nvmeStat.c.
>
>
>
> I thought there were some newer revisions than 137, but 137 is the latest
> I’m able to download.
>
> 137 is also the revision that appears when I go to
> https://svn.openfabrics.org/svnrepo/nvmewin/
>
>
>
> SVN is giving me some troubles, so I can’t tell for sure if this is my
> environment or a server issue.
>
>
>
> *Tom Freeman*
>
> *Software Engineer, Device Manager and Driver Development*
>
> Western Digital Corporation
>
> e.  Thomas.freeman at hgst.com
>
> o.  +1-507-322-2311
>
>
>
>
>
> *From:* nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] *On Behalf
> Of *Robles, Raymond C
> *Sent:* Friday, June 10, 2016 12:23 PM
> *To:* Iuliu Rus <iuliur at google.com>; suman.p at samsung.com
> *Cc:* nvmewin <nvmewin at lists.openfabrics.org>
>
>
> *Subject:* Re: [nvmewin] review comments for patch from Google
>
>
>
> Hi Iuliu,
>
>
>
> If you send out the updated patch with the protection in place for both
> list entries and doorbells, I can run it on an Intel NVMe SSD (P3700) and
> see if there is any performance impact (up or down).
>
>
>
> Thank you…
>
> Ray
>
>
>
> *From:* Iuliu Rus [mailto:iuliur at google.com <iuliur at google.com>]
> *Sent:* Friday, June 10, 2016 10:17 AM
> *To:* suman.p at samsung.com
> *Cc:* Robles, Raymond C <raymond.c.robles at intel.com>; Judy Brock <
> judy.brock at samsung.com>; nvmewin <nvmewin at lists.openfabrics.org>
> *Subject:* Re: RE: review comments for patch from Google
>
>
>
> we are running this in a virtual machine, and we do see a perf difference
> with concurrent channels. We did not observe any performance drop with
> synchronized access to the list entry.
>
>
>
> On Fri, Jun 10, 2016 at 7:15 AM, SUMAN PRAKASH B <suman.p at samsung.com>
> wrote:
>
> Hi Iuliu,
>
>
>
> As you have the implementation in place, can you let us know if there is
> any performance improvement when STOR_PERF_CONCURRENT_CHANNELS is set in
> case of #core and #queues not matching.
>
> Also, if there is any drop in performance if the list entry and doorbell
> is protected after setting STOR_PERF_CONCURRENT_CHANNELS.
>
>
>
> Perhaps a bit off topic:
>
> Even in case when #cores and #queues are same and when
> STOR_PERF_CONCURRENT_CHANNELS is set, we did not observe any performance
> gain. Maybe we are hitting the max performance from device even without
> setting STOR_PERF_CONCURRENT_CHANNELS. Can you please let us know your
> observation?
>
>
>
> Thanks,
>
> Suman
>
>
>
> ------- *Original Message* -------
>
> *Sender* : Robles, Raymond C<raymond.c.robles at intel.com>
>
> *Date* : Jun 10, 2016 08:30 (GMT+05:30)
>
> *Title* : RE: review comments for patch from Google
>
>
>
> Hi Judy,
>
>
>
> Thanks for the additional info. I agree with the assessment. My statement
> was directed at the general direction of the solution… and the fact that I
> think option 1 is not as robust as option 2.
>
>
>
> Thank you…
>
> Ray
>
>
>
> *From:* Judy Brock [mailto:judy.brock at samsung.com]
> *Sent:* Thursday, June 09, 2016 5:14 PM
> *To:* Robles, Raymond C <raymond.c.robles at intel.com>; Iuliu Rus <
> iuliur at google.com>; suman.p at samsung.com
> *Cc:* nvmewin at lists.openfabrics.org
> *Subject:* RE: review comments for patch from Google
>
>
>
> Hi Ray et al,
>
> [Suman wrote] >> Protect the listentry and doorbell before access/update
>
>                              >>Do not set STOR_PERF_CONCURRENT_CHANNELS
> when number of cores and queues are not same.
>
>
>
> Ray wrote] >>In the case Suman points out, concurrent channels should
> still be enabled even if the number of cores and queues do not match. This
> is especially true for large core count systems (>128 logical cpu cores)
> but devices may only support 32 queues or less
>
>
>
> I think Suman was saying, we have two options:
>
> 1.       either don’t enable it at all if # cores != #queues or
>
> 2.       protect both the list entries and doorbell writes before any
> accesses/updates if #cores != #queues.
>
>
>
> So we will need to check to see if we need to use locks or not in the
> performance path based on whether #cores and # queues DO match or not if we
> want to enable concurrent channels no matter what.   It’s probably better
> than option 1) above if this is a configuration that can come up frequently.
>
>
>
> Thanks,
>
> Judy
>
>
>
>
>
> *From:* nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org
> <nvmewin-bounces at lists.openfabrics.org>] *On Behalf Of *Robles, Raymond C
> *Sent:* Thursday, June 09, 2016 2:32 PM
> *To:* Iuliu Rus; suman.p at samsung.com
> *Cc:* nvmewin at lists.openfabrics.org
> *Subject:* Re: [nvmewin] review comments for patch from Google
>
>
>
> Hi Iuliu/Suman,
>
>
>
> I’m of the opinion that any solution provided should be as robust as
> possible. In the case Suman points out, concurrent channels should still be
> enabled even if the number of cores and queues do not match. This is
> especially true for large core count systems (>128 logical cpu cores) but
> devices may only support 32 queues or less.
>
>
>
> Thank you…
>
> Ray
>
>
>
> *From:* Iuliu Rus [mailto:iuliur at google.com]
> *Sent:* Thursday, June 09, 2016 2:09 PM
> *To:* suman.p at samsung.com
> *Cc:* Robles, Raymond C <raymond.c.robles at intel.com>;
> nvmewin at lists.openfabrics.org
> *Subject:* Re: review comments for patch from Google
>
>
>
> Which one do you prefer? Our google driver has the synchronization in
> place already for a different reason. Or, I can follow your second
> suggestion.
>
>
>
> On Thu, Jun 9, 2016 at 7:39 AM, SUMAN PRAKASH B <suman.p at samsung.com>
> wrote:
>
> Hi Iuliu,
>
>
>
> Please find my review comments below:
>
>
>
> 1. If the number of Cores and Queues supported by device are not same,
> then setting STOR_PERF_CONCURRENT_CHANNELS will corrupt the submission
> queue. For example, if device has 4 queues and connected to target PC which
> has 8 logical processors, and if STOR_PERF_CONCURRENT_CHANNELS is set, all
> the 8 logical processors will get startIo calls concurrently. As per OFA
> driver NUMA design, the IO Queue 1 will be shared by core 0 and core 4.
> There is possibility that core 0 and core 4 will get startio() calls
> concurrently and will access/update the listenty and queue 1 doorbell
> register at same time, which will lead to listentry and doorbell register
> corruption.
> This problem can be address by following ways:
> a. Protect the listentry and doorbell before access/update.
> b. Do not set STOR_PERF_CONCURRENT_CHANNELS when number of cores and
> queues are not same.
>
>
>
> 2. We can use pRMT->NumActiveCores which is updated in NVMeEnumNumaCores()
> instead of using KeQueryActiveProcessorCountEx/KeQueryActiveProcessorCount.
>
>
>
> Please let us know your opinion.
>
>
>
> Thanks,
> Suman
>
>
>
> ------- *Original Message* -------
>
> *Sender* : Robles, Raymond C<raymond.c.robles at intel.com>
>
> *Date* : Jun 09, 2016 03:56 (GMT+05:30)
>
> *Title* : Re: [nvmewin] Happy New Year... and status update
>
>
>
> Thank you Iuliu!
>
>
>
> Reviewing companies, please try to target the review completion for 2
> weeks from today (6/22/16).
>
>
>
> Thank you!
>
> Ray
>
>
>
> *From:* Iuliu Rus [mailto:iuliur at google.com]
> *Sent:* Wednesday, June 08, 2016 12:26 PM
> *To:* Robles, Raymond C <raymond.c.robles at intel.com>
> *Cc:* nvmewin at lists.openfabrics.org
> *Subject:* Re: Happy New Year... and status update
>
>
>
> I attached our patch.
>
>
>
> On Tue, Jun 7, 2016 at 2:37 PM, Iuliu Rus <iuliur at google.com> wrote:
>
> Let me see if i can get it done by tomorrow eod...
>
>
>
> On Tue, Jun 7, 2016 at 2:31 PM, Robles, Raymond C <
> raymond.c.robles at intel.com> wrote:
>
> Hi Iuliu,
>
>
>
> That’s great… thank you. Yes, the process is still the same (using SVN
> patches).
>
>
>
> Please let me know if you have any questions/comments. My only question
> for you is an ETA for the patch to be submitted.
>
>
>
> Thank you…
>
> Ray
>
>
>
> *From:* Iuliu Rus [mailto:iuliur at google.com]
> *Sent:* Tuesday, June 07, 2016 2:28 PM
> *To:* Robles, Raymond C <raymond.c.robles at intel.com>
> *Cc:* nvmewin at lists.openfabrics.org
> *Subject:* Re: Happy New Year... and status update
>
>
>
> Yes we can send that to you. Is the process still the same?
>
>
>
> On Tue, Jun 7, 2016 at 1:25 PM, Robles, Raymond C <
> raymond.c.robles at intel.com> wrote:
>
> Hello Iuliu,
>
>
>
> As the OFA community is closing on its mid-year release, we have only one
> more patch that was targeted… concurrent channels. Will Google still be
> submitting the concurrent channels patch?
>
>
>
> Thank you…
>
> Ray
>
>
>
> *From:* nvmewin-bounces at lists.openfabrics.org [mailto:
> nvmewin-bounces at lists.openfabrics.org] *On Behalf Of *Robles, Raymond C
> *Sent:* Thursday, January 14, 2016 4:55 PM
> *To:* 'nvmewin at lists.openfabrics.org' <nvmewin at lists.openfabrics.org>
> *Subject:* [nvmewin] Happy New Year... and status update
>
>
>
> Hello and Happy New Year,
>
>
>
> I hope everyone had a great holiday season and is off to a great start to
> the new year!
>
>
>
> As communicated last month, the patch from HGST for the SCSI
> multi-initiator changes has been approved and pushed. The holidays slowed
> down much of the progress on the OFA driver and there were several patches
> that did not get pushed prior to the end of the year. The list of patches
> remained to be pushed are as follows…
>
>
>
> ·         Namespace Management (Intel)
>
> ·         Perf Opts (Samsung)
>
> ·         Win 8.1 Timers (Samsung)
>
> ·         EOL Read Only (Samsung)
>
> ·         Concurrent channels (Google)
>
>
>
> The namespace management patch will be sent out for review tomorrow (look
> for the patch email from Carolyn)… stay tuned!!!
>
>
>
> However, once this patch is resolved, we as a community will have to make
> a decision on an official release strategy. The patch process and cadence
> was significantly slower in 2015 which leaves us with a few options.
>
>
>
> 1.       Release what is in the trunk today (or after the namespace
> management patch)… and call that the 2015 release (albeit later than
> expected)
>
> 2.       In lieu of an official 2015 release, we push the remaining
> patches listed above… and then release in ~Q2 of 2015. Basically skip a
> 2015 release and go right into the mid-2016 release.
>
> 3.       Remove the concept of “official releases” from the OFA Windows
> NVMe driver and just allow people, companies, and users to pull from the
> OFA trunk as necessary.
>
>
>
> For #3 above, my thoughts are that because the OFA driver is not a
> production driver… but a reference and enabling driver, it should/could be
> managed as just that… a reference baseline driver that any potential user
> can go and grab the source, or contribute a patch. Nothing more… nothing
> less.
>
>
>
> For the release decision, I’ll be happy to call a meeting… but we can also
> handle it via email as well… just let me know. Feedback from all is
> welcome… but I would request mandatory feedback form the 4 reviewing
> companies: Samsung, HGST, PMC-Sierra, and Intel. How would you like to
> proceed?
>
>
>
> Thanks,
>
> Ray
>
>
>
> *[image: cid:image001.png at 01CB3870.4BB88E70]*
>
> *Raymond C. Robles*
>
> Non-Volatile Memory Solutions Group
>
> Intel Corporation
>
> Office: 480-554-2600
>
> Mobile: 480-399-0645
>
>
>
>
>
>
>
>
>
>
>
>
>
> [image: cid:image006.gif at 01D1C6EF.5D9CDE20]
>
> [image: Image removed by sender.
> http://ext.samsung.net/mailcheck/SeenTimeChecker?do=9226f2572c3ad117df6a46883b3f3dde1d283838fa336dda194352576c69fe9bfdd53f331c9856e34bea3c1765014a1208cece8541bc14eacf878f9a26ce15a0]
>
>
>
>
>
>
>
> [image: cid:image006.gif at 01D1C6EF.5D9CDE20]
>
> [image: Image removed by sender.
> http://ext.samsung.net/mailcheck/SeenTimeChecker?do=36e611fbbc499bdb45df2c678ea6c804daf257cfe78672b8194352576c69fe9bfdd53f331c9856e34bea3c1765014a1208cece8541bc14eacf878f9a26ce15a0]
>
>
>
> Western Digital Corporation (and its subsidiaries) E-mail Confidentiality
> Notice & Disclaimer:
>
> This e-mail and any files transmitted with it may contain confidential or
> legally privileged information of WDC and/or its affiliates, and are
> intended solely for the use of the individual or entity to which they are
> addressed. If you are not the intended recipient, any disclosure, copying,
> distribution or any action taken or omitted to be taken in reliance on it,
> is prohibited. If you have received this e-mail in error, please notify the
> sender immediately and delete the e-mail in its entirety from your system.
>
>
>
> _______________________________________________
> nvmewin mailing list
> nvmewin at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/nvmewin
>
>
>
>
>
> Western Digital Corporation (and its subsidiaries) E-mail Confidentiality
> Notice & Disclaimer:
>
> This e-mail and any files transmitted with it may contain confidential or
> legally privileged information of WDC and/or its affiliates, and are
> intended solely for the use of the individual or entity to which they are
> addressed. If you are not the intended recipient, any disclosure, copying,
> distribution or any action taken or omitted to be taken in reliance on it,
> is prohibited. If you have received this e-mail in error, please notify the
> sender immediately and delete the e-mail in its entirety from your system.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20160616/dffb5a73/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image005.gif
Type: image/gif
Size: 13168 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20160616/dffb5a73/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.jpg
Type: image/jpeg
Size: 2937 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20160616/dffb5a73/attachment.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 2938 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20160616/dffb5a73/attachment-0001.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image006.jpg
Type: image/jpeg
Size: 344 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20160616/dffb5a73/attachment-0002.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image004.png
Type: image/png
Size: 1756 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20160616/dffb5a73/attachment.png>


More information about the nvmewin mailing list