From ken at kenstrandberg.com Mon Jan 2 18:13:34 2012 From: ken at kenstrandberg.com (Ken Strandberg) Date: Mon, 2 Jan 2012 18:13:34 -0800 Subject: [nvmewin] test Message-ID: <006e01ccc9bd$4c7c05a0$e57410e0$@kenstrandberg.com> If you get this, please reply. Ken Strandberg Catlow Communications S&S Business Services, Inc. Technical Communications for Marketing Writing for Web, Print, Film/Video 775.690.6575 (v) 775.313.9617 (f) ken at kenstrandberg.com ken at catlowcommunications.com www.catlowcommunications.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.e.luse at intel.com Mon Jan 2 18:17:34 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Tue, 3 Jan 2012 02:17:34 +0000 Subject: [nvmewin] test In-Reply-To: <006e01ccc9bd$4c7c05a0$e57410e0$@kenstrandberg.com> References: <006e01ccc9bd$4c7c05a0$e57410e0$@kenstrandberg.com> Message-ID: <82C9F782B054C94B9FC04A331649C77A014623@FMSMSX106.amr.corp.intel.com> Got it From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Ken Strandberg Sent: Monday, January 02, 2012 7:14 PM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] test If you get this, please reply. Ken Strandberg Catlow Communications S&S Business Services, Inc. Technical Communications for Marketing Writing for Web, Print, Film/Video 775.690.6575 (v) 775.313.9617 (f) ken at kenstrandberg.com ken at catlowcommunications.com www.catlowcommunications.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.e.luse at intel.com Mon Jan 16 17:58:45 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Tue, 17 Jan 2012 01:58:45 +0000 Subject: [nvmewin] Windows NVMe Driver Source Posted & Logistics for first work group meeting Message-ID: <82C9F782B054C94B9FC04A331649C77A022BA8@FMSMSX106.amr.corp.intel.com> Hello Everyone, Our initial source code is now available in our SVN repo at svn://sofa.openfabrics.org/nvmewin No binary has been posted yet as one of first orders of business in this first meeting is to decide what criteria we're comfortable with prior to posting a binary. The full core team that developed the driver will be on the call to provide information on what's been tested and what hasn't, etc., etc. Here's our first agenda, looking forward to talking with all of you! - Coding style/guidelines that we're following - Patch and review process - First release & subsequent release beat rate Thanks, Paul Thursday, January 19, 2012, 04:00 PM US Arizona Time 916-356-2663, 8-356-2663, Bridge: 93, Passcode: 7693020 Live Meeting: https://webjoin.intel.com/?passcode=7693020 -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin at nvelo.com Mon Jan 16 17:50:22 2012 From: kevin at nvelo.com (Kevin Silver) Date: Mon, 16 Jan 2012 17:50:22 -0800 Subject: [nvmewin] Please SUBSCRIBE me to this email list Message-ID: Thanks! Kevin Silver VP Business Development NVELO, Inc. 5201 Great America Parkway Santa Clara, CA 95054 650.283.3488 -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.e.luse at intel.com Thu Jan 19 09:20:43 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Thu, 19 Jan 2012 17:20:43 +0000 Subject: [nvmewin] Reminder: First meeting is this afternoon Message-ID: <82C9F782B054C94B9FC04A331649C77A02BD0A@FMSMSX106.amr.corp.intel.com> Here's our first agenda, looking forward to talking with all of you! - Discuss/decide SVN SCM strategy - Coding style/guidelines that we're following (attached) - Patch and review process - First release & subsequent release beat rate Thanks, Paul PS: Note the start time is in "AZ time", that's Mountain time so 3:00 Pacific Thursday, January 19, 2012, 04:00 PM US Arizona Time 916-356-2663, 8-356-2663, Bridge: 93, Passcode: 7693020 Live Meeting: https://webjoin.intel.com/?passcode=7693020 ____________________________________ Paul Luse Sr. Staff Engineer PCG Server Software Engineering Desk: 480.554.3688, Mobile: 480.334.4630 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Coding_guidelines.docx Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document Size: 76721 bytes Desc: Coding_guidelines.docx URL: From Alex.Chang at idt.com Thu Jan 19 10:40:14 2012 From: Alex.Chang at idt.com (Chang, Alex) Date: Thu, 19 Jan 2012 10:40:14 -0800 Subject: [nvmewin] ProcessIO Function In-Reply-To: <82C9F782B054C94B9FC04A331649C77A02BD0A@FMSMSX106.amr.corp.intel.com> References: <82C9F782B054C94B9FC04A331649C77A02BD0A@FMSMSX106.amr.corp.intel.com> Message-ID: <45C2596E6A608A46B0CA10A43A91FE1602F8386A@CORPEXCH1.na.ads.idt.com> Hi all I tested Format NVM last night and it basically works fine. However, when I re-visited the state machine, there is potential problem related to ProcessIo function, which completes the request back to Storport in error cases. The return value of ProcessIo only indicates whether the command had been issued to controller or not. It does NOT tell if it completes the request. In SNTI codes, Ray re-uses the Srb Extension to issue command(s). If ProcessIO fails, ASSERTION is called. Should I do the same if it fails when re-fetching Identify structures after Format NVM succeeds? Question is we don't want to complete the same request twice. We should talk about this later. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.e.luse at intel.com Thu Jan 19 11:05:55 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Thu, 19 Jan 2012 19:05:55 +0000 Subject: [nvmewin] ProcessIO Function In-Reply-To: <45C2596E6A608A46B0CA10A43A91FE1602F8386A@CORPEXCH1.na.ads.idt.com> References: <82C9F782B054C94B9FC04A331649C77A02BD0A@FMSMSX106.amr.corp.intel.com> <45C2596E6A608A46B0CA10A43A91FE1602F8386A@CORPEXCH1.na.ads.idt.com> Message-ID: <82C9F782B054C94B9FC04A331649C77A02C838@FMSMSX106.amr.corp.intel.com> Hi Alex- I'm not totally sure I understand what you are trying to say but I think I do. What does "In SNTI codes" mean? I assume you're saying that there's some inconsistency with what some callers are assuming about the return value from ProcessIO() and, if so, good catch! Clearly we cannot complete the same IO twice J Haven't seen your formatNVM code yet but if any of the sub-commands you need to issue as art of handling formatNVM fails then you should fail the whole command gracefully. Also, how does any of this relate to init state machine? We may not have time to talk about this in the meeting so if you could reply and identify specific cases where you see callers of ProcessIO doing the wrong thing and propose a fix that would be great. I don't believe we want the caller to have to complete the IO if processIO fails, procession should do that however just looking at the code doesn't consistently do that. We should scrub it so that callers know that if procession fails that the IO has been completed and if it completes the IO has been issued. We should also fix procession to complete all failures with appropriate error codes. Thanks Paul From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, January 19, 2012 11:40 AM To: Luse, Paul E; nvmewin at lists.openfabrics.org Subject: ProcessIO Function Hi all I tested Format NVM last night and it basically works fine. However, when I re-visited the state machine, there is potential problem related to ProcessIo function, which completes the request back to Storport in error cases. The return value of ProcessIo only indicates whether the command had been issued to controller or not. It does NOT tell if it completes the request. In SNTI codes, Ray re-uses the Srb Extension to issue command(s). If ProcessIO fails, ASSERTION is called. Should I do the same if it fails when re-fetching Identify structures after Format NVM succeeds? Question is we don't want to complete the same request twice. We should talk about this later. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at idt.com Thu Jan 19 11:26:50 2012 From: Alex.Chang at idt.com (Chang, Alex) Date: Thu, 19 Jan 2012 11:26:50 -0800 Subject: [nvmewin] ProcessIO Function In-Reply-To: <82C9F782B054C94B9FC04A331649C77A02C838@FMSMSX106.amr.corp.intel.com> References: <82C9F782B054C94B9FC04A331649C77A02BD0A@FMSMSX106.amr.corp.intel.com> <45C2596E6A608A46B0CA10A43A91FE1602F8386A@CORPEXCH1.na.ads.idt.com> <82C9F782B054C94B9FC04A331649C77A02C838@FMSMSX106.amr.corp.intel.com> Message-ID: <45C2596E6A608A46B0CA10A43A91FE1602F838C2@CORPEXCH1.na.ads.idt.com> Hi Paul, Sorry, I did not mean init state machine. It's the new Format NVM state machine. After Format NVM command completes successfully, Identify commands are being issued to re-fetch the structures. We all call ProcessIo to issue the commands and its return type is BOOLEAN, which only indicates whether the command had been issued or not. In some error cases, such as failure on getting CmdID, ProcessIo calls StorPortNotification to complete the request and callers have no way to tell if the request had been completed or not. To avoid complete the request twice, I'd like to suggest adding one extra parameter in ProcessIO like: BOOLEAN ProcessIo( __in PNVME_DEVICE_EXTENSION pAdapterExtension, __in PNVME_SRB_EXTENSION pSrbExtension, __in NVME_QUEUE_TYPE QueueType, __out RequestCompleted ) Thanks, Alex ________________________________ From: Luse, Paul E [mailto:paul.e.luse at intel.com] Sent: Thursday, January 19, 2012 11:06 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: ProcessIO Function Hi Alex- I'm not totally sure I understand what you are trying to say but I think I do. What does "In SNTI codes" mean? I assume you're saying that there's some inconsistency with what some callers are assuming about the return value from ProcessIO() and, if so, good catch! Clearly we cannot complete the same IO twice :-) Haven't seen your formatNVM code yet but if any of the sub-commands you need to issue as art of handling formatNVM fails then you should fail the whole command gracefully. Also, how does any of this relate to init state machine? We may not have time to talk about this in the meeting so if you could reply and identify specific cases where you see callers of ProcessIO doing the wrong thing and propose a fix that would be great. I don't believe we want the caller to have to complete the IO if processIO fails, procession should do that however just looking at the code doesn't consistently do that. We should scrub it so that callers know that if procession fails that the IO has been completed and if it completes the IO has been issued. We should also fix procession to complete all failures with appropriate error codes. Thanks Paul From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, January 19, 2012 11:40 AM To: Luse, Paul E; nvmewin at lists.openfabrics.org Subject: ProcessIO Function Hi all I tested Format NVM last night and it basically works fine. However, when I re-visited the state machine, there is potential problem related to ProcessIo function, which completes the request back to Storport in error cases. The return value of ProcessIo only indicates whether the command had been issued to controller or not. It does NOT tell if it completes the request. In SNTI codes, Ray re-uses the Srb Extension to issue command(s). If ProcessIO fails, ASSERTION is called. Should I do the same if it fails when re-fetching Identify structures after Format NVM succeeds? Question is we don't want to complete the same request twice. We should talk about this later. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.e.luse at intel.com Thu Jan 19 11:45:35 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Thu, 19 Jan 2012 19:45:35 +0000 Subject: [nvmewin] ProcessIO Function In-Reply-To: <45C2596E6A608A46B0CA10A43A91FE1602F838C2@CORPEXCH1.na.ads.idt.com> References: <82C9F782B054C94B9FC04A331649C77A02BD0A@FMSMSX106.amr.corp.intel.com> <45C2596E6A608A46B0CA10A43A91FE1602F8386A@CORPEXCH1.na.ads.idt.com> <82C9F782B054C94B9FC04A331649C77A02C838@FMSMSX106.amr.corp.intel.com> <45C2596E6A608A46B0CA10A43A91FE1602F838C2@CORPEXCH1.na.ads.idt.com> Message-ID: <82C9F782B054C94B9FC04A331649C77A02CA63@FMSMSX106.amr.corp.intel.com> Ahh, OK thanks. I'd prefer we simply have it by rule: - If process IO fails then the IO is complete to storport (if it had an SRB) with the appropriate status - If process IO succeeds then the IO is in flight If we add this flag then we also have to change the return value for process IO so the caller know why it failed so it can return the IO, if needed to storport. No point in having every caller implement code for 'return status busy' when it can be done in one central location where the error happens. Make sense? Thx Paul From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, January 19, 2012 12:27 PM To: Luse, Paul E; nvmewin at lists.openfabrics.org Subject: RE: ProcessIO Function Hi Paul, Sorry, I did not mean init state machine. It's the new Format NVM state machine. After Format NVM command completes successfully, Identify commands are being issued to re-fetch the structures. We all call ProcessIo to issue the commands and its return type is BOOLEAN, which only indicates whether the command had been issued or not. In some error cases, such as failure on getting CmdID, ProcessIo calls StorPortNotification to complete the request and callers have no way to tell if the request had been completed or not. To avoid complete the request twice, I'd like to suggest adding one extra parameter in ProcessIO like: BOOLEAN ProcessIo( __in PNVME_DEVICE_EXTENSION pAdapterExtension, __in PNVME_SRB_EXTENSION pSrbExtension, __in NVME_QUEUE_TYPE QueueType, __out RequestCompleted ) Thanks, Alex ________________________________ From: Luse, Paul E [mailto:paul.e.luse at intel.com] Sent: Thursday, January 19, 2012 11:06 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: ProcessIO Function Hi Alex- I'm not totally sure I understand what you are trying to say but I think I do. What does "In SNTI codes" mean? I assume you're saying that there's some inconsistency with what some callers are assuming about the return value from ProcessIO() and, if so, good catch! Clearly we cannot complete the same IO twice J Haven't seen your formatNVM code yet but if any of the sub-commands you need to issue as art of handling formatNVM fails then you should fail the whole command gracefully. Also, how does any of this relate to init state machine? We may not have time to talk about this in the meeting so if you could reply and identify specific cases where you see callers of ProcessIO doing the wrong thing and propose a fix that would be great. I don't believe we want the caller to have to complete the IO if processIO fails, procession should do that however just looking at the code doesn't consistently do that. We should scrub it so that callers know that if procession fails that the IO has been completed and if it completes the IO has been issued. We should also fix procession to complete all failures with appropriate error codes. Thanks Paul From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, January 19, 2012 11:40 AM To: Luse, Paul E; nvmewin at lists.openfabrics.org Subject: ProcessIO Function Hi all I tested Format NVM last night and it basically works fine. However, when I re-visited the state machine, there is potential problem related to ProcessIo function, which completes the request back to Storport in error cases. The return value of ProcessIo only indicates whether the command had been issued to controller or not. It does NOT tell if it completes the request. In SNTI codes, Ray re-uses the Srb Extension to issue command(s). If ProcessIO fails, ASSERTION is called. Should I do the same if it fails when re-fetching Identify structures after Format NVM succeeds? Question is we don't want to complete the same request twice. We should talk about this later. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at idt.com Thu Jan 19 11:57:43 2012 From: Alex.Chang at idt.com (Chang, Alex) Date: Thu, 19 Jan 2012 11:57:43 -0800 Subject: [nvmewin] ProcessIO Function In-Reply-To: <82C9F782B054C94B9FC04A331649C77A02CA63@FMSMSX106.amr.corp.intel.com> References: <82C9F782B054C94B9FC04A331649C77A02BD0A@FMSMSX106.amr.corp.intel.com> <45C2596E6A608A46B0CA10A43A91FE1602F8386A@CORPEXCH1.na.ads.idt.com> <82C9F782B054C94B9FC04A331649C77A02C838@FMSMSX106.amr.corp.intel.com> <45C2596E6A608A46B0CA10A43A91FE1602F838C2@CORPEXCH1.na.ads.idt.com> <82C9F782B054C94B9FC04A331649C77A02CA63@FMSMSX106.amr.corp.intel.com> Message-ID: <45C2596E6A608A46B0CA10A43A91FE1602F83907@CORPEXCH1.na.ads.idt.com> Yes, I'd agree make it as simple as possible. However, there are two places in ProcessIo returning FALSE without calling StorPortNotification to complete the request: Failure on calling storportGetCurrentProcessorNumber. Failure on calling NVMeMapCore2Queue. We should complete the request when either one happens? Thanks, Alex ________________________________ From: Luse, Paul E [mailto:paul.e.luse at intel.com] Sent: Thursday, January 19, 2012 11:46 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: ProcessIO Function Ahh, OK thanks. I'd prefer we simply have it by rule: - If process IO fails then the IO is complete to storport (if it had an SRB) with the appropriate status - If process IO succeeds then the IO is in flight If we add this flag then we also have to change the return value for process IO so the caller know why it failed so it can return the IO, if needed to storport. No point in having every caller implement code for 'return status busy' when it can be done in one central location where the error happens. Make sense? Thx Paul From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, January 19, 2012 12:27 PM To: Luse, Paul E; nvmewin at lists.openfabrics.org Subject: RE: ProcessIO Function Hi Paul, Sorry, I did not mean init state machine. It's the new Format NVM state machine. After Format NVM command completes successfully, Identify commands are being issued to re-fetch the structures. We all call ProcessIo to issue the commands and its return type is BOOLEAN, which only indicates whether the command had been issued or not. In some error cases, such as failure on getting CmdID, ProcessIo calls StorPortNotification to complete the request and callers have no way to tell if the request had been completed or not. To avoid complete the request twice, I'd like to suggest adding one extra parameter in ProcessIO like: BOOLEAN ProcessIo( __in PNVME_DEVICE_EXTENSION pAdapterExtension, __in PNVME_SRB_EXTENSION pSrbExtension, __in NVME_QUEUE_TYPE QueueType, __out RequestCompleted ) Thanks, Alex ________________________________ From: Luse, Paul E [mailto:paul.e.luse at intel.com] Sent: Thursday, January 19, 2012 11:06 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: ProcessIO Function Hi Alex- I'm not totally sure I understand what you are trying to say but I think I do. What does "In SNTI codes" mean? I assume you're saying that there's some inconsistency with what some callers are assuming about the return value from ProcessIO() and, if so, good catch! Clearly we cannot complete the same IO twice :-) Haven't seen your formatNVM code yet but if any of the sub-commands you need to issue as art of handling formatNVM fails then you should fail the whole command gracefully. Also, how does any of this relate to init state machine? We may not have time to talk about this in the meeting so if you could reply and identify specific cases where you see callers of ProcessIO doing the wrong thing and propose a fix that would be great. I don't believe we want the caller to have to complete the IO if processIO fails, procession should do that however just looking at the code doesn't consistently do that. We should scrub it so that callers know that if procession fails that the IO has been completed and if it completes the IO has been issued. We should also fix procession to complete all failures with appropriate error codes. Thanks Paul From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, January 19, 2012 11:40 AM To: Luse, Paul E; nvmewin at lists.openfabrics.org Subject: ProcessIO Function Hi all I tested Format NVM last night and it basically works fine. However, when I re-visited the state machine, there is potential problem related to ProcessIo function, which completes the request back to Storport in error cases. The return value of ProcessIo only indicates whether the command had been issued to controller or not. It does NOT tell if it completes the request. In SNTI codes, Ray re-uses the Srb Extension to issue command(s). If ProcessIO fails, ASSERTION is called. Should I do the same if it fails when re-fetching Identify structures after Format NVM succeeds? Question is we don't want to complete the same request twice. We should talk about this later. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.e.luse at intel.com Thu Jan 19 12:14:43 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Thu, 19 Jan 2012 20:14:43 +0000 Subject: [nvmewin] ProcessIO Function In-Reply-To: <45C2596E6A608A46B0CA10A43A91FE1602F83907@CORPEXCH1.na.ads.idt.com> References: <82C9F782B054C94B9FC04A331649C77A02BD0A@FMSMSX106.amr.corp.intel.com> <45C2596E6A608A46B0CA10A43A91FE1602F8386A@CORPEXCH1.na.ads.idt.com> <82C9F782B054C94B9FC04A331649C77A02C838@FMSMSX106.amr.corp.intel.com> <45C2596E6A608A46B0CA10A43A91FE1602F838C2@CORPEXCH1.na.ads.idt.com> <82C9F782B054C94B9FC04A331649C77A02CA63@FMSMSX106.amr.corp.intel.com> <45C2596E6A608A46B0CA10A43A91FE1602F83907@CORPEXCH1.na.ads.idt.com> Message-ID: <82C9F782B054C94B9FC04A331649C77A02CAE1@FMSMSX106.amr.corp.intel.com> Yes, that's what I meant below in my last sentence, we need to complete to storport in both of those cases and use SRB_STATUS_ERROR for those cases. If you want to implement this that'd be great (maybe just have one storport completion at the bottom of the function and set the error in earlier places; to avoid goto please use try/finally in this routine. Or if you want to focus on the formatNVM I'll clean this up as well as check the callers for mis-use of procession failures. Thx Paul From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, January 19, 2012 12:58 PM To: Luse, Paul E; nvmewin at lists.openfabrics.org Subject: RE: ProcessIO Function Yes, I'd agree make it as simple as possible. However, there are two places in ProcessIo returning FALSE without calling StorPortNotification to complete the request: Failure on calling storportGetCurrentProcessorNumber. Failure on calling NVMeMapCore2Queue. We should complete the request when either one happens? Thanks, Alex ________________________________ From: Luse, Paul E [mailto:paul.e.luse at intel.com] Sent: Thursday, January 19, 2012 11:46 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: ProcessIO Function Ahh, OK thanks. I'd prefer we simply have it by rule: - If process IO fails then the IO is complete to storport (if it had an SRB) with the appropriate status - If process IO succeeds then the IO is in flight If we add this flag then we also have to change the return value for process IO so the caller know why it failed so it can return the IO, if needed to storport. No point in having every caller implement code for 'return status busy' when it can be done in one central location where the error happens. Make sense? Thx Paul From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, January 19, 2012 12:27 PM To: Luse, Paul E; nvmewin at lists.openfabrics.org Subject: RE: ProcessIO Function Hi Paul, Sorry, I did not mean init state machine. It's the new Format NVM state machine. After Format NVM command completes successfully, Identify commands are being issued to re-fetch the structures. We all call ProcessIo to issue the commands and its return type is BOOLEAN, which only indicates whether the command had been issued or not. In some error cases, such as failure on getting CmdID, ProcessIo calls StorPortNotification to complete the request and callers have no way to tell if the request had been completed or not. To avoid complete the request twice, I'd like to suggest adding one extra parameter in ProcessIO like: BOOLEAN ProcessIo( __in PNVME_DEVICE_EXTENSION pAdapterExtension, __in PNVME_SRB_EXTENSION pSrbExtension, __in NVME_QUEUE_TYPE QueueType, __out RequestCompleted ) Thanks, Alex ________________________________ From: Luse, Paul E [mailto:paul.e.luse at intel.com] Sent: Thursday, January 19, 2012 11:06 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: ProcessIO Function Hi Alex- I'm not totally sure I understand what you are trying to say but I think I do. What does "In SNTI codes" mean? I assume you're saying that there's some inconsistency with what some callers are assuming about the return value from ProcessIO() and, if so, good catch! Clearly we cannot complete the same IO twice J Haven't seen your formatNVM code yet but if any of the sub-commands you need to issue as art of handling formatNVM fails then you should fail the whole command gracefully. Also, how does any of this relate to init state machine? We may not have time to talk about this in the meeting so if you could reply and identify specific cases where you see callers of ProcessIO doing the wrong thing and propose a fix that would be great. I don't believe we want the caller to have to complete the IO if processIO fails, procession should do that however just looking at the code doesn't consistently do that. We should scrub it so that callers know that if procession fails that the IO has been completed and if it completes the IO has been issued. We should also fix procession to complete all failures with appropriate error codes. Thanks Paul From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, January 19, 2012 11:40 AM To: Luse, Paul E; nvmewin at lists.openfabrics.org Subject: ProcessIO Function Hi all I tested Format NVM last night and it basically works fine. However, when I re-visited the state machine, there is potential problem related to ProcessIo function, which completes the request back to Storport in error cases. The return value of ProcessIo only indicates whether the command had been issued to controller or not. It does NOT tell if it completes the request. In SNTI codes, Ray re-uses the Srb Extension to issue command(s). If ProcessIO fails, ASSERTION is called. Should I do the same if it fails when re-fetching Identify structures after Format NVM succeeds? Question is we don't want to complete the same request twice. We should talk about this later. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at idt.com Thu Jan 19 12:26:20 2012 From: Alex.Chang at idt.com (Chang, Alex) Date: Thu, 19 Jan 2012 12:26:20 -0800 Subject: [nvmewin] ProcessIO Function In-Reply-To: <82C9F782B054C94B9FC04A331649C77A02CAE1@FMSMSX106.amr.corp.intel.com> References: <82C9F782B054C94B9FC04A331649C77A02BD0A@FMSMSX106.amr.corp.intel.com> <45C2596E6A608A46B0CA10A43A91FE1602F8386A@CORPEXCH1.na.ads.idt.com> <82C9F782B054C94B9FC04A331649C77A02C838@FMSMSX106.amr.corp.intel.com> <45C2596E6A608A46B0CA10A43A91FE1602F838C2@CORPEXCH1.na.ads.idt.com> <82C9F782B054C94B9FC04A331649C77A02CA63@FMSMSX106.amr.corp.intel.com> <45C2596E6A608A46B0CA10A43A91FE1602F83907@CORPEXCH1.na.ads.idt.com> <82C9F782B054C94B9FC04A331649C77A02CAE1@FMSMSX106.amr.corp.intel.com> Message-ID: <45C2596E6A608A46B0CA10A43A91FE1602F83928@CORPEXCH1.na.ads.idt.com> Thanks a lot, Paul. I am in the middle of wrapping Format NVM codes up. Could you please modify it when you have chance? Alex ________________________________ From: Luse, Paul E [mailto:paul.e.luse at intel.com] Sent: Thursday, January 19, 2012 12:15 PM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: ProcessIO Function Yes, that's what I meant below in my last sentence, we need to complete to storport in both of those cases and use SRB_STATUS_ERROR for those cases. If you want to implement this that'd be great (maybe just have one storport completion at the bottom of the function and set the error in earlier places; to avoid goto please use try/finally in this routine. Or if you want to focus on the formatNVM I'll clean this up as well as check the callers for mis-use of procession failures. Thx Paul From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, January 19, 2012 12:58 PM To: Luse, Paul E; nvmewin at lists.openfabrics.org Subject: RE: ProcessIO Function Yes, I'd agree make it as simple as possible. However, there are two places in ProcessIo returning FALSE without calling StorPortNotification to complete the request: Failure on calling storportGetCurrentProcessorNumber. Failure on calling NVMeMapCore2Queue. We should complete the request when either one happens? Thanks, Alex ________________________________ From: Luse, Paul E [mailto:paul.e.luse at intel.com] Sent: Thursday, January 19, 2012 11:46 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: ProcessIO Function Ahh, OK thanks. I'd prefer we simply have it by rule: - If process IO fails then the IO is complete to storport (if it had an SRB) with the appropriate status - If process IO succeeds then the IO is in flight If we add this flag then we also have to change the return value for process IO so the caller know why it failed so it can return the IO, if needed to storport. No point in having every caller implement code for 'return status busy' when it can be done in one central location where the error happens. Make sense? Thx Paul From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, January 19, 2012 12:27 PM To: Luse, Paul E; nvmewin at lists.openfabrics.org Subject: RE: ProcessIO Function Hi Paul, Sorry, I did not mean init state machine. It's the new Format NVM state machine. After Format NVM command completes successfully, Identify commands are being issued to re-fetch the structures. We all call ProcessIo to issue the commands and its return type is BOOLEAN, which only indicates whether the command had been issued or not. In some error cases, such as failure on getting CmdID, ProcessIo calls StorPortNotification to complete the request and callers have no way to tell if the request had been completed or not. To avoid complete the request twice, I'd like to suggest adding one extra parameter in ProcessIO like: BOOLEAN ProcessIo( __in PNVME_DEVICE_EXTENSION pAdapterExtension, __in PNVME_SRB_EXTENSION pSrbExtension, __in NVME_QUEUE_TYPE QueueType, __out RequestCompleted ) Thanks, Alex ________________________________ From: Luse, Paul E [mailto:paul.e.luse at intel.com] Sent: Thursday, January 19, 2012 11:06 AM To: Chang, Alex; nvmewin at lists.openfabrics.org Subject: RE: ProcessIO Function Hi Alex- I'm not totally sure I understand what you are trying to say but I think I do. What does "In SNTI codes" mean? I assume you're saying that there's some inconsistency with what some callers are assuming about the return value from ProcessIO() and, if so, good catch! Clearly we cannot complete the same IO twice :-) Haven't seen your formatNVM code yet but if any of the sub-commands you need to issue as art of handling formatNVM fails then you should fail the whole command gracefully. Also, how does any of this relate to init state machine? We may not have time to talk about this in the meeting so if you could reply and identify specific cases where you see callers of ProcessIO doing the wrong thing and propose a fix that would be great. I don't believe we want the caller to have to complete the IO if processIO fails, procession should do that however just looking at the code doesn't consistently do that. We should scrub it so that callers know that if procession fails that the IO has been completed and if it completes the IO has been issued. We should also fix procession to complete all failures with appropriate error codes. Thanks Paul From: Chang, Alex [mailto:Alex.Chang at idt.com] Sent: Thursday, January 19, 2012 11:40 AM To: Luse, Paul E; nvmewin at lists.openfabrics.org Subject: ProcessIO Function Hi all I tested Format NVM last night and it basically works fine. However, when I re-visited the state machine, there is potential problem related to ProcessIo function, which completes the request back to Storport in error cases. The return value of ProcessIo only indicates whether the command had been issued to controller or not. It does NOT tell if it completes the request. In SNTI codes, Ray re-uses the Srb Extension to issue command(s). If ProcessIO fails, ASSERTION is called. Should I do the same if it fails when re-fetching Identify structures after Format NVM succeeds? Question is we don't want to complete the same request twice. We should talk about this later. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.e.luse at intel.com Fri Jan 20 08:42:47 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Fri, 20 Jan 2012 16:42:47 +0000 Subject: [nvmewin] Notes from NVMe Windows Working Grope Call 1/19 Message-ID: <82C9F782B054C94B9FC04A331649C77A02E0A7@FMSMSX106.amr.corp.intel.com> All- Thanks for a great first call! Working Group Rules/Guidelines: Everything we state wrt operational procedures/rules/guidelines is subject to change based on "the common sense" rule. As we're just starting out, we have to lay down a few guidelines but we reserve the right to course correct as we determine the effectiveness of those guidelines. This policy applies across the board, so if anyone, at any time, recognizes an issue with how we're conducting the group that they believe is either inefficient or unfair in any way, they simply need to bring it to my attention and we're resolve it as a team and do what makes sense. Notes from call 1/19: - Paul to send http access URL for SVN. Website has already been updated, see https://www.openfabrics.org/resources/developer-tools/nvme-windows-development.html - SVN branching strategy o Latest is always on the trunk o Branch when we release, branches will be used to maintain the releases o For any experimental work that we'd like checked into the database (collaboration for example), we'll create a sand box branch from the trunk - Releases o Definition: A posted binary (in SVN) following confirmation of the release criteria defined below. Will include INF as well as release notes. o First release: Will be considered after check-in of the final support for format NVM (Alex, IDT) is reviewed and committed o Subsequent Releases: Will be calendar based, Jan/Jun or as needed to support bug-fixes and/or member special requests. o Criteria: Release candidate needs to pass our test scripts, described next, by at least 3 companies using the QEMU device emulation platform (latest at the time). In the event that a company has their own HW and sees issues that are not seen on QEMU but believed to be driver related, they can potentially hold up the release provided they can make a logical case with hard data explaining what the driver issue is and why it's not specific to their hardware. o Test Scripts: TBD, we have a good idea of what tools/scripts we want to use but we'll trial run them as part of our first few patches and send out specifics later. We have 2 patches in the near future (one from IDT and one from Intel) so we'll get those done and then post details and tools required. - Patches o Process: Submitter needs to base their changes on the latest (and re-base/re-test prior to sending their patch). They send the patch to the email list via TBD tool/format (we'll send details once we work them out). Some review will happen over the reflector, the maintainer will send a message out that the db is locked when they're ready to apply the patch which will be once at least one member from each company on the review panel has approved (can be via email or con call if needed). Once the patch is applied, the maintainer will send an email out. o Path contents: Code changes, short summary for SVN log, more verbose write up for release notes, confirmation of testing. o Testing: Same situation as test scripts for full releases, we'll send details after we run through the process a few times and work out the kinks - Bug Reporting o We'll use Bugzilla, details coming soon... - Meeting Logistics o For now we'll go "as needed", next one in probably 2-3 weeks. One goal we have is to keep the overhead extremely light so don't expect a ton of meetings but since we're just starting up we will likely have multiple meetings for the next quarter or so. From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Luse, Paul E Sent: Thursday, January 19, 2012 10:21 AM To: nvmewin at lists.openfabrics.org Subject: [nvmewin] Reminder: First meeting is this afternoon Here's our first agenda, looking forward to talking with all of you! - Discuss/decide SVN SCM strategy - Coding style/guidelines that we're following (attached) - Patch and review process - First release & subsequent release beat rate Thanks, Paul PS: Note the start time is in "AZ time", that's Mountain time so 3:00 Pacific Thursday, January 19, 2012, 04:00 PM US Arizona Time 916-356-2663, 8-356-2663, Bridge: 93, Passcode: 7693020 Live Meeting: https://webjoin.intel.com/?passcode=7693020 ____________________________________ Paul Luse Sr. Staff Engineer PCG Server Software Engineering Desk: 480.554.3688, Mobile: 480.334.4630 -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.e.luse at intel.com Fri Jan 20 13:03:35 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Fri, 20 Jan 2012 21:03:35 +0000 Subject: [nvmewin] patch review request Message-ID: <82C9F782B054C94B9FC04A331649C77A02E992@FMSMSX106.amr.corp.intel.com> So attached isn't a patch, it's the changed files as we don't have the patch process down yet (compare to the only version in SVN at the moment). Below are the changes (pretty small) and notes for SVN and the release notes. Tested on both Chatham and QEMU w/data integrity test as well as stress and SCSI compliance tests. nvmeStd.c (for svn log: assert changes & changes to support APIC logical mode) line 1236: changes in debug code to avoid asserts that we don't care about; also made this condition on chatham so qemu isn't so annoying NVMeIsrMsix(): changes to correctly support 'logical mode'. When we examine the MSI address as part of init, we determine if we know the physical mapping of cores/vectors or not and if we don't we go to a 'shared mode'. The 'shared mode' wasn't working correctly (never tested) as it was missing some code. I was able to exercise this mode by using Server 8. I believe when we can have a better solution when in 'logical mode', instead of going into shared mode I believe we do a bit moe work up front to figure out the mappings but will leave that for another day.... Line 2803: Chatham only change Nvmepowermgmt.c (for svn log: chatham specific changes) Line 97: chatham only change NVMeAdapterControlPowerDown90: chatham only changes Nvmeinit.c for svn log: assert changes NVMeNormalShutdown(): a few chatham only changes Sources (for svn log: added qemu compile switch) -added qemu compile switch For release notes: - Misc changes for Chatham and changes to the ISR to support APIC logical mode. ____________________________________ Paul Luse Sr. Staff Engineer PCG Server Software Engineering Desk: 480.554.3688, Mobile: 480.334.4630 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PENDING.zip Type: application/x-zip-compressed Size: 46354 bytes Desc: PENDING.zip URL: From Alex.Chang at idt.com Fri Jan 20 16:56:29 2012 From: Alex.Chang at idt.com (Chang, Alex) Date: Fri, 20 Jan 2012 16:56:29 -0800 Subject: [nvmewin] patch review request In-Reply-To: <82C9F782B054C94B9FC04A331649C77A02E992@FMSMSX106.amr.corp.intel.com> References: <82C9F782B054C94B9FC04A331649C77A02E992@FMSMSX106.amr.corp.intel.com> Message-ID: <45C2596E6A608A46B0CA10A43A91FE1602F83F0A@CORPEXCH1.na.ads.idt.com> Attached please find the implementation of handling Format NVM command via IOCTL Pass Through: - nvmeinit.c: Removed redundant codes in MSI enumeration. - nvmestd.c : Main implementation of Format NVM - nvmesnti.c : Includes blocking IO for the target namespace while it's under formatting. - nvmestd.h : Added definition of FORMAT_NVM_INFO structure. - nvmeioctl.h: Added two more error codes, associated with Format NVM, that can be returned in ReturnCode of SRB_IO_CONTROL. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: FormatNVM.zip Type: application/x-zip-compressed Size: 90346 bytes Desc: FormatNVM.zip URL: From paul.e.luse at intel.com Mon Jan 23 10:54:57 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Mon, 23 Jan 2012 18:54:57 +0000 Subject: [nvmewin] headsup: performance issue with current driver Message-ID: <82C9F782B054C94B9FC04A331649C77A0308A8@FMSMSX106.amr.corp.intel.com> Just starting to investigate but with > 3 threads our iomter performance gets erratic. This is using the Chatham HW and I do have another driver (internal to Intel) that rnus the board and doesn't have the same issue. Note that the other driver also uses DPCs for completion however it does not attempt to implement NUMA via decoding the MSI address, standard storport perf optimizations are used. I've been told by a few other teams who have worked on optimizing storport miniports for NUMA that they tried various 'brute force' methods and time after time reverted back to recommended methods from MSDN. I'll start getting to the bottom of this and will likely schedule a call later this week to discuss findings for those interested. Thx Paul ____________________________________ Paul Luse Sr. Staff Engineer PCG Server Software Engineering Desk: 480.554.3688, Mobile: 480.334.4630 -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.c.robles at intel.com Mon Jan 23 14:35:16 2012 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Mon, 23 Jan 2012 22:35:16 +0000 Subject: [nvmewin] Update on patch review request for "logical mode" support Message-ID: <49158E750348AA499168FD41D8898360048209@FMSMSX105.amr.corp.intel.com> All, Paul is in the process for updating his changes for the logical mode support in the driver. So please *disregard* his previous patch review request. He will send another patch review request out as soon as he completes the updates he is currently making and then we can continue with the review process. One quick item I did want to address was the patch request review process. As we agreed upon in our first official meeting, the review process will take place over the reflector for anyone wishing to review changes. However, the only requirement is that we get feedback from one representative from the 3 core dev companies (IDT, LSI, Intel). One additional item I'd like to point out is that if there is currently a patch review change in progress, then please refrain from sending out a second (or third, etc...) patch review request. This is for a couple of reasons: 1) If a patch request is already outstanding, then any new patches *must* wait so that their code base is rebased with the newly accepted patch (once reviewed and check in) prior to submitting the next patch request. This step cannot be done until the current outstanding patch has been reviewed and check in. This is to prevent any issues that may arise from not having the latest code (which may affect new changes) to work with for any change that want to be pushed into the baseline. 2) The person sending the subsequent patch review request, would have to rebase their code against the new baseline anyways and then send out a second patch review request with the latest code (plus their own changes). So for now, please wait on Paul's next patch review request to be sent out. Once that is reviewed and checked in, Alex (IDT) can then rebase his changes and then re-submit his patch review request to the reflector for review. Thanks, Ray [Description: cid:image001.png at 01CB3870.4BB88E70] Raymond C. Robles Platform Solutions Group | DCSG | IAG Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From raymond.c.robles at intel.com Mon Jan 23 14:42:15 2012 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Mon, 23 Jan 2012 22:42:15 +0000 Subject: [nvmewin] Update on patch review request for "logical mode" support Message-ID: <49158E750348AA499168FD41D8898360048230@FMSMSX105.amr.corp.intel.com> All, Paul is in the process for updating his changes for the logical mode support in the driver. So please *disregard* his previous patch review request. He will send another patch review request out as soon as he completes the updates he is currently making and then we can continue with the review process. One quick item I did want to address was the patch request review process. As we agreed upon in our first official meeting, the review process will take place over the reflector for anyone wishing to review changes. However, the only requirement is that we get feedback from one representative from the 3 core dev companies (IDT, LSI, Intel). One additional item I'd like to point out is that if there is currently a patch review change in progress, then please refrain from sending out a second (or third, etc...) patch review request. This is for a couple of reasons: 1) If a patch request is already outstanding, then any new patches *must* wait so that their code base is rebased with the newly accepted patch (once reviewed and check in) prior to submitting the next patch request. This step cannot be done until the current outstanding patch has been reviewed and check in. This is to prevent any issues that may arise from not having the latest code (which may affect new changes) to work with for any change that want to be pushed into the baseline. 2) The person sending the subsequent patch review request, would have to rebase their code against the new baseline anyways and then send out a second patch review request with the latest code (plus their own changes). So for now, please wait on Paul's next patch review request to be sent out. Once that is reviewed and checked in, Alex (IDT) can then rebase his changes and then re-submit his patch review request to the reflector for review. Thanks, Ray [Description: cid:image001.png at 01CB3870.4BB88E70] Raymond C. Robles Platform Solutions Group | DCSG | IAG Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From Alex.Chang at idt.com Mon Jan 23 15:28:22 2012 From: Alex.Chang at idt.com (Chang, Alex) Date: Mon, 23 Jan 2012 15:28:22 -0800 Subject: [nvmewin] IOCTL PT Document In-Reply-To: <49158E750348AA499168FD41D8898360048209@FMSMSX105.amr.corp.intel.com> References: <49158E750348AA499168FD41D8898360048209@FMSMSX105.amr.corp.intel.com> Message-ID: <45C2596E6A608A46B0CA10A43A91FE1602FC5130@CORPEXCH1.na.ads.idt.com> Per Paul's request, I am attaching the document, PT_IOCTL.doc, which we had reviewed a while ago. Please upload it to the doc directory. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PT_IOCTL.doc Type: application/msword Size: 75776 bytes Desc: PT_IOCTL.doc URL: From raymond.c.robles at intel.com Tue Jan 24 14:09:10 2012 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Tue, 24 Jan 2012 22:09:10 +0000 Subject: [nvmewin] Test Email Message-ID: <49158E750348AA499168FD41D8898360048598@FMSMSX105.amr.corp.intel.com> Testing my subscription. [Description: cid:image001.png at 01CB3870.4BB88E70] Raymond C. Robles Platform Solutions Group | DCSG | IAG Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From paul.e.luse at intel.com Tue Jan 24 15:23:46 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Tue, 24 Jan 2012 23:23:46 +0000 Subject: [nvmewin] update on strange perf issue Message-ID: <82C9F782B054C94B9FC04A331649C77A031892@FMSMSX106.amr.corp.intel.com> Here's the update: - If we use the storport DPC optimizations, our performance becomes erratic - If we don't use them, we end up in the same failing scenario that we had with ISR completions: o Get 'stuck' in a mode where we only get IOs on one core and therefore only complete IOs on one core however that's not sufficient for operation as we end up with a DPC watchdog timeout - I noticed the core that ends up 'stuck' every time is the same, core 0. - If I change our core mapping such that we never complete an IO on the same core that it came in on, we have no issues and peak IOPs in a 4K read case does not change however our CPU util goes from 20% to 60% saying a lot for our core/vector matching! - If I try in full shared mode (already coded, just hacked it to pretend that we only got one message from the OS) we still fail under heavy IO - So taking note that we share the MsgId/completion core between the admin queue and queue pair 1, I isolated the admin queue by moving incoming core 0 IO requests to queue pair 2 so it looks like the table below and things work fine and our CPU util stays at 20%. Not clear what is causing this and I've been working with out HW/FW folks to determine if there's some interaction there and so far no leads from that direction. It almost feels like obviously a bug in our stuff or some strange behavior with the OS and core 0. Obviously the former seems more plausible but I'm not able to pinpoint anything other than the data below (and the workaround of not sharing completions with the admin queue seems to solve it). - DOESN"T WORK Core SQ/CQ # Msg # 0 1 + admin 8 1 2 1 2 3 2 4 4 3 4 5 4 5 6 5 6 7 6 7 8 7 - - WORKS: Core SQ/CQ # Msg # 0 2 (+ admin) 1 (8) 1 2 1 2 3 2 4 4 3 4 5 4 5 6 5 6 7 6 7 8 7 - ____________________________________ Paul Luse Sr. Staff Engineer PCG Server Software Engineering Desk: 480.554.3688, Mobile: 480.334.4630 -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.e.luse at intel.com Tue Jan 24 16:08:50 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Wed, 25 Jan 2012 00:08:50 +0000 Subject: [nvmewin] update on strange perf issue Message-ID: <82C9F782B054C94B9FC04A331649C77A03190F@FMSMSX106.amr.corp.intel.com> One other possibility I didn't mention... as this seems to be associated with sharing with the admin queue could still easily be a HW/FW issue with Chatham. I'm running overnight with the non-admin queue sharing hack and if it passes I will continue with my patch check-in but leave this hack in there conditional on Chatham until someone else gets some HW running at speed that we can use to collect more data. From: Luse, Paul E Sent: Tuesday, January 24, 2012 4:24 PM To: nvmewin at lists.openfabrics.org Subject: update on strange perf issue Here's the update: - If we use the storport DPC optimizations, our performance becomes erratic - If we don't use them, we end up in the same failing scenario that we had with ISR completions: o Get 'stuck' in a mode where we only get IOs on one core and therefore only complete IOs on one core however that's not sufficient for operation as we end up with a DPC watchdog timeout - I noticed the core that ends up 'stuck' every time is the same, core 0. - If I change our core mapping such that we never complete an IO on the same core that it came in on, we have no issues and peak IOPs in a 4K read case does not change however our CPU util goes from 20% to 60% saying a lot for our core/vector matching! - If I try in full shared mode (already coded, just hacked it to pretend that we only got one message from the OS) we still fail under heavy IO - So taking note that we share the MsgId/completion core between the admin queue and queue pair 1, I isolated the admin queue by moving incoming core 0 IO requests to queue pair 2 so it looks like the table below and things work fine and our CPU util stays at 20%. Not clear what is causing this and I've been working with out HW/FW folks to determine if there's some interaction there and so far no leads from that direction. It almost feels like obviously a bug in our stuff or some strange behavior with the OS and core 0. Obviously the former seems more plausible but I'm not able to pinpoint anything other than the data below (and the workaround of not sharing completions with the admin queue seems to solve it). - DOESN"T WORK Core SQ/CQ # Msg # 0 1 + admin 8 1 2 1 2 3 2 4 4 3 4 5 4 5 6 5 6 7 6 7 8 7 - - WORKS: Core SQ/CQ # Msg # 0 2 (+ admin) 1 (8) 1 2 1 2 3 2 4 4 3 4 5 4 5 6 5 6 7 6 7 8 7 - ____________________________________ Paul Luse Sr. Staff Engineer PCG Server Software Engineering Desk: 480.554.3688, Mobile: 480.334.4630 -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.e.luse at intel.com Wed Jan 25 09:27:24 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Wed, 25 Jan 2012 17:27:24 +0000 Subject: [nvmewin] Update on patch review request for "logical mode" support In-Reply-To: <49158E750348AA499168FD41D8898360048209@FMSMSX105.amr.corp.intel.com> References: <49158E750348AA499168FD41D8898360048209@FMSMSX105.amr.corp.intel.com> Message-ID: <82C9F782B054C94B9FC04A331649C77A03208A@FMSMSX106.amr.corp.intel.com> OK, so I'm just about ready with mine but to be fair I think since I had to essentially retract mine, Alex's should have been immediately considered next so he doesn't have to rebase after waiting for me. As I'm the one who gave up 'the lock' I should have to rebase after him So, Alex's is on deck and I'll go next. I'll setup a meeting for next Tue (assuming Alex can make it) so we can walk us through his code changes and potentially we can cover mine during the meeting as well. Thx Paul From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Monday, January 23, 2012 3:35 PM To: NVMe Open Source Mailing List (mailto:nvmewin at lists.openfabrics.org); nvmewin at lists.openfabrics.org; Robles, Raymond C Subject: [nvmewin] Update on patch review request for "logical mode" support All, Paul is in the process for updating his changes for the logical mode support in the driver. So please *disregard* his previous patch review request. He will send another patch review request out as soon as he completes the updates he is currently making and then we can continue with the review process. One quick item I did want to address was the patch request review process. As we agreed upon in our first official meeting, the review process will take place over the reflector for anyone wishing to review changes. However, the only requirement is that we get feedback from one representative from the 3 core dev companies (IDT, LSI, Intel). One additional item I'd like to point out is that if there is currently a patch review change in progress, then please refrain from sending out a second (or third, etc...) patch review request. This is for a couple of reasons: 1) If a patch request is already outstanding, then any new patches *must* wait so that their code base is rebased with the newly accepted patch (once reviewed and check in) prior to submitting the next patch request. This step cannot be done until the current outstanding patch has been reviewed and check in. This is to prevent any issues that may arise from not having the latest code (which may affect new changes) to work with for any change that want to be pushed into the baseline. 2) The person sending the subsequent patch review request, would have to rebase their code against the new baseline anyways and then send out a second patch review request with the latest code (plus their own changes). So for now, please wait on Paul's next patch review request to be sent out. Once that is reviewed and checked in, Alex (IDT) can then rebase his changes and then re-submit his patch review request to the reflector for review. Thanks, Ray [Description: cid:image001.png at 01CB3870.4BB88E70] Raymond C. Robles Platform Solutions Group | DCSG | IAG Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From Alex.Chang at idt.com Wed Jan 25 09:37:18 2012 From: Alex.Chang at idt.com (Chang, Alex) Date: Wed, 25 Jan 2012 09:37:18 -0800 Subject: [nvmewin] Update on patch review request for "logical mode"support In-Reply-To: <82C9F782B054C94B9FC04A331649C77A03208A@FMSMSX106.amr.corp.intel.com> References: <49158E750348AA499168FD41D8898360048209@FMSMSX105.amr.corp.intel.com> <82C9F782B054C94B9FC04A331649C77A03208A@FMSMSX106.amr.corp.intel.com> Message-ID: <45C2596E6A608A46B0CA10A43A91FE1602FC56BD@CORPEXCH1.na.ads.idt.com> Thanks a lot, Paul. I will be available next Tuesday. Alex ________________________________ From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Luse, Paul E Sent: Wednesday, January 25, 2012 9:27 AM To: Robles, Raymond C; NVMe Open SourceMailing List (mailto:nvmewin at lists.openfabrics.org); nvmewin at lists.openfabrics.org; Robles, Raymond C Subject: Re: [nvmewin] Update on patch review request for "logical mode"support OK, so I'm just about ready with mine but to be fair I think since I had to essentially retract mine, Alex's should have been immediately considered next so he doesn't have to rebase after waiting for me. As I'm the one who gave up 'the lock' I should have to rebase after him So, Alex's is on deck and I'll go next. I'll setup a meeting for next Tue (assuming Alex can make it) so we can walk us through his code changes and potentially we can cover mine during the meeting as well. Thx Paul From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Robles, Raymond C Sent: Monday, January 23, 2012 3:35 PM To: NVMe Open Source Mailing List (mailto:nvmewin at lists.openfabrics.org); nvmewin at lists.openfabrics.org; Robles, Raymond C Subject: [nvmewin] Update on patch review request for "logical mode" support All, Paul is in the process for updating his changes for the logical mode support in the driver. So please *disregard* his previous patch review request. He will send another patch review request out as soon as he completes the updates he is currently making and then we can continue with the review process. One quick item I did want to address was the patch request review process. As we agreed upon in our first official meeting, the review process will take place over the reflector for anyone wishing to review changes. However, the only requirement is that we get feedback from one representative from the 3 core dev companies (IDT, LSI, Intel). One additional item I'd like to point out is that if there is currently a patch review change in progress, then please refrain from sending out a second (or third, etc...) patch review request. This is for a couple of reasons: 1) If a patch request is already outstanding, then any new patches *must* wait so that their code base is rebased with the newly accepted patch (once reviewed and check in) prior to submitting the next patch request. This step cannot be done until the current outstanding patch has been reviewed and check in. This is to prevent any issues that may arise from not having the latest code (which may affect new changes) to work with for any change that want to be pushed into the baseline. 2) The person sending the subsequent patch review request, would have to rebase their code against the new baseline anyways and then send out a second patch review request with the latest code (plus their own changes). So for now, please wait on Paul's next patch review request to be sent out. Once that is reviewed and checked in, Alex (IDT) can then rebase his changes and then re-submit his patch review request to the reflector for review. Thanks, Ray Raymond C. Robles Platform Solutions Group | DCSG | IAG Intel Corporation Desk: 480.554.2600 Mobile: 480.399.0645 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1756 bytes Desc: image001.png URL: From paul.e.luse at intel.com Wed Jan 25 09:44:39 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Wed, 25 Jan 2012 17:44:39 +0000 Subject: [nvmewin] NVMe Workgroup Meeting Message-ID: <82C9F782B054C94B9FC04A331649C77A032269@FMSMSX106.amr.corp.intel.com> Agenda: -Opens -Patch process clarification/updates -Review Alex's patch - format NVM pass through handler -Review Paul's patch - INT and core mapping changes for chatham (provided there's time) Tuesday, January 31, 2012, 02:00 PM US Arizona Time 916-356-2663, 8-356-2663, Bridge: 92, Passcode: 8326347 Live Meeting: https://webjoin.intel.com/?passcode=8326347 Speed dialer: inteldialer://92,8326347 | Learn more -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/calendar Size: 1910 bytes Desc: not available URL: From paul.e.luse at intel.com Wed Jan 25 09:54:39 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Wed, 25 Jan 2012 17:54:39 +0000 Subject: [nvmewin] quick update on perf investigation Message-ID: <82C9F782B054C94B9FC04A331649C77A03232B@FMSMSX106.amr.corp.intel.com> As a follow-up, I ran overnight w/no issues with the one change where I dedicate core 0 completions to admin only (still conditional on Chatham). Smells like a hw/fw problem at this point but anyway I added a compile switch to allow us to choose between ISR completions and DPC completions and sure enough things work fine now with ISR completions (no watchdog issues which is what led to the DPC implementation). I think it makes sense to leave them both in there (default is ISR completions for now) so foks can profile on their own HW and decide which is right for them. I'll try to run some xperf traces before the Tue meeting and bring some data describing the difference; from the pue iometer perspective using ISR our CPU util goes down from ~20 to about ~18 on the average, I suspect its because we're not masking MSIX ints therefore when using DPCs our ISR gets called a zillion times where the DPC doesn't get scheduled because there's one outstanding already; that overhead of course goes away but its not clear what the system level cost is of spending more time at DIRQL vs more time attempting to schedule DPCs Thx Paul ____________________________________ Paul Luse Sr. Staff Engineer PCG Server Software Engineering Desk: 480.554.3688, Mobile: 480.334.4630 -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.e.luse at intel.com Thu Jan 26 15:58:15 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Thu, 26 Jan 2012 23:58:15 +0000 Subject: [nvmewin] finally root caused - strange DPC watchdog issue Message-ID: <82C9F782B054C94B9FC04A331649C77A033980@FMSMSX106.amr.corp.intel.com> So its not Chatham and its not some complex mapping issue or crazy corner case... we were failing to set StorPortSetDeviceQueueDepth() per lun. I knew about this but incorrectly recalled that the default was 254, its actually 20. When I set it to 254 the odd imbalance in CPU load (and the watchdog issues) goes away. Apparently having an optimal miniport and a heavy IO load with a low storport queue depth exposes this problem in storport as when I mix up the submit/msix cores and leave the queue depth at 20 it goes away. I'll prepare my patch accordingly and send it out tomorrow.... Thx Paul ____________________________________ Paul Luse Sr. Staff Engineer PCG Server Software Engineering Desk: 480.554.3688, Mobile: 480.334.4630 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alex.Chang at idt.com Fri Jan 27 16:03:15 2012 From: Alex.Chang at idt.com (Chang, Alex) Date: Fri, 27 Jan 2012 16:03:15 -0800 Subject: [nvmewin] Patch Review Request In-Reply-To: <82C9F782B054C94B9FC04A331649C77A033980@FMSMSX106.amr.corp.intel.com> References: <82C9F782B054C94B9FC04A331649C77A033980@FMSMSX106.amr.corp.intel.com> Message-ID: <45C2596E6A608A46B0CA10A43A91FE1602FC6068@CORPEXCH1.na.ads.idt.com> Hi all, I am re-sending the requst. Please disgard the previous one I sent last week. Attached please find the implementation of handling Format NVM command and Namespace Hot Add/Remove via IOCTL Pass Through: - nvmeinit.c: Removed redundant codes in MSI enumeration. - nvmestd.c : Main implementation of Format NVM - nvmesnti.c : a. Includes blocking IO for the target namespace while it's under formatting and namespace add back logic b. Removed supporting Read/Write16 due to a bugcheck (0x19) when running SCSI Compliance 2.0 - nvmestd.h : Added definition of FORMAT_NVM_INFO structure. - nvmeioctl.h: a. Added three more error codes, associated with Format NVM, that can be returned in ReturnCode of SRB_IO_CONTROL . b. Added two IOCTL codes, NVME_HOT_ADD_NAMESPACE and NVME_HOT_REMOVE_NAMESPACE. Now, Format NVM can be done in two ways: 1. A single Format NVM command via IOCTL Pass Through request. The steps driver takes to complete the request are: a. Removes the target namespace(s) first b. Issues Format NVM command c. Re-fetch Identify Controller structure d. Re-fetch Identify Namespace structure(s) e. Adds back the formatted namespace(s) 2. With NVME_HOT_ADD/REMOVE_NAMESPACE IOCTL calls and a Format NVM command via IOCTL Pass Through request: a. Issue NVME_HOT_REMOVE_NAMESPACE IOCTL request to remove the target namespace(s) b. Issue Format NVM command via IOCTL Pass Through request, which convers Step b, c and d of first method. c. Issue NVME_HOT_ADD_NAMESPACE IOCTl request to add back the formatted namespace(s) Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: FormatNVM.zip Type: application/x-zip-compressed Size: 91566 bytes Desc: FormatNVM.zip URL: From paul.e.luse at intel.com Tue Jan 31 08:55:12 2012 From: paul.e.luse at intel.com (Luse, Paul E) Date: Tue, 31 Jan 2012 16:55:12 +0000 Subject: [nvmewin] NVMe Workgroup Meeting In-Reply-To: <82C9F782B054C94B9FC04A331649C77A032286@FMSMSX106.amr.corp.intel.com> References: <82C9F782B054C94B9FC04A331649C77A032286@FMSMSX106.amr.corp.intel.com> Message-ID: <82C9F782B054C94B9FC04A331649C77A037AB4@FMSMSX106.amr.corp.intel.com> FYI I'll be a on plane for this afternoon's meeting so Ray will be running things today. Thanks! Paul -----Original Appointment----- From: Luse, Paul E Sent: Wednesday, January 25, 2012 10:45 AM To: Luse, Paul E; nvmewin at lists.openfabrics.org Cc: Chang, Alex; Patel, Arpit; Knoblaugh, Rick; Robles, Raymond C; Delsey, Josephine A Subject: NVMe Workgroup Meeting When: Tuesday, January 31, 2012 2:00 PM-4:00 PM (UTC-07:00) Arizona. Where: See body of notice Agenda: -Opens -Patch process clarification/updates -Review Alex's patch - format NVM pass through handler -Review Paul's patch - INT and core mapping changes for chatham (provided there's time) Tuesday, January 31, 2012, 02:00 PM US Arizona Time 916-356-2663, 8-356-2663, Bridge: 92, Passcode: 8326347 Live Meeting: https://webjoin.intel.com/?passcode=8326347 Speed dialer: inteldialer://92,8326347 | Learn more -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.c.robles at intel.com Tue Jan 31 14:17:00 2012 From: raymond.c.robles at intel.com (Robles, Raymond C) Date: Tue, 31 Jan 2012 22:17:00 +0000 Subject: [nvmewin] NVMe Workgroup Meeting Message-ID: <49158E750348AA499168FD41D889836004A733@FMSMSX105.amr.corp.intel.com> Meeting Minutes (1/30/12): Attendees: Alex (IDT) Kwok (IDT) Arpit (LSI) Nathan (WD) Ray (Intel) Steven Shrader Opens: - No opens from the group Patch Review process update: - If there is an outstanding patch review request, any subsequent companies wishing to submit a patch request shall wait until the first outstanding patch has been integrated. NOTE: the submitter of the patch request should be ready to integrate at any time once the patch request submitted. - Patch review requests that require debug or any additional work (because of error), will go back in the queue. And the next person in line, will get to go. Review IDT's code modifications for Format NVM pass through IOCTL - Few minor modifications: Alex took notes. - Intel requesting all Format NVM PT IOCTL handling be done in StartIo, other than that, we are fine with the changes. - LSI is fine with all changes - Alex will send out updated patch request... no need for second review. Thanks, Ray -----Original Appointment----- From: Luse, Paul E [mailto:paul.e.luse at intel.com] Sent: Wednesday, January 25, 2012 10:45 AM To: Luse, Paul E; nvmewin at lists.openfabrics.org Subject: [nvmewin] NVMe Workgroup Meeting When: Tuesday, January 31, 2012 2:00 PM-4:00 PM (UTC-07:00) Arizona. Where: See body of notice Agenda: -Opens -Patch process clarification/updates -Review Alex's patch - format NVM pass through handler -Review Paul's patch - INT and core mapping changes for chatham (provided there's time) Tuesday, January 31, 2012, 02:00 PM US Arizona Time 916-356-2663, 8-356-2663, Bridge: 92, Passcode: 8326347 Live Meeting: https://webjoin.intel.com/?passcode=8326347 Speed dialer: inteldialer://92,8326347 | Learn more << File: ATT00001.txt >> -------------- next part -------------- An HTML attachment was scrubbed... URL: