[nvmewin] ***UNCHECKED*** latest patch

Luse, Paul E paul.e.luse at intel.com
Fri Jul 13 10:06:35 PDT 2012

Here's the latest patch (cumulative with the last one) pw is intel123.  Again, many minor changes to keep things clean and consistent with our coding guidelines so don't let the # of changes frighten you.  I'm going to ask for faster turnaround time on this because that's one of the reasons I had to put a patch on top of patch, right now we're moving quickly on my end (making updates) so if I wait two weeks I'm just pilling up more stuff for next time.  So, review board folks, please make an effort to provide feedback before the end of next week.  I'll schedule a meeting to cover Q&A for Fri sometime.

Wrt preparing for create/delete NS, there's actually very little that ended up in here.  We've done our own implementation at Intel (since its not defined in the spec) and really all of the complexity will remain in our private branch which is good news for the OFA driver.  It's a shame that the spec doesn't cover NS management but when it does we'll tackle it.  For now, here's the high level of what changed in this patch and then below are the details:

Preparing for create/delete:
- report luns was changed to handle the case where lun IDs might not be sequential.  This does no harm for the driver without create/delete NS but may/may not be needed depending on how create/delete NS is implemented
- during enum, and in a few other places, the lunExt array is not indexed by a sequential count but instead a count of the number of NS discovered that are not hidden (in other words, we're not putting hidden NS's in the lunExt array)
- changed init state machine to do get NS, get/set features together before moving to the next NS
- fixed a timeout issue in passive init
- fixed a missing startio sync lock when DPC routines are trying to do IO
- removed an un-needed MSI spinlock in the DPC handler
- removed extra CAP register from the devExt
- fixed small math bug in findAdapter
- addition of history buffers (inline trace) for debug
- perf opt for the case where the device has fewer queues than there are cores in the machine
- the format NVM command was not tested and is likely broken right now (slightly).  We'll be updating that code to leverage some of the new mechanisms added here in the next few weeks and, Alex, we may be consulting with you on this one.

- changed retValue of NVMeMapCore2Queue to be consistent with other functions called there
- misc variable name changes (per coding guidelines)
- had previously stored an extra copy of the CAP reg in the devExt, not needed becase the ctrl registers are already stored so there are updaters where that was used
- in NVMeSetFeaturesCompletion:
               - some name changes
               - removed all of the code where we processed range types and left it for now so that we only
               pay attention to visibility and readOnly (not rangetype) and we never issue a setFeatures
               - there's also an init state machine change in here that will be needed by anyone wanting to
              Support create/delete NS in the future.  Before we did all of the ID NS commands and then
              All of the gets which isn't enough info to build our map because after ID NS you don't know
              If you need to save the NS or not and if the create/delete scheme results in holes (it may or may not
              It depends on the scheme that the spec chooses) then you're in trouble.  The soln is to have the machine
              Perform ID NS, then get/set for that NS and then move on to the next ID NS and so on.  Small change,
              Long explanation J  There are also changes to support this in the InitCallback
- update to learning mode callback not to consider a read error on learning mode as fatal, let the driver attempt to
Continue in less than optimal mode and let a host driven read result in driver failure
- new parm added to procession calls, covered later
- small change(s) where the LunExt table is indexed by NSID, changed to index it by a new devExt variable that counts the number of visible namespaces - again needed in the event that holes are introduced in the NSID map in the future
- in NVMeAllocIoQueues() we used to drop to one QP in the event that there were more cores than we had QPs based on what the device supported.  It was a small change to make this condition less performance limiting, see the code for details as its fairly straightforward.  Makes us more flexible and also shortens the routine a bit.  Also fixed a bug here where some dbl buffer memory wasn't being free in the error case
- moved NVMeIssueCmd() from this file to the io file (makes a lot more sense over there)

- NVMeIssueCmd moved to here.  Also, new parm added to determine if the startio spinlock is needed or not.  Before we had 6 or so callers from DPC that didn't sync with startIO which was a bug.  Callers from DPC needing to send IO must synchronize with startIO
- Implemented try/finally clause in NVMeIssueCmd simply because mechanically it made sense given the new startIO lock being taken at the top and the number of exits we have in the middle of the routine.

- in SntiTranslateInquiry we don't check the exposenamespace element of the lunExt anymore as no hidden nsmaespaces will ever be there anyway
- SntiTranslateReportLuns - changes to support potential holes in the LunExt in the event that they are introduces as part of a create/delete NS scheme.  Instead of listing how many we have sequentially, we now look through the entire list and only include that that exist in the return map
- GetLunExtension - now checks the slot status of the lun being looked at.  New status introduced of ONLINE, OFFLINE and FREE.  Online and free should be clear, OFFLINE can be used for things like format where we need to make sure that lunID doesn't get "taken" by a potential addition while we're using it so its sorta like a reservation on the lun ID

- besides namechanges and a few prints, the only change here is in one of the criteria for learning mode, since we're allowing a core/QP mismatch now the check to see if queues == 1 isn't valid anymore, we need to disable learning if its less than the # of active cores (learning makes no sense in that case, queues are being shared by multiple cores)

- introduction of HISTORY_BUFFERS.  For debug only and can be extremely handy.  Self explanatory - basically inline trace capability.
- NVMeFindAdapter:  removed the extra CAP in the devExt, fixed an operator bug (was using ^ as to the power of)
- NVMePassiveInitialize:  the gross timeout value here didn't account for the fact that the wait on RDY in the state machine itself has a variable timeout based on what the card tells us in the CAP register.  So, the passiveInit timeout has to be longer than that for sure, so changed it so that it is
- IoCompletionDpcRoutine:  must have been a misunderstanding in the original coding of this (I recall there was some confusion) but we do not need to take the MSI lock here, it raised us back to DIRQL!  The MSI lock is taken by storport for us when it calls the ISR and then released when the ISR is released.  These API to get it are for cases where we need to sync with the ISR, its not meant to be called from the DPC as a regular thing.  We are protected by the DPC spinlock - windows will synchronize DPCs running on the same core and taking the lock synchronizes our DPC cross-core.  Taking the MSI lock was just a performance impact with no value add.
- fixed bug in RecoveryDpcRoutine, we were missing  a call to unfreeze storport queues

Paul Luse
Sr. Staff Engineer
PCG Server Software Engineering
Desk: 480.554.3688, Mobile: 480.334.4630

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20120713/af7afc6c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: source.zip
Type: application/x-zip-compressed
Size: 162984 bytes
Desc: source.zip
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20120713/af7afc6c/attachment.bin>

More information about the nvmewin mailing list