[nvmewin] Patch with changes for disk Read only support
Thomas Freeman
thomas.freeman at hgst.com
Wed Apr 20 09:29:55 PDT 2016
Hi Suman,
After reviewing the code, I have a few questions/comments:
1. For mode sense with 0x3f, you set the WP in the response header. Shouldn WP also be set if any of those pages(0x8, 0xa, 0x1a, 0x1c) are individually requested?
2. snti.c:8242
Lun = pLunExt->namespaceId - 1;
pDevExt->pLunExtensionTable[Lun]->IsNamespaceReadOnly = TRUE;
These 2 lines can be replaced with
pLunExt->IsNamespaceReadOnly = TRUE;
Also, the original code is not a reliable way to determine the LUN id.
Here is an example where there doesn't work. The device has attached namespaces 1,3 & 4 and Existing namespaces of 1, 2, 3 & 4. LUNs 0-3 will correspond to namespaces 1,3,4,2. For namespace 3, the calculation NSID-1=lun will incorrectly give you LUNid of 2.
3. Along with the new member, IsNamespaceReadOnly, the nvme_lun_extension also has ReadOnly. It seems like the setting of WP should take into account the value of both members.
4. If the NVMe command Get Log Page fails, (SCT != Generic_command_status || SC != Successful completion), the buffer pSrbExt->pDatBuffer is not freed. This corresponds to the allocation at snti.c:6530.
5. snti.c:6539: I think the following can be eliminated. The same copy occurs during SntiTranslateModeSenseResponse - snti.c:8272.
if (GET_DATA_BUFFER(pSrb) != NULL) {
StorPortCopyMemory((PVOID)GET_DATA_BUFFER(pSrb),
(PVOID)(pSrbExt->modeSenseBuf), GET_DATA_LENGTH(pSrb));
}
Let me know if you have questions,
Tom Freeman
Software Engineer, Device Manager and Driver Development
Western Digital Corporation
e. Thomas.freeman at hgst.com
o. +1-507-322-2311
[cid:image002.jpg at 01D19AF7.F4507C10]
From: nvmewin [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of SUMAN PRAKASH B
Sent: Tuesday, April 19, 2016 8:45 AM
To: nvmewin at lists.openfabrics.org
Cc: sukka.kim at samsung.com; prakash.v at samsung.com; anshul at samsung.com; MANOJ THAPLIYAL <m.thapliyal at samsung.com>; tru.nguyen at ssi.samsung.com
Subject: [nvmewin] Patch with changes for disk Read only support
Hi all,
This patch includes changes for supporting NVMe Disk read only mode.
I have made a detailed overview of the changes in the attached doc file(the contents are also copied here below) and the attached zip file contains the source code.
Password is samsungnvme
Please let me know if you have any questions.
Thanks,
Suman
******************
NVMe Disk End of Life support:
Whenever NVMe disk exhausts the P/E cycles, the disk become Read only(reaches End Of Life). In this case, the user should be able to read the data from the disk for backup or migration purpose. To achieve this, the driver should inform the kernel that disk has become read only. If driver does not inform the kernel, the disk will be unusable from Windows.
The device has to be detected as Read only in following 2 scenarios -
a. Detection during device hot plug
When a Read only device is hot inserted, the kernel should be able to enumerate the device as Read only and alert the user accordingly. When the SSD is hot inserted, as part of disk initialization process, a SCSI mode sense command with page code 'Return all pages' (0x3f) is requested by the kernel. The mode page has a mode parameter header, which has a WP bit in the 'Device specific Parameter' field which indicates if the device is Write Protected for some reason. We can make use of this field to report to the kernel that the device has become Read only.
When the miniport driver receives this request, the NVM Express command Get log page is built with log identifier 'SMART / Health Information' (0x2) and send to the device. The SMART data has a 'Critical Warning' field in which a bit 'MediaInReadOnlyMode' is set whenever the media becomes Read only. So if the device returns SMART data with this bit set, the miniport driver sets the Device specific parameter - WP bit in mode parameter header and completes the command.
When the WP bit is set in the mode parameter header, the kernel will understand that the device is Write protected and hence kernel will not send any more write requests.
b. Detection during run time
When the device is in use and the Write exhausts and device becomes Read only, the kernel has to immediately report to the user that device has become write protected. To achieve this, whenever the device receives a NVMe Write request after it has become Read only, the device sets SCT to Command Specific Status and SC to 'Attempted Write to Read Only Range' in response to the write command. For this the following sense data is returned for the corresponding SCSI write command.
Sense data - SCSI_SENSE_DATA_PROTECT, ASC - SCSI_ADSENSE_WRITE_PROTECT and ASCQ - SCSI_ADSENSE_NO_SENSE.
With this sense data, the kernel will understand that the device is in Write protected state for which the Mode sense command with mode page 'Return all pages' will be send to the device. Again with the NVM Express Get log page - SMART command, the miniport driver will return the mode sense 'Data Specific parameter' accordingly.
Code changes:
1. In SntiReturnAllModePages(), build get log page for SMART/health information and send to device.
2. In SntiTranslateModeSenseResponse(), for log page MODE_SENSE_RETURN_ALL, set the Write protect bit in device specific parameter in the mode header based on the media in read only mode bit(03) in critical warning field returned in SMART/health log page.
3. The checking for volatile write cache is moved from SntiReturnAllModePages() to SntiTranslateModeSenseResponse() after successful completion of get log page command.
We have tested the following:
a. On a Read only NVMe SSD, install OFA driver with these changes. In the disk management tool, the status of disk is shown as Read Only. Please find attached "DiskMgmt.jpg" (sometimes requires a system restart after driver installation).
b. Hot insert a RO NVMe SSD and observe status as Read Only in disk management tool.
c. On NVMe SSD, which has less % of available spare(for example 10%), execute io meter tool with write commands. When available spare reaches 0%, the error count in io meter tools starts increasing(i.e. write commands fails with the sense data, as explained in above sections), and status becomes Read Only in disk management tool.
d. After disk becomes RO, when we try to copy files to the RO drive, Windows show message "The disk is write protected". Please find attached "FileCopy.jpg"
Note:
a. As per NVMe spec 1.2, section 5.10.1.2, "There is not namespace specific information defined in the SMART / Health log page in this revision, thus the global log page and namespaces specific log page contain identical information". So when testing with multi namespace, when 1 namespace becomes RO, all the namespace will become RO. Spec has to be defined to have separate SMART /Health data per namespace.
b. For testing, if there is no NVMe SSD which is in RO state, the following changes can to be made in the driver to test this feature:
1. In SntiTranslateModeSenseResponse(), hardcode pNvmeLogPage->CriticalWarning.MediaInReadOnlyMode to 1, before checking for the value. This can be done for per namespace also.
2. In SntiMapCompletionStatus(), for NVMe write command, hardcode statusCodeType to COMMAND_SPECIFIC_ERRORS and statusCode to 0x82. This can be done for per namespace also.
[Image removed by sender.]
Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20160420/953c834b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ~WRD000.jpg
Type: image/jpeg
Size: 823 bytes
Desc: ~WRD000.jpg
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20160420/953c834b/attachment.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 2938 bytes
Desc: image002.jpg
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20160420/953c834b/attachment-0001.jpg>
More information about the nvmewin
mailing list