[nvmewin] Potential BSOD if DataTransferLength < Allocation Length
Mike Berhan
mikeb at bustrace.com
Wed Jun 25 15:59:09 PDT 2014
Alex,
I do a lot of boundary testing which is why I come up with these esoteric
bugs. Regarding the issues cited, I use this type of code to determine the
maximum amount of data to transfer.
ULONG nMaxTransferLen = (pDataBuffer ?
min(nDataTransferLength,nCdbAllocationLength) : 0);
That covers a NULL pointer as well as the DataTransferLength < CDB
Allocation length issue.
I then determine the amount of data I would like to ideally transfer in an
nDesiredTransferLen variable. For example, let's say 16 bytes is the actual
length of the Report LUNs structure to return. I then determine the actual
amount of data I'm going to transfer by performing this comparison:
ULONG nTransferLength = min(nDesiredTransferLen, nMaxTransferLen);
Now I know the exact number of bytes I'm going to transfer. I'll only work
within that memory space and that is the value returned in
DataTransferLength. The last thing to do is accurately report
SRB_STATUS_SUCCESS or SRB_STATUS_DATA_OVERRUN depending on the inbound and
outbound data lengths.
Mike
-----Original Message-----
From: Alex Chang [mailto:Alex.Chang at pmcs.com]
Sent: Wednesday, June 25, 2014 4:24 PM
To: Mike Berhan; nvmewin at lists.openfabrics.org
Subject: RE: [nvmewin] Potential BSOD if DataTransferLength < Allocation
Length
Hi Mike,
You did bring up couple of good points. From my experience, the
DataTransferLength is always equal or greater than the allocLength of CDB
and I am sure it's up to upper layer software when they package the SRB. In
order to avoid the potential BSOD, I'd agree to do sanity check on them.
As for the 2nd one, with the sanity check in place, it should be fine to
zero out the allocated buffer in the length specified by SRB. Please let me
know what you think.
Thanks,
Alex
-----Original Message-----
From: nvmewin-bounces at lists.openfabrics.org
[mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Mike Berhan
Sent: Wednesday, June 25, 2014 2:00 PM
To: nvmewin at lists.openfabrics.org
Subject: [nvmewin] Potential BSOD if DataTransferLength < Allocation Length
I recently joined this reflector. Advanced apologies if this has already
been covered.
I've been perusing the latest NVMe OFA source code. The NVMe Win source
indicates a potential BSOD on a poorly formatted SPTI/SRB/CDB. For the sake
of discussion, I'll limit this to the Report LUNs CDB but it's applicable to
a variety of other CDBs the driver handles. Here is a sample SRB/CDB to
consider:
_SCSI_REQUEST_BLOCK
<snip>
ULONG DataTransferLength 00000100h
<snip>
PVOID DataBuffer 0000006C`801D0A50h
<snip>
UCHAR Cdb[16] A0 00 00 00 00 00 00 00 10 00 00 00 00 00 00
00
The key items to consider are the DataTransferLength at 100h and the CDB
Allocation Length at 1000h. This is likely a bug in the calling software as
the DataTransferLength should be greater than or equal to the CDB allocation
length. In this example, this probably doesn't cause a problem as the
return data is likely less than 100h bytes anyway so the caller would never
see any failure. However, when executed on the NVMe driver, several
problems can occur.
In reviewing the source code, the Report LUNs CDB gets routed to the
SntiTranslateReportLuns() function. The code retrieves information about
the command as you see here:
pResponseBuffer = (PUCHAR)GET_DATA_BUFFER(pSrb);
allocLength = GET_REPORT_LUNS_ALLOC_LENGTH(pSrb);
selectReport = GET_U8_FROM_CDB(pSrb, REPORT_LUNS_SELECT_REPORT_OFFSET);
>From our example above, the following values are extracted from the SRB/CDB:
pResponseBuffer = 0000006C`801D0A50h
allocLength = 1000h
selectReport = 00h (i.e. RESTRICTED_LUNS_RETURNED)
The following code then gets executed:
if ((selectReport != ALL_LUNS_RETURNED) &&
(selectReport != ALL_WELL_KNOWN_LUNS_RETURNED) &&
(selectReport != RESTRICTED_LUNS_RETURNED)) {
<snip>
} else {
<snip>
memset(pResponseBuffer, 0, allocLength); <<<--- BSOD!?!?!?
<snip>
}
Several potential problems are seen in walking through the source code.
Problem #1 - Driver overrunning data buffer
The miniport driver received a 100h byte buffer. However, because the CDB
allocation length was set to 1000h, the miniport driver is going to overflow
the buffer and zero out 1000h bytes of data. At best you're looking at a
BSOD. At worst memory is corrupted. It's also not checking for a NULL SRB
Data buffer pointer or the SRB data direction flags.
Problem #2 - Miniport writing to buffer when it shouldn't
A storage device should only alter the data buffer that has actual data.
For example, if I send down a request with a 256 byte buffer, but the device
only returns 16 bytes, the remainder of the buffer (i.e. the tail 240 bytes)
should not be altered. The DataTransferLength field in the SRB would be
returned as 16 indicating that only 16 bytes were transferred.
In the sample cited above, the miniport driver is attempting to zero out the
entire data buffer passed in when it should only zero out the portion of the
data buffer that it will actually return data in (max DataTransferLength).
Not that this is a particularly critical issue but it is worth noting and
would make the driver more robust.
Mike
_______________________________________________
nvmewin mailing list
nvmewin at lists.openfabrics.org
http://lists.openfabrics.org/mailman/listinfo/nvmewin
More information about the nvmewin
mailing list