[nvmewin] Potential BSOD if DataTransferLength < Allocation Length

Alex Chang Alex.Chang at pmcs.com
Wed Jun 25 15:23:43 PDT 2014

Hi Mike,

You did bring up couple of good points. From my experience, the DataTransferLength is always equal or greater than the allocLength of CDB and I am sure it's up to upper layer software when they package the SRB. In order to avoid the potential BSOD, I'd agree to do sanity check on them.
As for the 2nd one, with the sanity check in place, it should be fine to zero out the allocated buffer in the length specified by SRB. Please let me know what you think.


-----Original Message-----
From: nvmewin-bounces at lists.openfabrics.org [mailto:nvmewin-bounces at lists.openfabrics.org] On Behalf Of Mike Berhan
Sent: Wednesday, June 25, 2014 2:00 PM
To: nvmewin at lists.openfabrics.org
Subject: [nvmewin] Potential BSOD if DataTransferLength < Allocation Length

I recently joined this reflector.  Advanced apologies if this has already been covered.

I've been perusing the latest NVMe OFA source code.  The NVMe Win source indicates a potential BSOD on a poorly formatted SPTI/SRB/CDB.  For the sake of discussion, I'll limit this to the Report LUNs CDB but it's applicable to a variety of other CDBs the driver handles.  Here is a sample SRB/CDB to

    ULONG DataTransferLength  00000100h
    PVOID DataBuffer          0000006C`801D0A50h
    UCHAR Cdb[16]             A0 00 00 00 00 00 00 00 10 00 00 00 00 00 00

The key items to consider are the DataTransferLength at 100h and the CDB Allocation Length at 1000h.  This is likely a bug in the calling software as the DataTransferLength should be greater than or equal to the CDB allocation length.  In this example, this probably doesn't cause a problem as the return data is likely less than 100h bytes anyway so the caller would never see any failure.  However, when executed on the NVMe driver, several problems can occur.

In reviewing the source code, the Report LUNs CDB gets routed to the
SntiTranslateReportLuns() function.  The code retrieves information about the command as you see here:

    pResponseBuffer = (PUCHAR)GET_DATA_BUFFER(pSrb);
    allocLength = GET_REPORT_LUNS_ALLOC_LENGTH(pSrb);

>From our example above, the following values are extracted from the SRB/CDB:

    pResponseBuffer = 0000006C`801D0A50h
    allocLength = 1000h
    selectReport = 00h (i.e. RESTRICTED_LUNS_RETURNED)

The following code then gets executed:

    if ((selectReport != ALL_LUNS_RETURNED)            &&
        (selectReport != ALL_WELL_KNOWN_LUNS_RETURNED) &&
        (selectReport != RESTRICTED_LUNS_RETURNED)) {
    } else {
        memset(pResponseBuffer, 0, allocLength);   <<<--- BSOD!?!?!?

Several potential problems are seen in walking through the source code.

Problem #1 - Driver overrunning data buffer

The miniport driver received a 100h byte buffer.  However, because the CDB allocation length was set to 1000h, the miniport driver is going to overflow the buffer and zero out 1000h bytes of data.  At best you're looking at a BSOD.  At worst memory is corrupted.  It's also not checking for a NULL SRB Data buffer pointer or the SRB data direction flags.

Problem #2 - Miniport writing to buffer when it shouldn't

A storage device should only alter the data buffer that has actual data.
For example, if I send down a request with a 256 byte buffer, but the device only returns 16 bytes, the remainder of the buffer (i.e. the tail 240 bytes) should not be altered.  The DataTransferLength field in the SRB would be returned as 16 indicating that only 16 bytes were transferred.

In the sample cited above, the miniport driver is attempting to zero out the entire data buffer passed in when it should only zero out the portion of the data buffer that it will actually return data in (max DataTransferLength).
Not that this is a particularly critical issue but it is worth noting and would make the driver more robust.


nvmewin mailing list
nvmewin at lists.openfabrics.org

More information about the nvmewin mailing list