[ofa-general] Re: [PATCH 2.6.30.4] Fix for NULL pointer dereference by SRP initiator triggered by a SCSI reset after the SRP connection has been closed

Bart Van Assche bart.vanassche at gmail.com
Tue Aug 4 00:48:22 PDT 2009


On Mon, Aug 3, 2009 at 10:36 PM, Roland Dreier <rdreier at cisco.com> wrote:
>
>  > Issuing a SCSI reset command on an SRP initiator after the SRP connection has
>  > been closed triggers a NULL pointer dereference. The patch below fixes this
>  > NULL pointer dereference.
>  >
>  > See also http://bugzilla.kernel.org/show_bug.cgi?id=13893.
>
> Thanks for debugging this... a couple of questions:
>
>  > +    BUG_ON(!req->scmnd->device);
>
> Why BUG_ON() here?  Can we return failure or something, rather than
> crashing the whole system?

The function srp_send_tsk_mgmt() contains a.o. the following
statement: "tsk_mgmt->lun = cpu_to_be64((u64) req->scmnd->device->lun
<< 48);". This is the statement that triggered the NULL pointer
dereference. Whether or not a BUG_ON() is appropriate here depends on
which of the following two alternatives is preferred: should the
caller guarantee that req->scmnd->device != NULL or should
srp_send_tsk_mgmt() should handle the condition req->scmnd->device ==
NULL itself ?

>  > +    if (!req->scmnd->device)
>  > +            return FAILED;
>
> How do we end up in srp_reset_device() with req->scmnd->device == NULL?
> Presumably req->scmnd should match scmnd if I am understanding the code
> properly -- and then scmnd->device == NULL??

Good question. I did not yet analyze why this happens. But before I
started developing a patch I had first verified that scmnd->device is
NULL at that point by inserting the statement WARN_ON(!scmnd->device).

A clue might be that without the above patch the BUG message on the
initiator system is triggered just after the "SRP reset_device called"
message has been logged.

Bart.



More information about the general mailing list