[ofa-general] Re: [PATCH 2.6.30.4] Fix for NULL pointer dereference by SRP initiator triggered by a SCSI reset after the SRP connection has been closed
Bart Van Assche
bart.vanassche at gmail.com
Tue Aug 4 00:48:22 PDT 2009
On Mon, Aug 3, 2009 at 10:36 PM, Roland Dreier <rdreier at cisco.com> wrote:
>
> > Issuing a SCSI reset command on an SRP initiator after the SRP connection has
> > been closed triggers a NULL pointer dereference. The patch below fixes this
> > NULL pointer dereference.
> >
> > See also http://bugzilla.kernel.org/show_bug.cgi?id=13893.
>
> Thanks for debugging this... a couple of questions:
>
> > + BUG_ON(!req->scmnd->device);
>
> Why BUG_ON() here? Can we return failure or something, rather than
> crashing the whole system?
The function srp_send_tsk_mgmt() contains a.o. the following
statement: "tsk_mgmt->lun = cpu_to_be64((u64) req->scmnd->device->lun
<< 48);". This is the statement that triggered the NULL pointer
dereference. Whether or not a BUG_ON() is appropriate here depends on
which of the following two alternatives is preferred: should the
caller guarantee that req->scmnd->device != NULL or should
srp_send_tsk_mgmt() should handle the condition req->scmnd->device ==
NULL itself ?
> > + if (!req->scmnd->device)
> > + return FAILED;
>
> How do we end up in srp_reset_device() with req->scmnd->device == NULL?
> Presumably req->scmnd should match scmnd if I am understanding the code
> properly -- and then scmnd->device == NULL??
Good question. I did not yet analyze why this happens. But before I
started developing a patch I had first verified that scmnd->device is
NULL at that point by inserting the statement WARN_ON(!scmnd->device).
A clue might be that without the above patch the BUG message on the
initiator system is triggered just after the "SRP reset_device called"
message has been logged.
Bart.
More information about the general
mailing list