[openib-general][patch review] srp: fmr implementation,
Vu Pham
vuhuong at mellanox.com
Wed Apr 19 11:56:34 PDT 2006
Roland Dreier wrote:
> > > And what if you comment out the line
> > > .eh_device_reset_handler = srp_reset_device,
> > > does that fix it?
>
> > No
>
> Now I'm really confused.
>
Me too.
> It seems we lose the connection to the target (BTW -- do you know why
> the connection is getting killed)?
I reported the error from my original email responding to
your fmr patch. For ia64 system with pcix hca I got asyn
event IB_EVENT_QP_ACCESS_ERR at the initiator (and I got cqe
with IB_COMPLETION_STATUS_REMOTE_ACCESS_ERROR status at my
target)
I still have not had an IB analyzer trace (as you suggested)
>
> So the SCSI midlayer times out commands and tries to abort them. But
> we have no connection so the abort fails. The SCSI command shouldn't
> get freed now (at least if I'm understanding scsi_error.c correctly).
>
> Then we have no .eh_device_reset_handler so everything should fall
> through to calling our .eh_host_reset_handler without freeing any SCSI
> commands. And then we crash on a use-after-free of a SCSI command.
>
> So where is that command getting freed on us??
>
The scsi command that is used by error handlers
(.eh_abort_handler, .eh_host_reset_handler...) is not the
same as use-after-free scsi command from req->scmnd
There is some glitch that the scsi command from req->scmnd
already freed by scsi midlayer; however, the request is
still in our pending request queue
Vu
More information about the general
mailing list