[openib-general][patch review] srp: fmr implementation,
Vu Pham
vuhuong at mellanox.com
Mon May 8 09:50:08 PDT 2006
Roland Dreier wrote:
> Vu> scsi_eh_try_stu or scsi_eh_try_tur get timeout, scsi midlayer
> Vu> tried to abort stu or tur command as well. Since we delay to
> Vu> clean in srp_reset_device(), srp's request queue is still not
> Vu> empty. This stu or tur command is freed by scsi midlayer. The
> Vu> next srp_host_reset() will try to clean srp's request queue
> Vu> with "old" request referencing to freed scsi command.
>
> This is where I get confused. We should be flushing out the command
> queue in srp_host_reset(), so the loop in scsi_error.c after resetting
> the host:
>
> rtn = scsi_try_host_reset(scmd);
> if (rtn == SUCCESS) {
> list_for_each_entry_safe(scmd, next, work_q, eh_entry) {
> if (!scsi_device_online(scmd->device) ||
> (!scsi_eh_try_stu(scmd) && !scsi_eh_tur(scmd)) ||
> !scsi_eh_tur(scmd))
> scsi_eh_finish_cmd(scmd, done_q);
> }
>
> should not find any commands still queued.
>
Have you read scsi_eh_try_stu(scmnd) and scsi_eh_tur(scmnd)?
These functions use the same scmnd and reformat it with new
cdb and call srp_queuecommand() which uses new req and put
this new req in request queue for this same scmnd with
different cdb
> Your previous patch can't be the right fix. I think there are two
> things wrong with the changes below to srp_reset_device():
>
> - You changed srp_abort to remove the request from SRP's queue, but
> then you look it up and use it again in srp_reset_device, which
> seems risky at best.
I aggree. In stead of looking for and using the same req's
buffer to send task management packet, I need new buffer to
send this task management then.
> - If srp_reset_device() succeeds, you don't flush all matching
> commands, so this will definitely leave some stale commands in
> SRP's queue.
>
I already guarantee to flush all commands in srp_abort.
Vu
More information about the general
mailing list