[openib-general][patch review] srp: fmr implementation,
Vu Pham
vuhuong at mellanox.com
Mon May 8 09:19:32 PDT 2006
Roland Dreier wrote:
> > 1st scsi_try_host_reset() --> srp_host_reset() -->
> > srp_reconnect_target() return SUCCESS. Then scsi_eh_try_stu() or
> > scsi_eh_tur() is called right after
> >
> > scsi_eh_try_stu or scsi_eh_tur --> scsi_send_eh_cmnd() -->
> > srp_queuecommand()
>
> But after srp_reconnect_target(), both SRP's and the midlayer's queue
> of pending commands should be completely empty, since I put
>
> list_for_each_entry(req, &target->req_queue, list) {
> req->scmnd->result = DID_RESET << 16;
> req->scmnd->scsi_done(req->scmnd);
> srp_unmap_data(req->scmnd, target, req);
> }
>
> and
>
> INIT_LIST_HEAD(&target->free_reqs);
> INIT_LIST_HEAD(&target->req_queue);
> for (i = 0; i < SRP_SQ_SIZE; ++i)
> list_add_tail(&target->req_ring[i].list, &target->free_reqs);
>
> in there. Why doesn't that work to kill all the pending commands?
That works fine and kills all the pending commands; however
right after srp_host_reset return, scsi error handling
queue/send the stu or tur scsi command right away in the
error handling flow of function scsi_eh_host_reset()
Please re-read scsi_eh_host_reset() and
scsi_try_host_reset() in scsi_error.c. Here is the logic
scsi_eh_host_reset() --> scsi_try_host_reset() -->
srp_host_reset() --- all pending command are killed.
srp_host_reset() returns SUCCESS, scsi_try_host_reset()
returns SUCCCESS.
static int scsi_eh_host_reset(struct list_head *work_q,
struct list_head *done_q)
{
...
rtn = scsi_try_host_reset(scmd);
if (rtn == SUCCESS) {
list_for_each_entry_safe(scmd, next, work_q,
eh_entry) {
if (!scsi_device_online(scmd->device) ||
(!scsi_eh_try_stu(scmd) &&
!scsi_eh_tur(scmd)) ||
!scsi_eh_tur(scmd))
...
}
Since the (rtn == SUCCESS), scsi_eh_host_reset calls
scsi_eh_try_stu() or scsi_eh_try_tur() which will call
scsi_send_eh_cmnd() --> srp_queuecommand(). Now srp's
request queue is not empty anymore.
scsi_eh_try_stu or scsi_eh_try_tur get timeout, scsi
midlayer tried to abort stu or tur command as well. Since we
delay to clean in srp_reset_device(), srp's request queue is
still not empty. This stu or tur command is freed by scsi
midlayer. The next srp_host_reset() will try to clean srp's
request queue with "old" request referencing to freed scsi
command.
If you still have question, I can call you or give me a call
at (408) 916-0006
Vu
More information about the general
mailing list