[openib-general] [PATCH] SRP: Avoid a potential race on target->req_queue

Tue Jun 13 07:57:47 PDT 2006

Hi Roland,

There is a potential race between srp_reconnect_target and srp_reset_device
when they access the target->req_queue.
These functions can execute in the same time because srp_reconnect_target is
called form srp_reconnect_work that is scheduled by srp_completion, while
srp_reset_device is called from the scsi layer.

The race is caused because srp_reconnect_target is not holding host_lock while
accessing target->req_queue. It assumes that since the state is CONNECTING no
other function will access target->req_queue (and this is the case with
srp_reset_host for example).

There are two possible solutions: 
1) Change srp_reset_device: after locking host_lock, it will check the
   state. Only if the state is LIVE it will execute the loop that access
   target->req_queue.
2) Change srp_reconnect_target. Before executing the loop that access
   target->req_queue it will lock host_lock and will release it after
   the loop.

I'm sending a patch for the second solution. If you prefer the first, I have 
another patch for it (It is a bit longer).
Which solution do you like better?

Signed-off-by: Ishai Rabinovitz <ishai at mellanox.co.il>

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===================================================================

--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c	2006-06-13 02:24:22.000000000 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c	2006-06-13 02:26:07.000000000 +0300
@@ -641,8 +641,10 @@ static int srp_reconnect_target(struct s
 	while (ib_poll_cq(target->cq, 1, &wc) > 0)
 		; /* nothing */
 
+	spin_lock_irq(target->scsi_host->host_lock);
 	list_for_each_entry_safe(req, tmp, &target->req_queue, list)
 		srp_reset_req(target, req);
+	spin_unlock_irq(target->scsi_host->host_lock);
 
 	target->rx_head	 = 0;
 	target->tx_head	 = 0;
-- 
Ishai Rabinovitz