[openib-general] [PATCH] RFC: mthca handling of signals

Michael S. Tsirkin mst at mellanox.co.il
Mon Jan 23 07:54:36 PST 2006


We have run into the following problem: if a task receives
a signal while in the process of e.g. destroying a resource
(which could be because the relevant file was closed)
mthca could bail out from trying to take a command
interface semaphore without performing the appropriate
command to tell hardware that the resource is being destroyed.

As a result we see messages like
 ib_mthca 0000:04:00.0: HW2SW_CQ failed (-4)

In this case, hardware could access the resource after the memory has been
freed, possibly causing memory corruption.

A simple solution is to replace down_interruptible by down in command
interface activation.

A more elegant, but much bigger, change would involve making
resource allocation command interruptible, while keeping
resource cleanup commands uninterruptible.

---

Its not safe to cancel a command upon a signal.

Signed-off-by: Michael S. Tsirkin <mst at mellanox.co.il>

Index: openib/drivers/infiniband/hw/mthca/mthca_cmd.c
===================================================================
--- openib/drivers/infiniband/hw/mthca/mthca_cmd.c	(revision 5121)
+++ openib/drivers/infiniband/hw/mthca/mthca_cmd.c	(working copy)
@@ -199,8 +199,7 @@ static int mthca_cmd_post(struct mthca_d
 {
 	int err = 0;
 
-	if (down_interruptible(&dev->cmd.hcr_sem))
-		return -EINTR;
+	down(&dev->cmd.hcr_sem);
 
 	if (event) {
 		unsigned long end = jiffies + GO_BIT_TIMEOUT;
@@ -255,8 +254,7 @@ static int mthca_cmd_poll(struct mthca_d
 	int err = 0;
 	unsigned long end;
 
-	if (down_interruptible(&dev->cmd.poll_sem))
-		return -EINTR;
+	down(&dev->cmd.poll_sem);
 
 	err = mthca_cmd_post(dev, in_param,
 			     out_param ? *out_param : 0,
@@ -333,8 +331,7 @@ static int mthca_cmd_wait(struct mthca_d
 	int err = 0;
 	struct mthca_cmd_context *context;
 
-	if (down_interruptible(&dev->cmd.event_sem))
-		return -EINTR;
+	down(&dev->cmd.event_sem);
 
 	spin_lock(&dev->cmd.context_lock);
 	BUG_ON(dev->cmd.free_head < 0);

-- 
MST



More information about the general mailing list