[openib-general] Re: Re: outstanding patches
Michael S. Tsirkin
mst at mellanox.co.il
Wed Mar 8 08:37:45 PST 2006
Quoting r. Sean Hefty <mshefty at ichips.intel.com>:
> Subject: Re: Re: outstanding patches
>
> Roland Dreier wrote:
> > > MAD:
> > > mad_port_close.patch
> >
> >BTW, I think I saw the crash that this fixes while testing the
> >previous patches.
>
> I just haven't had time yet to create a patch for this. The one that was
> submitted ends up scanning all ports on every completion. I need to spend
> more time finding a way to handle this, and verify that we don't have a
> larger issue that affects all kernel users.
OK, here's a patch to fix this race without a full port scan + a small cleanup.
---
Fix an oopsable race debugged by Eli Cohen <eli at mellanox.co.il>:
After removing the port from port_list, ib_mad_port_close flushes port_priv->wq
before destroying the special QPs. This means that a completion event could
arrive, and queue a new work in this work queue after flush.
This patch also removes an unnecessary flush_workqueue: destroy_workqueue
already includes a flush.
Signed-off-by: Michael S. Tsirkin <mst at mellanox.co.il>
Index: linux-2.6.15/drivers/infiniband/core/mad.c
===================================================================
--- linux-2.6.15.orig/drivers/infiniband/core/mad.c 2006-03-08 14:38:23.000000000 +0200
+++ linux-2.6.15/drivers/infiniband/core/mad.c 2006-03-08 21:13:08.000000000 +0200
@@ -2410,8 +2410,12 @@ static void timeout_sends(void *data)
static void ib_mad_thread_completion_handler(struct ib_cq *cq, void *arg)
{
struct ib_mad_port_private *port_priv = cq->cq_context;
+ unsigned long flags;
- queue_work(port_priv->wq, &port_priv->work);
+ spin_lock_irqsave(&ib_mad_port_list_lock, flags);
+ if (!list_empty(&port_priv->port_list))
+ queue_work(port_priv->wq, &port_priv->work);
+ spin_unlock_irqrestore(&ib_mad_port_list_lock, flags);
}
/*
@@ -2723,18 +2727,23 @@ static int ib_mad_port_open(struct ib_de
}
INIT_WORK(&port_priv->work, ib_mad_completion_handler, port_priv);
+ spin_lock_irqsave(&ib_mad_port_list_lock, flags);
+ list_add_tail(&port_priv->port_list, &ib_mad_port_list);
+ spin_unlock_irqrestore(&ib_mad_port_list_lock, flags);
+
ret = ib_mad_port_start(port_priv);
if (ret) {
printk(KERN_ERR PFX "Couldn't start port\n");
goto error9;
}
- spin_lock_irqsave(&ib_mad_port_list_lock, flags);
- list_add_tail(&port_priv->port_list, &ib_mad_port_list);
- spin_unlock_irqrestore(&ib_mad_port_list_lock, flags);
return 0;
error9:
+ spin_lock_irqsave(&ib_mad_port_list_lock, flags);
+ list_del_init(&port_priv->port_list);
+ spin_unlock_irqrestore(&ib_mad_port_list_lock, flags);
+
destroy_workqueue(port_priv->wq);
error8:
destroy_mad_qp(&port_priv->qp_info[1]);
@@ -2771,11 +2780,9 @@ static int ib_mad_port_close(struct ib_d
printk(KERN_ERR PFX "Port %d not found\n", port_num);
return -ENODEV;
}
- list_del(&port_priv->port_list);
+ list_del_init(&port_priv->port_list);
spin_unlock_irqrestore(&ib_mad_port_list_lock, flags);
- /* Stop processing completions. */
- flush_workqueue(port_priv->wq);
destroy_workqueue(port_priv->wq);
destroy_mad_qp(&port_priv->qp_info[1]);
destroy_mad_qp(&port_priv->qp_info[0]);
--
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
More information about the general
mailing list