[ofa-general] [PATCH] ipoib/cm: make stale task actually run once in a while

Michael S. Tsirkin mst at dev.mellanox.co.il
Mon May 7 13:03:15 PDT 2007


In the presence of some active passive connections, stale task would never run,
since each 4 RX CQEs we repeat queue_delayed_work calls which delays it for some
10 minutes.  As a result, on a noisy system with failing ports, we slowly run
out of resources - slowing connection setup down and eventually failing.

What we actually want to do is - start stale task when a first
passive connection is added, rerun it every 10 min as long
as there are outstanding passive connections.

As a happy side effect, this removes some code from RX data path.

Signed-off-by: Michael S. Tsirkin <mst at dev.mellanox.co.il>

---

Scott, I think this might address bugs 541 and 465: slow IPoIB CM HA failover
and eventual failing IPoIB HA. Could you test this please?

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 2b242a4..b77e8d7 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -258,10 +258,11 @@ static int ipoib_cm_req_handler(struct ib_cm_id *cm_id, struct ib_cm_event *even
 	cm_id->context = p;
 	p->jiffies = jiffies;
 	spin_lock_irqsave(&priv->lock, flags);
+	if (list_empty(&priv->cm.passive_ids))
+		queue_delayed_work(ipoib_workqueue,
+				   &priv->cm.stale_task, IPOIB_CM_RX_DELAY);
 	list_add(&p->list, &priv->cm.passive_ids);
 	spin_unlock_irqrestore(&priv->lock, flags);
-	queue_delayed_work(ipoib_workqueue,
-			   &priv->cm.stale_task, IPOIB_CM_RX_DELAY);
 	return 0;
 
 err_rep:
@@ -380,8 +381,6 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
 			if (!list_empty(&p->list))
 				list_move(&p->list, &priv->cm.passive_ids);
 			spin_unlock_irqrestore(&priv->lock, flags);
-			queue_delayed_work(ipoib_workqueue,
-					   &priv->cm.stale_task, IPOIB_CM_RX_DELAY);
 		}
 	}
 
@@ -1104,6 +1103,10 @@ static void ipoib_cm_stale_task(struct work_struct *work)
 		kfree(p);
 		spin_lock_irqsave(&priv->lock, flags);
 	}
+
+	if (!list_empty(&priv->cm.passive_ids))
+		queue_delayed_work(ipoib_workqueue,
+				   &priv->cm.stale_task, IPOIB_CM_RX_DELAY);
 	spin_unlock_irqrestore(&priv->lock, flags);
 }
 
-- 
MST



More information about the general mailing list