[ewg] Re: Slow failover of IPoIB ipoibtools/bonding (bug 541)

Michael S. Tsirkin mst at dev.mellanox.co.il
Sat Apr 21 23:16:10 PDT 2007


> 10-second port failover test has been running with IPoIB UD ipoibtools
> HA for over 8 hours, and there have been very few slow failovers:
> 
> $ grep seconds screenlog.7 | wc -l
> 29705
> 
> $ grep seconds screenlog.7 | fgrep -v "over 1." | fgrep -v "over 2."
> Interim result:   45.29 10^6bits/s over 53.21 seconds
> Interim result:  299.37 10^6bits/s over 7.34 seconds
> Interim result:  406.76 10^6bits/s over 5.84 seconds
> Interim result:  614.00 10^6bits/s over 3.91 seconds
> Interim result:  579.55 10^6bits/s over 4.06 seconds
> Interim result:  239.60 10^6bits/s over 10.19 seconds
> 
> Scott 

So it seems we are timing out the connection instead of getting RARP
from the remote and tearing it down outselves.
I wonder whether the following (untested) patch improves things for you.

-- 
MST
-------------- next part --------------
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index f2a40ae..a6d0594 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -682,6 +682,25 @@ static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		}
 
 		neigh = *to_ipoib_neigh(skb->dst->neighbour);
+		if (unlikely(memcmp(&neigh->dgid.raw,
+				    skb->dst->neighbour->ha + 4,
+				    sizeof(union ib_gid))) &&
+		    likely(neigh->ah)) {
+			spin_lock(&priv->lock);
+			/*
+			 * It's safe to call ipoib_put_ah() inside
+			 * priv->lock here, because we know that
+			 * path->ah will always hold one more reference,
+			 * so ipoib_put_ah() will never do more than
+			 * decrement the ref count.
+			 */
+			ipoib_put_ah(neigh->ah);
+			list_del(&neigh->list);
+			ipoib_neigh_free(dev, neigh);
+			spin_unlock(&priv->lock);
+			ipoib_path_lookup(skb, dev);
+			goto out;
+		}
 
 		if (ipoib_cm_get(neigh)) {
 			if (ipoib_cm_up(neigh)) {
@@ -689,25 +708,6 @@ static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev)
 				goto out;
 			}
 		} else if (neigh->ah) {
-			if (unlikely(memcmp(&neigh->dgid.raw,
-					    skb->dst->neighbour->ha + 4,
-					    sizeof(union ib_gid)))) {
-				spin_lock(&priv->lock);
-				/*
-				 * It's safe to call ipoib_put_ah() inside
-				 * priv->lock here, because we know that
-				 * path->ah will always hold one more reference,
-				 * so ipoib_put_ah() will never do more than
-				 * decrement the ref count.
-				 */
-				ipoib_put_ah(neigh->ah);
-				list_del(&neigh->list);
-				ipoib_neigh_free(dev, neigh);
-				spin_unlock(&priv->lock);
-				ipoib_path_lookup(skb, dev);
-				goto out;
-			}
-
 			ipoib_send(dev, skb, neigh->ah, IPOIB_QPN(skb->dst->neighbour->ha));
 			goto out;
 		}


More information about the ewg mailing list