[ofa-general] Re: [PATCH V3] IB/ipoib: refresh paths instead of fushing them on SM change event

Roland Dreier rdreier at cisco.com
Tue Jul 1 13:57:14 PDT 2008


thanks, applied.  a couple of notes:

 > In most cases, SM failover doesn't cause LID change so traffic won't stop. In
 > the rare cases of LID change, the remote host (the one that hadn't changed
 > its LID) will lose connectivity until paths are refreshed. This is no worse than
 > the current state. In fact, preventing the device from going down saves packets
 > that otherwise would be lost.

is it really true that this is *always* no worse?  there is the case
where LIDs *did* change on an SM change event, and this patch waits for
an ARP probe before discovering the LID change, while the old code
immediately would look up the path again.  But that seems like a less
common case, and I do believe that this patch makes things better in the
more common cases.

I also changed the path_rec_completion() handling of old ahs, based on
your code.  I think the below is a little simpler, let me know if it
still looks OK to you:

@@ -393,6 +410,7 @@ static void path_rec_completion(int status,
 	struct net_device *dev = path->dev;
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
 	struct ipoib_ah *ah = NULL;
+	struct ipoib_ah *old_ah;
 	struct ipoib_neigh *neigh, *tn;
 	struct sk_buff_head skqueue;
 	struct sk_buff *skb;
@@ -416,6 +434,7 @@ static void path_rec_completion(int status,
 
 	spin_lock_irqsave(&priv->lock, flags);
 
+	old_ah   = path->ah;
 	path->ah = ah;
 
 	if (ah) {
@@ -428,6 +447,17 @@ static void path_rec_completion(int status,
 			__skb_queue_tail(&skqueue, skb);
 
 		list_for_each_entry_safe(neigh, tn, &path->neigh_list, list) {
+			if (neigh->ah) {
+				WARN_ON(neigh->ah != old_ah);
+				/*
+				 * Dropping the ah reference inside
+				 * priv->lock is safe here, because we
+				 * will hold one more reference from
+				 * the original value of path->ah (ie
+				 * old_ah).
+				 */
+				ipoib_put_ah(neigh->ah);
+			}
 			kref_get(&path->ah->ref);
 			neigh->ah = path->ah;
 			memcpy(&neigh->dgid.raw, &path->pathrec.dgid.raw,
@@ -450,6 +480,7 @@ static void path_rec_completion(int status,
 			while ((skb = __skb_dequeue(&neigh->queue)))
 				__skb_queue_tail(&skqueue, skb);
 		}
+		path->valid = 1;
 	}
 
 	path->query = NULL;
@@ -457,6 +488,9 @@ static void path_rec_completion(int status,
 
 	spin_unlock_irqrestore(&priv->lock, flags);
 
+	if (old_ah)
+		ipoib_put_ah(old_ah);
+
 	while ((skb = __skb_dequeue(&skqueue))) {
 		skb->dev = dev;
 		if (dev_queue_xmit(skb))



More information about the general mailing list