[ewg] [PATCH] IPOIB/CM fixes for issues seen in OFED-1.3

Pradeep Satyanarayana pradeeps at linux.vnet.ibm.com
Mon Feb 11 13:20:13 PST 2008


The following patch incorporates fixes for several issues:
1. Fail to destroy ipoib rx QP (https://bugs.openfabrics.org/show_bug.cgi?id=906)
This fixes the usecnt issue and allows the qp to be destroyed.
2. Change retry counts to small values. This helps interoperability
between ehca and mthca.
3. While looking through the code, I found an error introduced by the split cq
patch in the ipoib_poll(). This undoes the change.

Please include for the OFED-1.3 rc5 build. This patch was tested on today's build 
on ehca and mthca on ppc64 machines. I have done some tests with network traffic 
and also loads and unloads of modules and seen no issues.


Signed-off-by: Pradeep Satyanarayana <pradeeps at linux.vnet.ibm.com>
---

--- ofa_kernel-1.3_a/drivers/infiniband/ulp/ipoib/ipoib_cm.c	2008-02-11 14:28:47.000000000 -0500
+++ ofa_kernel-1.3_b/drivers/infiniband/ulp/ipoib/ipoib_cm.c	2008-02-11 15:05:48.000000000 -0500
@@ -881,11 +881,11 @@ void ipoib_cm_dev_stop(struct net_device
 			ipoib_warn(priv, "RX drain timing out\n");
 
 			/*
-			 * assume the HW is wedged and just free up everything.
+			 * assume errors and move to rx_reap list.
 			 */
-			list_splice_init(&priv->cm.rx_flush_list, &list);
-			list_splice_init(&priv->cm.rx_error_list, &list);
-			list_splice_init(&priv->cm.rx_drain_list, &list);
+			list_splice_init(&priv->cm.rx_flush_list, &priv->cm.rx_reap_list);
+			list_splice_init(&priv->cm.rx_error_list, &priv->cm.rx_reap_list);
+			list_splice_init(&priv->cm.rx_drain_list, &priv->cm.rx_reap_list);
 			break;
 		}
 		spin_unlock_irq(&priv->lock);
@@ -1016,8 +1016,8 @@ static int ipoib_cm_send_req(struct net_
 	req.responder_resources	      = 4;
 	req.remote_cm_response_timeout = 20;
 	req.local_cm_response_timeout  = 20;
-	req.retry_count 	      = 0; /* RFC draft warns against retries */
-	req.rnr_retry_count 	      = 0; /* RFC draft warns against retries */
+	req.retry_count 	      = 3;
+	req.rnr_retry_count 	      = 3;
 	req.max_cm_retries 	      = 15;
 	req.srq 	              = ipoib_cm_has_srq(dev);
 	return ib_send_cm_req(id, &req);
--- ofa_kernel-1.3_a/drivers/infiniband/ulp/ipoib/ipoib_ib.c	2008-02-11 14:28:47.000000000 -0500
+++ ofa_kernel-1.3_b/drivers/infiniband/ulp/ipoib/ipoib_ib.c	2008-02-11 14:49:24.000000000 -0500
@@ -405,8 +405,12 @@ poll_more:
 					ipoib_cm_handle_rx_wc(dev, wc);
 				else
 					ipoib_ib_handle_rx_wc(dev, wc);
-			} else
-                                ipoib_cm_handle_tx_wc(priv->dev, wc);
+			} else {
+				if (wc->wr_id & IPOIB_OP_CM)
+                                	ipoib_cm_handle_tx_wc(priv->dev, wc);
+				else
+					ipoib_ib_handle_tx_wc(dev, wc);
+			}
 		}
 
 		if (n != t)




More information about the ewg mailing list