[ofa-general] Bug in send gather mode if we switch to CM?

Roland Dreier rdreier at cisco.com
Tue Mar 11 18:37:49 PDT 2008


Hi Eli,

It seems that the current 2.6.25-rc IPoIB code has a bug introduced
with the send gather support, namely that the post_send() in
ipoib_cm.c assumes that num_sge is 1, but it might not be if gather
support is ever used.  Since we don't have the checksum offload stuff
yet, this is impossible to trigger, but it is a trap waiting to hit us
when we merge the checksum offload patches.

Does the patch below look correct to you?  I have tentatively queued
it up for 2.6.25.

 - R.

commit 4200406b8fbbf309f4fffb339bd16c4553ae0c30
Author: Roland Dreier <rolandd at cisco.com>
Date:   Tue Mar 11 18:35:20 2008 -0700

    IPoIB/cm: Set tx_wr.num_sge in connected mode post_send()
    
    Commit 7143740d ("IPoIB: Add send gather support") made it possible
    for tx_wr.num_sge to be != 1 -- this happens if send gather support is
    enabled.  However, the code in the connected mode post_send() function
    assumes the old invariant, namely that tx_wr.num_sge is always 1.  Fix
    this by explicitly setting tx_wr.num_sge to 1 in the CM post_send().
    
    Signed-off-by: Roland Dreier <rolandd at cisco.com>

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 52b1beb..4e8d028 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -637,6 +637,7 @@ static inline int post_send(struct ipoib_dev_priv *priv,
 	priv->tx_sge[0].addr          = addr;
 	priv->tx_sge[0].length        = len;
 
+	priv->tx_wr.num_sge	= 1;
 	priv->tx_wr.wr_id	= wr_id | IPOIB_OP_CM;
 
 	return ib_post_send(tx->qp, &priv->tx_wr, &bad_wr);



More information about the general mailing list