[ewg] Re: [PATCH] IPOIB/CM fixes for issues seen in OFED-1.3

Eli Cohen eli at dev.mellanox.co.il
Mon Feb 11 23:47:58 PST 2008


Pradeep,

could you send as distinct patches according to what they fix?

Thanks.

Pradeep Satyanarayana wrote:
> The following patch incorporates fixes for several issues:
> 1. Fail to destroy ipoib rx QP (https://bugs.openfabrics.org/show_bug.cgi?id=906)
> This fixes the usecnt issue and allows the qp to be destroyed.
> 2. Change retry counts to small values. This helps interoperability
> between ehca and mthca.
> 3. While looking through the code, I found an error introduced by the split cq
> patch in the ipoib_poll(). This undoes the change.
> 
> Please include for the OFED-1.3 rc5 build. This patch was tested on today's build 
> on ehca and mthca on ppc64 machines. I have done some tests with network traffic 
> and also loads and unloads of modules and seen no issues.
> 
> 
> Signed-off-by: Pradeep Satyanarayana <pradeeps at linux.vnet.ibm.com>
> ---
> 
> --- ofa_kernel-1.3_a/drivers/infiniband/ulp/ipoib/ipoib_cm.c	2008-02-11 14:28:47.000000000 -0500
> +++ ofa_kernel-1.3_b/drivers/infiniband/ulp/ipoib/ipoib_cm.c	2008-02-11 15:05:48.000000000 -0500
> @@ -881,11 +881,11 @@ void ipoib_cm_dev_stop(struct net_device
>  			ipoib_warn(priv, "RX drain timing out\n");
>  
>  			/*
> -			 * assume the HW is wedged and just free up everything.
> +			 * assume errors and move to rx_reap list.
>  			 */
> -			list_splice_init(&priv->cm.rx_flush_list, &list);
> -			list_splice_init(&priv->cm.rx_error_list, &list);
> -			list_splice_init(&priv->cm.rx_drain_list, &list);
> +			list_splice_init(&priv->cm.rx_flush_list, &priv->cm.rx_reap_list);
> +			list_splice_init(&priv->cm.rx_error_list, &priv->cm.rx_reap_list);
> +			list_splice_init(&priv->cm.rx_drain_list, &priv->cm.rx_reap_list);
>  			break;
>  		}
>  		spin_unlock_irq(&priv->lock);
> @@ -1016,8 +1016,8 @@ static int ipoib_cm_send_req(struct net_
>  	req.responder_resources	      = 4;
>  	req.remote_cm_response_timeout = 20;
>  	req.local_cm_response_timeout  = 20;
> -	req.retry_count 	      = 0; /* RFC draft warns against retries */
> -	req.rnr_retry_count 	      = 0; /* RFC draft warns against retries */
> +	req.retry_count 	      = 3;
> +	req.rnr_retry_count 	      = 3;
>  	req.max_cm_retries 	      = 15;
>  	req.srq 	              = ipoib_cm_has_srq(dev);
>  	return ib_send_cm_req(id, &req);
> --- ofa_kernel-1.3_a/drivers/infiniband/ulp/ipoib/ipoib_ib.c	2008-02-11 14:28:47.000000000 -0500
> +++ ofa_kernel-1.3_b/drivers/infiniband/ulp/ipoib/ipoib_ib.c	2008-02-11 14:49:24.000000000 -0500
> @@ -405,8 +405,12 @@ poll_more:
>  					ipoib_cm_handle_rx_wc(dev, wc);
>  				else
>  					ipoib_ib_handle_rx_wc(dev, wc);
> -			} else
> -                                ipoib_cm_handle_tx_wc(priv->dev, wc);
> +			} else {
> +				if (wc->wr_id & IPOIB_OP_CM)
> +                                	ipoib_cm_handle_tx_wc(priv->dev, wc);
> +				else
> +					ipoib_ib_handle_tx_wc(dev, wc);
> +			}
>  		}
>  
>  		if (n != t)
> 




More information about the ewg mailing list