[ofa-general] Re: [PATCH] IPOIB/CM Increase retry counts for OFED-1.3

Tziporet Koren tziporet at dev.mellanox.co.il
Thu Feb 14 06:57:21 PST 2008


Pradeep Satyanarayana wrote:
> This patch change retry counts to small values. This helps interoperability
> between ehca and mthca. Without this patch I had seen "send completion errors".
>
> Or Gerlitz has started a thread on the general mailing list and the complete
> discussion will be available there. This is the second part of the patch
> submitted yesterday and is split up as per Eli's request.
>
> Signed-off-by: Pradeep Satyanarayana <pradeeps at linux.vnet.ibm.com>
> ---
>
> --- ofa_kernel-1.3_a/drivers/infiniband/ulp/ipoib/ipoib_cm.c	2008-02-12 17:46:03.000000000 -0500
> +++ ofa_kernel-1.3_b/drivers/infiniband/ulp/ipoib/ipoib_cm.c	2008-02-12 17:46:58.000000000 -0500
> @@ -1016,8 +1016,8 @@ static int ipoib_cm_send_req(struct net_
>  	req.responder_resources	      = 4;
>  	req.remote_cm_response_timeout = 20;
>  	req.local_cm_response_timeout  = 20;
> -	req.retry_count 	      = 0; /* RFC draft warns against retries */
> -	req.rnr_retry_count 	      = 0; /* RFC draft warns against retries */
> +	req.retry_count 	      = 3;
> +	req.rnr_retry_count 	      = 3;
>  	req.max_cm_retries 	      = 15;
>  	req.srq 	              = ipoib_cm_has_srq(dev);
>  	return ib_send_cm_req(id, &req);
>
>
>   

I wish to see Roland's respond to this patch.
I think its enough to enlarge only the rnr_retry_count since the 
retry_count is for cases were packages are dropped by a very busy subnet.
The case you want to solve should be covered by the rnr_retry_count.

Regarding UC - we may consider this in the future since in ConnectX we 
have UC with SRQ support and indeed it will be the best for IPoIB CM

Tziporet



More information about the general mailing list