[ofa-general] Re: [PATCH] IPOIB/CM Increase retry counts for OFED-1.3
Pradeep Satyanarayana
pradeeps at linux.vnet.ibm.com
Thu Feb 14 09:03:28 PST 2008
Tziporet Koren wrote:
> Pradeep Satyanarayana wrote:
>> This patch change retry counts to small values. This helps
>> interoperability
>> between ehca and mthca. Without this patch I had seen "send completion
>> errors".
>>
>> Or Gerlitz has started a thread on the general mailing list and the
>> complete
>> discussion will be available there. This is the second part of the patch
>> submitted yesterday and is split up as per Eli's request.
>>
>> Signed-off-by: Pradeep Satyanarayana <pradeeps at linux.vnet.ibm.com>
>> ---
>>
>> --- ofa_kernel-1.3_a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
>> 2008-02-12 17:46:03.000000000 -0500
>> +++ ofa_kernel-1.3_b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
>> 2008-02-12 17:46:58.000000000 -0500
>> @@ -1016,8 +1016,8 @@ static int ipoib_cm_send_req(struct net_
>> req.responder_resources = 4;
>> req.remote_cm_response_timeout = 20;
>> req.local_cm_response_timeout = 20;
>> - req.retry_count = 0; /* RFC draft warns against retries */
>> - req.rnr_retry_count = 0; /* RFC draft warns against
>> retries */
>> + req.retry_count = 3;
>> + req.rnr_retry_count = 3;
>> req.max_cm_retries = 15;
>> req.srq = ipoib_cm_has_srq(dev);
>> return ib_send_cm_req(id, &req);
>>
>>
>>
>
> I wish to see Roland's respond to this patch.
> I think its enough to enlarge only the rnr_retry_count since the
> retry_count is for cases were packages are dropped by a very busy subnet.
> The case you want to solve should be covered by the rnr_retry_count.
>
The case that I saw was the other way round. The sender saw "send completion
errors" and that was solved by changing the retry_count. The rnr_retry_count
was added to cover any other corner cases I had not seen (Table 78 in the spec).
Pradeep
More information about the general
mailing list