[ofa-general] IPOIB/CM increase retry counts
Or Gerlitz
ogerlitz at voltaire.com
Tue Feb 12 00:08:08 PST 2008
Pradeep Satyanarayana wrote:
> I have seen sporadic errors while running the HCAs in connected mode.
> These errors appear to be related to the speeds of the different HCAs.
> Increasing the retry counts solves the problem.
Hi Predeep,
I see now that you have sent tonight this patch (posted on Dec 2007 to
the mailing list and never discussed) to be included in ofed 1.3
I think more detailed are needed here on the problem, from the above
three lines it seem to be more of a workaround than a solution. What is
the problem here?
> I looked at the RFC as regards to warnings about retries. The warnings
> is to make sure that the IB timeouts do not interfere with TCP timeouts.
> The TCP timeout are so much larger than the IB timeouts (even with
> non zero values) that we are nowhere close to interfering with TCP
> timeouts.
IP provides "unreliable datagram service" to upper layers, hence don't
really see a point in implementing it over a reliable HW transport. This
was discussed on the list, and suggestions on how to move to IPoIB/CM
over UC transports were made, not yet an implementation...
Saying all that, I don't think we want to have --any RNR retries--, as
for retries, I am open to hear what others think.
Or.
>
> Signed-off-by: Pradeep Satyanarayana <pradeeps at linux.vnet.ibm.com>
> ---
>
> --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c 2007-12-21 16:06:49.000000000 -0500
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c 2007-12-21 16:07:28.000000000 -0500
> @@ -990,8 +990,8 @@ static int ipoib_cm_send_req(struct net_
> req.responder_resources = 4;
> req.remote_cm_response_timeout = 20;
> req.local_cm_response_timeout = 20;
> - req.retry_count = 0; /* RFC draft warns against retries */
> - req.rnr_retry_count = 0; /* RFC draft warns against retries */
> + req.retry_count = 3;
> + req.rnr_retry_count = 3;
> req.max_cm_retries = 15;
> req.srq = ipoib_cm_has_srq(dev);
> return ib_send_cm_req(id, &req);
More information about the general
mailing list