[openib-general] [PATCH] IB/cma: add rdma_establish
Michael S. Tsirkin
mst at mellanox.co.il
Tue Sep 12 22:23:41 PDT 2006
Quoting r. Sean Hefty <mshefty at ichips.intel.com>:
> Subject: Re: [PATCH] IB/cma: add rdma_establish
>
> Michael S. Tsirkin wrote:
> >>>As a side note, reasons for frequent loss of RTU must be investigated.
> >>
> >>A lost RTU shouldn't be any more likely than a lost REQ or REP. Is the RTU
> >>never showing up?
> >
> >
> > Seems like that. I know fir sure I do accept after REP but remote side never
> > gets ESTABLISHED.
>
> I looked at the code, then ran some tests. The REP is retried until an RTU is
> received, or its number of retries is exhausted. By modifying the IB CM, I was
> able to force RTU drops. Using madeye, I could see that the REP would be
> retried, resulting in the RTU being resent. After 4 drops, I had the code
> receive the RTU, which allowed the test to proceed.
>
> A couple things to look at in OFED would be the setting of max cm retries and
> the cm timeout.
>
> - Sean
OFED uses CMA from upstream kernel. If default parameters there
are inappropriate, maybe should fix them?
BTW, how about the idea of exporting max cm retries in transport-independent
header?
--
MST
More information about the general
mailing list