[openib-general] [PATCH for-2.6.18] Re: [PATCH] IB/cma: add rdma_establish

Michael S. Tsirkin mst at mellanox.co.il
Wed Sep 13 12:30:16 PDT 2006


Quoting r. Sean Hefty <mshefty at ichips.intel.com>:
> Subject: Re: [PATCH for-2.6.18] Re: [PATCH] IB/cma: add rdma_establish
> 
> Michael S. Tsirkin wrote:
> > What I think we need for 2.6.18 is the following. Pls comment.
> > 
> > 
> > IB/cma: increase the retry count in CMA from 3 to maximum 15.
> > 3 seems low - we see connections failing under stress - and in any case looks
> > like an arbitrary number. 15 is the max value allowed by spec.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst at mellanox.co.il>
> 
> Dropping 3 packets in a row seems likely only under stress testing, so I'm not 
> sure that this is worthy of a change to 2.6.18 at this point (we're at rc7). 

I don't really understand. The fix is a one-liner.
The problem is observed in practice, under stress.
Who *wants* systems that fall apart under stress?

It seems that with retry of 3, chances of losing
one out of 3 packets would be close to 100% if loss rate is about 10%.
Ranking it up to 15, you need loss rate on top of 50% to get close to 100%
chance of losing connection request.

Losing a DREP is also bad - as it leaves stale connections around
munching up resources.

So why aren't we fixing this?

-- 
MST




More information about the general mailing list