[Openib-windows] RE: Connection rate of WSD

Tue Jun 6 06:39:51 PDT 2006

Hi Fab,

Here are the results of my testing of the fix that you have suggested
(using RNR):
[Just to give some background, here are some results:
	For 100mb Ethernet 1670 CPS (connections per second)
	For IPOIB 1950 CPS
]
1) The patch that you have sent to the MTHCA , works well and will be
applied today by Leonid to open IB.
2) The patch to the mt23108: The patch seems good, but we don't have the
resource to test it. Fill free to do the necessary tests and check it
in.
3) Maximum connection rate that I was able to achieve was 50 CPS. (In
all ways).
4) Using the fix that I have suggested, connection rate is down to 33
CPS.
5) To my understanding, setting the retry count to 7 is not good enough
as the remote side might never post the buffers, and there will be no
timeout to the sender, so the retry count should be set to 6.
6) Here are some results that I got from different timeout values:
RNR Time Constant		CPS
0x6				50
0x8				50
0x10				45
0x12				39
0x14				32
0x18				20
>From my experience, setting the value to 6 means that we get RNR NACK
exhausted errors even in an un stable system.

As a result I suggest to set the values to somewhere between 8 to 0x10.
I wasn't able to create a test that loads the system so much that 8
wasn't good but 0x10 is.

So please pick a value and check it in.

To my understanding the fact that our connection rate is so low means
that we must continue and analyze the problem with a goal of 2000 CPS.
Once we will be near this target, we will continue looking at this.

Thanks
Tzachi

> -----Original Message-----
> From: ftillier.sst at gmail.com [mailto:ftillier.sst at gmail.com] 
> On Behalf Of Fabian Tillier
> Sent: Monday, June 05, 2006 10:28 PM
> To: Tzachi Dar
> Cc: openib-windows at openib.org
> Subject: Re: [Openib-windows] RE: Connection rate of WSD
> 
> Hi Tzachi,
> 
> On 6/5/06, Tzachi Dar <tzachid at mellanox.co.il> wrote:
> > Hi Fab,
> > 1) Please see bellow my answers. I still don't see how playing with 
> > the rnr timeout will solve the problem.
> 
> RNR handles the case where a send is sent before a matching 
> receive is posted.  This is exactly the situation we're 
> trying to handle.
> 
> Is there some problem with RNR handling in the HCAs?  The RNR 
> situation should not be that common, happening only durring 
> connection establishment on a very busy system.  The 
> inefficiency of RNR on the wire is worth the savings in extra 
> complexity in the code.
> 
> I've asked about RNR handling in the HCA several times now 
> without getting any answer - is RNR broken in the current HCA 
> implementations?
> 
> > 2) The way we see it there are two possible answers. A - 
> play with the 
> > cm . This will slow connection establishment, but gives some more 
> > freedom (the CM is in software). Please also note that as 
> for timeouts 
> > the CM has another message MRA (more processing required) 
> which gives 
> > us exactly the freedom to do what we want. We answer that 
> we received 
> > the request and still thinking what to do. So this is a 
> timeout free 
> > solution.
> > As for the other solution: posting the first receive this has the 
> > advantage that we follow the WSD spec. As for the latency 
> introduced: 
> > I believe that we can add another variable that will tell 
> if the first 
> > buffer was already received correctly. On the buffer 
> complete side the 
> > first action will be to check if the first receive was 
> already handled.
> > Only if not, it will take the lock and do the complex thing. As a 
> > result I believe that the latency introduced on most of the buffers 
> > will only be an if statement latency, which is quiet small.
> >
> > What do you think?
> 
> I would rather not add code if we can avoid it.  RNR should 
> work for this, unless RNR is broken in the HCAs.
> 
> Maybe I don't understand - you say that a 40ms RNR retry is 
> too long, yet you follow with saying that it may be seconds 
> before the switch posts its receive.  If it's seconds, the 
> RNR retry should be just fine.
> 
> Anyhow, as you pointed out, sending an MRA doesn't help the 
> connection rate, so it's not really any better than using 
> RNR.  I'm still considering the buffering thing, but need to 
> find a solution that will be streamlined and clean.  I'm 
> weary of making significant changes at this point, since the 
> RNR solution is functional.
> 
> Did you test with the updated code to see if the connection 
> delay is reduced?  What was the outcome?
> 
> Thanks,
> 
> - Fab
>