[Openib-windows] RNR NACK issues in WSD

Thu Jul 6 12:21:01 PDT 2006

Hi Fab,

Generally speaking, having an RNR NACK with infinite doesn't seem to be
the right solution. The main reason is probably an application with a
bug or a malicious user that can cause our connect message to wait for
ever.

Still for the short term this is the best that we can do. So please
check this in.
We will return to this once there is more time.

Thanks
Tzachi

> -----Original Message-----
> From: openib-windows-bounces at openib.org 
> [mailto:openib-windows-bounces at openib.org] On Behalf Of Fab Tillier
> Sent: Thursday, July 06, 2006 4:54 PM
> To: openib-windows at openib.org
> Subject: [Openib-windows] RNR NACK issues in WSD
> 
> Hi Folks,
> 
> I've been thinking more about the RNR NACK handling in WSD.  
> A quick background on the problem - the WSD switch is 
> responsible for posting receives, but there is no mechanism 
> for it to pre-post receives.  This means that the server side 
> must accept connections before posting receives.
> Because the client side sends the hello message, there is a 
> race between the hello message being sent and receives being posted.
> 
> There are two ways of handling this, one is for the WSD 
> provider to allocate, register, and pre-post the first 
> receive, and then buffer it to handle the race condition.  
> The other way is to use IB's RNR NACK mechanisms to retry the 
> hello message until the receives are posted by the WSD switch.
> 
> The first solution is fairly complex compared to the second, 
> as it requires proper synchronization to handle races between 
> completions of the pre-posted receive and the switch posting 
> its first receive.  The RNR NACK solution has some drawbacks 
> too - the retry timeout affects connection rate, and the 
> retry limit affects behavior in corner cases where the WSD 
> switch thread responsible for posting receives is hung.
> 
> While I think that buffering the first receive would be 
> ideal, it is complicated and I would like to avoid it in the 
> short term - at least until we get through WHQL because of 
> the scope of the change.  I also would like to avoid 
> increasing the RNR timeout as that has an undesirable affect 
> on connection rate.
> 
> The only remaining workaround is to set the RNR retry to 
> infinite.  I think this is low risk, and here's why.  The 
> risk with infinite RNR retries is that a hang of the thread 
> that posts the receives could leave the connection in an 
> infinite retry scenario.  However, this can only happen in two cases.
> First, if the internal switch thread is hung - not if any of 
> the application threads are hung.  Second, if the application 
> is stopped on a break point (all threads are suspended).
> 
> There is no issue if the application crashes or the 
> application's own threads are hung.  If the application is 
> stopped on a breakpoint, having infinite RNR retries is 
> actually the right thing to do, as the user is actively 
> debugging.  I believe the chance of the internal WSD switch's 
> thread being hung is so unlikely that using infinite RNR 
> retries is an acceptable short term solution that 
> accomplishes two things:
> 1. it maintains a higher connection rate 2. it allows WSD to 
> ride through very high CPU loads and connection demands.
> 
> I'd like to get feedback on this, keeping in mind that the 
> goal at this point is to maximize connection rates, 
> connectivity in busy systems, and getting through WHQL tests.
> 
> Thanks,
> 
> - Fab
> 
> 
> _______________________________________________
> openib-windows mailing list
> openib-windows at openib.org
> http://openib.org/mailman/listinfo/openib-windows
>