[Openib-windows] RNR NACK issues in WSD

Fab Tillier ftillier at silverstorm.com
Thu Jul 6 06:54:25 PDT 2006


Hi Folks,

I've been thinking more about the RNR NACK handling in WSD.  A quick
background on the problem - the WSD switch is responsible for posting
receives, but there is no mechanism for it to pre-post receives.  This means
that the server side must accept connections before posting receives.
Because the client side sends the hello message, there is a race between the
hello message being sent and receives being posted.

There are two ways of handling this, one is for the WSD provider to
allocate, register, and pre-post the first receive, and then buffer it to
handle the race condition.  The other way is to use IB's RNR NACK mechanisms
to retry the hello message until the receives are posted by the WSD switch.

The first solution is fairly complex compared to the second, as it requires
proper synchronization to handle races between completions of the pre-posted
receive and the switch posting its first receive.  The RNR NACK solution has
some drawbacks too - the retry timeout affects connection rate, and the
retry limit affects behavior in corner cases where the WSD switch thread
responsible for posting receives is hung.

While I think that buffering the first receive would be ideal, it is
complicated and I would like to avoid it in the short term - at least until
we get through WHQL because of the scope of the change.  I also would like
to avoid increasing the RNR timeout as that has an undesirable affect on
connection rate.

The only remaining workaround is to set the RNR retry to infinite.  I think
this is low risk, and here's why.  The risk with infinite RNR retries is
that a hang of the thread that posts the receives could leave the connection
in an infinite retry scenario.  However, this can only happen in two cases.
First, if the internal switch thread is hung - not if any of the application
threads are hung.  Second, if the application is stopped on a break point
(all threads are suspended).

There is no issue if the application crashes or the application's own
threads are hung.  If the application is stopped on a breakpoint, having
infinite RNR retries is actually the right thing to do, as the user is
actively debugging.  I believe the chance of the internal WSD switch's
thread being hung is so unlikely that using infinite RNR retries is an
acceptable short term solution that accomplishes two things:
1. it maintains a higher connection rate
2. it allows WSD to ride through very high CPU loads and connection demands.

I'd like to get feedback on this, keeping in mind that the goal at this
point is to maximize connection rates, connectivity in busy systems, and
getting through WHQL tests.

Thanks,

- Fab





More information about the ofw mailing list