[openib-general] [ANNOUNCE] Contribute RDS (ReliableDatagramSockets) to OpenIB

Tue Nov 8 13:08:13 PST 2005

At 12:33 PM 11/8/2005, Ranjit Pandit wrote:
> > Mike wrote:
> >  - RDS does not solve a set of failure models.  For example, if a RNIC 
> / HCA
> > were to fail, then one cannot simply replay the operations on another 
> RNIC /
> > HCA without extracting state, etc. and providing some end-to-end sync of
> > what was really sent / received by the application.  Yes, one can recover
> > from cable or switch port failure by using APM style recovery but that is
> > only one class of faults.  The harder faults either result in the end node
> > being cast out of the cluster or see silent data corruption unless
> > additional steps are taken to transparently recover - again app writers
> > don't want to solve the hard problems; they want that done for them.
>
>The current reference implementation of RDS solves the HCA failure case as 
>well.
>Since applications don't need to keep connection states, it's easier
>to handle cases like HCA and intermediate path failures.
>As far as application is concerned, every sendmsg 'could' result in a
>new connection setup in the driver.
>If the current path fails, RDS reestablishes a connection, if
>available, on a different port or a different HCA , and replays the
>failed messages.
>Using APM is not useful because it doesn't provide failover across HCA's.

I think others may disagree about whether RDS solves the problem.  You have 
no way of knowing whether something was received or not into the other 
node's coherency domain without some intermediary or application's 
involvement to see the data arrived.  As such, you might see many hardware 
level acks occur and not know there is a real failure.  If an application 
takes any action assuming that send complete means it is delivered, then it 
is subject to silent data corruption.  Hence, RDS can replay to its heart 
content but until there is an application or middleware level of 
acknowledgement, you have not solve the fault domain issues.  Some may be 
happy with this as they just cast out the endnode from the cluster / 
database but others see the loss of a server as a big deal so may not be 
happy to see this occur.  It really comes down to whether you believe 
loosing a server is worth while just for a local failure event which is not 
fatal to the rest of the server.

APM's value is the ability to recover from link failure.  It has the same 
value for any other ULP in that it recovers transparently to the ULP.

Mike 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20051108/33a100f1/attachment.html>