[openib-general] [ANNOUNCE] Contribute RDS (ReliableDatagramSockets) to OpenIB

Michael Krause krause at cup.hp.com
Wed Nov 9 13:57:06 PST 2005


At 01:24 PM 11/9/2005, Greg Lindahl wrote:
>On Wed, Nov 09, 2005 at 12:18:28PM -0800, Michael Krause wrote:
>
> > So, things like HCA failure are not transparent and one cannot simply
> > replay the operations since you don't know what was really seen by the
> > other side unless the application performs the resync itself.
>
>I think you are over-stating the case. On the remote end, the kernel
>piece of RDS knows what it presented to the remote application, ditto
>on the local end. If only an HCA fails, and not the sending and
>receiving kernels or applications, that knowledge is not lost.
>
>Perhaps you were assuming that RDS would be implemented only in
>firmware on the HCA, and there is no kernel piece that knows what's
>going on. I hadn't seen that stated by anyone, and of course there are
>several existing and contemplated OpenIB devices that are considerably
>different from the usual offload engine. You could also choose to
>implement RDS using an offload engine and still keep enough state in
>the kernel to recover.

I hadn't assumed anything.  I'm simply trying to understand the assertions 
concerning availability and recovery.  What you indicate above is that RDS 
will implement a resync of the two sides of the association to determine 
what has been successfully sent.  It will then retransmit what has not 
transparent to the application.  This then implies that the reliability of 
the underlying interconnect isn't as critical per se as the end-to-end RDS 
protocol will assure that data is delivered to the RDS components in the 
face of hardware failures.   Correct?

Mike 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20051109/38444acf/attachment.html>


More information about the general mailing list