[openib-general] [ANNOUNCE] Contribute RDS(ReliableDatagramSockets) to OpenIB

Michael Krause krause at cup.hp.com
Fri Nov 11 16:03:27 PST 2005


At 01:02 PM 11/11/2005, Ranjit Pandit wrote:
>On 11/11/05, Michael Krause <krause at cup.hp.com> wrote:
> > Please clarify the following which was in the document provided by Oracle.
> >
> > On page 3 of the RDS document, under the section "RDP Interface", the 2nd
> > and 3rd paragraphs are state:
> >
> >    * RDP does not guarantee that a datagram is delivered to the remote
> > application.
> >    * It is up to the RDP client to deal with datagrams lost due to 
> transport
> > failure or remote application failure.
> >
> > The HCA is still a fault domain with RDS - it does not address flushing 
> data
> > out of the HCA fault domain, nor does it sound like it ensures that CQE 
> loss
> > is recoverable.
> >
> > I do believe RDS will replay all of the sendmsg's that it believes are
> > pending, but it has no way to determine if already sent sendmsgs were
> > actually successfully delivered to the remote application unless it 
> provides
> > some level of resync of the outstanding sends not completed from an
> > application's perspective as well as any state updated via RDMA operations
> > which may occur without an explicit send operation to flush to a known
> > state.  I'm still trying to ascertain whether RDS completely recovers from
> > HCA failure (assuming there is another HCA / path available) between 
> the two
> > endnodes.
>
>RDS will replay the sends that are completed in error by the HCA,
>which typically would happen if the current path fails or the remote
>node/HCA dies.

Does this mean that the receiving RDS entity is responsible for dealing 
with duplicates?  A Send completion error does not mean that the receiving 
endnode did not receive the data for either IB or iWARP; it only indicates 
that the Send operation failed which could be just a loss of the receive 
ACK with the Send completing on the receiver.  Such a scenario would imply 
that RDS would have to comprehend what buffers have actually been consumed 
before retransmission, i.e. a resync is performed, else one could receive 
duplicate data at the application layer which can cause corruption or other 
problems as a function of the application (tolerance will vary by 
application thus the ULP must present consistent semantics to enable a 
broader set of applications than perhaps the initial targeted application 
to be supported).

>In case of a catastrophic error on the local HCA, subsequent sends will 
>fail (for a certain time (session_time_wait ) ) as if there was no 
>alternate path available at that time. On getting an error the application 
>should discard any sends unacknowledged by it's peer and take corrective 
>action.

Unacknowledged by the peer means at the interconnect or the application 
level?  Again, how is the receive buffer management handled?

>After the time_wait is over, subsequent sends will initiate a brand new 
>connection which could use the alternate HCA ( if the path is available).

This is understood.

Mike 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20051111/413b285e/attachment.html>


More information about the general mailing list