[openfabrics-ewg] [openib-general] Minutes for January 15, 2007 teleconference about OFED 1.2 development progress toward code freeze

Caitlin Bestler caitlinb at broadcom.com
Mon Jan 22 16:22:18 PST 2007


openib-general-bounces at openib.org wrote:
>  >  > 	when RDMA is used, a message is transferred
> from card A (in node
>>  > A) to card B (in node B), card B delivers the message to to user
>> buffer,  > and sends ACK to card A, but ACK is lost due to switch
>> fail. So process  > on node A get fail for this transfer, but process
>> on node B check the  > memory  and get the message(success).  >
>>  > 	If send/recv(SRQ) is used, is it possible that process on node A
>>  > get failure, but process on node B successfully get the message ?
>> 
>> Yes, of course, for exactly the same reason you describe above (lost
>> ACK).
> 
> Thanks. So it is NOT possible that sender gets success, but
> receiver gets failure, right ?
> 

The fact that the message was acknowledged by card B at
best means that the message is fully in the receiver's memory.
Card B cannot tell the Sender that the reciever lived long
enough to process or even notice the message.

The only thing that the Sender should infer from a send completion
is that the connection is still alive and that it no longer needs
to maintain its copy of the message. It also knows that the
Receiver will not receive any later message before it receives
this one.

If there is an application specific reason to know more then
it needs to rely on an application layer message from the
receiver. For most applications, such a response is a natural
part of the application layer protocol anyway.

 





More information about the ewg mailing list