[ofa-general] RE: How fast to get RDMA_CM_EVENT_DISCONNECTED ?

Tang, Changqing changquing.tang at hp.com
Wed Apr 11 15:49:34 PDT 2007


 

> -----Original Message-----
> From: Sean Hefty [mailto:mshefty at ichips.intel.com] 
> Sent: Wednesday, April 11, 2007 4:50 PM
> To: Tang, Changqing
> Cc: general at lists.openfabrics.org
> Subject: Re: How fast to get RDMA_CM_EVENT_DISCONNECTED ?
> 
> > 	A question about rdmacm library. I use 
> rdma_connect/accept to wire 
> > the IB connection between A and B. Somehow the IB 
> connection is broken 
> > by either process B dies, or a bad cable. If process A just 
> receives 
> > messages from process B, can process A get a 
> > RDMA_CM_EVENT_DISCONNECTED event ? if yes, how fast A can get such 
> > event ?
> 
> If the process B dies, the kernel IB CM on B's system will 
> automatically disconnect.  Process A should get this fairly 
> close to when process B dies.
> 
> I'm not as sure about the timing for a bad cable.
> 
> Slightly off topic, but how do you handle flow control 
> between process A and B if process A only receives?

Yes, Internally in A, if the # of receives exceeds lowwater(4), an ack
will be sent back. I assume ACK is not trigered at the moment.
when A is trying to receive a message from B, and the message never
shows, A acctualy sends a heart beat back to B, however, it takes
serveral seconds for this heart-beat to complete with error ( we
configure timout ~1 sec, and retry count 7).

Serveral seconds to detect connection failure is not acceptable for us,
so if I use rdmacm, I want to know if I detect the connection
failure faster than heart-beat message.

Again, if there is cable issue, is there still a DISCONNECT event
generated eventually ?


--CQ


> 
> - Sean
> 



More information about the general mailing list