[ofa-general] RE: How fast to get RDMA_CM_EVENT_DISCONNECTED ?
Tang, Changqing
changquing.tang at hp.com
Wed Apr 11 15:49:34 PDT 2007
> -----Original Message-----
> From: Sean Hefty [mailto:mshefty at ichips.intel.com]
> Sent: Wednesday, April 11, 2007 4:50 PM
> To: Tang, Changqing
> Cc: general at lists.openfabrics.org
> Subject: Re: How fast to get RDMA_CM_EVENT_DISCONNECTED ?
>
> > A question about rdmacm library. I use
> rdma_connect/accept to wire
> > the IB connection between A and B. Somehow the IB
> connection is broken
> > by either process B dies, or a bad cable. If process A just
> receives
> > messages from process B, can process A get a
> > RDMA_CM_EVENT_DISCONNECTED event ? if yes, how fast A can get such
> > event ?
>
> If the process B dies, the kernel IB CM on B's system will
> automatically disconnect. Process A should get this fairly
> close to when process B dies.
>
> I'm not as sure about the timing for a bad cable.
>
> Slightly off topic, but how do you handle flow control
> between process A and B if process A only receives?
Yes, Internally in A, if the # of receives exceeds lowwater(4), an ack
will be sent back. I assume ACK is not trigered at the moment.
when A is trying to receive a message from B, and the message never
shows, A acctualy sends a heart beat back to B, however, it takes
serveral seconds for this heart-beat to complete with error ( we
configure timout ~1 sec, and retry count 7).
Serveral seconds to detect connection failure is not acceptable for us,
so if I use rdmacm, I want to know if I detect the connection
failure faster than heart-beat message.
Again, if there is cable issue, is there still a DISCONNECT event
generated eventually ?
--CQ
>
> - Sean
>
More information about the general
mailing list