[openib-general] Question about QP's in timewait state and CM stale conn rejects

Arlin Davis ardavis at ichips.intel.com
Wed Aug 16 14:42:36 PDT 2006


We are running into connection reject issues (IB_CM_REJ_STALE_CONN) with 
our application under heavy load and lots of connections.

We occassionally get a reject based on the QP being in timewait state 
leftover from a prior connection. It appears that the CM keeps track of 
the QP's in timewait state on both sides of the connection, 
independently of the verbs layer, even after the QP has been destroyed 
at the verbs level. I can actually create a new QP via verbs and it 
could still be on the CM timewait queue waiting for the timer to pop and 
be removed. If this is the case, my attempts to connect using this QP 
will fail with a reject.

How can a consumer know for sure that the new QP will not be in a 
timewait state according to the CM? Does it make sense to push the 
timewait functionality down into verbs? If not, is there a way for the 
CM to hold a reference to the QP until the timewait expires?

-arlin




More information about the general mailing list