[openib-general] Question about QP's in timewait state and CM stale conn rejects
Arlin Davis
ardavis at ichips.intel.com
Wed Aug 16 14:42:36 PDT 2006
We are running into connection reject issues (IB_CM_REJ_STALE_CONN) with
our application under heavy load and lots of connections.
We occassionally get a reject based on the QP being in timewait state
leftover from a prior connection. It appears that the CM keeps track of
the QP's in timewait state on both sides of the connection,
independently of the verbs layer, even after the QP has been destroyed
at the verbs level. I can actually create a new QP via verbs and it
could still be on the CM timewait queue waiting for the timer to pop and
be removed. If this is the case, my attempts to connect using this QP
will fail with a reject.
How can a consumer know for sure that the new QP will not be in a
timewait state according to the CM? Does it make sense to push the
timewait functionality down into verbs? If not, is there a way for the
CM to hold a reference to the QP until the timewait expires?
-arlin
More information about the general
mailing list