[openib-general] Question about QP's in timewait state and CM stale conn rejects
Or Gerlitz
ogerlitz at voltaire.com
Sun Aug 20 04:53:48 PDT 2006
This email appear in the archive, but seems not to be distributed to the
subscribers so i am reposting it.
Or Gerlitz wrote:
> Arlin Davis wrote:
>> We are running into connection reject issues (IB_CM_REJ_STALE_CONN)
>> with our application under heavy load and lots of connections.
>>
>> We occassionally get a reject based on the QP being in timewait state
>> leftover from a prior connection. It appears that the CM keeps track
>> of the QP's in timewait state on both sides of the connection,
>
> How did you verify that? the CM generated REJ with IB_CM_REJ_STALE_CONN
> in two flows for the passive side (ie rejecting a REQ) and one flow for
> the active side (ie rejecting a REP).
>
>> How can a consumer know for sure that the new QP will not be in a
>> timewait state according to the CM? Does it make sense to push the
>> timewait functionality down into verbs? If not, is there a way for the
>> CM to hold a reference to the QP until the timewait expires?
>
> Just to emphasize what Sean has pointed out, you are asking how can a CM
> consumer know that a **local** QPN is not in the timewait state
> according to the **remote** CM. Since the issue is with the remote CM,
> it seems to me that pushing down timewait into verbs is not the correct
> direction to look at.
>
> Or.
>
More information about the general
mailing list