[openib-general] Question about QP's in timewait state and CM stale conn rejects

Or Gerlitz ogerlitz at voltaire.com
Sun Aug 20 04:53:48 PDT 2006


This email appear in the archive, but seems not to be distributed to the 
subscribers so i am reposting it.

Or Gerlitz wrote:
> Arlin Davis wrote:
>> We are running into connection reject issues (IB_CM_REJ_STALE_CONN) 
>> with our application under heavy load and lots of connections.
>>
>> We occassionally get a reject based on the QP being in timewait state 
>> leftover from a prior connection. It appears that the CM keeps track 
>> of the QP's in timewait state on both sides of the connection, 
> 
> How did you verify that? the CM generated REJ with IB_CM_REJ_STALE_CONN 
> in two flows for the passive side (ie rejecting a REQ) and one flow for 
> the active side (ie rejecting a REP).
> 
>> How can a consumer know for sure that the new QP will not be in a 
>> timewait state according to the CM? Does it make sense to push the 
>> timewait functionality down into verbs? If not, is there a way for the 
>> CM to hold a reference to the QP until the timewait expires?
> 
> Just to emphasize what Sean has pointed out, you are asking how can a CM 
> consumer know that a **local** QPN is not in the timewait state 
> according to the **remote** CM. Since the issue is with the remote CM, 
> it seems to me that pushing down timewait into verbs is not the correct 
> direction to look at.
> 
> Or.
> 






More information about the general mailing list