[openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

Sean Hefty mshefty at ichips.intel.com
Tue Aug 29 14:02:43 PDT 2006


Michael S. Tsirkin wrote:
>>If we completely ignore timewait, what conditions are required to have a problem 
>>occur?
> 
> Outstanding packets with PSNs and QP numbers coinside between the 2 connections.
> Look for "Stale packet" in IB spec.

 From what I can tell, a QP will receive an incoming packet incorrectly if the 
SLID and PSN match that of its current connection, which matches with your 
statement.  Stale packets could cause this, but so can misconfigured QPs.  (I'm 
just trying to understand how large the problem is, and how much of it does 
timewait solve.)

> Hmm. We can ask user not to post sends if he rejects the REP.
> Then there won't be stale packets. But is there anything in spec that
> forbids this?

See page 690 of the spec.  It implies that the QP should go to RTS only if an 
RTU is sent.

Note that if the DREQ is lost, it's possible for the remote side to initiate a 
send after the local QP has exited timewait, which seems to defeat its purpose 
in this case.

> Maybe an extra call is better than assuming things beyond spec
> requirements?

I'm still trying to determine who provides the timewait duration.  Verbs allows 
users to connect QPs without going through the CM, and several apps do this. 
Timewait provides only partial protection against this problem, so maybe we only 
restrict it to handling the most common case, which is after the QP has 
transitioned to RTS.

Another alternative to solving this problem is to select a PSN value that is 
likely to discard stale packets.  Can the lower level driver be of any 
assistance here?  I.e. would it know what the last PSN value was received on a QP?

- Sean




More information about the general mailing list