[ofa-general] RE: CM goes to timewait state without waiting for disconnect reply

Amir Vadai amirv at mellanox.co.il
Thu Apr 17 12:49:41 PDT 2008


I understand - I'll make sure the flow you described will be used.

Thanks a lot,
- Amir.


-----Original Message-----
From: Sean Hefty [mailto:sean.hefty at intel.com] 
Sent: ה 17 אפריל 2008 21:52
To: Amir Vadai
Cc: general at lists.openfabrics.org; Oren Duer
Subject: RE: CM goes to timewait state without waiting for disconnect reply

>What I see is that a connection request is coming from the client to 
>the server And the server reply with reject - the reason for the reject 
>is that a timewait structure already exists for this QPN. And that's 
>because the client thinks that a connection is closed and reuse the QPN 
>but the server didn't finish cleaning up the connection.

This is an unavoidable situation.  There's no coordination between the timewait states on different systems, so it's always possible for one to re-connect before the other system has exited timewait.

However, in your case, the problem is that the client is trying to re-use the QPN outside of knowing when it has exited the local timewait state.  Instead, have the client issue a DREQ, and then wait for the timewait state to exit before trying to re-use the QPN.

This would then be the sequence:

client		server
sends DREQ
			enters timewait
			sends DREP
enters timewait
exits timewait
destroy cm_id
new connection

Your hope at this point is that the server exits timewait before the client will, while, likely, is not guaranteed.

- Sean




More information about the general mailing list