[ofa-general] RE: CM goes to timewait state without waiting for disconnect reply

Amir Vadai amirv at mellanox.co.il
Thu Apr 17 11:41:03 PDT 2008


There are some problems that I hope related to that.

But the one I know for sure is:
I got a very busy SDP server with lots of connections coming up and down.
And a client with many threads that open and close connections.

What I see is that a connection request is coming from the client to the server
And the server reply with reject - the reason for the reject is that a timewait structure
already exists for this QPN. And that's because the client thinks that a connection is closed and reuse the QPN but the server didn't finish cleaning up the connection.

In the bottom line - I get a reject on SDP socket open.

- Amir 

-----Original Message-----
From: Sean Hefty [mailto:sean.hefty at intel.com] 
Sent: ה 17 אפריל 2008 21:34
To: Amir Vadai
Cc: general at lists.openfabrics.org; Oren Duer
Subject: RE: CM goes to timewait state without waiting for disconnect reply

>When the client closes the connection it calls ib_destroy_cm_id() who 
>calls cm_destroy_id().
>In my scenario it happen when the CM is in state "Established". In this 
>state
>ib_send_cm_dreq() is called.
>This function sends a DREQ and change state to "DREQ sent".
>After that the function returns and the switch is tried again this time 
>we're in state "DREQ sent".
>There the state is changed into "TimeWait".

Yes - this will result in transitioning into timewait immediately after sending the DREQ.  By destroying the cm_id, the user has indicated that they do not want to wait for a DREP, nor do they care about when timewait has exited.

If a DREQ is received while the cm_id is in timewait, it will generate a DREP in response.  DREP messages while in timewait are simply dropped.

What exactly is the problem that you're seeing?

- Sean




More information about the general mailing list