[ofa-general] Re: [PATCH] ib/cm: fix stale connection detection
Michael S. Tsirkin
mst at dev.mellanox.co.il
Tue May 22 00:59:52 PDT 2007
The ib_cm can incorrectly detect a stale connection (a new connection
request for a QPN that is already connected) as a duplicate connection
request. Separate the handling of potential duplicate REQs from stale
connections.
Signed-off-by: Sean Hefty <sean.hefty at intel.com>
Acked-by: Michael S. Tsirkin <mst at dev.mellanox.co.il>
> ---
>
> Can you let me know if this fixes the issues for you? I reworked the
> code only to detect the stale connection properly.
Yes, this has fixed the issue for me.
I have not seen any timeouts yet: netperf seems to recover in at most 15 sec,
where previously it needed up to 2 minutes.
The patch looks obvious enough for 2.6.22, safe enough in that it replaces a
timeout with a reject, and it addresses a real problem. Sean? Roland? What do
you think?
> More work is needed
> to force the local QP into timewait if that is needed.
Yes, this is needed: IPoIB has its own stale
connection detection logic, so it will, after several minutes
of inactivity, clean out the connection; however, if the
number of QPs is increased, this timeout might become too long:
and handling this only at the ULP level is wrong anyway.
In practice I don't think we have seen this yet,
but the spec is quite explicit about this point:
When a CM receives such a REQ/REP it shall abort the connection establishment by
issuing REJ to the REQ/REP. It shall then issue DREQ, with “DREQ:remote QPN”
set to the remote QPN from the REQ/REP, until DREP is received or Max Retries
is exceeded, and place the local QP in the timeWait state.
I agree this is 2.6.23 material, however.
Something that I think would be very useful for 2.6.22 already: could you please
document which portions of chapter 12 are not currently implemented in cm.c, and
put this in some file in kernel tree? This way people will be able to figure
out whether something that they need is missing, and contribute.
> This would
> likely require adding a new CM event to report that a stale connection
> was detected on the QP.
Yes, this looks like a reasonable way to do this.
> Also, I left the duplicate request handling
> as it was, since that should go in as a separate patch.
Could you please describe what is missing currently?
Is the missing handling likely to cause timeouts?
I hope we have reduced the chance of duplicate request misdetections
with the local id patch sufficiently, and fixing this can
wait till 2.6.23.
> drivers/infiniband/core/cm.c | 25 ++++++++++++++-----------
> 1 files changed, 14 insertions(+), 11 deletions(-)
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index eff591d..c53d486 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -1295,26 +1295,29 @@ static struct cm_id_private * cm_match_req(struct cm_work *work,
req_msg = (struct cm_req_msg *)work->mad_recv_wc->recv_buf.mad;
- /* Check for duplicate REQ and stale connections. */
+ /* Check for possible duplicate REQ. */
spin_lock_irqsave(&cm.lock, flags);
timewait_info = cm_insert_remote_id(cm_id_priv->timewait_info);
- if (!timewait_info)
- timewait_info = cm_insert_remote_qpn(cm_id_priv->timewait_info);
-
if (timewait_info) {
cur_cm_id_priv = cm_get_id(timewait_info->work.local_id,
timewait_info->work.remote_id);
- cm_cleanup_timewait(cm_id_priv->timewait_info);
spin_unlock_irqrestore(&cm.lock, flags);
if (cur_cm_id_priv) {
cm_dup_req_handler(work, cur_cm_id_priv);
cm_deref_id(cur_cm_id_priv);
- } else
- cm_issue_rej(work->port, work->mad_recv_wc,
- IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REQ,
- NULL, 0);
- listen_cm_id_priv = NULL;
- goto out;
+ }
+ return NULL;
+ }
+
+ /* Check for stale connections. */
+ timewait_info = cm_insert_remote_qpn(cm_id_priv->timewait_info);
+ if (timewait_info) {
+ cm_cleanup_timewait(cm_id_priv->timewait_info);
+ spin_unlock_irqrestore(&cm.lock, flags);
+ cm_issue_rej(work->port, work->mad_recv_wc,
+ IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REQ,
+ NULL, 0);
+ return NULL;
}
/* Find matching listen request. */
--
MST
More information about the general
mailing list