[ofa-general] IB/cm: bug in stale connection detection logic?
Michael S. Tsirkin
mst at dev.mellanox.co.il
Mon May 21 13:40:41 PDT 2007
> Quoting Sean Hefty <mshefty at ichips.intel.com>:
> Subject: Re: [ofa-general] IB/cm: bug in stale connection detection logic?
>
> >Why is an extra call to cm_get_id required to detect a duplicate?
> >Shouldn't we just
> >
> > timewait_info = cm_insert_remote_id(cm_id_priv->timewait_info);
> > if (timewait_info) {
> > /* handle duplicate */
> > return;
> > }
> >
> > timewait_info = cm_insert_remote_qpn(cm_id_priv->timewait_info);
> > if (timewait_info) {
> > /* handle stale */
> > return;
> > }
> >
> > not a duplicate and not a stale connection
>
> After looking at this more, I think we want something structured closer
> to what's listed above, with the duplicate handling enhanced to check
> that the QPN in the potential duplicate REQ matches what's already
> associated with the remote ID.
Yes, that's what I thought too.
> Did you hit into an actual problem with the current code? It seems like
> the only issue is that a possible stale request would timeout, rather
> then be immediately rejected. If so, I will queue up a patch for 2.6.23.
Exactly. This is a serious problem for IPoIB CM since packet drop rates
and recovery times go up radically: sockets get closed, etc.
With a reject we would just retry connecting on the next packet.
Could you please post a patch? Let's discuss whether it's appropriate
for 2.6.22 separately.
--
MST
More information about the general
mailing list