[openib-general] [PATCH] ib_cm: randomize starting local comm IDs

Michael S. Tsirkin mst at mellanox.co.il
Wed Aug 23 07:26:20 PDT 2006


Quoting r. Or Gerlitz <or.gerlitz at gmail.com>:
> Subject: Re: [PATCH] ib_cm: randomize starting local comm IDs
> 
> On 8/23/06, Michael S. Tsirkin <mst at mellanox.co.il> wrote:
> > Quoting r. Or Gerlitz <or.gerlitz at gmail.com>:
> 
> > > So the CM at the target side rejects the first REQ after the client
> > > reboot with STALE reason (and deliveres a disconnect event to the
> > > ULP). The second REQ is processed fine and a new connection is
> > > established.
> 
> > Hmm. Might this still be a concern for users such as SDP
> > which don't retry connections?
> 
> I don't know if "this" in your email referes to the quote above

I am speaking more or less about this quote from your message:
> > Without the patch, since the REQ had <local_id,qpn> as this of an
> > existing connection, it was just silently dropped and a target reboot
> > was a must to let the initiator reconnect !

the spec says:
> > 	If a CM receives a REQ/REP as described above, if the REQ/REP has the
> > 	same Local Communication ID and Remote Communication ID as are present
> > 	in the existing connection and if the REQ/REP arrives within the window
> > 	of time during which the active/passive side could be legally
> > 	retransmitting REQ/REP, the CM should treat the REQ/REP as a retry and
> > 	not initiate stale connection processing as described above.

so I am wandering why is it not sufficient to wait for
the window of time as described above to expire?
Is something broken in CM that this patch is papering over?

> but what
> i discribe there is what stated in the IB spec ch. 12 re stale connections.

I know. I am just wandering aloud whether this is relevant for SDP.
Why won't the window expire as described above?

> So you need to either rely on the SDP consumer to reconnect or when
> getting a STALE reject attempt to reconnect from SDP.
 
I'm not sure SDP needs to do anything - the port is busy, after all.
Retrying seems to be against the SDP spec.

-- 
MST




More information about the general mailing list