[openib-general] Re: Re: Re: [PATCH] change Mellanox SDP workaround toa moduleparameter

Michael S. Tsirkin mst at mellanox.co.il
Tue Feb 21 13:36:10 PST 2006


Quoting r. Hal Rosenstock <halr at voltaire.com>:
> Subject: Re: Re: Re: [PATCH] change Mellanox SDP workaround toa moduleparameter
> 
> On Tue, 2006-02-21 at 16:09, Michael S. Tsirkin wrote:
> > Quoting Hal Rosenstock <halr at voltaire.com>:
> > > > Assuming the spec says as it is, then:
> > > > 1. CMA needs to be modified to retry the connection if its rejected because
> > > >    of lower MTU.
> > > > 2. SDP/SRP protocols specs need a clarification: e.g. current SDP spec
> > > >    says the connection should be closed when we get a REJ.
> > > 
> > > Can you be specific about the spec citations for SDP and SRP for REJ
> > > handling ? Isn't it more the retry strategy once the connection is
> > > REJected ? Is that in those specs ?
> > 
> > This is not explicitly explained in spec. I think Dror discussed the use of
> > REJ/retry to get the MTU in his mail in sufficient detail.
> 
> Sorry for being dense but this is what Dror wrote:
> "The SWG defined a generic mechanism which uses REJ to indicate that 
> the passive side does not accept a certain REQ fields, and allows the
> passive side to indicate an alternative value. Indirection is also
> supported through the same protocol. It also allows the active side,
> following the REJ, to use an alternate value, other than the one
> suggested by the passive side, i.e. passive side only has a veto
> capability."

I think problems that need resolution are:
1. Some of places in spec require the connection to be terminated after REJ.
It does not seem to describe what Dror says anywhere.

For example, see SDP spec:

A4.5.1.2 ABORTING CONNECTION SETUP
CA4-43: When a CM REJ MAD is received by either the connecting or accepting
peer the connection setup shall be aborted.
If a CM REJ MAD is sent for an SDP-specific error, the reject reason code
value shall be 28 (Consumer Reject -- 12.6.7.2 Rejection Reason on page 665.
An SDP implementation is expected to cleanup any resources associated with
an aborted connection.

2. The implementation is still missing: does it belong as part of CM,
or should it be a higher level thing like CMA?

3. Is there some solution for backward compatibility?
There does not seem to exist a way to figure out whether
sending REJ makes sense since the remote side will retry the connection
with a better MTU value, or not.

> So the only issue here is the inefficiency in terms of the back and
> forth of CM messages to get to the 1K MTU connection. How important is
> connection rate for SDP and SRP ? If not, can't we live with how things
> are ?

SDP implements sockets, and thats a very wide field, so everything is important.
AFAIK, connection rate is very important for some socket applications, and does
not matter at all for others.

-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies



More information about the general mailing list