[openib-general] Re: Re: Re: [PATCH] change Mellanox SDP workaround toa moduleparameter

Hal Rosenstock halr at voltaire.com
Tue Feb 21 13:43:32 PST 2006


On Tue, 2006-02-21 at 16:36, Michael S. Tsirkin wrote:
> Quoting r. Hal Rosenstock <halr at voltaire.com>:
> > Subject: Re: Re: Re: [PATCH] change Mellanox SDP workaround toa moduleparameter
> > 
> > On Tue, 2006-02-21 at 16:09, Michael S. Tsirkin wrote:
> > > Quoting Hal Rosenstock <halr at voltaire.com>:
> > > > > Assuming the spec says as it is, then:
> > > > > 1. CMA needs to be modified to retry the connection if its rejected because
> > > > >    of lower MTU.
> > > > > 2. SDP/SRP protocols specs need a clarification: e.g. current SDP spec
> > > > >    says the connection should be closed when we get a REJ.
> > > > 
> > > > Can you be specific about the spec citations for SDP and SRP for REJ
> > > > handling ? Isn't it more the retry strategy once the connection is
> > > > REJected ? Is that in those specs ?
> > > 
> > > This is not explicitly explained in spec. I think Dror discussed the use of
> > > REJ/retry to get the MTU in his mail in sufficient detail.
> > 
> > Sorry for being dense but this is what Dror wrote:
> > "The SWG defined a generic mechanism which uses REJ to indicate that 
> > the passive side does not accept a certain REQ fields, and allows the
> > passive side to indicate an alternative value. Indirection is also
> > supported through the same protocol. It also allows the active side,
> > following the REJ, to use an alternate value, other than the one
> > suggested by the passive side, i.e. passive side only has a veto
> > capability."
> 
> I think problems that need resolution are:
> 1. Some of places in spec require the connection to be terminated after REJ.
> It does not seem to describe what Dror says anywhere.
> 
> For example, see SDP spec:
> 
> A4.5.1.2 ABORTING CONNECTION SETUP
> CA4-43: When a CM REJ MAD is received by either the connecting or accepting
> peer the connection setup shall be aborted.
> If a CM REJ MAD is sent for an SDP-specific error, the reject reason code
> value shall be 28 (Consumer Reject -- 12.6.7.2 Rejection Reason on page 665.
> An SDP implementation is expected to cleanup any resources associated with
> an aborted connection.

Yes, that was what I saw too. It was unclear to me what the
ramifications of aborting the connection are. It didn't say it couldn't
be retried. Also, doesn't it leave open what would be done with
Rejection Reason 26 ?

> 2. The implementation is still missing: does it belong as part of CM,
> or should it be a higher level thing like CMA?

Yes, this is after how to deal with any standards issues with the ULPs.

> 3. Is there some solution for backward compatibility?
> There does not seem to exist a way to figure out whether
> sending REJ makes sense since the remote side will retry the connection
> with a better MTU value, or not.

If consumer reject is the only REJ reason and the format of the ARI
gives no clue, I agree (that there is no basis on which the active peer
has any idea what to do).

> > So the only issue here is the inefficiency in terms of the back and
> > forth of CM messages to get to the 1K MTU connection. How important is
> > connection rate for SDP and SRP ? If not, can't we live with how things
> > are ?
> 
> SDP implements sockets, and thats a very wide field, so everything is important.
> AFAIK, connection rate is very important for some socket applications, and does
> not matter at all for others.

It sounds like better CM handling is important for connection rate.

-- Hal




More information about the general mailing list