[openib-general] Problem is routing CM REQ

Tue Feb 13 15:10:27 PST 2007

At 02:02 PM 2/13/2007, Jason Gunthorpe wrote:
>On Tue, Feb 13, 2007 at 12:49:57PM -0800, Michael Krause wrote:
>
> > >Translated into a network with routers this means that for a RC flow
> > >to successfully work both the *forward* and *reverse* direction must
> > >traverse the same router *LID* not just *port* on both subnets.
> >
> > That is a given since the LID = path and same path must be used to insure
> > strong ordering is maintained.
>
>I think you are missing what I'm saying. IB within a subnet has the
>path selected by the DLID only.

The actual path selection is a policy decision outside the scope of the 
specification - it appears this is your main concern in that the 
specification does not state "take these N parameters and apply the 
following algorithm to identify a path".   The address vector can be 
comprised of many fields including a LID range.  The actual DLID selected 
is done above as there can be a variety of policies or constraints imposed 
for a given data flow.   I agree that packet switching within is via a DLID.

>So the construction process for a QP is to choose two enport LIDs, reverse 
>them on one side and then query the SA for the forward and reverse SL. 
>That gives you a pair of workable QPs.

SL, LID, etc. are all uploaded into the management database for the SM / SA 
to access and there can be much more robust information loaded as well that 
goes well beyond what the IBTA specified in order to provide additional 
interpretations / information to guide path selection.   A query can return 
multiple records if multi-path has been configured.  Policy above is used 
to construct the CM messages which communicate the preferred path.    The 
CM messages for establishment across subnets should be sufficient in their 
existing content to work independent of how the actual routing is 
accomplished.

>This same procedure doesn't work for routers.
>Consider a case where a router port has LID 1 and an end port has
>LIDs 3,4.
>The end port establishes two RC QPs:
>  #1: SLID=3, DLID=1
>  #2: SLID=4, DLID=1
>Both have the same DGID - how is the router expected to know that QP
>#1 requires one set of LIDs and QP #2 requires a different set?

For all intents and purposes, within a local subnet, a router Port is 
treated the same as CA.  If there are multiple paths between a router Port 
and a given CA Port, i.e. multiple LIDs are configured, then the router is 
supposed to query the SM / SA database and obtain the appropriate records 
and make a decision that remains valid for the lifetime of the data 
flow.   The purpose of the TClass is to enable a local mapping to SL which 
can also be used as input into LID selection.   The flow label is left open 
in its value and was expected to be used much like it is in IP.   People 
considered encoding it or at a minimum, using it as an input parameter to 
identify the associated LID for the flow but that was not agreed to since 
the router vendors at the time wanted it left largely opaque.

>Section 19.2.4.1 seems to make it explicit to me that this is a valid 
>situation.

Yes, 19.2.4.1 supports multi-path within a given subnet.

>To have this work the router must use the flow label to identify the
>correct DLID. SA/CM must be enhanced in some way to let the two sides
>exchange flow labels.

That is a policy decision or something for a TBD router protocol 
specification.   It is not required to use the Flow Label.

>This problem is worse if you have multiple independent redundent
>routers on your subnet, or LMC != 0. Then you now have the problem of
>SLID matching as well as DLID matching.

It is no worse due to the existence of multi-path.   There are many 
variables involved in creating a viable router protocol specification which 
is in part, why the IBTA chose to not complete that work.

>Strong ordering is maintained in all cases because the routers always
>make consistent choices for the LRH.DLID on a session by session
>basis.

Agreed,  The router is responsible for insuring a consistent path is used 
for a given flow.  That does not preclude multi-path nor does it make 
multi-path more complicated as a result.

> > >That is why I said previously that the QP matching rules are a
> > >mistake. The best way to solve this is to change C9-54 to only be in
> > >effect if the GRH is not present.
> >
> > I disagree.  We were very explicit in how and why we constructed those
> > rules.
>
>Do you know of a solution then?
>
>If C9-54 is a very deliberate design then it must be that the CM
>specification in Chapter 12 is not designed to handle the
>ramifications of C9-54.
>
>I just can't see how to fit both CM and C9-54 together into a workable
>solution.

You are arguing about a router protocol problem that does not exist  or 
perhaps I just don't get it.   We did progress the router specification or 
at least the operating models behind it sufficiently to validate that both 
Chapter 9 and Chapter 12 worked as specified (as well as chapters 8 and 
19).   Yes, there are implementation issues within a router for it to 
perform the appropriate queries on the SM / SA to identify a preferred 
flow's path within a given subnet.   This makes this a local subnet policy 
issue and is orthogonal to the compliance statements in the volume 1 
specification.    If you believe the specification is faulty, then it would 
be best to take this to the IBTA and have an official review done by the 
workgroup teams involved with these chapters.   People are free to 
implement what they choose which for routers is completely open since there 
isn't a specification but for the compliance statements, assuming 
interoperability is desirable in this regard, the validation tree in the 
specification should be used and any software implemented on top of such 
hardware should take that into account.    For the most part, what you've 
described is largely an argument about the policy to select a path and that 
is a router domain problem not a packet validation problem.    Within the 
router domain, that is pure policy just like in the IP world.  As long as 
it results a given flow consistently using the same data path, all is 
good.    At the end of the day, the router implementations will decide 
their policy for determining the optimal path and I doubt there will be a 
one-size-fits-all agreement on the formula that is used to construct that 
decision (albeit, if the SM/SA only returns one path for a given flow, then 
the decision is rather easy).

Mike