[openib-general] Problem is routing CM REQ

Jason Gunthorpe jgunthorpe at obsidianresearch.com
Sun Feb 11 15:09:35 PST 2007


On Fri, Feb 09, 2007 at 06:08:34PM -0800, Sean Hefty wrote:
> >So basically what you are saying is that the TClass and FlowLabel act
> >as some kind of global dis-ambiguation that lets all SAs know that the
> >tuple <SGID,DGID,TClass,FlowLabel> MUST be matched with <LRH_A,LRH_B>
> >on each side.
> 
> Sort of...  My reasoning is that if you look at a packet traveling
> from the source QP to the destination QP, and examine the packet in
> some intermediate subnet (say between two routers), then the only
> information that it carries is the <SGID, DGID, TClass, FlowLabel>
> tuple.  This information must be sufficient to direct the routing at
> the endpoints.

Ah, I think I missed the key step in your scheme.. You plan to query
the local SM for SGID=remote DGID=local? (ie reversed from 'normal'. I
was thinking only about the SGID=local DGID=remote query direction)

Yes, I agree this works in the simple cases. Quite well in fact...
The reversed direction of the PR query is very much aligned with the
idea that the GRH is only a destination affecting thing.

Let my try to outline to you what I think you are proposing.
This is the diagram I am thinking of:

   SA                                                      SA'
Node1 --> (LID 1) Router A -------  Router A' (LID A) ---> Node2
      |-> (LID 2) Router A                              |
      |-> (LID 3) Router B -------  Router B' (LID B) --|

Router A and Router B are independent redundant devices, not a route
cloud of some sort. B -> A' is not a possible path.

So your idea is to do:
  PR0: Node 1 asks SA for Node1 -> Node2 reversable path.
       SA returns SLID=Node1 DLID=1, FlowLabel=Magic Reversable
       indicator. This path is used for CM GMPs, or for the
       normal non-routed CM.
  PR1: Detecting a routed situation from PR0, 
       Node 1 asks SA for Node2 -> Node1. SA returns SLID=1
       DLID=Node1 and a GRH that configures Router A to use SLID=1
       You reverse the local LIDS from that path to get the QP
       configuration.
  PR2: Node 1 asks SA' for Node1 -> Node2. SA returns SLID=A
       DLID=Node.

OK. But what if:
  PR1: Node 1 asks SA for Node2 -> Node1. SA returns SLID=3
       DLID=Node1
  PR2: Node 1 asks SA' for Node1 -> Node2. SA returns SLID=A
       DLID=Node2.

Now the LIDs don't match and the QP won't work. SA' has no idea that
SA picked Router B.

> It shouldn't need information about the paths used by packets on the
> remote subnet.  If a subnet has multiple routers into it, they can
> forward packets to the correct router if needed.  (Could the routers
> just forward to the end node and insert the expected SLID?)

Right, this is a good way to solve the problem. Going with the
example above, SA' returns a GRH that configures Router B' to use
SLID=A and the GRH SA returned configures Router A to use SLID=3.
Router B' and A both are faking the SLID in the LRH.

This effectively defeats the QP SLID check and everything works :>
[Like I said before, this check seems to be a misfeature]

I can think of the following downsides:
 1) Re-reading Michael Krause's email makes me think that defeating
    the QP SLID check is contrary to the spirit of IBA
 2) Routers now require a GRH->LRH translation table size that
    is proportional to all the router LIDs in the subnet, not
    just its own LIDs. [Smart selection of the Flow Label could
    mitigate this growth though]
 3) The reverse PR query method requires 3 PR queries for the simple
    case and as many as 5 if you want non-reversible paths.
 4) Some means of remote SA communication needs to be decided
    pre-standardization :< (I agree that a magic GID seems best)

But... It is the SLID faking that solves the multiple-router-path
problem, not the reverse PR. Do you think something like that could
be standardized?

I guess the big question I have is if IBA chooses to standardize some
other method, how much chance is there that it would also make this
unsupportable? Ie by preventing the remote SA communication mechanism
or by defining a reverse PR to mean something else? I could easially
imagine the reverse PR being defined as a way to ask the local SA
about the *remote* LIDs.

[Actually, if you define it that way and use a MultiPathRecord query
 then there is enough information to return working LIDs
 for both subnets. The SAs would have to communicate between
 themselves and the routers using a new protocol, but that is
 doable. This does require that a PR be defined so that the LIDs are
 relative to the subnet of the SGID - not to the local subnet!]

> I'm still trying to find a solution that doesn't violate the
> architecture as defined.  I don't see why my idea wouldn't work yet.
> It just requires some unspecified coordination between the local SA
> and local routers.

I'd also very much like to not have to change the passive side to make
this work.

But this has turned into such a complex problem it seems really hard
to predict what will pass through to standardization.. That is the
main benifit I see of the small change to the passive side. No matter
what is standardized it can be accomidated in the resulting
standard, wheras defining a PR with SGID==offsubnet to mean one thing
or another seems much more risky.

Jason




More information about the general mailing list