[openib-general] Problem is routing CM REQ

Jason Gunthorpe jgunthorpe at obsidianresearch.com
Mon Feb 12 16:10:45 PST 2007


On Mon, Feb 12, 2007 at 03:31:15PM -0800, Michael Krause wrote:

> TClass is intended to communicate the end-to-end QoS desired.   TClass is 
> then mapped to a SL that is local to each subnet.   A flow label is 
> intended to much the same as in the IP world and is left, in essence, to 
> routers to manage.    An endnode look up should be to find the address 
> vector to the remote.   A look up may return multiple vectors.   The SLID 
> would correspond to each local subnet router port that acts as a first-hop 
> destination to the remote subnet.    I don't see why the router protocol 
> would not simply enable all paths on the local subnet to a given remote 
> subnet be acquired.  All of the work is kept local to the SA / SM in the 
> source subnet when determining a remote path to take.   Why is there any 
> need to define more than just this?  Define a router protocol to 
> communicate the each subnet's prefix, TClass, etc. and apply KISS.   A 
> management entity that wanted to manage out each subnet provides router 
> management in terms of route selection, etc. can be constructed by using 
> the existing protocols / tools combined with a new router protocol which 
> only does DGID to next hop SLID mapping.

All of this complexity is due to the RC QP requirement that the SLID
of an incoming LRH match the DLID programmed into the QP.

Translated into a network with routers this means that for a RC flow
to successfully work both the *forward* and *reverse* direction must
traverse the same router *LID* not just *port* on both subnets.

Please see the little ascii diagram I drew in a prior email to
understand my concern.

There is no such restriction in a real IP network. It would be akin to
having a host match the source MAC address in the ethernet frame to
double check that it came from the router port it is sending outgoing
packets to. Which means simple one-sided solutions from IP land don't
work here.

Things work exactly the way you outline today for UD. They don't work
at all for the general case of RC. Get rid of the QP requirement and
things work the way you outline for RC too. Keep it in and you must
use the FlowLabel to force the flows onto the right router LID.

That is why I said previously that the QP matching rules are a
mistake. The best way to solve this is to change C9-54 to only be in
effect if the GRH is not present.

CM also introduces the much smaller problem of getting the LIDs to the
passive side - but that cannot be solved without a broad solution to
the RC QP SLID matching problem.

Jason




More information about the general mailing list