[openib-general] Problem is routing CM REQ was: Use a GRH when appropriate for unicast packets

Jason Gunthorpe jgunthorpe at obsidianresearch.com
Thu Feb 8 11:08:09 PST 2007


On Thu, Feb 08, 2007 at 10:23:11AM -0800, Sean Hefty wrote:
> >>The active side clearly cannot learn what the SLID of the passive
> >>side's router should be.
> >>
> >>We don't want to have the routers snoop and alter CM GMPs.
> >>
> >>The passive side cannot use information from the LRH to get the router
> >>LID since the LRH may not be reversible.
> >>
> >>The only option seems to be to have the passive side do a path record
> >>query on a SGID in the CM REQ...
> >>
> >>This is a spec problem unfortunately.
> >
> >
> >Yes and I would expect that this would be changed.
> 
> Looking at the problem more, I think that the issue extends to the remote 
> port LID as well.  My expectation with a local path record query is that 
> the SLID is the local port, and the DLID is the local router.  This should 
> be sufficient for one-way UD traffic, but for connected traffic we still 
> need to discover the remote router and remote port LIDs.

Hum, you mean to meet the LID validation rules of 9.6.1.5? That is a
huge PITA..

[IMHO, 9.6.1.5 C9-54 is a mistake, if there is a GRH then the LRH.SLID
 should not be validated against the QP context since it makes it
 extra hard for multipath routing and QoS to work...]

Here is one thought on how to do this:
To meet this rule each side of the CM must take the SLID from
the incoming LRH as the DLID for the connection. This SLID will be
one of the SLIDs for the local router. The other side doesn't need to
know what it is. The passive side will get the router SLID from the
REQ and the active side gets it from the ACK.

The passive side is easy, it just path record queries the DGID and
requests the DLID == the incoming LRH.SLID.

The nasty problem is with the active side - CMA will select a router
lid it uses as the DLID and the router may select a different LID for
it to use as the SLID when it processes the ACK. By C9-54 they have to
be the same :< So the active side might have to do another path record
query to move its DLID and SL to match the routers choosen
SLID. Double suck :P

Overarching all of this is some mechanism where the SM and all the
routers collaborate to keep the router SLID the same for the duration
of every RC flow. (One simple way would be to have the SM encode the
SLID it wants to router to pick in the Flow Label or TClass..)

Suck.

Another idea would be to encode the local router SLID in the flow
label and have the CM exchange and use asymetric flow labels.. That
would move control over SL selection into the endpoints and remove the
possible 2nd pathrecord query from the active side - but I haven't
looked if CM can exchange flow labels in the ACK..

> I think that we need a way for the local node to query the remote SA to 
> obtain this information.  Or we need a new path record for routable paths 
> that includes this information.

Being able to query doesn't really help matters since you still can't
tell the router what SLID to use.. The main idea is that the router
lid is only useful to the endpoint on the same subnet so there is no
reason to make the non-local side fetch it.

Jason




More information about the general mailing list