[openib-general] Problem is routing CM REQ

Mon Feb 12 15:54:04 PST 2007

On Mon, Feb 12, 2007 at 02:47:42PM -0800, Sean Hefty wrote:

> Maybe it would help if we can agree on a set of expectations.  These are 
> what I am thinking:
> 
> 1. An SA should be able to respond to a valid PR query if at least one of 
> the GIDs in the path record is local.
> 
> 2. The LIDs in a PR are relative to the SA's subnet that returned the 
> record.
> 
> 3. An IB router should not failover transparently to QPs sending traffic 
> through that router.

OK to these

> 4. A PR from the local SA with reversible=1 indicates that data sent from 
> the remote GID to the local GID using the PR TC and FL will route locally 
> using the specified LID pair.  This holds whether the PR SGID is local or 
> remote.

> 5. A PR from a remote SA with reversible=1 indicates that data sent from 
> the local GID to the remote GID using the PR TC and FL will route remotely 
> using the specified LID pair.  This holds whether the PR SGID is local or 
> remote.

I can't think how to actually implement these restrictions in the
general case without SLID spoofing and the general method I outlined
in my prior email.

*Especially* reversible - which by definition requires the FL and TC
to be the same on both directions of the path!

> 6. A PR with reversible=0 is relative to SA's subnet.  The SGID->DGID data 
> flow over the PR TC and FL indicates the SLID->DLID mapping for that subnet.

Think about this - it is backwards for the UD case. You have specified
that the SGID->DGID direction uses the returned SLID/DLID which are
ensured by the flowlabel in the GRH. But the local side only controls
what it sends. How does this GRH get to the remote side? In UD the
returned GRH from the PR controls the selection of LID on the DGID's
subnet. That is how it must be.

QPs have a specific definition of where the GRH comes from, and for a
local PR query with SGID=myself the GRH programmed into the QP must come
from that query. This is necessary for UD and I don't think it can be
changed around.

Plus, in the multi-router path, the GRH alone does not contain the
information to know which physical router port the flow exits
from. (See prior diagram) - so the SLID spoofing is the only way to
fix things up if the PR queries are left unchanged.

> The use of reversible between subnets is what's concerning me.  It may be 
> that an SA could not return any paths as reversible between two subnets 
> without using some trick like what you mentioned.

I really don't see how it can work any other way right now..

> These add a requirement on the SA that they must be aware of the routes 
> packets take between two GIDs using a given TC and FL, but I don't believe 
> that this necessarily forces SA to SA communication.  The SA may only need 
> to exchange information with a router...?

The major problem is that there are multiple router paths that a given
GRH can take that are only fully disambiguated by the router lid at
the sender.

> > - Routers do the SLID spoofing you outlined.
> 
> I'm not sure this is something that we do want now.  APM should really 
> handle path failover.

It has absolutely nothing to do with failover. This is necessary to
make multiple router paths work at all. It is necessary for reversible
to work with multiple routers at all. 

> >There is alot of complex work in the router and SA side to make this
> >kind of topology work, but it is critical that the clients use path
> >queries that can provide enough data to the SA and return enough data
> >to the client to support this.
> 
> I'm still deciding if the existing path record attribute is sufficient.

I'm of the opinion that it isn't a good fit. Look at how tortured
things are just because the PR record does not have enough information
to let the SA answer in the best way.

Jason