[openib-general] Problem is routing CM REQ

Wed Feb 14 14:15:50 PST 2007

At 01:36 PM 2/14/2007, Sean Hefty wrote:
>Assume that the active and passive sides of a connection request are on 
>different subnets and:
>
>Active side - LID 1
>Active side router - LID 2
>Passive side - LID 93
>Passive side router - LID 94
>
>What values are you suggesting are used for:
>
>Active side QP - DLID
>Passive side QP - DLID
>CM REQ Primary Local Port LID

Subnet A is:
QP Port LID 1
Router A Port LID 2

Subnet B is:
QP Port LID 93
Router B Port LID 94

Process steps:

- Router A populates SM / SA A with the GID prefix it can route.   SM / SA 
A will have configured the router Port with the appropriate local route 
information and hence have assigned it LID 2.

- CM associated with Port LID 1 queries the SM / SA to identify a path to a 
GID Prefix.   SM / SA returns a path record indicating a global route, i.e. 
one that requires a GRH, is available and provides the CM with the 
information targeting router Port LID 2.

- CM creates a REQ and populates the global information to identify the 
remote endnode.  The LRH generated targets Port LID 2.  The GRH is 
generated to target the remote subnet so the router will comprehend how to 
process the packet.

- Router A receives the packet and examines the GRH.   Via its router 
protocol, it has previously identified what router Port will lead to the 
next hop on the path to the destination endnode.

- If the endnode is subnet local, say subnet B, then the router generates a 
LRH with QP LID 93 and emits that on router Port LID 94.

- QP in subnet B receives the CM REQ and validates the LRH.  Given these 
messages are via UD service and not RC / UC, the validation rules for the 
LRH are different.   The CM agent processes the request and returns an 
appropriate response by filling in a GRH that replaces the SGID with the 
DGID and so forth so the addresses are basically reflected back.   The 
response uses QP port LID 93 and targets router Port 94.

- Router B Port 94 receives the response.  It parses the GRH and determines 
the next hop port.   In this example, the response goes out router A Port 2 
and targets QP Port LID 1.  The LRH is generated using these 
fields.  Again, since CM is targeting a UD QP, the LRH validation rules are 
different.

- Once the connection is established, the QP on subnet A will send packets 
to QP on subnet B using a GRH that is processed by the router with each QP 
using a LRH that targets the router port locally attached to its 
subnet.   The router is responsible for generating a LRH to forward to the 
next hop.   These packets are now in a RC / UC data flow so the LRH 
validation is per the sections cited in this e-mail string.

In all cases, the router protocol is responsible for generation of a LRH 
that will work within each subnet.  There is no exchange of subnet local 
information between the subnets.  Each subnet's SM/SA only tracks what is 
local to it as well as what GID prefix can be routed via a given LID.   If 
multiple LID can route to a given GID prefix, multiple path records are 
returned.   Which to choose is not specified by the specifications so it 
can be any policy one desires. If the router protocol communicates a "cost" 
to a given path in order to give an indication of appropriateness for a 
given workload, then this should be communicated to the CM agent.

Mike