[ofa-general] Combined DR path with empty DR path, what is the expected behavior?

Ira Weiny weiny2 at llnl.gov
Mon Aug 24 18:52:06 PDT 2009


If I send a combined DR path with a start lid but an empty (0 length) DR
path.  What is the expected behavior?

I know this could be specified with LID routing, but I don't see anywhere in
the specification which says this is an error.  I do however seem to have 2
different implementations on 2 different switches.  For example:

I have Switch A (Lid 1) and Switch B (Lid 7).  I attempt to query PortInfo of
Port 1 of each switch using the LID followed by an empty DR path.

17:55:22 > ./smpquery -c portinfo 1 0 1
ibwarn: [21005] mad_rpc: _do_madrpc failed; dport (Lid 1)
./smpquery: iberror: failed: operation portinfo: port info query failed


17:55:31 > ./smpquery -c portinfo 7 0 1
# Port info: Lid 7 port 1
Mkey:............................0x0000000000000000
GidPrefix:.......................0x0000000000000000
...
<normal output snipped>

Detecting this special case in libibmad and turning the packet into a LID
routed one succeeds but I wonder if this is an error in the SMI?  I also
notice this is an error on the HCA I am running from (lid 2).

17:57:42 > ./smpquery -c portinfo 2 0 1
ibwarn: [21008] mad_rpc: _do_madrpc failed; dport (Lid 2)
./smpquery: iberror: failed: operation portinfo: port info query failed

Running with a simple DR path works, I guess because this is the loopback case
mentioned on page 805.

17:58:16 > ./smpquery -D portinfo 0 1
# Port info: DR path slid 65535; dlid 65535; 0 port 1
Mkey:............................0x0000000000000000
GidPrefix:.......................0x2007000000000000
...
<snip>

It guess that the comment "Since each part may be empty, there are eight
combinations, although only four are really useful:" on line 36 Page 805 can
be interpreted to mean that only those 4 combinations need to be supported.
Is this true?

On the other hand I think strictly this should be supported.  Item 4 of C14-9
(line 24 page 810) requires the SMI to handle the packet if the HopPointer
equals HopCount +1, which it is in my case (HopCount == 0, HopPointer == 1).
Then after processing the SMI should return the packet as specified in C14-13
item 3 on line 9 page 812.

Am I wrong?  In the end it does not matter as I have to make the software work
for all the hardware I have; so I will change the software.  However, I wonder
where exactly the spec falls on this, because I think it will influence where
the fix resides.  If the spec does not allow this then I think it is fine to
have libibmad return an error since the user specified an invalid combined DR
path.  However, if this should be legal I think libibmad should work around
the bad hardware out there.

Thoughts?
Ira

-- 
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
weiny2 at llnl.gov



More information about the general mailing list