[ofa-general] Combined DR path with empty DR path, what is the expected behavior?
Ira Weiny
weiny2 at llnl.gov
Mon Aug 24 18:52:06 PDT 2009
If I send a combined DR path with a start lid but an empty (0 length) DR
path. What is the expected behavior?
I know this could be specified with LID routing, but I don't see anywhere in
the specification which says this is an error. I do however seem to have 2
different implementations on 2 different switches. For example:
I have Switch A (Lid 1) and Switch B (Lid 7). I attempt to query PortInfo of
Port 1 of each switch using the LID followed by an empty DR path.
17:55:22 > ./smpquery -c portinfo 1 0 1
ibwarn: [21005] mad_rpc: _do_madrpc failed; dport (Lid 1)
./smpquery: iberror: failed: operation portinfo: port info query failed
17:55:31 > ./smpquery -c portinfo 7 0 1
# Port info: Lid 7 port 1
Mkey:............................0x0000000000000000
GidPrefix:.......................0x0000000000000000
...
<normal output snipped>
Detecting this special case in libibmad and turning the packet into a LID
routed one succeeds but I wonder if this is an error in the SMI? I also
notice this is an error on the HCA I am running from (lid 2).
17:57:42 > ./smpquery -c portinfo 2 0 1
ibwarn: [21008] mad_rpc: _do_madrpc failed; dport (Lid 2)
./smpquery: iberror: failed: operation portinfo: port info query failed
Running with a simple DR path works, I guess because this is the loopback case
mentioned on page 805.
17:58:16 > ./smpquery -D portinfo 0 1
# Port info: DR path slid 65535; dlid 65535; 0 port 1
Mkey:............................0x0000000000000000
GidPrefix:.......................0x2007000000000000
...
<snip>
It guess that the comment "Since each part may be empty, there are eight
combinations, although only four are really useful:" on line 36 Page 805 can
be interpreted to mean that only those 4 combinations need to be supported.
Is this true?
On the other hand I think strictly this should be supported. Item 4 of C14-9
(line 24 page 810) requires the SMI to handle the packet if the HopPointer
equals HopCount +1, which it is in my case (HopCount == 0, HopPointer == 1).
Then after processing the SMI should return the packet as specified in C14-13
item 3 on line 9 page 812.
Am I wrong? In the end it does not matter as I have to make the software work
for all the hardware I have; so I will change the software. However, I wonder
where exactly the spec falls on this, because I think it will influence where
the fix resides. If the spec does not allow this then I think it is fine to
have libibmad return an error since the user specified an invalid combined DR
path. However, if this should be legal I think libibmad should work around
the bad hardware out there.
Thoughts?
Ira
--
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
weiny2 at llnl.gov
More information about the general
mailing list