[ofa-general] Receiving 24-byte remnant header on bad umad_recv?

Joseph L Greathouse jlgreath at us.ibm.com
Wed Jul 30 11:43:59 PDT 2008


Hi there,

I am building a program that sometimes sends MADs from device A to device
B, and device B should (by design) silently ignore these packets.

Device A is using OFED 1.3.1's libibumad-1.1.7, and I am sending these MADs
using a regular umad_send() method. When B responds, the subsequent
umad_recv() works just fine.  However, when B is supposed to silently
ignores the MAD, device A's subsequent umad_recv() receives a strange
packet rather than timing out.  It receives a 24-byte packet that looks
like a MAD Header. All of its fields except for TID are the same as the
original send's MAD Header.

I have verified that, at the very least, device B is *not* sending out this
24 byte packet.  In fact, this 24-byte malformed packet appears even when
doing a similar test from device A to device C (where device C is from
another manufacturer).

Even using umad_poll() to wait for an actual response results in this
misformed packet. umad_poll() returns with 0, and the next umad_recv()
picks up the 24-byte remnant heaer.

Is this the expected case in OFED when a MAD send fails to receive a
response?  Does receiving a 24-byte packet using umad_recv() guarantee that
the response was never received, or could this arise in other situations as
well?

Any help you could offer would be appreciated.
-Joe Greathouse




More information about the general mailing list