[openib-general] Re: [PATCH] ib_mad: prevent duplicateoutstanding MADtransactions with same TID

Sean Hefty sean.hefty at intel.com
Wed Feb 22 23:41:01 PST 2006


>The issue is complex, and two-fold:

I still need to consider this in more detail to see if there isn't some simpler
solution that we're overlooking.

>A
>--
>1. We should PREVENT sending a new duplicate identical request MADs
>while the previous MAD has not yet timed out (but allow RMPP ACK/NACK
>packets, which have the identical TID/GID/class as the original request
>packet).

I'm uncertain about making this a requirement for kernel components, which we
should be able to trust behaving correctly.

Also of concern is that the TID may be 0 for management classes that do not make
use of it.  It is also permissible to have multiple outstanding MADs containing
the same TID/GID/class.  For example, a series of RMPP segments would have this,
as would SNMP tunneling.

>2. Similarly, we should PREVENT sending a new duplicate RMPP mad from
>sender side (usually an RMPP response) while the previous RMPP session
>is still in progress.

IMO, as long as nothing catastrophic happens on the responder side, we may be
fine here.  Duplicate responses should only come from seeing duplicate requests,
so I would place the burden on the sender to be fixed.

>When SENDING:
>	If RESPONSE bit of method is set:
>		Need to check TID/GID/class of all responses in list to
>verify
>		that this is not a duplicate.

See above -- I believe that this check would disallow valid transfers.

>	Otherwise:
>		Need to check TID/class of all requests in list.
>
>	NOTE:  Currently, struct ib_mad_send_wr_private holds only the
>address
>		handle pointer, NOT the address handle attributes.  We
>need the
>		AH attribute data to check GID, LID, and grh.  To
>extract this
>		Info we can either add it to the private struct

This data could also be added to struct ib_ah.

>When RECEIVING:
>	If RESPONSE bit is set:
>		Need to check TID/class against outstanding requests.
>	Otherwise:
>		Need to check TID/GID/class against outstanding
>responses (RMPP)
>		GID is important here, because responder may have
>several
>		RMPP sessions active with same TID, but involving
>different
>		Destination hosts.

What specific error do you see in the receive path?  Responses should match with
requests based on TID alone, since we control setting the TID.  I can see where
a duplicate request may be received for a response that is currently in
transfer, but that seems like a narrow window.  The duplicate request could just
as easily come before or after the response is sent, which would need to be
handled by the ULP anyway.  I don't see that this optimization is worth it.

- Sean




More information about the general mailing list