[openib-general] Re: [PATCH] ib_mad: prevent duplicateoutstanding MADtransactions with same TID

Jack Morgenstein jackm at mellanox.co.il
Thu Feb 23 08:25:19 PST 2006


On Thursday 23 February 2006 18:14, Jack Morgenstein wrote:
> On Thursday 23 February 2006 09:41, Sean Hefty wrote:
> > What specific error do you see in the receive path?
>
> SA Host, Host1, Host2.
>
> Host1 and Host2 have simultaneous GET_TABLE query responses (both with same
> TID) in flight with the SA Host.
>
> Host1 sends an RMPP abort to the SA.  The SA Host receives the abort and
> does abort_send(), searching on the TID alone.  The wrong session gets
> aborted.
>
> - Jack
>
Regarding RMPP abort processing, I see that there is a problem:  the code 
assumes that all aborts are received by the responder:

static void process_rmpp_abort(struct ib_mad_agent_private *agent,
			       struct ib_mad_recv_wc *mad_recv_wc)
{
	struct ib_rmpp_mad *rmpp_mad;

	rmpp_mad = (struct ib_rmpp_mad *)mad_recv_wc->recv_buf.mad;

	if (rmpp_mad->rmpp_hdr.rmpp_status < IB_MGMT_RMPP_STATUS_ABORT_MIN ||
	    rmpp_mad->rmpp_hdr.rmpp_status > IB_MGMT_RMPP_STATUS_ABORT_MAX) {
		abort_send(agent, rmpp_mad->mad_hdr.tid,
			   IB_MGMT_RMPP_STATUS_BAD_STATUS);
		nack_recv(agent, mad_recv_wc, IB_MGMT_RMPP_STATUS_BAD_STATUS);
	} else
>>>>>>>>>> This is performed if the abort status is a valid one
		abort_send(agent, rmpp_mad->mad_hdr.tid,
			   rmpp_mad->rmpp_hdr.rmpp_status);
}

However, there are abort messages which the responder may send to the 
requester as well (RMPP status codes 122, 123 for example, which can only be 
sent by the responder -- see SPEC page 774).

These aborts should result in the RMPP receive session being terminated, and 
have no connection with an RMPP send session.  I'm thinking about a fix.

- Jack



More information about the general mailing list