[openib-general] Re: [PATCH] ib_mad: prevent duplicateoutstanding MADtransactions with same TID
Jack Morgenstein
jackm at mellanox.co.il
Thu Feb 23 08:25:19 PST 2006
On Thursday 23 February 2006 18:14, Jack Morgenstein wrote:
> On Thursday 23 February 2006 09:41, Sean Hefty wrote:
> > What specific error do you see in the receive path?
>
> SA Host, Host1, Host2.
>
> Host1 and Host2 have simultaneous GET_TABLE query responses (both with same
> TID) in flight with the SA Host.
>
> Host1 sends an RMPP abort to the SA. The SA Host receives the abort and
> does abort_send(), searching on the TID alone. The wrong session gets
> aborted.
>
> - Jack
>
Regarding RMPP abort processing, I see that there is a problem: the code
assumes that all aborts are received by the responder:
static void process_rmpp_abort(struct ib_mad_agent_private *agent,
struct ib_mad_recv_wc *mad_recv_wc)
{
struct ib_rmpp_mad *rmpp_mad;
rmpp_mad = (struct ib_rmpp_mad *)mad_recv_wc->recv_buf.mad;
if (rmpp_mad->rmpp_hdr.rmpp_status < IB_MGMT_RMPP_STATUS_ABORT_MIN ||
rmpp_mad->rmpp_hdr.rmpp_status > IB_MGMT_RMPP_STATUS_ABORT_MAX) {
abort_send(agent, rmpp_mad->mad_hdr.tid,
IB_MGMT_RMPP_STATUS_BAD_STATUS);
nack_recv(agent, mad_recv_wc, IB_MGMT_RMPP_STATUS_BAD_STATUS);
} else
>>>>>>>>>> This is performed if the abort status is a valid one
abort_send(agent, rmpp_mad->mad_hdr.tid,
rmpp_mad->rmpp_hdr.rmpp_status);
}
However, there are abort messages which the responder may send to the
requester as well (RMPP status codes 122, 123 for example, which can only be
sent by the responder -- see SPEC page 774).
These aborts should result in the RMPP receive session being terminated, and
have no connection with an RMPP send session. I'm thinking about a fix.
- Jack
More information about the general
mailing list