[openib-general] Re: [PATCH] ib_mad: prevent duplicateoutstanding MADtransactions with same TID

Jack Morgenstein jackm at mellanox.co.il
Mon Feb 27 05:34:21 PST 2006


On Friday 24 February 2006 19:52, Sean Hefty wrote:
> I was thinking about how we can reduce some of the inefficiencies. 
> Currently, we only track if a request is waiting for a response or not.  We
> can add a new state indicating that a response is in progress, which would
> be set when the first segment of a response is received.  This would be
> used to suppress duplicate requests.
There is still a race condition here.  The duplicate request could go out 
while the first response segment is in transit (particularly true for 
requests which generate huge responses!).

This fix is OK, but does not absolve the responding side from checking as 
well.

>
> On the receive side, I was considering adding an API that the user would
> invoke to indicate that a response was being generated.  The MAD layer
> would queue this information, and a received request would be checked
> against this queue to determine if it were a duplicate.  When the response
> is sent, the queued information would be removed.  I think that we may be
> able to use such an API to support dual-sided RMPP as well.
There is still a race here -- between user indicating that a response will be
generated, and a new request arriving.  Not serious, though since presumably
a duplicate request will only be issued after a significant timeout (seconds), 
and this API would be invoked immediately.
This does demand changes in user code, which checking at the mad-send
time does not.

This is also more medium term, and will certainly not be in time for the 
upcoming release.

I recommend that we use the mad duplicate RMPP send patch now (since it -- or 
something like it -- will still be needed when we do handling at the 
requester side, and at the receive side of the responder).  This fix is 
admittedly incomplete, since it not as efficient as I would like (e.g., the 
duplicate request is still processed, and is thrown out only after all the 
processing is complete) -- but it does fix the problem.

The Tid/GID issue is really a separate issue, and should be treated as such.
I'm including the patch here, to save looking for it.
Jack
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Prevent issuing multiple MAD transactions with the same TID.
Could happen if duplicate requests are posted.

Signed-off-by: Jack Morgenstein <jackm at mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst at mellanox.co.il>

Index: latest/drivers/infiniband/core/mad.c
===================================================================
--- latest.orig/drivers/infiniband/core/mad.c	2006-01-16 18:19:55.000000000 
+0200
+++ latest/drivers/infiniband/core/mad.c	2006-01-19 11:41:42.000000000 +0200
@@ -907,6 +907,20 @@
 	return ret;
 }
 
+static inline int is_rmpp_data(struct ib_mad *mad)
+{
+	/* check if has rmpp header */
+	if (mad->mad_hdr.mgmt_class != IB_MGMT_CLASS_SUBN_ADM &&
+	    (mad->mad_hdr.mgmt_class < IB_MGMT_CLASS_VENDOR_RANGE2_START ||
+	     mad->mad_hdr.mgmt_class > IB_MGMT_CLASS_VENDOR_RANGE2_END))
+		return 0;
+
+	return ((ib_get_rmpp_flags(&((struct ib_rmpp_mad *)mad)->rmpp_hdr) &
+		 IB_MGMT_RMPP_FLAG_ACTIVE) &&
+		((struct ib_rmpp_mad *)mad)->rmpp_hdr.rmpp_type ==
+		IB_MGMT_RMPP_TYPE_DATA);
+}
+
 /*
  * ib_post_send_mad - Posts MAD(s) to the send queue of the QP associated
  *  with the registered client
@@ -964,6 +979,13 @@
 		/* Reference MAD agent until send completes */
 		atomic_inc(&mad_agent_priv->refcount);
 		spin_lock_irqsave(&mad_agent_priv->lock, flags);
+		if (is_rmpp_data(send_buf->mad) &&
+		    ib_find_send_mad(mad_agent_priv, mad_send_wr->tid)) {
+			/* Duplicate send request */
+			spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
+			atomic_dec(&mad_agent_priv->refcount);
+			return -EBUSY;
+		}
 		list_add_tail(&mad_send_wr->agent_list,
 			      &mad_agent_priv->send_list);
 		spin_unlock_irqrestore(&mad_agent_priv->lock, flags);



More information about the general mailing list