[ewg] [PATCH] Handling busy responses from the SA
Mike Heinz
michael.heinz at qlogic.com
Fri Jun 4 11:30:45 PDT 2010
The purpose of this patch is to cause the ib_mad driver to discard busy responses from the SA, effectively causing busy responses to become time outs.
This ensures that naïve IB applications cannot overwhelm the SA with queries, which could happen when a cluster is being rebooted, or when a large HPC application is started.
Note that this patch directly changes the same code affected by the mad user rmpp patch - it cannot be successfully applied without that patch.
Signed-Off-By: Michael Heinz <michael.heinz at qlogic.com>
----
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index efca783..05f2930 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -1815,9 +1815,20 @@ static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv,
*/
/* Complete corresponding request */
if (ib_response_mad(mad_recv_wc->recv_buf.mad)) {
+ u16 busy = __be16_to_cpu(mad_recv_wc->recv_buf.mad->mad_hdr.status) &
+ IB_MGMT_MAD_STATUS_BUSY;
+
spin_lock_irqsave(&mad_agent_priv->lock, flags);
mad_send_wr = ib_find_send_mad(mad_agent_priv, mad_recv_wc);
if (mad_send_wr) {
+ if (busy && mad_send_wr->retries_left) {
+ /* Just let the query timeout and have it requeued later */
+ spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
+ ib_free_recv_mad(mad_recv_wc);
+ deref_mad_agent(mad_agent_priv);
+ printk(KERN_NOTICE PFX "Response returned with MAD_STATUS_BUSY\n");
+ return;
+ }
ib_mark_mad_done(mad_send_wr);
spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h index 2651e93..e9dc4cc 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -77,6 +77,15 @@
#define IB_MGMT_MAX_METHODS 128
+/* MAD Status field bit masks */
+#define IB_MGMT_MAD_STATUS_SUCCESS 0x0000
+#define IB_MGMT_MAD_STATUS_BUSY 0x0001
+#define IB_MGMT_MAD_STATUS_REDIRECT_REQD 0x0002
+#define IB_MGMT_MAD_STATUS_BAD_VERERSION 0x0004
+#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD 0x0008
+#define IB_MGMT_MAD_STATUS_UNSUPPORTED_METHOD_ATTRIB 0x000c
+#define IB_MGMT_MAD_STATUS_INVALID_ATTRIB_VALUE 0x001c
+
/* RMPP information */
#define IB_MGMT_RMPP_VERSION 1
#define IB_MGMT_RMPP_PASSTHRU 255
More information about the ewg
mailing list