[ofa-general] [PATCH] for-2.6.25: rdma/cm: do not issue MRA if user rejects connection request

Sean Hefty sean.hefty at intel.com
Wed Feb 13 14:33:53 PST 2008


There's an undesirable interaction with issuing MRA requests to
increase connection timeouts and the listen backlog.

When the rdma_cm receives a connection request, it queues an MRA
with the ib_cm.  (The ib_cm will send an MRA if it receives a
duplicate REQ.)  The rdma_cm will then create a new rdma_cm_id and
give that to the user, which in this case is the rdma_user_cm.

If the listen backlog maintained in the rdma_user_cm is full,
it destroys the rdma_cm_id, which in turns destroys the ib_cm_id.
The ib_cm_id generates a REJ because the state of the ib_cm_id has
changed to MRA sent, versus REQ received.

Defer queuing the MRA until after the user of the rdma_cm has
examined the connection request.

Signed-off-by: Sean Hefty <sean.hefty at intel.com>
---
This problem was detected while debugging an MPI application running
over uDAPL.

This patch is also available at:

	git://git.openfabrics.org/~shefty/rdma-dev.git for-roland

 drivers/infiniband/core/cma.c |   13 +++++++++++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 0751697..98e1b38 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1100,7 +1100,6 @@ static int cma_req_handler(struct ib_cm_id *cm_id, struct ib_cm_event *ib_event)
 		event.param.ud.private_data_len =
 				IB_CM_SIDR_REQ_PRIVATE_DATA_SIZE - offset;
 	} else {
-		ib_send_cm_mra(cm_id, CMA_CM_MRA_SETTING, NULL, 0);
 		conn_id = cma_new_conn_id(&listen_id->id, ib_event);
 		cma_set_req_event_data(&event, &ib_event->param.req_rcvd,
 				       ib_event->private_data, offset);
@@ -1122,8 +1121,18 @@ static int cma_req_handler(struct ib_cm_id *cm_id, struct ib_cm_event *ib_event)
 	cm_id->cm_handler = cma_ib_handler;
 
 	ret = conn_id->id.event_handler(&conn_id->id, &event);
-	if (!ret)
+	if (!ret) {
+		/*
+		 * Acquire mutex to prevent user executing rdma_destroy_id()
+		 * while we're accessing the cm_id.
+		 */
+		mutex_lock(&lock);
+		if (cma_comp(conn_id, CMA_CONNECT) &&
+		    !cma_is_ud_ps(conn_id->id.ps))
+			ib_send_cm_mra(cm_id, CMA_CM_MRA_SETTING, NULL, 0);
+		mutex_unlock(&lock);
 		goto out;
+	}
 
 	/* Destroy the CM ID by returning a non-zero value. */
 	conn_id->cm_id.ib = NULL;






More information about the general mailing list