[openib-general] [CMA] re-connect procedure?

Tue Mar 21 13:02:28 PST 2006

How should CMA consumers re-connect after a disconnect?

The current implementation of rdma_connect() requires the rdma_cm_id 
to be in the CMA_ROUTE_RESOLVED state. As a result, the following:

 rdma_resolve_addr()
 rdma_resolve_route()
 rdma_connect()

 [receive disconnect]

 rdma_connect() <-- fails rdma_cm_id state check 

fails.

>From my perspective, it seems reasonable to allow the consumer to 
re-connect without calling rdma_resolve_addr() and 
rdma_resolve_route().

If those functions are always prerequisites for a connect call, the 
consumer will need to cleanup and reallocate all of its resources 
every time the connection is lost.

In some protocols, a disconnect does not represent a catastrophic 
failure and therefore does not warrant the complete cleanup. For 
example, NFS clients and servers close idle connections. When the 
client has additional operations to send, it reconnects to the server.

Would something as simple as this solve the problem? I'm debugging the 
re-connect path in my code now, so it is only partially tested.

Singed-off-by: James Lentini <jlentini at netapp.com>

Index: core/cma.c
===================================================================

--- core/cma.c	(revision 5938)
+++ core/cma.c	(working copy)
@@ -1443,7 +1443,8 @@ int rdma_connect(struct rdma_cm_id *id, 
 	int ret;
 
 	id_priv = container_of(id, struct rdma_id_private, id);
-	if (!cma_comp_exch(id_priv, CMA_ROUTE_RESOLVED, CMA_CONNECT))
+	if (!cma_comp(id_priv, CMA_CONNECT) &&
+	    !cma_comp_exch(id_priv, CMA_ROUTE_RESOLVED, CMA_CONNECT))
 		return -EINVAL;
 
 	if (!id->qp) {