[ewg] rdma retry number

Sunkyoung Shin Sunkyoung.Shin at falconstor.com
Wed Oct 10 15:54:10 PDT 2007


Hello,

During failover test, we found the iscsi over iser reconnected to the
iscs target after 100 seconds due to the default max timeout (8sec) and
retry number (15). The max timeout was adjustable with the module
parameter, max_timeout, of ib_cm.ko, but the retry number wasn't. Can we
add the retry number as module parameter of rdma_cm.ko? I added the
patch below based on the ofed version, OFED-1.2-20070626-0917.



diff -Naur ofa_kernel-1.2.orig/drivers/infiniband/core/cma.c
ofa_kernel-1.2/drivers/infiniband/core/cma.c
--- ofa_kernel-1.2.orig/drivers/infiniband/core/cma.c   2007-06-26
12:17:47.000000000 -0400
+++ ofa_kernel-1.2/drivers/infiniband/core/cma.c    2007-10-10
18:41:09.000000000 -0400
@@ -53,6 +53,10 @@
 #define CMA_CM_RESPONSE_TIMEOUT 20
 #define CMA_MAX_CM_RETRIES 15
 
+static int cma_max_cm_retries = CMA_MAX_CM_RETRIES;
+module_param_named(cma_max_cm_retries, cma_max_cm_retries, int, 0644);
+MODULE_PARM_DESC(cma_max_cm_retries, "the number of retry");
+
 static void cma_add_one(struct ib_device *device);
 static void cma_remove_one(struct ib_device *device);
 
@@ -1985,7 +1989,7 @@
    req.service_id = cma_get_service_id(id_priv->id.ps,
                        &route->addr.dst_addr);
    req.timeout_ms = 1 << (CMA_CM_RESPONSE_TIMEOUT - 8); 
-   req.max_cm_retries = CMA_MAX_CM_RETRIES;
+   req.max_cm_retries = cma_max_cm_retries;
 
    ret = ib_send_cm_sidr_req(id_priv->cm_id.ib, &req);
    if (ret) {
@@ -2045,7 +2049,7 @@
    req.rnr_retry_count = conn_param->rnr_retry_count;
    req.remote_cm_response_timeout = CMA_CM_RESPONSE_TIMEOUT;
    req.local_cm_response_timeout = CMA_CM_RESPONSE_TIMEOUT;
-   req.max_cm_retries = CMA_MAX_CM_RETRIES;
+   req.max_cm_retries = cma_max_cm_retries;
    req.srq = id_priv->srq ? 1 : 0;
 
    ret = ib_send_cm_req(id_priv->cm_id.ib, &req); 
 



Sunkyoung Shin
FalconStor Software, Inc.




More information about the ewg mailing list