[openib-general] [PATCH 2/5] 2.6.19 rdma_cm: fix device removal race
Sean Hefty
sean.hefty at intel.com
Fri Sep 29 11:51:49 PDT 2006
The race is as follows:
A process : cma_process_remove() calls cma_remove_id_dev(),
which sets id state to CMA_DEVICE_REMOVAL and
calls wait_event(dev_remove).
B process : cma_req_handler() had incremented dev_remove,
and calls cma_acquire_ib_dev() and on failure
calls cma_release_remove(), which does a
wake_up of cma_process_remove(). Then
cma_req_handler() calls rdma_destroy_id();
A Process : cma_remove_id_dev() gets woken and checks the
state of id, and since it is still (wrongly)
CMA_DEVICE_REMOVAL, it calls notify_user(id)
and if that fails, the caller - cma_process_remove()
calls rdma_destroy_id(id). Two processes can
call rdma_destroy_id(), resulting in one
de-referencing kfreed id_priv.
Fix is for process B to set CMA_DESTROYING in cma_req_handler()
so that process A will return instead of doing a rdma_destroy_id().
Signed-off-by: Krishna Kumar <krkumar2 at in.ibm.com>
Signed-off-by: Sean Hefty <sean.hefty at intel.com>
---
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 69bb089..f383a4f 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -932,6 +932,7 @@ static int cma_req_handler(struct ib_cm_
mutex_unlock(&lock);
if (ret) {
ret = -ENODEV;
+ cma_exch(conn_id, CMA_DESTROYING);
cma_release_remove(conn_id);
rdma_destroy_id(&conn_id->id);
goto out;
More information about the general
mailing list