[ofa-general] hotplug event handle question
Michael S. Tsirkin
mst at dev.mellanox.co.il
Wed Apr 25 21:05:53 PDT 2007
> Quoting Sean Hefty <mshefty at ichips.intel.com>:
> Subject: Re: [ofa-general] hotplug event handle question
>
> > if (!cma_comp(id_priv, CMA_CONNECT) &&
> > !cma_comp(id_priv, CMA_DISCONNECT))
> > return -EINVAL;
>
> This check only ensures that we have a valid underlying cm_id (cm_id.ib or
> cm_id.iw) and are bound to a device, with the underlying cm's providing the
> synchronization that we need. To allow rdma_disconnect() to be called
> after the device has been removed will likely take a slight re-working of
> the states. (I'm more concerned about userspace clients calling
> rdma_disconnect at the wrong time and crashing the kernel than a misuse
> from kernel clients.) I don't think that the device removal code should
> transition the QP into the error state underneath the user, so fixing
> rdma_disconnect seems like the way to go.
I think the problem is that cma_remove_id_dev overrides the current state,
losing state information in the process. Why do we need CMA_DEVICE_REMOVAL
at all? Everything seems to work fine just by forwarding RDMA_CM_EVENT_DEVICE_REMOVAL
to user, without touching state.
>
> I will work on a fix for this. In the meantime, the alternatives are
> either to remove the check or have the ULP transitioning QP into the error
> state.
>
>
> Steve, I don't see where the iwarp code transitions the cm_id to
> CMA_DISCONNECT. Does iwarp keep the cm_id in the CMA_CONNECT state until
> it is destroyed?
Don't lose state on RDMA_CM_EVENT_DEVICE_REMOVAL.
Without this patch, rdma_disconnect won't move the QP to error.
Signed-off-by: Michael S. Tsirkin <mst at dev.mellanox.co.il>
---
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index fde92ce..9f37eac 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -92,7 +92,6 @@ enum cma_state {
CMA_DISCONNECT,
CMA_ADDR_BOUND,
CMA_LISTEN,
- CMA_DEVICE_REMOVAL,
CMA_DESTROYING
};
@@ -2689,20 +2688,13 @@ static void cma_add_one(struct ib_device *device)
static int cma_remove_id_dev(struct rdma_id_private *id_priv)
{
struct rdma_cm_event event;
- enum cma_state state;
- /* Record that we want to remove the device */
- state = cma_exch(id_priv, CMA_DEVICE_REMOVAL);
- if (state == CMA_DESTROYING)
+ if (cma_comp(id_priv, CMA_DESTROYING))
return 0;
- cma_cancel_operation(id_priv, state);
+ cma_cancel_operation(id_priv, id_priv->state);
wait_event(id_priv->wait_remove, !atomic_read(&id_priv->dev_remove));
- /* Check for destruction from another callback. */
- if (!cma_comp(id_priv, CMA_DEVICE_REMOVAL))
- return 0;
-
memset(&event, 0, sizeof event);
event.event = RDMA_CM_EVENT_DEVICE_REMOVAL;
return id_priv->id.event_handler(&id_priv->id, &event);
--
MST
More information about the general
mailing list