[ofa-general] [PATCH] rdma/cm: add locking around QP accesses
Or Gerlitz
ogerlitz at voltaire.com
Tue Sep 25 02:57:03 PDT 2007
Sean Hefty wrote:
> If a user allocates a QP on an rdma_cm_id, the rdma_cm will automatically
> transition the QP through its states (RTR, RTS, error, etc.) While the
> QP state transitions are occurring, the QP itself must remain valid.
> Provide locking around the QP pointer to prevent its destruction while
> accessing the pointer.
>
> This fixes an issue reported by Olaf Kirch from Oracle that resulted in
> a system crash:
>
> "An incoming connection arrives and we decide to tear down the nascent
> connection. The remote ends decides to do the same. We start to shut
> down the connection, and call rdma_destroy_qp on our cm_id. ... Now
> apparently a 'connect reject' message comes in from the other host,
> and cma_ib_handler() is called with an event of IB_CM_REJ_RECEIVED.
> It calls cma_modify_qp_err, which for some odd reason tries to modify
> the exact same QP we just destroyed."
Hi Sean, Rick,
In iscsi/iser, the approach we took wrt to destruction of a <ID,QP> pair
(ID and QP are created/destroyed through and state-managed by the
rdma-cm) is:
A) call rdma_disconnect to make sure the QP was transitioned to error
B) get the completions/flushes assoc. with all the WR posted to the QP
C) make sure a disconnected event was received
call rdma_destroy_qp only when B && C hold.
What is your take on this approach?
Or.
More information about the general
mailing list