[ofa-general] [PATCH] rdma/cm: add locking around QP accesses

Or Gerlitz ogerlitz at voltaire.com
Tue Sep 25 02:57:03 PDT 2007


Sean Hefty wrote:
> If a user allocates a QP on an rdma_cm_id, the rdma_cm will automatically
> transition the QP through its states (RTR, RTS, error, etc.)  While the
> QP state transitions are occurring, the QP itself must remain valid.
> Provide locking around the QP pointer to prevent its destruction while
> accessing the pointer.
> 
> This fixes an issue reported by Olaf Kirch from Oracle that resulted in
> a system crash:
> 
> "An incoming connection arrives and we decide to tear down the nascent
>  connection.  The remote ends decides to do the same.  We start to shut
>  down the connection, and call rdma_destroy_qp on our cm_id. ... Now
>  apparently a 'connect reject' message comes in from the other host,
>  and cma_ib_handler() is called with an event of IB_CM_REJ_RECEIVED.
>  It calls cma_modify_qp_err, which for some odd reason tries to modify
>  the exact same QP we just destroyed."

Hi Sean, Rick,

In iscsi/iser, the approach we took wrt to destruction of a <ID,QP> pair 
(ID and QP are created/destroyed through and state-managed by the 
rdma-cm) is:

A) call rdma_disconnect to make sure the QP was transitioned to error

B) get the completions/flushes assoc. with all the WR posted to the QP
C) make sure a disconnected event was received

call rdma_destroy_qp only when B && C hold.

What is your take on this approach?

Or.





More information about the general mailing list