[openib-general] [PATCH] 2.6.20 rdma_ucm: fix reporting events with invalid user context

Or Gerlitz or.gerlitz at gmail.com
Mon Jan 8 12:13:25 PST 2007


On 1/5/07, Sean Hefty <sean.hefty at intel.com> wrote:
> There's a problem with how rdma cm events are reported to userspace that can
> lead to application crashes.
>
> When a new connection request arrives, a context for the connection is allocated
> in the kernel.  The connection event is then reported to userspace.  The
> userspace library retrieves the event and allocates its own context for the
> connection.  The userspace context is associated with the kernel's context when
> accepting.  This allows the kernel to give userspace context with other events.

> A problem occurs if a second event for the same connection occurs before the
> user has had a chance to call accept.  The userspace context has not yet been
> set, which causes the librdmacm to crash.   (This has been seen when the app
> takes too long to call accept, resulting in the remote side timing out and
> rejecting the connection.)

Assuming that events are reported in order (correct?)  then the user
space consumer was calling rdma_get_cm_event, got a connection request
and before calling rdma_accept they have called rdma_get_cm_event
again and got connection reject ?

Or the thing is that there are two threads in user space, one calling
rdma_get_cm_event and on some events acting by itself where on other
events causing another thread to act, so it got the conn request and
moved it to the other thread and then got the conn reject and tried to
act on it before the other thread called rdma_accept ?

Or.




More information about the general mailing list