[openib-general] Re: CMA deadlock
Michael S. Tsirkin
mst at mellanox.co.il
Mon Apr 3 11:46:22 PDT 2006
Quoting r. Sean Hefty <sean.hefty at intel.com>:
> Subject: RE: CMA deadlock
>
> > A ULP requests address resolution; on success requests route resolution;
> > route resolution succeeds; inside the callback ULP requests rdma_connect.
> > Now, a failure (e.g. out of memory) occurs at ULP level and so it decides to
> > destroy the ID. To this end it returns failure code from the route callback.
>
> I didn't consider this possibility. The only solution I can see at the moment
> is to schedule route resolution to a separate thread, as you suggested.
OK. I gather you'll fix it then?
> >And it seems that, if the user callback returns failure, the CMA actually
> >calls rdma_destroy_id which in turn may call ib_destroy_cm_id from inside the
> >CM callback. I think this might deadlock in a similiar way. Again, bouncing
> >the CM event to the rdma WQ will solve this I think.
>
> This should be handled by the code. See the comment near the bottom of the
> cma_ib_handler() routine.
I don't really understand the comment.
if (ret) {
/* Destroy the CM ID by returning a non-zero value. */
conn_id->cm_id.ib = NULL;
cma_exch(conn_id, CMA_DESTROYING);
cma_release_remove(conn_id);
rdma_destroy_id(&conn_id->id);
}
We seem to be calling rdma_destroy_id, which seems to be calling
ib_destroy_cm_id directly. No?
--
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
More information about the general
mailing list