How to destroy IB resources (was Re: [ofa-general] Help - RDMA event files remain open after acknowledging them)

Nitin Mehrotra nmehrotra at riorey.com
Fri Aug 14 12:02:21 PDT 2009


Sean,

Thanks for your reply. It turns out the problem is the file created by the ibv_create_comp_channel() call. I do make sure to call the destroy call for each create call, the problem is that it is failing with error 16 (device or resource busy) and I missed that fact.

So this brings me to another newbie question which I haven't been able to completely solve and that is how to cleanly and successfully destroy all IB resources. Since this is a new subject I have changed the thread subject appropriately.

I init IB as follows:

- ibv_create_comp_channel()
- make ccc_fd non-blocking
- ibv_create_cq()
- ibv_req_notify_cq()
- ibv_alloc_pd()
- ibv_create_qp

I shutdown in the reverse order
- drain_cq()
- ibv_destroy_qp()
- ibv_dealloc_pd()
- ibv_destroy_cq()
- ibv_destroy_comp_channel()

my drain_cq() function is:
loop:
    - ibv_get_cq_event()
    - ibv_ack_cq_events() all unacknowledged events pending, if any
    - ibv_req_notify_cq()
    - ibv_poll_cq()
until either ibv_get_cq_event() returns an error, ibv_poll_cq() returns 0 completions or I have looped the depth of the cq.

It works, in a fashion. Without the drain function ibv_destroy_cq() hangs. However now, ibv_get_cq_event() returns EAGAIN continuonsly so I exit after I have looped the depth of cq  and then ibv_dealloc_pd(), ibv_destroy_cq() and ibv_destroy_comp_channel() all return error EBUSY. This leaves the file open in the system.

I guess my question is, what's the best way to destroy IB resources? (Perhaps even, what's the best way to init them in the first place).

Thanks,

Nitin
----- Original Message -----
From: "Sean Hefty" <sean.hefty at intel.com>
To: "Nitin Mehrotra" <nmehrotra at riorey.com>, general at lists.openfabrics.org
Sent: Friday, August 14, 2009 12:34:15 PM GMT -05:00 US/Canada Eastern
Subject: RE: [ofa-general] Help - RDMA event files remain open after	acknowledging them

>What am I doing wrong? Is there something more I need to do than calling
>rdma_ack_cm_event after every rdma_ack_cm_event to get these event files to be
>closed? As an fyi, I have even tried closing the rdma_id and destroying the
>event channel when the connection fails to force the event files to be closed
>without success.

The following calls result in opening files to the kernel:

ibv_create_comp_channel() - used to report cq events
rdma_create_event_channel() - used to report rdma cm events

Be sure that there are corresponding calls to:

ibv_destroy_comp_channel()
rdma_destroy_event_channel()

These are the calls that close the opened files.

- Sean




More information about the general mailing list