[openib-general] Completion callback /teardown race

Eric Barton eeb at bartonsoftware.com
Tue Sep 19 11:14:28 PDT 2006


Hi,

I create 1 CQ just for receive completions on each of my QPs.  When I tear down
the QP, I rdma_disconnect(), change the QP state to IB_QPS_ERR and then wait
for all currently posted receives to complete.

This has worked just fine for me, but I've had a bug report from a site using
this software (possibly with HCAs I've not tested with) that another completion
callback can happen after all the posted receives have completed.

I supplied a debug/workaround patch that checks the CQ in this situation.  It
confirms that all posted receives have completed and that the CQ is in fact
empty.

Is this a bug, or an unavoidable race between arming the callback and polling
the CQ?

All the CQ callback does is wake a thread to poll the queue.  This effectively
keeps polling completions out of the CQ until it is empty. Then it calls
ib_req_notify_cq(cq, IB_CQ_NEXT_COMP) and ib_poll_cq() 1 more time.  

If this last call to ib_poll_cq() finds something, it repeats the whole process
- but can I be guaranteed another CQ callback in this case or is it
indeterminate?

-- 

                Cheers,
                        Eric






More information about the general mailing list