[openib-general] Completion callback /teardown race
Eric Barton
eeb at bartonsoftware.com
Tue Sep 19 11:14:28 PDT 2006
Hi,
I create 1 CQ just for receive completions on each of my QPs. When I tear down
the QP, I rdma_disconnect(), change the QP state to IB_QPS_ERR and then wait
for all currently posted receives to complete.
This has worked just fine for me, but I've had a bug report from a site using
this software (possibly with HCAs I've not tested with) that another completion
callback can happen after all the posted receives have completed.
I supplied a debug/workaround patch that checks the CQ in this situation. It
confirms that all posted receives have completed and that the CQ is in fact
empty.
Is this a bug, or an unavoidable race between arming the callback and polling
the CQ?
All the CQ callback does is wake a thread to poll the queue. This effectively
keeps polling completions out of the CQ until it is empty. Then it calls
ib_req_notify_cq(cq, IB_CQ_NEXT_COMP) and ib_poll_cq() 1 more time.
If this last call to ib_poll_cq() finds something, it repeats the whole process
- but can I be guaranteed another CQ callback in this case or is it
indeterminate?
--
Cheers,
Eric
More information about the general
mailing list