[ofw] CQ poll and rearm semantics

Jan Bottorff jbottorff at xsigo.com
Thu Mar 8 17:12:23 PST 2007


Hi,

 

I've been debugging an IB kernel driver and see that sometimes we get a
stuck send operation. I believe what happens is the send actually
happens but we don't get a CQ completion callback. I've been trying to
track down the CORRECT programming semantics for CQ polling and
rearming. 

 

Looking in the Windows IB stack, I see in same cases were in the
completion callback routine, the CQ is rearmed BEFORE the CQ entries are
polled (like in the base mad processing code). In other places (like the
IPoIB driver) I see where it polls first, in a loop until no CQ entries
are returned, and then it rearms the CQ. I also found a document from
2003 from Intel called the IB verb implementers guide (at
infiniband.sourceforge.net/HWDrivers/HCA_DDK/VIG_SF.pdf), and it very
clearly states in section 8.3 you need to use what look like edge
triggered interrupt semantics to handle the race condition of polling
and rearming the CQ. Assuming the Intel document is correct, then the IB
stack may be getting stuck completions on occasion. 

 

Can anybody give a definite answer if the CQ trigger has edge or level
semantics, and what I need to do to assure CQ entries are always
processed without a delay? The docs for ib_rearm_cq seem to say
something different than the docs for ib_rearm_n_cq, so the docs aren't
much help either.

 

- Jan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20070308/3552eac2/attachment.html>


More information about the ofw mailing list