[PATCH] Re: [openib-general] Re: IPoIB Failure CQ overrun

Roland Dreier roland at topspin.com
Mon Dec 20 09:55:34 PST 2004


>From adding some more dumping of CQ state, what _may_ be happening is
that under rare conditions the HCA's CQ consumer index gets
incremented by 1 too many.  Then when the CQ is completely empty it
will look full to the HW and we'll get an overrun for the next CQE.
(I saw it happen after ~300K increments of the CQ's CI, ~160K of which
were for >1)

I didn't see how the driver could be doing this, since the HCA ended
up with a CI that was one more than the number of increments that the
driver did.  Also, converting all of the increment CI dbells to only
increment by 1 fixes the problem, which is more evidence of a FW glitch.

Thanks,
  Roland



More information about the general mailing list