[PATCH] Re: [openib-general] Re: IPoIB Failure CQ overrun
Michael S. Tsirkin
mst at mellanox.co.il
Tue Dec 21 01:46:12 PST 2004
I'm a bit ill, expect to work on it tomorrow.
Could you post the patch with these dumps?
> -----Original Message-----
> From: Roland Dreier [mailto:roland at topspin.com]
> Sent: Mon, December 20, 2004 7:56 PM
> To: Michael S. Tsirkin
> Cc: openib-general at openib.org
> Subject: Re: [PATCH] Re: [openib-general] Re: IPoIB Failure CQ overrun
>
>
> >From adding some more dumping of CQ state, what _may_ be happening is
> that under rare conditions the HCA's CQ consumer index gets
> incremented by 1 too many. Then when the CQ is completely empty it
> will look full to the HW and we'll get an overrun for the next CQE.
> (I saw it happen after ~300K increments of the CQ's CI, ~160K of which
> were for >1)
>
> I didn't see how the driver could be doing this, since the HCA ended
> up with a CI that was one more than the number of increments that the
> driver did. Also, converting all of the increment CI dbells to only
> increment by 1 fixes the problem, which is more evidence of a
> FW glitch.
>
> Thanks,
> Roland
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20041221/a2602e87/attachment.html>
More information about the general
mailing list