[PATCH] Re: [openib-general] Re: IPoIB Failure CQ overrun

Michael S. Tsirkin mst at mellanox.co.il
Tue Dec 21 01:46:12 PST 2004


I'm a bit ill, expect to work on it tomorrow.
Could you post the patch with these dumps?


> -----Original Message-----
> From: Roland Dreier [mailto:roland at topspin.com]
> Sent: Mon, December 20, 2004 7:56 PM
> To: Michael S. Tsirkin
> Cc: openib-general at openib.org
> Subject: Re: [PATCH] Re: [openib-general] Re: IPoIB Failure CQ overrun
> 
> 
> >From adding some more dumping of CQ state, what _may_ be happening is
> that under rare conditions the HCA's CQ consumer index gets
> incremented by 1 too many.  Then when the CQ is completely empty it
> will look full to the HW and we'll get an overrun for the next CQE.
> (I saw it happen after ~300K increments of the CQ's CI, ~160K of which
> were for >1)
> 
> I didn't see how the driver could be doing this, since the HCA ended
> up with a CI that was one more than the number of increments that the
> driver did.  Also, converting all of the increment CI dbells to only
> increment by 1 fixes the problem, which is more evidence of a 
> FW glitch.
> 
> Thanks,
>   Roland
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20041221/a2602e87/attachment.html>


More information about the general mailing list