[ofa-general] mlx4_core CQ overrun

Shirley Ma xma at us.ibm.com
Thu Jun 12 18:10:08 PDT 2008





"Roland Dreier" <roland.list at gmail.com> wrote on 06/12/2008 05:47:07 PM:

> > Anybody saw mlx4_core CQ overrun before? The test is based on OFED-1.3.
FW
> > version is 2.3.0. Please let me know any more info is needed.
>
> Yes, I've seen CQ overrun -- when a CQ is overrun...

Thanks for your prompt response. So it is not possible a driver or FW bug?
We will recheck our test.

> > c955mgrs1:~ # dsh -av "grep 'CQ overrun' /var/log/messages" | sort
> > dsh: c955c2s1.ppd.pok.ibm.com Host is not responding. No command will
be
> > issued to this host
> > c955c1s11.ppd.pok.ibm.com: Jun 10 07:18:15 c955c1s11 kernel: mlx4_core
> > 0003:01:00.0: CQ overrun on CQN 000098
>
> What test are you running to get this error?  My first guess would be
> a bug in the
> test that overruns a CQ.
>
>  - R.

Some vendor specific MPI stress test.

Thanks
Shirley
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20080612/3547e304/attachment.html>


More information about the general mailing list