[ofa-general] proper way to recover from poll CQ failed error
Dotan Barak
dotanb at dev.mellanox.co.il
Thu Mar 13 01:05:28 PDT 2008
Hi.
The fact that ibv_poll_cq failed indicates that something bad happened.
Usually this failure should create any problem and only the process that
had the problem is being
effected from this.
I personally think that the ib_* performance tools are better to check
the performance of your subnet.
I will be happy if you'll answer the following questions:
Is this error is consistent?
Can you please send me the output of the ibv_devinfo of your machines?
Did you have any error message in the /var/log/messages when you saw
this error?
thanks
Dotan
Murray Smigel wrote:
> Hi,
> I am running OFED-3.0 using ConnectX adapters in a two machine direct
> connect mode.
> Most of the various pingpong tests seem ok, but when I run
> ibv_srq_pingpong -s 500 -n 1000
>
> I get poll "CQ failed -2" when I start up the client side. Smaller
> values of -s worked fine.
> Once this happens, no other pingpong tests seem to work.
> I have then unloaded all the ib_* mlmx_* and iw_* modules, reloaded
> them and things still
> fail. I have to reboot the machines to get things back.
>
> 1) Is there a cleaner way to recover from this situation?
> 2) Is the initial failure an indication that something else is wrong?
> 3) Is the -s 1 latency I see with ibv_rc_pingpong of ~7 microseconds
> reasonable?
>
> Thanks,
> murray smigel
>
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>
More information about the general
mailing list