[ofa-general] proper way to recover from poll CQ failed error
Murray Smigel
murray at tradeworx.com
Wed Mar 12 11:05:37 PDT 2008
Hi,
I am running OFED-3.0 using ConnectX adapters in a two machine direct
connect mode.
Most of the various pingpong tests seem ok, but when I run
ibv_srq_pingpong -s 500 -n 1000
I get poll "CQ failed -2" when I start up the client side. Smaller
values of -s worked fine.
Once this happens, no other pingpong tests seem to work.
I have then unloaded all the ib_* mlmx_* and iw_* modules, reloaded them
and things still
fail. I have to reboot the machines to get things back.
1) Is there a cleaner way to recover from this situation?
2) Is the initial failure an indication that something else is wrong?
3) Is the -s 1 latency I see with ibv_rc_pingpong of ~7 microseconds
reasonable?
Thanks,
murray smigel
More information about the general
mailing list