[ofa-general] proper way to recover from poll CQ failed error

Murray Smigel murray at tradeworx.com
Wed Mar 12 11:05:37 PDT 2008


Hi,
I am running OFED-3.0 using ConnectX adapters in a two machine direct 
connect mode.
Most of the various pingpong tests seem ok, but when I run
ibv_srq_pingpong -s 500 -n 1000

I get poll "CQ failed -2" when I start up the client side.  Smaller 
values of -s worked fine.
Once this happens, no other pingpong tests seem to work. 

I have then unloaded all the ib_* mlmx_* and iw_* modules, reloaded them 
and things still
fail. I have to reboot the machines to get things back.

1) Is there a cleaner way to recover from this situation?
2) Is the initial failure an indication that something else is wrong?
3) Is the -s 1 latency I see with ibv_rc_pingpong of ~7 microseconds 
reasonable?

Thanks,
murray smigel





More information about the general mailing list