[ewg] [perkinjo at cse.ohio-state.edu: ibv_open_xrc_domain error]
Jonathan Perkins
perkinjo at cse.ohio-state.edu
Fri Mar 8 06:41:18 PST 2013
I've sent the forwarded messages to linux-rdma initially but perhaps its
better to ask this question here.
----- Forwarded message from Jonathan Perkins <perkinjo at cse.ohio-state.edu> -----
Date: Wed, 6 Mar 2013 14:29:50 -0500
From: Jonathan Perkins <perkinjo at cse.ohio-state.edu>
To: linux-rdma at vger.kernel.org
Subject: ibv_open_xrc_domain error
User-Agent: Mutt/1.5.21 (2010-09-15)
Dear list:
Recently we have experienced failures using the ibv_open_xrc_domain
which gives an invalid parameter error code. This failure started to
appear randomly after upgrading the kernel to 2.6.32-279.19.1.el6.x86_64
and seems to require us to reboot the node. Whenever this happens we
notice that /var/log messages contain many of the following messages...
Feb 26 20:56:07 magny4 kernel: mlx4_core 0000:02:00.0: mlx4_eq_int:
MLX4_EVENT_TYPE_SRQ_LIMIT
Does anyone have any idea of what may be going wrong or how to debug
this issue?
Also, we've noticed that there is no user-space XRC support in OFED-3.5.
Will this support be added back in a future release?
Below is some information about our setup.
OFED-1.5.4.1
RHEL6 (2.6.32-279.19.1.el6.x86_64)
We're using many different platforms but here are two of them which show
the error.
Platform A:
CPU: AMD Magny Cour
HCA: Mellanox ConnectX VPI MT26428
Platform B:
CPU: Intel Kentsfield
HCA: Mellanox ConnectX VPI MT25418
--
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo
----- End forwarded message -----
--
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo
More information about the ewg
mailing list