[ofa-general] perfquery causes kernel to be stuck in ib_unregister_mad_agent() function
Jean-Francois.Neyroud
Jean-Francois.Neyroud at bull.net
Tue Apr 29 01:17:38 PDT 2008
If I attemp to query at the same time the performance counters on all
nodes on a cluster ( 40 nodes) .
perfquery causes kernel to be stuck in ib_unregister_mad_agent() function.
Impossible to send CTRL-C or CTRL-Z to perfquery, it is stuck in the kernel.
# pgrep perfquery
27578
# cat /proc/27578/wchan
ib_unregister_mad_agent
I have this problem with OFED-1.2.5 or 1.3 and with mthca or ConnectX,
not tested with others HCA and OFED.
Reproduceur with 2 nodes and without switch:
# for i in `seq 1 100`; do perfquery >/dev/null 2>&1 & done
# pgrep perfquery | while read pid; do echo "$pid: `cat /proc/$pid/wchan`"; echo; done | dshbak -c
----------------
[14936,14938-15029]
----------------
0
----------------
----------------
----------------
14937
----------------
flush_cpu_workqueue
Does anyone know this problem ?
Jean-Francois.
More information about the general
mailing list