[ofa-general] IB performance stats (revisited)
Mark Seger
Mark.Seger at hp.com
Wed Jun 27 06:17:36 PDT 2007
I had posted something about this some time last year but now actually
have some data to present.
My problem statement with IB is there is no efficient way to get
time-oriented performance numbers for all types of IB traffic. As far
as I know nothing is available for all types of traffic, such as MPI.
This is further complicated because IB counters do not wrap and as a
result when the counters are integers, they end up latching in <30
seconds when under load. The only way I am aware to do what I want to
do is by running perfquery AND then clearing the counters after each
request which by definition prevents anyone else from accessing the
counters including multiple instances of my program.
To give people a better idea of what I'm talking about, below is an
extract from a utility I've written called 'collectl' which has been in
use on HP systems for about 4 years and which we've now Open Sourced at
http://sourceforge.net/projects/collectl [shameless plug]. In the
following sample I've requested cpu, network and IB stats (there are
actually a whole lot of other things you can examine and you can learn
more at http://collectl.sourceforge.net/index.html). Anyhow, what
you're seeing below is a sample taken every second. At first there is
no IB traffic. Then I start a 'netperf' and you can see the IB stats
jump. A few seconds later I do a 'ping -f -s50000' to the ib interface
and you can now see an increase in the network traffic.
#
<--------CPU--------><-----------Network----------><----------InfiniBand---------->
#Time cpu sys inter ctxsw netKBi pkt-in netKBo pkt-out KBin
pktIn KBOut pktOut Errs
08:48:19 0 0 1046 137 0 4 0 2 0
0 0 0 0
08:48:20 2 2 18659 170 0 10 0 5 925
10767 80478 41636 0
08:48:21 14 14 92368 1882 0 9 1 10 3403
39599 463892 235588 0
08:48:22 14 14 92167 2243 0 8 0 4 3186
37081 471246 238743 0
08:48:23 12 12 92131 2382 0 3 0 2 4456
37323 470766 238488 0
08:48:24 13 13 91708 2691 7 106 12 104 7300
38542 466580 236450 0
08:48:25 14 14 91675 2763 11 175 20 175 7434
38417 463952 235146 0
08:48:26 13 13 91712 2716 11 174 20 175 7486
38464 465195 235767 0
08:48:27 14 14 91755 2742 11 171 19 171 7502
38656 465079 235720 0
08:48:28 13 13 90131 2126 12 178 20 179 8257
44080 424930 217067 0
08:48:29 13 13 89974 2389 13 191 22 191 7801
37094 457082 231523 0
here's another display option where you can see just the ipoib traffic
along with other network stats
# NETWORK STATISTICS (/sec)
# Num Name InPck InErr OutPck OutErr Mult ICmp
OCmp IKB OKB
09:04:51 0 lo: 0 0 0 0 0 0
0 0 0
09:04:51 1 eth0: 23 0 4 0 0 0
0 1 0
09:04:51 2 eth1: 0 0 0 0 0 0
0 0 0
09:04:51 3 ib0: 900 0 900 0 0 0 0
1775 1779
09:04:51 4 sit0: 0 0 0 0 0 0
0 0 0
09:04:52 0 lo: 0 0 0 0 0 0
0 0 0
09:04:52 1 eth0: 127 0 126 0 0 0
0 8 15
09:04:52 2 eth1: 0 0 0 0 0 0
0 0 0
09:04:52 3 ib0: 2275 0 2275 0 0 0 0
4488 4497
09:04:52 4 sit0: 0 0 0 0 0 0
0 0 0
While this is a relatively light-weight operation (collectl uses <0.1%
of the cpu), I still do have to call perfquery every second and that
does generate a little overhead. Furthermore, since I'm continuously
resetting the counters multiple instances of my tool or any other tool
that relies on these counters won't work correctly!
One solution that had been implemented in the Voltaire stack worked
quite well and that was a loadable module that read/cleared the HCA
counters, but exported them as wrapping counters in /proc. That way
utilities could access the counters in /proc without stepping on each
others toes. While still not the best solution, as long as the counters
don't wrap in the HCA, read/clear is the only way to do what it is I'm
trying to do, unless of course someone has a better solution. I also
realize with 64 bit counters this becomes a non-issue but I'm trying to
solve the more general case.
comments? flames? 8-)
-mark
More information about the general
mailing list