[openib-general] performance counters in /sys
Mark Seger
Mark.Seger at hp.com
Thu May 19 13:11:48 PDT 2005
I've looked through the archives for more details on this topic and
while there were some interesting discussions during Sept 2004, none of
them really seemed to touch on this specific topic - programmatic access
to performance counters. If this has already been discussed elsewhere a
pointer would be appreciated.
The current location for the IB counters is:
/sys/class/infiniband/mthca0/ports/1/counters
and within that directory there, there are 16 counters. If I want to
programmatically access all the counters for all the hcas, this means I
have to go to 2 directories (one per port) per hca and then open/read 16
files in each. If I want to do that with any kind of frequency, say
once every second or so, that's a lot of extra work that I'd like to cut
down on.
I believe there are probably several scenarios of how people may want to
look at the data and for a human interface would think something like
/proc/meminfo style would at least reduce the number of reads by a
factor of 16, which is certainly better. However, this means that
programmatically I still need to match a line in the file with a
particular item of data that I'm interested in and that means comparing
each line's label to known values which does take some extra overhead
since I potentially have to do over 100 compares for each set of 16
counters.
Better yet, a format such as /proc/net/dev shows one line per network
interface followed by all the counters, space separated for human
consumption. This can be argued to be better but as the number of
counters grow, scaling becomes an issue because when cat'ing the file,
you get into word wrap and readability becomes difficult if not impossible.
Yet a more compact format is that of /proc/pid/stat in which all the
fields are simply separated by a single space with no attempt to read
the output by anything other than a program. Personally that would be
my vote.
And finally there's the notion of having a version number associated
with the data in case it changes so the consumer will know what to do
with the new format as is done with /proc/slabinfo.
So based on all that I'd ask the question of what people would think of
adding a new structure, perhaps under the directory
/sys/class/infiniband (maintaining the old for those who feel it is more
suitable for human consumption) that looks something like:
Vx.y
hca0 1 off ctr1 ctr2 ctr3 ...
hca0 2 on ctr1 ctr2 ctr3 ...
hca1 1 on ctr1 ctr2 ctr3 ...
hca1 2 off ctr1 ctr2 ctr3 ...
in which 2 ports are on and 2 off, followed by the 16 counters
associated with them
The only other thing that could be useful would be an extra field for
the protocol, such that for a given interface/port, I could see the
traffic counters for each type of protocol that one might choose to
support, such as mpi, portals, etc.
personally, I think one size (programmatic vs human readable) doesn't
fit all although one could argue the need for both and I certainly
wouldn't object. however, if I had to pick only one I'd pick
programmatic since counters can be extremely difficult to interpret when
they're continually increasing and I for one can't do subtraction of
large numbers that quickly in my head. 8-)
comments? opinions?
-mark
More information about the general
mailing list