[openib-general] performance counters in /sys

Mark Seger Mark.Seger at hp.com
Thu May 19 13:11:48 PDT 2005


I've  looked through the archives for more details on this topic and 
while there were some interesting discussions during Sept 2004, none of 
them really seemed to touch on this specific topic - programmatic access 
to performance counters.  If this has already been discussed elsewhere a 
pointer would be appreciated.

The current location for the IB counters is:

/sys/class/infiniband/mthca0/ports/1/counters

and within that directory there, there are 16 counters.  If I want to 
programmatically access all the counters for all the hcas, this means I 
have to go to 2 directories (one per port) per hca and then open/read 16 
files in each.  If I want to do that with any kind of frequency, say 
once every second or so, that's a lot of extra work that I'd like to cut 
down on.

I believe there are probably several scenarios of how people may want to 
look at the data and for a human interface would think something like 
/proc/meminfo style would at least reduce the number of reads by a 
factor of 16, which is certainly better.  However, this means that 
programmatically I still need to match a line in the file with a 
particular item of data that I'm interested in and that means comparing 
each line's label to known values which does take some extra overhead 
since I potentially have to do over 100 compares for each set of 16 
counters.

Better yet, a format such as /proc/net/dev shows one line per network 
interface followed by all the counters, space separated for human 
consumption.  This can be argued to be better but as the number of 
counters grow, scaling becomes an issue because when cat'ing the file, 
you get into word wrap and readability becomes difficult if not impossible.

Yet a more compact format is that of /proc/pid/stat in which all the 
fields are simply separated by a single space with no attempt to read 
the output by anything other than a program.  Personally that would be 
my vote.

And finally there's the notion of having a version number associated 
with the data in case it changes so the consumer will know what to do 
with the new format as is done with /proc/slabinfo.

So based on all that I'd ask the question of what people would think of 
adding a new structure, perhaps under the directory 
/sys/class/infiniband (maintaining the old for those who feel it is more 
suitable for human consumption) that looks something like:

Vx.y
hca0 1 off ctr1 ctr2 ctr3 ...
hca0 2 on ctr1 ctr2 ctr3 ...
hca1 1 on ctr1 ctr2 ctr3 ...
hca1 2 off ctr1 ctr2 ctr3 ...

in which 2 ports are on and 2 off, followed by the 16 counters 
associated with them

The only other thing that could be useful would be an extra field for 
the protocol, such that for a given interface/port, I could see the 
traffic counters for each type of protocol that one might choose to 
support, such as mpi, portals, etc.

personally, I think one size (programmatic vs human readable) doesn't 
fit all although one could argue the need for both and I certainly 
wouldn't object.  however, if I had to pick only one I'd pick 
programmatic since counters can be extremely difficult to interpret when 
they're continually increasing and I for one can't do subtraction of 
large numbers that quickly in my head.  8-)

comments?  opinions?

-mark




More information about the general mailing list