[openib-general] sysfs exposure of port counters useless?

Hal Rosenstock halr at voltaire.com
Tue Oct 17 20:49:13 PDT 2006


On Tue, 2006-10-17 at 20:18, Scott Weitzenkamp (sweitzen) wrote:
> I agree the 32-bit byte and packet counters are useless as they get
> pegged in a few seconds on a busy IB networks.  I thought there was an
> effort in IBTA to fix this.

The fix at least in terms of the spec has been there for a while.
PortCountersExtended are in the 1.2 spec but not all hardware/PMA
supports these (they are optional).

> For IB counters in a Cisco switch, we read and reset the 32-bit counters
> once per second and keep 64-bit counters internally.

32 bit byte counters can be pegged in only 16 seconds on a 4x SDR link
and there are 4x DDR links now (8 seconds) and 12x links (5 seconds) so
that strategy is inaccurate on busy networks.

> This would be possible in OF too, right?

This is part of a performance manager (which is part of fabric
management) and is not standardized (specific to each fabric management
offering). Most offer this manager as part of their solution.

OpenSM will be adding a performance manager in the not distant future.
An RFC will initially be published on this list so I look forward to
comments since this seems to be an area of interest.

-- Hal

> Scott Weitzenkamp
> SQA and Release Manager
> Server Virtualization Business Unit
> Cisco Systems
>  
> 
> > -----Original Message-----
> > From: openib-general-bounces at openib.org 
> > [mailto:openib-general-bounces at openib.org] On Behalf Of Michael Newton
> > Sent: Tuesday, October 17, 2006 5:10 PM
> > To: Hal Rosenstock
> > Cc: openib-general at openib.org
> > Subject: Re: [openib-general] sysfs exposure of port counters useless?
> > 
> > On Tue, 17 Oct 2006, Hal Rosenstock wrote:
> > > On Tue, 2006-10-17 at 09:55, Rimmer, Todd wrote:
> > > > > From: Michael Newton
> > > > > Sent: Tuesday, October 17, 2006 3:02 AM
> > > > > To: openib-general at openib.org
> > > > > Subject: [openib-general] sysfs exposure of port 
> > counters useless?
> > > > >
> > > > >
> > > > > These are 32 bit counters. The rcv/xmit_data counters 
> > count 32-bit
> > > > > blocks. Also, these counts do not wrap: they peg at all 1s.
> > > > > At infiniband speeds, these counts can peg out very 
> > quickly indeed,
> > > > > to the point they can really only be of use if they can 
> > be reset each
> > > > time
> > > > > there read. Now if anyone who wants to use them has to 
> > go the CLI to
> > > > reset
> > > > > them, and theres little point in reading them without 
> > reset, why would
> > > > > anyone read them via sysfs? so why have them?
> > > > >
> > > >
> > > > We have found that while your comment is true for the 
> > data movement
> > > > counters, the error counters should not peg quickly, 
> > hence it is valid
> > 
> > its true i overstated the case just a little;) .. yes error counters
> > should be fine and its mainly the data counters that are problematic
> > (tho now im not sure i havent seen the packet counters freeze when the
> > data ones peg out)..
> > 
> > > > to read them without resetting.  However it is also 
> > useful to have an
> > > > ability to reset them.  Of course if there are other CLI 
> > commands which
> > > > do this easily, the sysfs info is of less value.
> > >
> > > There are diag tools for this.
> > 
> > thats where we started.. the point im making is that exposing the data
> > counters in sysfs is of little use, because if you have to go to other
> > tools to reset, why wouldnt you use them to read as well?
> > 
> > i was looking at exposing infiniband stats via PCP
> > (http://oss.sgi.com/projects/pcp/). This would be useful for 
> > folk doing IB
> > performance testing. Its very easy to just feed in the sysfs values..
> > unfortunately they turn out to be of little value. Life would 
> > be so much
> > easier if there were 64 bit counters available. Instead I 
> > will probably
> > need to have an additional daemon to construct them.
> > 
> > 
> > _______________________________________________
> > openib-general mailing list
> > openib-general at openib.org
> > http://openib.org/mailman/listinfo/openib-general
> > 
> > To unsubscribe, please visit 
> > http://openib.org/mailman/listinfo/openib-general
> > 





More information about the general mailing list