[ofw] Programatically checking Infiniband performance counters

Hal Rosenstock hrosenstock at xsigo.com
Tue May 6 05:05:45 PDT 2008


On Tue, 2008-05-06 at 14:37 +0300, Moshe Haim wrote:
> > > Looking at the ib_port_counters_t structure I see several fields.
> > > 
> > > The port_xmit_data looks like transferred the data in bytes divided
> by
> > > 4 - however, it is only 32 bit so it fills up very fast (after
> 16GB).
> >
> > Yes; so either the polling needs to account for this or use optional
> > extended 64 bit counters if they are supported.
> 
> Could you please elaborate on the extended 64 bit counters? 

See IBA 1.2.1 vol 1 16.1 Performance Management specifically 16.1.4.11

> How can I tell if they are supported and how can I access them?

They're attribute ID 0x001D (PortCountersExtended) rather than 0x0012
(PortCounters). There are a number of ways to tell this but I suspect
they're not supported on any Windows supported HCAs I'm aware of so this
may not be of much use :-(

> > > 2. How can I reset the counters to 0? I tried using ib_local_mad
> with
> > > setting the port_xmit_pkts/port_xmit_data to 0 and use mad->method =
> > > IB_MAD_METHOD_SET but that doesn't seem to work...
> > > I need this since I expect to run for long periods of time and even
> > > the port_xmit_pkts will hit its limit.
> >
> > Did you set PortSelect and CounterSelect ?
> 
> I am basically using the code inside vstat_main.c in the function
> vstat_get_counters - so PortSelect and CounterSelect are the same
> values(portnum and 0xFF respectively), I only changed the mad_in->method
> from IB_MAD_METHOD_GET to IB_MAD_METHOD_SET.

CounterSelect is 16 rather than 8 bits so if you want to reset all the
counters, it should be 0xffff. Also, what's the value of portnum ?

> I get the following error: ib_local_mad failed with status = 42.

I'm not familiar with the Windows code so I have no clue what status 42
means.

> > > Last but not least, is there any plan to add performance counters in
> > > windows for WinOF in the future?
> > 
> > In general, there can only be one performance manager owning a node's
> > performance counters (assuming they are going to do resets as above).
> > PerfMgrs come bundled in a number of different SMs. A more recent
> OpenSM
> > than what is current in Windows has this as do most if not all vendor
> > SMs.
> 
> I have WinOF 1.0.1 installed with the OpenSM service started. 
> So if I understand correctly the version that I have does not manage the
> performance counters.

Not AFAIK.

> According to what you say I also understand that future versions of the
> OpenSM will have Performance Manager that manages the performance
> counters - am I correct? 

It depends on what the Windows community and Windows OpenSM maintainer
decide relative to the PerfMgr. It could include this as the OpenSM
common base currently supports this.

> If so in which version of WinOF is this planned for?

I don't know the WinOF plan to update OpenSM. You'll have to ask others
on this list.

> Also, when talking about bundled performance managers - do they
> automatically reset the counters if they hit the limit,

They attempt to reset them some threshold prior to hitting the limits.

> do I interact with them in order to get/set values for the performance counters?

It exports the accumulated values for those counters on a node basis.

-- Hal

> Thanks,
> Moshe.
> 
> -----Original Message-----
> From: Hal Rosenstock [mailto:hrosenstock at xsigo.com] 
> Sent: Sunday, May 04, 2008 7:17 PM
> To: Moshe Haim
> Cc: Tzachi Dar; ofw at lists.openfabrics.org
> Subject: RE: [ofw] Programatically checking Infiniband performance
> counters
> 
> Moshe,
> 
> On Sun, 2008-05-04 at 15:46 +0300, Moshe Haim wrote:
> > Thanks for the quick reply.
> > 
> >  
> > 
> > I have a few additional questions:
> > 
> > Looking at the ib_port_counters_t structure I see several fields.
> > 
> > The port_xmit_data looks like transferred the data in bytes divided by
> > 4 - however, it is only 32 bit so it fills up very fast (after 16GB).
> 
> Yes; so either the polling needs to account for this or use optional
> extended 64 bit counters if they are supported.
> 
> > The port_xmit_pkts looks like transferred data in 1KB packets
> 
> Just a packet count; nothing related to packet size.
> 
> >  - so it fills up more slowly:
> 
> Yes.
> 
> > 1. Can I rely on the packet size being 1KB - if not where can I find
> > the packet size definition. 
> 
> No; the packets are all different sizes depending on the protocols being
> used.
> 
> > 2. How can I reset the counters to 0? I tried using ib_local_mad with
> > setting the port_xmit_pkts/port_xmit_data to 0 and use mad->method =
> > IB_MAD_METHOD_SET but that doesn't seem to work...
> > I need this since I expect to run for long periods of time and even
> > the port_xmit_pkts will hit its limit.
> 
> Did you set PortSelect and CounterSelect ?
> 
> > Last but not least, is there any plan to add performance counters in
> > windows for WinOF in the future?
> 
> In general, there can only be one performance manager owning a node's
> performance counters (assuming they are going to do resets as above).
> PerfMgrs come bundled in a number of different SMs. A more recent OpenSM
> than what is current in Windows has this as do most if not all vendor
> SMs.
> 
> -- Hal
> 
> > Thanks,
> > 
> > Moshe.
> > 
> >  
> > 
> >                                    
> > ______________________________________________________________________
> > 
> > From: Tzachi Dar [mailto:tzachid at mellanox.co.il] 
> > Sent: Sunday, April 27, 2008 5:51 PM
> > To: Moshe Haim; ofw at lists.openfabrics.org
> > Subject: RE: [ofw] Programatically checking Infiniband performance
> > counters
> > 
> > 
> >  
> > 
> > Using performance counters currently only allows one to get WSD data.
> > Since you are working on XP that won't help you. You can use the same
> > way that vstat is doing to get that data.
> > 
> > 
> > In other words, you need to implement the function vstat_get_counters
> > () that is using a local mad in order to get that data.
> > 
> > 
> >  
> > 
> > 
> > You can find this function in the file vstat_main.c (lines 343 - 393)
> > 
> > 
> >  
> > 
> > 
> > Thanks
> > 
> > 
> > Tzachi
> > 
> > 
> >          
> >         
> >                                        
> >         ______________________________________________________________
> >         
> >         From:ofw-bounces at lists.openfabrics.org [mailto:ofw-
> >         bounces at lists.openfabrics.org] On Behalf Of Moshe Haim
> >         Sent: Sunday, April 27, 2008 5:09 PM
> >         To: ofw at lists.openfabrics.org
> >         Subject: [ofw] Programatically checking Infiniband performance
> >         counters
> >         
> >         Hi,
> >         
> >          
> >         
> >         I am working with Infiniband on Windows XP using the
> >         WinOF_1.0.1 package.
> >         
> >         I need to be able to check the Infiniband performance counters
> >         in order to estimate traffic over the Infiniband.
> >         
> >          
> >         
> >         I have seen that using vstat -c the counter information
> >         appears.
> >         
> >         How can I retrieve it not using vstat.exe? Are there any
> >         performance counters I can check (for instance select them in
> >         perfmon.exe?), or any other APIs?
> >         
> >          
> >         
> >         Thanks,
> >         
> >         Moshe.
> >         
> >          
> >         
> > _______________________________________________
> > ofw mailing list
> > ofw at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> _______________________________________________
> ofw mailing list
> ofw at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw




More information about the ofw mailing list