[ofw] Programatically checking Infiniband performance counters

Ira Weiny weiny2 at llnl.gov
Fri May 16 13:42:55 PDT 2008


I did not follow this thread completely but I just submitted a patch to Sasha
for the master and OFED 1.3 branches which include a Performance Manager HOWTO.
He has not accepted the patch yet, so if you want a copy I can send it
directly.

Furthermore, there is nothing I know of which would prevent the Performance
Manager from running on a Windows box.  I am pretty sure all "compatibility"
libs are in use.  However, the PerfMgr is still experimental so YMMV.  :-D

Hope this helps,
Ira


On Tue, 06 May 2008 05:05:45 -0700
Hal Rosenstock <hrosenstock at xsigo.com> wrote:

> On Tue, 2008-05-06 at 14:37 +0300, Moshe Haim wrote:
> > > > Looking at the ib_port_counters_t structure I see several fields.
> > > > 
> > > > The port_xmit_data looks like transferred the data in bytes divided
> > by
> > > > 4 - however, it is only 32 bit so it fills up very fast (after
> > 16GB).
> > >
> > > Yes; so either the polling needs to account for this or use optional
> > > extended 64 bit counters if they are supported.
> > 
> > Could you please elaborate on the extended 64 bit counters? 
> 
> See IBA 1.2.1 vol 1 16.1 Performance Management specifically 16.1.4.11
> 
> > How can I tell if they are supported and how can I access them?
> 
> They're attribute ID 0x001D (PortCountersExtended) rather than 0x0012
> (PortCounters). There are a number of ways to tell this but I suspect
> they're not supported on any Windows supported HCAs I'm aware of so this
> may not be of much use :-(
> 
> > > > 2. How can I reset the counters to 0? I tried using ib_local_mad
> > with
> > > > setting the port_xmit_pkts/port_xmit_data to 0 and use mad->method =
> > > > IB_MAD_METHOD_SET but that doesn't seem to work...
> > > > I need this since I expect to run for long periods of time and even
> > > > the port_xmit_pkts will hit its limit.
> > >
> > > Did you set PortSelect and CounterSelect ?
> > 
> > I am basically using the code inside vstat_main.c in the function
> > vstat_get_counters - so PortSelect and CounterSelect are the same
> > values(portnum and 0xFF respectively), I only changed the mad_in->method
> > from IB_MAD_METHOD_GET to IB_MAD_METHOD_SET.
> 
> CounterSelect is 16 rather than 8 bits so if you want to reset all the
> counters, it should be 0xffff. Also, what's the value of portnum ?
> 
> > I get the following error: ib_local_mad failed with status = 42.
> 
> I'm not familiar with the Windows code so I have no clue what status 42
> means.
> 
> > > > Last but not least, is there any plan to add performance counters in
> > > > windows for WinOF in the future?
> > > 
> > > In general, there can only be one performance manager owning a node's
> > > performance counters (assuming they are going to do resets as above).
> > > PerfMgrs come bundled in a number of different SMs. A more recent
> > OpenSM
> > > than what is current in Windows has this as do most if not all vendor
> > > SMs.
> > 
> > I have WinOF 1.0.1 installed with the OpenSM service started. 
> > So if I understand correctly the version that I have does not manage the
> > performance counters.
> 
> Not AFAIK.
> 
> > According to what you say I also understand that future versions of the
> > OpenSM will have Performance Manager that manages the performance
> > counters - am I correct? 
> 
> It depends on what the Windows community and Windows OpenSM maintainer
> decide relative to the PerfMgr. It could include this as the OpenSM
> common base currently supports this.
> 
> > If so in which version of WinOF is this planned for?
> 
> I don't know the WinOF plan to update OpenSM. You'll have to ask others
> on this list.
> 
> > Also, when talking about bundled performance managers - do they
> > automatically reset the counters if they hit the limit,
> 
> They attempt to reset them some threshold prior to hitting the limits.
> 
> > do I interact with them in order to get/set values for the performance counters?
> 
> It exports the accumulated values for those counters on a node basis.
> 
> -- Hal
> 
> > Thanks,
> > Moshe.
> > 
> > -----Original Message-----
> > From: Hal Rosenstock [mailto:hrosenstock at xsigo.com] 
> > Sent: Sunday, May 04, 2008 7:17 PM
> > To: Moshe Haim
> > Cc: Tzachi Dar; ofw at lists.openfabrics.org
> > Subject: RE: [ofw] Programatically checking Infiniband performance
> > counters
> > 
> > Moshe,
> > 
> > On Sun, 2008-05-04 at 15:46 +0300, Moshe Haim wrote:
> > > Thanks for the quick reply.
> > > 
> > >  
> > > 
> > > I have a few additional questions:
> > > 
> > > Looking at the ib_port_counters_t structure I see several fields.
> > > 
> > > The port_xmit_data looks like transferred the data in bytes divided by
> > > 4 - however, it is only 32 bit so it fills up very fast (after 16GB).
> > 
> > Yes; so either the polling needs to account for this or use optional
> > extended 64 bit counters if they are supported.
> > 
> > > The port_xmit_pkts looks like transferred data in 1KB packets
> > 
> > Just a packet count; nothing related to packet size.
> > 
> > >  - so it fills up more slowly:
> > 
> > Yes.
> > 
> > > 1. Can I rely on the packet size being 1KB - if not where can I find
> > > the packet size definition. 
> > 
> > No; the packets are all different sizes depending on the protocols being
> > used.
> > 
> > > 2. How can I reset the counters to 0? I tried using ib_local_mad with
> > > setting the port_xmit_pkts/port_xmit_data to 0 and use mad->method =
> > > IB_MAD_METHOD_SET but that doesn't seem to work...
> > > I need this since I expect to run for long periods of time and even
> > > the port_xmit_pkts will hit its limit.
> > 
> > Did you set PortSelect and CounterSelect ?
> > 
> > > Last but not least, is there any plan to add performance counters in
> > > windows for WinOF in the future?
> > 
> > In general, there can only be one performance manager owning a node's
> > performance counters (assuming they are going to do resets as above).
> > PerfMgrs come bundled in a number of different SMs. A more recent OpenSM
> > than what is current in Windows has this as do most if not all vendor
> > SMs.
> > 
> > -- Hal
> > 
> > > Thanks,
> > > 
> > > Moshe.
> > > 
> > >  
> > > 
> > >                                    
> > > ______________________________________________________________________
> > > 
> > > From: Tzachi Dar [mailto:tzachid at mellanox.co.il] 
> > > Sent: Sunday, April 27, 2008 5:51 PM
> > > To: Moshe Haim; ofw at lists.openfabrics.org
> > > Subject: RE: [ofw] Programatically checking Infiniband performance
> > > counters
> > > 
> > > 
> > >  
> > > 
> > > Using performance counters currently only allows one to get WSD data.
> > > Since you are working on XP that won't help you. You can use the same
> > > way that vstat is doing to get that data.
> > > 
> > > 
> > > In other words, you need to implement the function vstat_get_counters
> > > () that is using a local mad in order to get that data.
> > > 
> > > 
> > >  
> > > 
> > > 
> > > You can find this function in the file vstat_main.c (lines 343 - 393)
> > > 
> > > 
> > >  
> > > 
> > > 
> > > Thanks
> > > 
> > > 
> > > Tzachi
> > > 
> > > 
> > >          
> > >         
> > >                                        
> > >         ______________________________________________________________
> > >         
> > >         From:ofw-bounces at lists.openfabrics.org [mailto:ofw-
> > >         bounces at lists.openfabrics.org] On Behalf Of Moshe Haim
> > >         Sent: Sunday, April 27, 2008 5:09 PM
> > >         To: ofw at lists.openfabrics.org
> > >         Subject: [ofw] Programatically checking Infiniband performance
> > >         counters
> > >         
> > >         Hi,
> > >         
> > >          
> > >         
> > >         I am working with Infiniband on Windows XP using the
> > >         WinOF_1.0.1 package.
> > >         
> > >         I need to be able to check the Infiniband performance counters
> > >         in order to estimate traffic over the Infiniband.
> > >         
> > >          
> > >         
> > >         I have seen that using vstat -c the counter information
> > >         appears.
> > >         
> > >         How can I retrieve it not using vstat.exe? Are there any
> > >         performance counters I can check (for instance select them in
> > >         perfmon.exe?), or any other APIs?
> > >         
> > >          
> > >         
> > >         Thanks,
> > >         
> > >         Moshe.
> > >         
> > >          
> > >         
> > > _______________________________________________
> > > ofw mailing list
> > > ofw at lists.openfabrics.org
> > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> > _______________________________________________
> > ofw mailing list
> > ofw at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> 



More information about the ofw mailing list