[ofw] Programatically checking Infiniband performance counters

Tzachi Dar tzachid at mellanox.co.il
Wed May 21 07:02:18 PDT 2008


Hi Moshe.

Creating performance counters for windows is in our to do list, but it
is currently not scheduled for any coming release.

If you can contribute such a component we will be happy to check it in.

Thanks
Tzachi

> -----Original Message-----
> From: ofw-bounces at lists.openfabrics.org 
> [mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Moshe Haim
> Sent: Sunday, May 18, 2008 4:55 PM
> To: ofw at lists.openfabrics.org
> Subject: RE: [ofw] Programatically checking Infiniband 
> performance counters
> 
> Basically, what I would like to ultimately achieve is being 
> able to do some load balancing on my software - taking the IB 
> traffic into account as well as other performance information.
> 
> When I install WinOF on a machine and I install the IPoIB 
> component, I am able to use windows performance monitor and 
> to see the traffic over the IPoIB "adapter".
> So I was thinking that it would be good if I was able to do 
> the same thing on "lower level protocol" traffic as well - in 
> my specific use case uDAPL (through Intel MPI).
> 
> This is why I wanted to take a look at the performance 
> counters of the IB.
> 
> Taking the counter size limit into account, and the fact that 
> there should be only one maintainer for the counter 
> (resetting etc.) - I was wondering if there is any plan to 
> actually write performance counters DLL for Windows?
> So it would be possible to watch the performance the same way 
> I see counters for the IPoIB.
> 
> Is this something that is planned for some version of WinOF?
> 
> If not, is it possible to do such a thing? 
> I'd be more than glad to provide help writing such a thing - 
> as I think other people may need it as well as me.
> 
> Thanks,
> Moshe.
> 
> -----Original Message-----
> From: ofw-bounces at lists.openfabrics.org
> [mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Ira Weiny
> Sent: Friday, May 16, 2008 11:43 PM
> To: Hal Rosenstock
> Cc: Moshe Haim; ofw at lists.openfabrics.org
> Subject: Re: [ofw] Programatically checking Infiniband 
> performance counters
> 
> I did not follow this thread completely but I just submitted 
> a patch to Sasha for the master and OFED 1.3 branches which 
> include a Performance Manager HOWTO.
> He has not accepted the patch yet, so if you want a copy I 
> can send it directly.
> 
> Furthermore, there is nothing I know of which would prevent 
> the Performance Manager from running on a Windows box.  I am 
> pretty sure all "compatibility"
> libs are in use.  However, the PerfMgr is still experimental so YMMV.
> :-D
> 
> Hope this helps,
> Ira
> 
> 
> On Tue, 06 May 2008 05:05:45 -0700
> Hal Rosenstock <hrosenstock at xsigo.com> wrote:
> 
> > On Tue, 2008-05-06 at 14:37 +0300, Moshe Haim wrote:
> > > > > Looking at the ib_port_counters_t structure I see several
> fields.
> > > > > 
> > > > > The port_xmit_data looks like transferred the data in bytes
> divided
> > > by
> > > > > 4 - however, it is only 32 bit so it fills up very fast (after
> > > 16GB).
> > > >
> > > > Yes; so either the polling needs to account for this or use
> optional
> > > > extended 64 bit counters if they are supported.
> > > 
> > > Could you please elaborate on the extended 64 bit counters? 
> > 
> > See IBA 1.2.1 vol 1 16.1 Performance Management 
> specifically 16.1.4.11
> > 
> > > How can I tell if they are supported and how can I access them?
> > 
> > They're attribute ID 0x001D (PortCountersExtended) rather 
> than 0x0012 
> > (PortCounters). There are a number of ways to tell this but 
> I suspect 
> > they're not supported on any Windows supported HCAs I'm aware of so
> this
> > may not be of much use :-(
> > 
> > > > > 2. How can I reset the counters to 0? I tried using 
> ib_local_mad
> > > with
> > > > > setting the port_xmit_pkts/port_xmit_data to 0 and use
> mad->method =
> > > > > IB_MAD_METHOD_SET but that doesn't seem to work...
> > > > > I need this since I expect to run for long periods of time and
> even
> > > > > the port_xmit_pkts will hit its limit.
> > > >
> > > > Did you set PortSelect and CounterSelect ?
> > > 
> > > I am basically using the code inside vstat_main.c in the function 
> > > vstat_get_counters - so PortSelect and CounterSelect are the same 
> > > values(portnum and 0xFF respectively), I only changed the
> mad_in->method
> > > from IB_MAD_METHOD_GET to IB_MAD_METHOD_SET.
> > 
> > CounterSelect is 16 rather than 8 bits so if you want to 
> reset all the 
> > counters, it should be 0xffff. Also, what's the value of portnum ?
> > 
> > > I get the following error: ib_local_mad failed with status = 42.
> > 
> > I'm not familiar with the Windows code so I have no clue what status
> 42
> > means.
> > 
> > > > > Last but not least, is there any plan to add performance
> counters in
> > > > > windows for WinOF in the future?
> > > > 
> > > > In general, there can only be one performance manager owning a
> node's
> > > > performance counters (assuming they are going to do resets as
> above).
> > > > PerfMgrs come bundled in a number of different SMs. A 
> more recent
> > > OpenSM
> > > > than what is current in Windows has this as do most if not all
> vendor
> > > > SMs.
> > > 
> > > I have WinOF 1.0.1 installed with the OpenSM service started. 
> > > So if I understand correctly the version that I have does 
> not manage
> the
> > > performance counters.
> > 
> > Not AFAIK.
> > 
> > > According to what you say I also understand that future 
> versions of
> the
> > > OpenSM will have Performance Manager that manages the performance 
> > > counters - am I correct?
> > 
> > It depends on what the Windows community and Windows OpenSM 
> maintainer 
> > decide relative to the PerfMgr. It could include this as the OpenSM 
> > common base currently supports this.
> > 
> > > If so in which version of WinOF is this planned for?
> > 
> > I don't know the WinOF plan to update OpenSM. You'll have to ask
> others
> > on this list.
> > 
> > > Also, when talking about bundled performance managers - do they 
> > > automatically reset the counters if they hit the limit,
> > 
> > They attempt to reset them some threshold prior to hitting 
> the limits.
> > 
> > > do I interact with them in order to get/set values for the
> performance counters?
> > 
> > It exports the accumulated values for those counters on a 
> node basis.
> > 
> > -- Hal
> > 
> > > Thanks,
> > > Moshe.
> > > 
> > > -----Original Message-----
> > > From: Hal Rosenstock [mailto:hrosenstock at xsigo.com]
> > > Sent: Sunday, May 04, 2008 7:17 PM
> > > To: Moshe Haim
> > > Cc: Tzachi Dar; ofw at lists.openfabrics.org
> > > Subject: RE: [ofw] Programatically checking Infiniband 
> performance 
> > > counters
> > > 
> > > Moshe,
> > > 
> > > On Sun, 2008-05-04 at 15:46 +0300, Moshe Haim wrote:
> > > > Thanks for the quick reply.
> > > > 
> > > >  
> > > > 
> > > > I have a few additional questions:
> > > > 
> > > > Looking at the ib_port_counters_t structure I see 
> several fields.
> > > > 
> > > > The port_xmit_data looks like transferred the data in bytes
> divided by
> > > > 4 - however, it is only 32 bit so it fills up very fast (after
> 16GB).
> > > 
> > > Yes; so either the polling needs to account for this or 
> use optional 
> > > extended 64 bit counters if they are supported.
> > > 
> > > > The port_xmit_pkts looks like transferred data in 1KB packets
> > > 
> > > Just a packet count; nothing related to packet size.
> > > 
> > > >  - so it fills up more slowly:
> > > 
> > > Yes.
> > > 
> > > > 1. Can I rely on the packet size being 1KB - if not where can I
> find
> > > > the packet size definition. 
> > > 
> > > No; the packets are all different sizes depending on the protocols
> being
> > > used.
> > > 
> > > > 2. How can I reset the counters to 0? I tried using ib_local_mad
> with
> > > > setting the port_xmit_pkts/port_xmit_data to 0 and use 
> mad->method
> =
> > > > IB_MAD_METHOD_SET but that doesn't seem to work...
> > > > I need this since I expect to run for long periods of time and
> even
> > > > the port_xmit_pkts will hit its limit.
> > > 
> > > Did you set PortSelect and CounterSelect ?
> > > 
> > > > Last but not least, is there any plan to add 
> performance counters
> in
> > > > windows for WinOF in the future?
> > > 
> > > In general, there can only be one performance manager owning a
> node's
> > > performance counters (assuming they are going to do resets as
> above).
> > > PerfMgrs come bundled in a number of different SMs. A more recent
> OpenSM
> > > than what is current in Windows has this as do most if not all
> vendor
> > > SMs.
> > > 
> > > -- Hal
> > > 
> > > > Thanks,
> > > > 
> > > > Moshe.
> > > > 
> > > >  
> > > > 
> > > >                                    
> > > >
> ______________________________________________________________________
> > > > 
> > > > From: Tzachi Dar [mailto:tzachid at mellanox.co.il]
> > > > Sent: Sunday, April 27, 2008 5:51 PM
> > > > To: Moshe Haim; ofw at lists.openfabrics.org
> > > > Subject: RE: [ofw] Programatically checking Infiniband 
> performance 
> > > > counters
> > > > 
> > > > 
> > > >  
> > > > 
> > > > Using performance counters currently only allows one to get WSD
> data.
> > > > Since you are working on XP that won't help you. You can use the
> same
> > > > way that vstat is doing to get that data.
> > > > 
> > > > 
> > > > In other words, you need to implement the function
> vstat_get_counters
> > > > () that is using a local mad in order to get that data.
> > > > 
> > > > 
> > > >  
> > > > 
> > > > 
> > > > You can find this function in the file vstat_main.c (lines 343 -
> 393)
> > > > 
> > > > 
> > > >  
> > > > 
> > > > 
> > > > Thanks
> > > > 
> > > > 
> > > > Tzachi
> > > > 
> > > > 
> > > >          
> > > >         
> > > >                                        
> > > >
> ______________________________________________________________
> > > >         
> > > >         From:ofw-bounces at lists.openfabrics.org [mailto:ofw-
> > > >         bounces at lists.openfabrics.org] On Behalf Of Moshe Haim
> > > >         Sent: Sunday, April 27, 2008 5:09 PM
> > > >         To: ofw at lists.openfabrics.org
> > > >         Subject: [ofw] Programatically checking Infiniband
> performance
> > > >         counters
> > > >         
> > > >         Hi,
> > > >         
> > > >          
> > > >         
> > > >         I am working with Infiniband on Windows XP using the
> > > >         WinOF_1.0.1 package.
> > > >         
> > > >         I need to be able to check the Infiniband performance
> counters
> > > >         in order to estimate traffic over the Infiniband.
> > > >         
> > > >          
> > > >         
> > > >         I have seen that using vstat -c the counter information
> > > >         appears.
> > > >         
> > > >         How can I retrieve it not using vstat.exe? Are there any
> > > >         performance counters I can check (for instance 
> select them
> in
> > > >         perfmon.exe?), or any other APIs?
> > > >         
> > > >          
> > > >         
> > > >         Thanks,
> > > >         
> > > >         Moshe.
> > > >         
> > > >          
> > > >         
> > > > _______________________________________________
> > > > ofw mailing list
> > > > ofw at lists.openfabrics.org
> > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> > > _______________________________________________
> > > ofw mailing list
> > > ofw at lists.openfabrics.org
> > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> > 
> _______________________________________________
> ofw mailing list
> ofw at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> _______________________________________________
> ofw mailing list
> ofw at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> 



More information about the ofw mailing list