[ofw] Programatically checking Infiniband performance counters

Moshe Haim Moshe-ha at orbotech.com
Sun May 18 06:54:36 PDT 2008


Basically, what I would like to ultimately achieve is being able to do
some load balancing on my software - taking the IB traffic into account
as well as other performance information.

When I install WinOF on a machine and I install the IPoIB component, I
am able to use windows performance monitor and to see the traffic over
the IPoIB "adapter".
So I was thinking that it would be good if I was able to do the same
thing on "lower level protocol" traffic as well - in my specific use
case uDAPL (through Intel MPI).

This is why I wanted to take a look at the performance counters of the
IB.

Taking the counter size limit into account, and the fact that there
should be only one maintainer for the counter (resetting etc.) - I was
wondering if there is any plan to actually write performance counters
DLL for Windows?
So it would be possible to watch the performance the same way I see
counters for the IPoIB.

Is this something that is planned for some version of WinOF?

If not, is it possible to do such a thing? 
I'd be more than glad to provide help writing such a thing - as I think
other people may need it as well as me.

Thanks,
Moshe.

-----Original Message-----
From: ofw-bounces at lists.openfabrics.org
[mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Ira Weiny
Sent: Friday, May 16, 2008 11:43 PM
To: Hal Rosenstock
Cc: Moshe Haim; ofw at lists.openfabrics.org
Subject: Re: [ofw] Programatically checking Infiniband performance
counters

I did not follow this thread completely but I just submitted a patch to
Sasha
for the master and OFED 1.3 branches which include a Performance Manager
HOWTO.
He has not accepted the patch yet, so if you want a copy I can send it
directly.

Furthermore, there is nothing I know of which would prevent the
Performance
Manager from running on a Windows box.  I am pretty sure all
"compatibility"
libs are in use.  However, the PerfMgr is still experimental so YMMV.
:-D

Hope this helps,
Ira


On Tue, 06 May 2008 05:05:45 -0700
Hal Rosenstock <hrosenstock at xsigo.com> wrote:

> On Tue, 2008-05-06 at 14:37 +0300, Moshe Haim wrote:
> > > > Looking at the ib_port_counters_t structure I see several
fields.
> > > > 
> > > > The port_xmit_data looks like transferred the data in bytes
divided
> > by
> > > > 4 - however, it is only 32 bit so it fills up very fast (after
> > 16GB).
> > >
> > > Yes; so either the polling needs to account for this or use
optional
> > > extended 64 bit counters if they are supported.
> > 
> > Could you please elaborate on the extended 64 bit counters? 
> 
> See IBA 1.2.1 vol 1 16.1 Performance Management specifically 16.1.4.11
> 
> > How can I tell if they are supported and how can I access them?
> 
> They're attribute ID 0x001D (PortCountersExtended) rather than 0x0012
> (PortCounters). There are a number of ways to tell this but I suspect
> they're not supported on any Windows supported HCAs I'm aware of so
this
> may not be of much use :-(
> 
> > > > 2. How can I reset the counters to 0? I tried using ib_local_mad
> > with
> > > > setting the port_xmit_pkts/port_xmit_data to 0 and use
mad->method =
> > > > IB_MAD_METHOD_SET but that doesn't seem to work...
> > > > I need this since I expect to run for long periods of time and
even
> > > > the port_xmit_pkts will hit its limit.
> > >
> > > Did you set PortSelect and CounterSelect ?
> > 
> > I am basically using the code inside vstat_main.c in the function
> > vstat_get_counters - so PortSelect and CounterSelect are the same
> > values(portnum and 0xFF respectively), I only changed the
mad_in->method
> > from IB_MAD_METHOD_GET to IB_MAD_METHOD_SET.
> 
> CounterSelect is 16 rather than 8 bits so if you want to reset all the
> counters, it should be 0xffff. Also, what's the value of portnum ?
> 
> > I get the following error: ib_local_mad failed with status = 42.
> 
> I'm not familiar with the Windows code so I have no clue what status
42
> means.
> 
> > > > Last but not least, is there any plan to add performance
counters in
> > > > windows for WinOF in the future?
> > > 
> > > In general, there can only be one performance manager owning a
node's
> > > performance counters (assuming they are going to do resets as
above).
> > > PerfMgrs come bundled in a number of different SMs. A more recent
> > OpenSM
> > > than what is current in Windows has this as do most if not all
vendor
> > > SMs.
> > 
> > I have WinOF 1.0.1 installed with the OpenSM service started. 
> > So if I understand correctly the version that I have does not manage
the
> > performance counters.
> 
> Not AFAIK.
> 
> > According to what you say I also understand that future versions of
the
> > OpenSM will have Performance Manager that manages the performance
> > counters - am I correct? 
> 
> It depends on what the Windows community and Windows OpenSM maintainer
> decide relative to the PerfMgr. It could include this as the OpenSM
> common base currently supports this.
> 
> > If so in which version of WinOF is this planned for?
> 
> I don't know the WinOF plan to update OpenSM. You'll have to ask
others
> on this list.
> 
> > Also, when talking about bundled performance managers - do they
> > automatically reset the counters if they hit the limit,
> 
> They attempt to reset them some threshold prior to hitting the limits.
> 
> > do I interact with them in order to get/set values for the
performance counters?
> 
> It exports the accumulated values for those counters on a node basis.
> 
> -- Hal
> 
> > Thanks,
> > Moshe.
> > 
> > -----Original Message-----
> > From: Hal Rosenstock [mailto:hrosenstock at xsigo.com] 
> > Sent: Sunday, May 04, 2008 7:17 PM
> > To: Moshe Haim
> > Cc: Tzachi Dar; ofw at lists.openfabrics.org
> > Subject: RE: [ofw] Programatically checking Infiniband performance
> > counters
> > 
> > Moshe,
> > 
> > On Sun, 2008-05-04 at 15:46 +0300, Moshe Haim wrote:
> > > Thanks for the quick reply.
> > > 
> > >  
> > > 
> > > I have a few additional questions:
> > > 
> > > Looking at the ib_port_counters_t structure I see several fields.
> > > 
> > > The port_xmit_data looks like transferred the data in bytes
divided by
> > > 4 - however, it is only 32 bit so it fills up very fast (after
16GB).
> > 
> > Yes; so either the polling needs to account for this or use optional
> > extended 64 bit counters if they are supported.
> > 
> > > The port_xmit_pkts looks like transferred data in 1KB packets
> > 
> > Just a packet count; nothing related to packet size.
> > 
> > >  - so it fills up more slowly:
> > 
> > Yes.
> > 
> > > 1. Can I rely on the packet size being 1KB - if not where can I
find
> > > the packet size definition. 
> > 
> > No; the packets are all different sizes depending on the protocols
being
> > used.
> > 
> > > 2. How can I reset the counters to 0? I tried using ib_local_mad
with
> > > setting the port_xmit_pkts/port_xmit_data to 0 and use mad->method
=
> > > IB_MAD_METHOD_SET but that doesn't seem to work...
> > > I need this since I expect to run for long periods of time and
even
> > > the port_xmit_pkts will hit its limit.
> > 
> > Did you set PortSelect and CounterSelect ?
> > 
> > > Last but not least, is there any plan to add performance counters
in
> > > windows for WinOF in the future?
> > 
> > In general, there can only be one performance manager owning a
node's
> > performance counters (assuming they are going to do resets as
above).
> > PerfMgrs come bundled in a number of different SMs. A more recent
OpenSM
> > than what is current in Windows has this as do most if not all
vendor
> > SMs.
> > 
> > -- Hal
> > 
> > > Thanks,
> > > 
> > > Moshe.
> > > 
> > >  
> > > 
> > >                                    
> > >
______________________________________________________________________
> > > 
> > > From: Tzachi Dar [mailto:tzachid at mellanox.co.il] 
> > > Sent: Sunday, April 27, 2008 5:51 PM
> > > To: Moshe Haim; ofw at lists.openfabrics.org
> > > Subject: RE: [ofw] Programatically checking Infiniband performance
> > > counters
> > > 
> > > 
> > >  
> > > 
> > > Using performance counters currently only allows one to get WSD
data.
> > > Since you are working on XP that won't help you. You can use the
same
> > > way that vstat is doing to get that data.
> > > 
> > > 
> > > In other words, you need to implement the function
vstat_get_counters
> > > () that is using a local mad in order to get that data.
> > > 
> > > 
> > >  
> > > 
> > > 
> > > You can find this function in the file vstat_main.c (lines 343 -
393)
> > > 
> > > 
> > >  
> > > 
> > > 
> > > Thanks
> > > 
> > > 
> > > Tzachi
> > > 
> > > 
> > >          
> > >         
> > >                                        
> > >
______________________________________________________________
> > >         
> > >         From:ofw-bounces at lists.openfabrics.org [mailto:ofw-
> > >         bounces at lists.openfabrics.org] On Behalf Of Moshe Haim
> > >         Sent: Sunday, April 27, 2008 5:09 PM
> > >         To: ofw at lists.openfabrics.org
> > >         Subject: [ofw] Programatically checking Infiniband
performance
> > >         counters
> > >         
> > >         Hi,
> > >         
> > >          
> > >         
> > >         I am working with Infiniband on Windows XP using the
> > >         WinOF_1.0.1 package.
> > >         
> > >         I need to be able to check the Infiniband performance
counters
> > >         in order to estimate traffic over the Infiniband.
> > >         
> > >          
> > >         
> > >         I have seen that using vstat -c the counter information
> > >         appears.
> > >         
> > >         How can I retrieve it not using vstat.exe? Are there any
> > >         performance counters I can check (for instance select them
in
> > >         perfmon.exe?), or any other APIs?
> > >         
> > >          
> > >         
> > >         Thanks,
> > >         
> > >         Moshe.
> > >         
> > >          
> > >         
> > > _______________________________________________
> > > ofw mailing list
> > > ofw at lists.openfabrics.org
> > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> > _______________________________________________
> > ofw mailing list
> > ofw at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
> 
_______________________________________________
ofw mailing list
ofw at lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw



More information about the ofw mailing list