[ofa-general] IB performance stats (revisited)

Wed Jul 11 08:30:04 PDT 2007

Hi Marc,

I wish I had a large enough fabric worth testing collectl on...

I did the math for how much data would be collected for 10Knodes
cluster. It is ~7MB for each iteration: 
10K ports 
* 6 (3 level fabric * 2 ports on each link)
* 32 byte (data/pkts tx/rx) + 22byte (err counters) + 64byte (cong
counters) = 116bytes

Seems reasonable - but adds up to large amount of data over a day period
assuming a collect every second:
24*60*60 *116*10000*6 = 6.01344e+11 Bytes of storage

Eitan Zahavi
Senior Engineering Director, Software Architect
Mellanox Technologies LTD
Tel:+972-4-9097208
Fax:+972-4-9593245
P.O. Box 586 Yokneam 20692 ISRAEL

> -----Original Message-----
> From: Mark Seger [mailto:Mark.Seger at hp.com] 
> Sent: Wednesday, July 11, 2007 5:51 PM
> To: Eitan Zahavi
> Cc: Hal Rosenstock; Ira Weiny; general at lists.openfabrics.org; 
> Ed.Finn at FMR.COM
> Subject: Re: [ofa-general] IB performance stats (revisited)
> 
> 
> 
> Eitan Zahavi wrote:
> 
> >Hi Marc,
> >
> >I published an RFC and later had discussions regarding the 
> distribution 
> >of query ownership of switch counters.
> >Making this ownership purely dynamic, semi-dynamic or even 
> static is an 
> >implementation tradeoff.
> >However, it can be shown that the maximal number of switches 
> a single 
> >compute node would be responsible for is <= number of switch 
> levels. So 
> >no problem to get counters every second...
> >
> >The issue is: what do you do with the size of data collected?
> >This is only relevant if monitoring is run in "profiling mode" 
> >otherwise only link health errors should be reported.
> >  
> >
> I use IB data for performance data typically for 
> system/application diagnostics.  I run a tool I wrote (see
> http://sourceforge.net/projects/collectl/) as a service on 
> most systems and it gathers well over hundreds of performance 
> metrics/counters on everything from  cpu load, memory, 
> network,  infiniband, disk, etc.  The philosophy here is that 
> if something goes wrong, it may be too late to then run some 
> diagnostic.  Rather you need to have already collected the 
> data, especially if this is an intemittent problem.  When 
> there is no need to look at the data, it just gets purged 
> away after a week.
> 
> There have been situation where someone reports a batch 
> program they ran the other day was really slow and they 
> didn't change anything.  By being able to pull up a 
> monitoring log and seeing what the system was doing at the 
> time of the run might reveal their network was saturated and 
> therefore their MPI job was impacted.  You can't very well 
> turn on diagnostics and rerun the application because system 
> conditions have probably changed.
> 
> Does that help?  Why don't you try installing collectl and 
> see what it does...
> 
> -mark
> 
> 
>