[ofa-general] Re: [PATCH V2] OpenSM: Add a Performance Manager HOWTO to the docs and the dist

Ira Weiny weiny2 at llnl.gov
Fri May 16 13:35:02 PDT 2008


On Fri, 16 May 2008 12:52:09 -0700
Hal Rosenstock <hrosenstock at xsigo.com> wrote:

> On Thu, 2008-05-15 at 13:27 -0700, Ira Weiny wrote:
> > I decided to write a little HOWTO to help people to set it up.
> 
> Nice writeup :-)
> 
> > 5) Can be run in a standby SM
> 
> I thought it was changed so that it can run in a standalone mode without
> SM. Am I confusing this with something else ?
> 

I think you are right I should have said standalone.  However, can't it also
work in a standby SM?

yea, from the patch which Sasha applied:

   opensm/perfmgr: PerfMgr for SM standby and inactive states

Here is an updated patch with the correction.

Ira


>From 9be13c3da4d34ad0a736ced4c9e3bb5e13a24bb6 Mon Sep 17 00:00:00 2001
From: Ira K. Weiny <weiny2 at llnl.gov>
Date: Thu, 15 May 2008 08:19:17 -0700
Subject: [PATCH] Add a Performance Manager HOWTO to the docs and the dist


Signed-off-by: Ira K. Weiny <weiny2 at llnl.gov>
---
 opensm/Makefile.am                       |    3 +-
 opensm/doc/performance-manager-HOWTO.txt |  153 ++++++++++++++++++++++++++++++
 opensm/opensm.spec.in                    |    2 +-
 3 files changed, 156 insertions(+), 2 deletions(-)
 create mode 100644 opensm/doc/performance-manager-HOWTO.txt

diff --git a/opensm/Makefile.am b/opensm/Makefile.am
index 3811963..4c79f49 100644
--- a/opensm/Makefile.am
+++ b/opensm/Makefile.am
@@ -24,8 +24,9 @@ endif
 man_MANS = man/opensm.8 man/osmtest.8
 
 various_scripts = $(wildcard scripts/*)
+docs = doc/performance-manager-HOWTO.txt
 
-EXTRA_DIST = autogen.sh opensm.spec $(various_scripts) $(man_MANS)
+EXTRA_DIST = autogen.sh opensm.spec $(various_scripts) $(man_MANS) $(docs)
 
 dist-hook: $(EXTRA_DIST)
 	if [ -x $(top_srcdir)/../gen_chlog.sh ] ; then \
diff --git a/opensm/doc/performance-manager-HOWTO.txt b/opensm/doc/performance-manager-HOWTO.txt
new file mode 100644
index 0000000..f0380f3
--- /dev/null
+++ b/opensm/doc/performance-manager-HOWTO.txt
@@ -0,0 +1,153 @@
+OpenSM Performance manager HOWTO
+================================
+
+Introduction
+============
+
+OpenSM now includes a performance manager which collects Port counters from
+the subnet and stores them internally in OpenSM.
+
+Some of the features of the performance manager are:
+
+	1) Collect port data and error counters per v1.2 spec and store in
+	   64bit internal counts.
+	2) Automatic reset of counters when they reach approximatly 3/4 full.
+	   (While not guarenteeing that counts will not be missed this does
+	   keep counts incrementing as best as possible given the current
+	   hardware limitations.)
+	3) Basic warnings in the OpenSM log on "critical" errors like symbol
+	   errors.
+	4) Automatically detects "outside" resets of counters and adjusts to
+	   continue collecting data.
+	5) Can be run when OpenSM is in standby or inactive states.
+
+Known issues are:
+
+	1) Data counters will be lost on high data rate links.  Sweeping the
+	   fabric fast enough for a DDR link is not practical.
+	2) Default partition support only.
+
+
+Setup and Usage
+===============
+
+Using the Performance Manager consists of 3 steps:
+
+	1) compiling in support for the perfmgr (Optionally: the console
+	   socket as well)
+	2) enabling the perfmgr and console in opensm.opts
+	3) retrieving data which has been collected.
+	   3a) using console to "dump data"
+	   3b) using a plugin module to store the data to your own
+	       "database"
+
+Step 1: Compile in support for the Performance Manager
+------------------------------------------------------
+
+Because of the performance manager's experimental status, it is not enabled at
+compile time by default.  (This will hopefully soon change as more people use
+it and confirm that it does not break things...  ;-)  The configure option is
+"--enable-perf-mgr".
+
+At this time it is really best to enable the console socket option as well.
+OpenSM can be run in an "interactive" mode.  But with the console socket option
+turned on one can also make a connection to a running OpenSM.  The console
+option is "--enable-console-socket".  This option requires the use of
+tcp_wrappers to ensure security.  Please be aware of your configuration for
+tcp_wrappers as the commands presented in the console can affect the operation
+of your subnet.
+
+The following configure line includes turning on the performance manager as
+well as the console:
+
+	./configure --enable-perf-mgr --enable-console-socket
+
+
+Step 2: Enable the perfmgr and console in opensm.opts
+-----------------------------------------------------
+
+Turning the Perfmorance Manager on is pretty easy, set the following options in
+the opensm.opts config file.  (Default location is
+/var/cache/opensm/opensm.opts)
+
+	# Turn it all on.
+	perfmgr TRUE
+
+	# sweep time in seconds
+	perfmgr_sweep_time_s 180
+
+	# Dump file to dump the events to
+	event_db_dump_file /var/log/opensm_port_counters.log
+
+Also enable the console socket and configure the port for it to listen to if
+desired.
+
+	# console [off|local|socket]
+	console socket
+
+	# Telnet port for console (default 10000)
+	console_port 10000
+
+As noted above you also need to set up tcp_wrappers to prevent unauthorized
+users from connecting to the console.[*]
+
+	[*] As an alternate you can use the loopback mode but I noticed when
+	writing this (OpenSM v3.1.10; OFED 1.3) that there are some bugs in
+	specifying the loopback mode in the opensm.opts file.  Look for this to
+	be fixed in newer versions.
+
+	[**] Also you could use "local" but this is only useful if you run
+	OpenSM in the foreground of a terminal.  As OpenSM is usually started
+	as a daemon I left this out as an option.
+
+Step 3: retrieve data which has been collected
+----------------------------------------------
+
+Step 3a: Using console dump function
+------------------------------------
+
+The console command "perfmgr dump_counters" will dump counters to the file
+specified in the opensm.opts file.  In the example above
+"/var/log/opensm_port_counters.log"
+
+Example output is below:
+
+<snip>
+"SW1 wopr ISR9024D (MLX4 FW)" 0x8f10400411f56 port 1 (Since Mon May 12 13:27:14 2008)
+     symbol_err_cnt       : 0
+     link_err_recover     : 0
+     link_downed          : 0
+     rcv_err              : 0
+     rcv_rem_phys_err     : 0
+     rcv_switch_relay_err : 2
+     xmit_discards        : 0
+     xmit_constraint_err  : 0
+     rcv_constraint_err   : 0
+     link_integrity_err   : 0
+     buf_overrun_err      : 0
+     vl15_dropped         : 0
+     xmit_data            : 470435
+     rcv_data             : 405956
+     xmit_pkts            : 8954
+     rcv_pkts             : 6900
+     unicast_xmit_pkts    : 0
+     unicast_rcv_pkts     : 0
+     multicast_xmit_pkts  : 0
+     multicast_rcv_pkts   : 0
+</snip>
+
+
+Step 3b: Using a plugin module
+------------------------------
+
+If you want a more automated method of retrieving the data OpenSM provides a
+plugin interface to extend OpenSM.  The header file is osm_event_plugin.h.
+The functions you register with this interface will be called when data is
+collected.  You can then use that data as appropriate.
+
+An example plugin can be configured at compile time using the
+"--enable-default-event-plugin" option on the configure line.  This plugin is
+very simple.  It logs "events" recieved from the performance manager to a log
+file.  I don't recomend using this directly but rather use it as a templat to
+create your own plugin.
+
diff --git a/opensm/opensm.spec.in b/opensm/opensm.spec.in
index feabfef..c36d6f2 100644
--- a/opensm/opensm.spec.in
+++ b/opensm/opensm.spec.in
@@ -125,7 +125,7 @@ fi
 %{_sbindir}/opensm
 %{_sbindir}/osmtest
 %{_mandir}/man8/*
-%doc AUTHORS COPYING README
+%doc AUTHORS COPYING README doc/performance-manager-HOWTO.txt
 %{_sysconfdir}/init.d/opensmd
 %{_sbindir}/sldd.sh
 %config(noreplace) @OPENSM_CONFIG_DIR@/opensm.conf
-- 
1.5.1





More information about the general mailing list