[ofa-general] Re: [PATCH V2] OpenSM: Add a Performance Manager HOWTO to the docs and the dist
Ira Weiny
weiny2 at llnl.gov
Fri May 16 13:35:02 PDT 2008
On Fri, 16 May 2008 12:52:09 -0700
Hal Rosenstock <hrosenstock at xsigo.com> wrote:
> On Thu, 2008-05-15 at 13:27 -0700, Ira Weiny wrote:
> > I decided to write a little HOWTO to help people to set it up.
>
> Nice writeup :-)
>
> > 5) Can be run in a standby SM
>
> I thought it was changed so that it can run in a standalone mode without
> SM. Am I confusing this with something else ?
>
I think you are right I should have said standalone. However, can't it also
work in a standby SM?
yea, from the patch which Sasha applied:
opensm/perfmgr: PerfMgr for SM standby and inactive states
Here is an updated patch with the correction.
Ira
>From 9be13c3da4d34ad0a736ced4c9e3bb5e13a24bb6 Mon Sep 17 00:00:00 2001
From: Ira K. Weiny <weiny2 at llnl.gov>
Date: Thu, 15 May 2008 08:19:17 -0700
Subject: [PATCH] Add a Performance Manager HOWTO to the docs and the dist
Signed-off-by: Ira K. Weiny <weiny2 at llnl.gov>
---
opensm/Makefile.am | 3 +-
opensm/doc/performance-manager-HOWTO.txt | 153 ++++++++++++++++++++++++++++++
opensm/opensm.spec.in | 2 +-
3 files changed, 156 insertions(+), 2 deletions(-)
create mode 100644 opensm/doc/performance-manager-HOWTO.txt
diff --git a/opensm/Makefile.am b/opensm/Makefile.am
index 3811963..4c79f49 100644
--- a/opensm/Makefile.am
+++ b/opensm/Makefile.am
@@ -24,8 +24,9 @@ endif
man_MANS = man/opensm.8 man/osmtest.8
various_scripts = $(wildcard scripts/*)
+docs = doc/performance-manager-HOWTO.txt
-EXTRA_DIST = autogen.sh opensm.spec $(various_scripts) $(man_MANS)
+EXTRA_DIST = autogen.sh opensm.spec $(various_scripts) $(man_MANS) $(docs)
dist-hook: $(EXTRA_DIST)
if [ -x $(top_srcdir)/../gen_chlog.sh ] ; then \
diff --git a/opensm/doc/performance-manager-HOWTO.txt b/opensm/doc/performance-manager-HOWTO.txt
new file mode 100644
index 0000000..f0380f3
--- /dev/null
+++ b/opensm/doc/performance-manager-HOWTO.txt
@@ -0,0 +1,153 @@
+OpenSM Performance manager HOWTO
+================================
+
+Introduction
+============
+
+OpenSM now includes a performance manager which collects Port counters from
+the subnet and stores them internally in OpenSM.
+
+Some of the features of the performance manager are:
+
+ 1) Collect port data and error counters per v1.2 spec and store in
+ 64bit internal counts.
+ 2) Automatic reset of counters when they reach approximatly 3/4 full.
+ (While not guarenteeing that counts will not be missed this does
+ keep counts incrementing as best as possible given the current
+ hardware limitations.)
+ 3) Basic warnings in the OpenSM log on "critical" errors like symbol
+ errors.
+ 4) Automatically detects "outside" resets of counters and adjusts to
+ continue collecting data.
+ 5) Can be run when OpenSM is in standby or inactive states.
+
+Known issues are:
+
+ 1) Data counters will be lost on high data rate links. Sweeping the
+ fabric fast enough for a DDR link is not practical.
+ 2) Default partition support only.
+
+
+Setup and Usage
+===============
+
+Using the Performance Manager consists of 3 steps:
+
+ 1) compiling in support for the perfmgr (Optionally: the console
+ socket as well)
+ 2) enabling the perfmgr and console in opensm.opts
+ 3) retrieving data which has been collected.
+ 3a) using console to "dump data"
+ 3b) using a plugin module to store the data to your own
+ "database"
+
+Step 1: Compile in support for the Performance Manager
+------------------------------------------------------
+
+Because of the performance manager's experimental status, it is not enabled at
+compile time by default. (This will hopefully soon change as more people use
+it and confirm that it does not break things... ;-) The configure option is
+"--enable-perf-mgr".
+
+At this time it is really best to enable the console socket option as well.
+OpenSM can be run in an "interactive" mode. But with the console socket option
+turned on one can also make a connection to a running OpenSM. The console
+option is "--enable-console-socket". This option requires the use of
+tcp_wrappers to ensure security. Please be aware of your configuration for
+tcp_wrappers as the commands presented in the console can affect the operation
+of your subnet.
+
+The following configure line includes turning on the performance manager as
+well as the console:
+
+ ./configure --enable-perf-mgr --enable-console-socket
+
+
+Step 2: Enable the perfmgr and console in opensm.opts
+-----------------------------------------------------
+
+Turning the Perfmorance Manager on is pretty easy, set the following options in
+the opensm.opts config file. (Default location is
+/var/cache/opensm/opensm.opts)
+
+ # Turn it all on.
+ perfmgr TRUE
+
+ # sweep time in seconds
+ perfmgr_sweep_time_s 180
+
+ # Dump file to dump the events to
+ event_db_dump_file /var/log/opensm_port_counters.log
+
+Also enable the console socket and configure the port for it to listen to if
+desired.
+
+ # console [off|local|socket]
+ console socket
+
+ # Telnet port for console (default 10000)
+ console_port 10000
+
+As noted above you also need to set up tcp_wrappers to prevent unauthorized
+users from connecting to the console.[*]
+
+ [*] As an alternate you can use the loopback mode but I noticed when
+ writing this (OpenSM v3.1.10; OFED 1.3) that there are some bugs in
+ specifying the loopback mode in the opensm.opts file. Look for this to
+ be fixed in newer versions.
+
+ [**] Also you could use "local" but this is only useful if you run
+ OpenSM in the foreground of a terminal. As OpenSM is usually started
+ as a daemon I left this out as an option.
+
+Step 3: retrieve data which has been collected
+----------------------------------------------
+
+Step 3a: Using console dump function
+------------------------------------
+
+The console command "perfmgr dump_counters" will dump counters to the file
+specified in the opensm.opts file. In the example above
+"/var/log/opensm_port_counters.log"
+
+Example output is below:
+
+<snip>
+"SW1 wopr ISR9024D (MLX4 FW)" 0x8f10400411f56 port 1 (Since Mon May 12 13:27:14 2008)
+ symbol_err_cnt : 0
+ link_err_recover : 0
+ link_downed : 0
+ rcv_err : 0
+ rcv_rem_phys_err : 0
+ rcv_switch_relay_err : 2
+ xmit_discards : 0
+ xmit_constraint_err : 0
+ rcv_constraint_err : 0
+ link_integrity_err : 0
+ buf_overrun_err : 0
+ vl15_dropped : 0
+ xmit_data : 470435
+ rcv_data : 405956
+ xmit_pkts : 8954
+ rcv_pkts : 6900
+ unicast_xmit_pkts : 0
+ unicast_rcv_pkts : 0
+ multicast_xmit_pkts : 0
+ multicast_rcv_pkts : 0
+</snip>
+
+
+Step 3b: Using a plugin module
+------------------------------
+
+If you want a more automated method of retrieving the data OpenSM provides a
+plugin interface to extend OpenSM. The header file is osm_event_plugin.h.
+The functions you register with this interface will be called when data is
+collected. You can then use that data as appropriate.
+
+An example plugin can be configured at compile time using the
+"--enable-default-event-plugin" option on the configure line. This plugin is
+very simple. It logs "events" recieved from the performance manager to a log
+file. I don't recomend using this directly but rather use it as a templat to
+create your own plugin.
+
diff --git a/opensm/opensm.spec.in b/opensm/opensm.spec.in
index feabfef..c36d6f2 100644
--- a/opensm/opensm.spec.in
+++ b/opensm/opensm.spec.in
@@ -125,7 +125,7 @@ fi
%{_sbindir}/opensm
%{_sbindir}/osmtest
%{_mandir}/man8/*
-%doc AUTHORS COPYING README
+%doc AUTHORS COPYING README doc/performance-manager-HOWTO.txt
%{_sysconfdir}/init.d/opensmd
%{_sbindir}/sldd.sh
%config(noreplace) @OPENSM_CONFIG_DIR@/opensm.conf
--
1.5.1
More information about the general
mailing list