[ofa-general] [PATCH] opensm/man: Adding QoS-related info to opensm man pages

Yevgeny Kliteynik kliteyn at dev.mellanox.co.il
Tue Mar 25 17:47:08 PDT 2008


Hi Sasha,

I've added QoS related info to opensm man pages: enhanced
existing part (that was talking about VL arbitration) and
added description of QoS manager in accordance with QoS annex.

Please apply to ofed_1_3 and master.

Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
---
 opensm/man/opensm.8.in |  501 +++++++++++++++++++++++++++++++++++++++++++-----
 1 files changed, 457 insertions(+), 44 deletions(-)

diff --git a/opensm/man/opensm.8.in b/opensm/man/opensm.8.in
index 5322ab7..1d9c5b7 100644
--- a/opensm/man/opensm.8.in
+++ b/opensm/man/opensm.8.in
@@ -35,7 +35,8 @@ to initialize the InfiniBand hardware (at least one per each
 InfiniBand subnet).

 opensm also now contains an experimental version of a performance
-manager as well.
+manager and an experimental version QoS manager (in accordance with
+IBA QoS Annex).

 opensm defaults were designed to meet the common case usage on clusters with up to a few hundred nodes. Thus, in this default mode, opensm will scan the IB
 fabric, initialize it, and sweep occasionally for changes.
@@ -433,51 +434,463 @@ partition manager:

  Default=0x7fff,ipoib:ALL=full;

-.SH QOS CONFIGURATION
+.SH QUALITY OF SERVICE
 .PP
-There are a set of QoS related low-level configuration parameters.
-All these parameter names are prefixed by "qos_" string. Here is a full
-list of these parameters:
-
- qos_max_vls    - The maximum number of VLs that will be on the subnet
- qos_high_limit - The limit of High Priority component of VL
-                  Arbitration table (IBA 7.6.9)
- qos_vlarb_low  - Low priority VL Arbitration table (IBA 7.6.9)
-                  template
- qos_vlarb_high - High priority VL Arbitration table (IBA 7.6.9)
-                  template
-                  Both VL arbitration templates are pairs of
-                  VL and weight
- qos_sl2vl      - SL2VL Mapping table (IBA 7.6.6) template. It is
-                  a list of VLs corresponding to SLs 0-15 (Note
-                  that VL15 used here means drop this SL)
-
-Typical default values (hard-coded in OpenSM initialization) are:
-
- qos_max_vls=15
- qos_high_limit=0
- qos_vlarb_low=0:0,1:4,2:4,3:4,4:4,5:4,6:4,7:4,8:4,9:4,10:4,11:4,12:4,13:4,14:4
- qos_vlarb_high=0:4,1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0,9:0,10:0,11:0,12:0,13:0,14:0
- qos_sl2vl=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7
-
-The syntax is compatible with rest of OpenSM configuration options and
-values may be stored in OpenSM config file (cached options file).
-
-In addition to the above, we may define separate QoS configuration
-parameters sets for various target types. As targets, we currently support
-CAs, routers, switch external ports, and switch's enhanced port 0. The
-names of such specialized parameters are prefixed by "qos_<type>_"
-string. Here is a full list of the currently supported sets:
-
- qos_ca_  - QoS configuration parameters set for CAs.
- qos_rtr_ - parameters set for routers.
- qos_sw0_ - parameters set for switches' port 0.
- qos_swe_ - parameters set for switches' external ports.
+OpenSM QoS support comprises of two parts:

-Examples:
- qos_sw0_max_vls=2
- qos_ca_sl2vl=0,1,2,3,5,5,5,12,12,0,
- qos_swe_high_limit=0
+  1. \fBQoS manager in accordance with IBA QoS Annex\fP (experimental)
+.P
+  2. \fBSL2VL and VL Arbitration tables configuration\fP
+.P
+.SS QoS Manager (experimental)
+.PP
+When Quality of Service in OpenSM is enabled (-Q or --qos), OpenSM looks
+for QoS Policy file. The default name of this file is
+\fB\%@CONF_DIR@/@QOS_POLICY_FILE@\fP. The default may be changed by using
+-Y or --qos_policy_file option with OpenSM.
+
+During fabric initialization and at every heavy sweep OpenSM parses the
+QoS policy file, applies its settings to the discovered fabric elements,
+and enforces the provided policy on client requests. The overall flow for
+such requests is as follows:
+ - The request is matched against the defined matching rules such that
+   the QoS Level definition is found.
+ - Given the QoS Level, path(s) search is performed with the given
+   restrictions imposed by that level.
+
+There are two ways to define QoS policy:
+ - \fBFull\fP: the full policy file syntax provides the administrator various
+   ways to match a PathRecord/MultiPathRecord (PR/MPR) request, and to
+   enforce various QoS constraints on the requested PR/MPR.
+ - \fBSimplified\fP: the simplified policy file syntax enables the administrator
+   match PR/MPR requests by various ULPs and applications running on top of
+   these ULPs.
+
+While the full policy syntax is very flexible, in many cases the simplified
+policy definition would be sufficient.
+.PP
+.B Full QoS Policy File
+.PP
+QoS policy file has the following sections:
+
+.B I)
+Port Groups (denoted by port-groups).
+This section defines zero or more port groups that can be referred later by
+matching rules (see below). Port group lists ports by:
+  - Port GUID
+  - Port name, which is a combination of NodeDescription and IB port number
+  - PKey, which means that all the ports in the subnet that belong to
+    partition with a given PKey belong to this port group
+  - Partition name, which means that all the ports in the subnet that belong
+    to partition with a given name belong to this port group
+  - Node type, where possible node types are: CA, SWITCH, ROUTER, ALL, and
+    SELF (SM's port).
+
+.B II)
+QoS Setup (denoted by qos-setup).
+This section describes how to set up SL2VL and VL Arbitration tables on
+various nodes in the fabric.
+However, this is not supported in OFED 1.3.
+SL2VL and VLArb tables should be configured in the OpenSM options file.
+
+.B III)
+QoS Levels (denoted by qos-levels).
+Each QoS Level defines Service Level (SL) and a few optional fields:
+  - MTU limit
+  - Rate limit
+  - PKey
+  - Packet lifetime
+
+When path(s) search is performed, it is done with regards to restriction that
+these QoS Level parameters impose.
+One QoS level that is mandatory to define is a DEFAULT QoS level. It is
+applied to a PR/MPR query that does not match any existing match rule.
+Similar to any other QoS Level, it can also be explicitly referred by any
+match rule.
+
+.B IV)
+QoS Matching Rules (denoted by qos-match-rules).
+Each PathRecord/MultiPathRecord query that OpenSM receives is matched against
+the set of matching rules. Rules are scanned in order of appearance in the QoS
+policy file such as the first match takes precedence.
+Each rule has a name of QoS level that will be applied to the matching query.
+A default QoS level is applied to a query that did not match any rule.
+Queries can be matched by:
+ - Source port group (whether a source port is a member of a specified group)
+ - Destination port group (same as above, only for destination port)
+ - PKey
+ - QoS class
+ - Service ID
+
+To match a certain matching rule, PR/MPR query has to match ALL the rule's
+criteria. However, not all the fields of the PR/MPR query have to appear in
+the matching rule.
+For instance, if the rule has a single criterion - Service ID, it will match
+any query that has this Service ID, disregarding rest of the query fields.
+However, if a certain query has only Service ID (which means that this is the
+only bit in the PR/MPR component mask that is on), it will not match any rule
+that has other matching criteria besides Service ID.
+.PP
+.B Simplified QoS Policy Definition
+.PP
+Simplified QoS policy definition comprises of a single section denoted by
+qos-ulps. Similar to the full QoS policy, it has a list of match rules and
+their QoS Level, but in this case a match rule has only one criterion - its
+goal is to match a certain ULP (or a certain application on top of this ULP)
+PR/MPR request, and QoS Level has only one constraint - Service Level (SL).
+The simplified policy section may appear in the policy file in combine with
+the full policy, or as a stand-alone policy definition.
+See more details and list of match rule criteria below.
+.PP
+.B Policy File Syntax Guidelines
+.PP
+Empty lines are ignored.
+Leading and trailing blanks, as well as empty lines, are ignored, so the
+indentation in the example is just for better readability.
+Comments are started with the pound sign (#) and terminated by EOL.
+Any keyword should be the first non-blank in the line, unless it's a comment.
+Keywords that denote section/subsection start have matching closing keywords.
+Having a QoS Level named "DEFAULT" is a must - it is applied to PR/MPR
+requests that didn't match any of the matching rules.
+Any section/subsection of the policy file is optional.
+
+.PP
+.B Examples of Full Policy File
+.PP
+As mentioned earlier, any section of the policy file is optional, and
+the only mandatory part of the policy file is a default QoS Level.
+Here's an example of the shortest policy file:
+
+    qos-levels
+        qos-level
+            name: DEFAULT
+            sl: 0
+        end-qos-level
+    end-qos-levels
+
+Port groups section is missing because there are no match rules, which means
+that port groups are not referred anywhere, and there is no need defining
+them. And since this policy file doesn't have any matching rules, PR/MPR query
+won't match any rule, and OpenSM will enforce default QoS level.
+Essentially, the above example is equivalent to not having QoS policy file
+at all.
+
+The following example shows all the possible options and keywords in the
+policy file and their syntax:
+
+    #
+    # See the comments in the following example.
+    # They explain different keywords and their meaning.
+    #
+    port-groups
+        port-group
+            name: Storage
+            # "use" is just a description that is used for logging
+            #  Other than that, it is just a comment
+            use: SRP Targets
+            port-guid: 0x10000000000001, 0x10000000000005-0x1000000000FFFA
+            port-guid: 0x1000000000FFFF
+        end-port-group
+
+        port-group
+            name: Virtual Servers
+            # The syntax of the port name is as follows:
+            #   "node_description/Pnum".
+            # node_description is compared to the NodeDescription of the node,
+            # and "Pnum" is a port number on that node.
+            port-name: vs1 HCA-1/P1, vs2 HCA-1/P1
+        end-port-group
+
+        # using partitions defined in the partition policy
+        port-group
+            name: Partitions
+            partition: Part1
+            pkey: 0x1234
+        end-port-group
+
+        # using node types: CA, ROUTER, SWITCH, SELF (for node that runs SM)
+        # or ALL (for all the nodes in the subnet)
+        port-group
+            name: CAs and SM
+            node-type: CA, SELF
+        end-port-group
+
+    end-port-groups
+
+    qos-setup
+        # This section of the policy file describes how to set up SL2VL and VL
+        # Arbitration tables on various nodes in the fabric.
+        # However, this is not supported in OFED 1.3 - the section is parsed
+        # and ignored. SL2VL and VLArb tables should be configured in the
+        # OpenSM options file (by default - /var/cache/opensm/opensm.opts).
+    end-qos-setup
+
+    qos-levels
+
+        # Having a QoS Level named "DEFAULT" is a must - it is applied to
+        # PR/MPR requests that didn't match any of the matching rules.
+        qos-level
+            name: DEFAULT
+            use: default QoS Level
+            sl: 0
+        end-qos-level
+
+        # the whole set: SL, MTU-Limit, Rate-Limit, PKey, Packet Lifetime
+        qos-level
+            name: WholeSet
+            sl: 1
+            mtu-limit: 4
+            rate-limit: 5
+            pkey: 0x1234
+            packet-life: 8
+        end-qos-level
+
+    end-qos-levels
+
+    # Match rules are scanned in order of their apperance in the policy file.
+    # First matched rule takes precedence.
+    qos-match-rules
+
+        # matching by single criteria: QoS class
+        qos-match-rule
+            use: by QoS class
+            qos-class: 7-9,11
+            # Name of qos-level to apply to the matching PR/MPR
+            qos-level-name: WholeSet
+        end-qos-match-rule
+
+        # show matching by destination group and service id
+        qos-match-rule
+            use: Storage targets
+            destination: Storage
+            service-id: 0x10000000000001, 0x10000000000008-0x10000000000FFF
+            qos-level-name: WholeSet
+        end-qos-match-rule
+
+        qos-match-rule
+            source: Storage
+            use: match by source group only
+            qos-level-name: DEFAULT
+        end-qos-match-rule
+
+        qos-match-rule
+            use: match by all parameters
+            qos-class: 7-9,11
+            source: Virtual Servers
+            destination: Storage
+            service-id: 0x0000000000010000-0x000000000001FFFF
+            pkey: 0x0F00-0x0FFF
+            qos-level-name: WholeSet
+        end-qos-match-rule
+
+    end-qos-match-rules
+
+.PP
+.B Simplified QoS Policy - Details and Examples
+.PP
+Simplified QoS policy match rules are tailored for matching ULPs (or
+some application on top of a ULP) PR/MPR requests. It has a list of
+per-ULP (or per-application) match rules and the SL that should be
+enforced on the matched PR/MPR query.
+
+Match rules include:
+ - Default match rule that is applied to PR/MPR query that didn't
+   match any of the other match rules
+ - SDP
+ - SDP application with a specific target TCP/IP port range
+ - SRP with a specific target IB port GUID
+ - RDS
+ - iSER
+ - iSER application with a specific target TCP/IP port range
+ - IPoIB with a default PKey
+ - IPoIB with a specific PKey
+ - any ULP/application with a specific Service ID in the PR/MPR query
+ - any ULP/application with a specific PKey in the PR/MPR query
+ - any ULP/application with a specific target IB port GUID in the PR/MPR query
+
+Since any section of the policy file is optional, as long as basic rules
+of the file are kept (such as no referring to nonexisting port group,
+having default QoS Level, etc), the simplified policy section (qos-ulps)
+can serve as a complete QoS policy file.
+The shortest policy file in this case would be as follows:
+
+    qos-ulps
+        default  : 0 #default SL
+    end-qos-ulps
+
+It is equivalent to not having policy file at all.
+
+Below is an example of simplified QoS policy with all the possible keywords:
+
+    qos-ulps
+        default                       : 0 # default SL
+        sdp, port-num 30000           : 0 # SL for application running on top
+                                          # of SDP when a destination
+                                          # TCP/IPport is 30000
+        sdp, port-num 10000-20000     : 0
+        sdp                           : 1 # default SL for any other
+                                          # application running on top of SDP
+        rds                           : 2 # SL for RDS traffic
+        iser, port-num 900            : 0 # SL for iSER with a specific target
+                                          # port
+        iser                          : 3 # default SL for iSER
+        ipoib, pkey 0x0001            : 0 # SL for IPoIB on partition with
+                                          # pkey 0x0001
+        ipoib                         : 4 # default IPoIB partition,
+                                          # pkey=0x7FFF
+        any, service-id 0x6234        : 6 # match any PR/MPR query with a
+                                          # specific Service ID
+        any, pkey 0x0ABC              : 6 # match any PR/MPR query with a
+                                          # specific PKey
+        srp, target-port-guid 0x1234  : 5 # SRP when SRP Target is located on
+                                          # a specified IB port GUID
+        any, target-port-guid 0x0ABC-0xFFFFF : 6 # match any PR/MPR query with
+                                          # a specific target port GUID
+    end-qos-ulps
+
+
+Similar to the full policy definition, matching of PR/MPR queries is done in
+order of appearance in the QoS policy file such as the first match takes
+precedence, except for the "default" rule, which is applied only if the query
+didn't match any other rule.
+
+All other sections of the QoS policy file take precedence over the qos-ulps
+section. That is, if a policy file has both qos-match-rules and qos-ulps
+sections, then any query is matched first against the rules in the
+qos-match-rules section, and only if there was no match, the query is matched
+against the rules in qos-ulps section.
+
+Note that some of these match rules may overlap, so in order to use the
+simplified QoS definition effectively, it is important to understand how each
+of the ULPs is matched:
+
+.B  IPoIB:
+PR query is matched by PKey. Default PKey for IPoIB partition is 0x7fff, so
+the following three match rules are equivalent:
+
+    ipoib              : <SL>
+    ipoib, pkey 0x7fff : <SL>
+    any,   pkey 0x7fff : <SL>
+
+.I Note
+: For OFED 1.3, IPoIB partition SL configuration should be done through
+partition configuration file only.
+
+\fBSDP\fP: PR query is matched by Service ID. The Service-ID for SDP is
+0x000000000001PPPP, where PPPP are 4 hex digits holding the remote TCP/IP
+Port Number to connect to. The following two match rules are equivalent:
+
+    sdp                                                   : <SL>
+    any, service-id 0x0000000000010000-0x000000000001ffff : <SL>
+
+\fBRDS\fP: Similar to SDP, RDS PR query is matched by Service ID. The
+Service ID for RDS is 0x000000000106PPPP, where PPPP are 4 hex digits
+holding the remote TCP/IP Port Number to connect to. Default port number
+for RDS is 0x48CA, which makes a default Service-ID 0x00000000010648CA.
+The following two match rules are equivalent:
+
+    rds                                : <SL>
+    any, service-id 0x00000000010648CA : <SL>
+
+\fBiSER\fP: Similar to RDS, iSER query is matched by Service ID, where the
+Service ID is also 0x000000000106PPPP. Default port number for iSER is 0x035C,
+which makes a default Service-ID 0x000000000106035C.
+The following two match rules are equivalent:
+
+    iser                               : <SL>
+    any, service-id 0x000000000106035C : <SL>
+
+\fBSRP\fP: Service ID for SRP varies from storage vendor to vendor, thus SRP query is
+matched by the target IB port GUID. The following two match rules are
+equivalent:
+
+    srp, target-port-guid 0x1234  : <SL>
+    any, target-port-guid 0x1234  : <SL>
+
+Note that any of the above ULPs might contain target port GUID in the PR
+query, so in order for these queries not to be recognized by the QoS manager
+as SRP, the SRP match rule (or any match rule that refers to the target port
+guid only) should be placed at the end of the qos-ulps match rules.
+
+\fBMPI\fP: SL for MPI is manually configured by MPI admin. OpenSM is not
+forcing any SL on the MPI traffic, and that's why it is the only ULP that
+did not appear in the qos-ulps section.
+
+
+.SS SL2VL Mapping and VL Arbitration
+.PP
+
+OpenSM cached options file has a set of QoS related configuration
+parameters, that are used to configure SL2VL mapping and VL arbitration
+on IB ports. These parameters are:
+  - Max VLs: the maximum number of VLs that will be on the subnet.
+  - High limit: the limit of High Priority component of VL Arbitration
+    table (IBA 7.6.9).
+  - VLArb low table: Low priority VL Arbitration table (IBA 7.6.9) template.
+  - VLArb high table: High priority VL Arbitration table (IBA 7.6.9) template.
+  - SL2VL: SL2VL Mapping table (IBA 7.6.6) template. It is a list of VLs
+    corresponding to SLs 0-15 (Note that VL15 used here means drop this SL).
+
+There are separate QoS configuration parameters sets for various target
+types: CAs, routers, switch external ports, and switch's enhanced port 0.
+The names of such parameters are prefixed by "qos_<type>_" string.
+Here is a full list of the currently supported sets:
+
+    qos_ca_  - QoS configuration parameters set for CAs.
+    qos_rtr_ - parameters set for routers.
+    qos_sw0_ - parameters set for switches' port 0.
+    qos_swe_ - parameters set for switches' external ports.
+
+Here's the example of typical default values for all the ports in the
+subnet (hard-coded in OpenSM initialization):
+
+    qos_max_vls=15
+    qos_high_limit=0
+    qos_vlarb_high=0:4,1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0,9:0,10:0,11:0,12:0,13:0,14:0
+    qos_vlarb_low=0:0,1:4,2:4,3:4,4:4,5:4,6:4,7:4,8:4,9:4,10:4,11:4,12:4,13:4,14:4
+    qos_sl2vl=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7
+
+
+VL arbitration tables (both high and low) are lists of VL/Weight pairs.
+Each list entry contains a VL number (values from 0-14), and a weighting value
+(values 0-255), indicating the number of 64 byte units (credits) which may be
+transmitted from that VL when its turn in the arbitration occurs. A weight
+of 0 indicates that this entry should be skipped. If a list entry is
+programmed for VL15 or for a VL that is not supported or is not currently
+configured by the port, the port may either skip that entry or send from any
+supported VL for that entry.
+
+Note, that the same VLs may be listed multiple times in the High or Low
+priority arbitration tables, and, further, it can be listed in both tables.
+
+The limit of high-priority VLArb table (qos_<type>_high_limit) indicates the
+number of high-priority packets that can be transmitted without an opportunity
+to send a low-priority packet. Specifically, the number of bytes that can be
+sent is high_limit times 4K bytes.
+
+A high_limit value of 255 indicates that the byte limit is unbounded.
+Note: if the 255 value is used, the low priority VLs may be starved.
+A value of 0 indicates that only a single packet from the high-priority table
+may be sent before an opportunity is given to the low-priority table.
+
+Keep in mind that ports usually transmit packets of size equal to MTU.
+For instance, for 4KB MTU a single packet will require 64 credits, so in order
+to achieve effective VL arbitration for packets of 4KB MTU, the weighting
+values for each VL should be multiples of 64.
+
+Below is an example of SL2VL and VL Arbitration configuration on subnet:
+
+    qos_max_vls=15
+    qos_high_limit=6
+    qos_vlarb_high=0:4
+    qos_vlarb_low=0:0,1:64,2:128,3:192,4:0,5:64,6:64,7:64
+    qos_sl2vl=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7
+
+In this example, there are 8 VLs configured on subnet: VL0 to VL7. VL0 is
+defined as a high priority VL, and it is limited to 6 x 4KB = 24KB in a single
+transmission burst. Such configuration would suilt VL that needs low latency
+and uses small MTU when transmitting packets. Rest of VLs are defined as low
+priority VLs with different weights, while VL4 is effectively turned off.

 .SH PREFIX ROUTES
 .PP
-- 
1.5.1.4




More information about the general mailing list