[Fwd: [ofa-general] [PATCH] opensm/man: Adding QoS-related info to opensm man pages]
Hal Rosenstock
hrosenstock at xsigo.com
Tue May 6 07:45:25 PDT 2008
Hi Yevgeny,
On Tue, 2008-05-06 at 17:34 +0300, Yevgeny Kliteynik wrote:
> Hi Hal,
>
> This is the mail that I was talking about (QoS info for OpenSM man page).
> Sasha has reviewed it, and posted his answer to the mailing list.
I must have missed that. What was the date of that post ?
See below for some additional comments.
-- Hal
>
> -- Yevgeny
>
>
> -------- Original Message --------
> Subject: [ofa-general] [PATCH] opensm/man: Adding QoS-related info to opensm man pages
> Date: Wed, 26 Mar 2008 02:47:08 +0200
> From: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> To: Sasha Khapyorsky <sashak at voltaire.com>
> CC: OpenIB <general at lists.openfabrics.org>
>
> Hi Sasha,
>
> I've added QoS related info to opensm man pages: enhanced
> existing part (that was talking about VL arbitration) and
> added description of QoS manager in accordance with QoS annex.
>
> Please apply to ofed_1_3 and master.
>
> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> ---
> opensm/man/opensm.8.in | 501 +++++++++++++++++++++++++++++++++++++++++++-----
> 1 files changed, 457 insertions(+), 44 deletions(-)
>
> diff --git a/opensm/man/opensm.8.in b/opensm/man/opensm.8.in
> index 5322ab7..1d9c5b7 100644
> --- a/opensm/man/opensm.8.in
> +++ b/opensm/man/opensm.8.in
> @@ -35,7 +35,8 @@ to initialize the InfiniBand hardware (at least one per each
> InfiniBand subnet).
>
> opensm also now contains an experimental version of a performance
> -manager as well.
> +manager and an experimental version QoS manager (in accordance with
> +IBA QoS Annex).
Minor tweak as I think the performance manager is now longer being
indicated as experimental:
opensm also now contains a performance manager as well as an
experimental QoS manager (in accordance with IBTA 1.2.1 QoS Annex).
> opensm defaults were designed to meet the common case usage on clusters with up to a few hundred nodes. Thus, in this default mode, opensm will scan the IB
> fabric, initialize it, and sweep occasionally for changes.
> @@ -433,51 +434,463 @@ partition manager:
>
> Default=0x7fff,ipoib:ALL=full;
>
> -.SH QOS CONFIGURATION
> +.SH QUALITY OF SERVICE
> .PP
> -There are a set of QoS related low-level configuration parameters.
> -All these parameter names are prefixed by "qos_" string. Here is a full
> -list of these parameters:
> -
> - qos_max_vls - The maximum number of VLs that will be on the subnet
> - qos_high_limit - The limit of High Priority component of VL
> - Arbitration table (IBA 7.6.9)
> - qos_vlarb_low - Low priority VL Arbitration table (IBA 7.6.9)
> - template
> - qos_vlarb_high - High priority VL Arbitration table (IBA 7.6.9)
> - template
> - Both VL arbitration templates are pairs of
> - VL and weight
> - qos_sl2vl - SL2VL Mapping table (IBA 7.6.6) template. It is
> - a list of VLs corresponding to SLs 0-15 (Note
> - that VL15 used here means drop this SL)
> -
> -Typical default values (hard-coded in OpenSM initialization) are:
> -
> - qos_max_vls=15
> - qos_high_limit=0
> - qos_vlarb_low=0:0,1:4,2:4,3:4,4:4,5:4,6:4,7:4,8:4,9:4,10:4,11:4,12:4,13:4,14:4
> - qos_vlarb_high=0:4,1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0,9:0,10:0,11:0,12:0,13:0,14:0
> - qos_sl2vl=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7
> -
> -The syntax is compatible with rest of OpenSM configuration options and
> -values may be stored in OpenSM config file (cached options file).
> -
> -In addition to the above, we may define separate QoS configuration
> -parameters sets for various target types. As targets, we currently support
> -CAs, routers, switch external ports, and switch's enhanced port 0. The
> -names of such specialized parameters are prefixed by "qos_<type>_"
> -string. Here is a full list of the currently supported sets:
> -
> - qos_ca_ - QoS configuration parameters set for CAs.
> - qos_rtr_ - parameters set for routers.
> - qos_sw0_ - parameters set for switches' port 0.
> - qos_swe_ - parameters set for switches' external ports.
> +OpenSM QoS support comprises of two parts:
>
> -Examples:
> - qos_sw0_max_vls=2
> - qos_ca_sl2vl=0,1,2,3,5,5,5,12,12,0,
> - qos_swe_high_limit=0
> + 1. \fBQoS manager in accordance with IBA QoS Annex\fP (experimental)
> +.P
> + 2. \fBSL2VL and VL Arbitration tables configuration\fP
> +.P
> +.SS QoS Manager (experimental)
> +.PP
> +When Quality of Service in OpenSM is enabled (-Q or --qos), OpenSM looks
> +for QoS Policy file. The default name of this file is
> +\fB\%@CONF_DIR@/@QOS_POLICY_FILE@\fP. The default may be changed by using
> +-Y or --qos_policy_file option with OpenSM.
This is essentially the QoS management doc cast into the man page with
the older primitive QoS tacked on. Should the annex support just refer
to the doc and leave the older description present ? Or something else ?
Maybe that was addressed in Sasha's response.
> +
> +During fabric initialization and at every heavy sweep OpenSM parses the
> +QoS policy file, applies its settings to the discovered fabric elements,
> +and enforces the provided policy on client requests. The overall flow for
> +such requests is as follows:
> + - The request is matched against the defined matching rules such that
> + the QoS Level definition is found.
> + - Given the QoS Level, path(s) search is performed with the given
> + restrictions imposed by that level.
> +
> +There are two ways to define QoS policy:
> + - \fBFull\fP: the full policy file syntax provides the administrator various
> + ways to match a PathRecord/MultiPathRecord (PR/MPR) request, and to
> + enforce various QoS constraints on the requested PR/MPR.
> + - \fBSimplified\fP: the simplified policy file syntax enables the administrator
> + match PR/MPR requests by various ULPs and applications running on top of
> + these ULPs.
> +
> +While the full policy syntax is very flexible, in many cases the simplified
> +policy definition would be sufficient.
> +.PP
> +.B Full QoS Policy File
> +.PP
> +QoS policy file has the following sections:
> +
> +.B I)
> +Port Groups (denoted by port-groups).
> +This section defines zero or more port groups that can be referred later by
> +matching rules (see below). Port group lists ports by:
> + - Port GUID
> + - Port name, which is a combination of NodeDescription and IB port number
> + - PKey, which means that all the ports in the subnet that belong to
> + partition with a given PKey belong to this port group
> + - Partition name, which means that all the ports in the subnet that belong
> + to partition with a given name belong to this port group
> + - Node type, where possible node types are: CA, SWITCH, ROUTER, ALL, and
> + SELF (SM's port).
> +
> +.B II)
> +QoS Setup (denoted by qos-setup).
> +This section describes how to set up SL2VL and VL Arbitration tables on
> +various nodes in the fabric.
> +However, this is not supported in OFED 1.3.
> +SL2VL and VLArb tables should be configured in the OpenSM options file.
> +
> +.B III)
> +QoS Levels (denoted by qos-levels).
> +Each QoS Level defines Service Level (SL) and a few optional fields:
> + - MTU limit
> + - Rate limit
> + - PKey
> + - Packet lifetime
> +
> +When path(s) search is performed, it is done with regards to restriction that
> +these QoS Level parameters impose.
> +One QoS level that is mandatory to define is a DEFAULT QoS level. It is
> +applied to a PR/MPR query that does not match any existing match rule.
> +Similar to any other QoS Level, it can also be explicitly referred by any
> +match rule.
> +
> +.B IV)
> +QoS Matching Rules (denoted by qos-match-rules).
> +Each PathRecord/MultiPathRecord query that OpenSM receives is matched against
> +the set of matching rules. Rules are scanned in order of appearance in the QoS
> +policy file such as the first match takes precedence.
> +Each rule has a name of QoS level that will be applied to the matching query.
> +A default QoS level is applied to a query that did not match any rule.
> +Queries can be matched by:
> + - Source port group (whether a source port is a member of a specified group)
> + - Destination port group (same as above, only for destination port)
> + - PKey
> + - QoS class
> + - Service ID
> +
> +To match a certain matching rule, PR/MPR query has to match ALL the rule's
> +criteria. However, not all the fields of the PR/MPR query have to appear in
> +the matching rule.
> +For instance, if the rule has a single criterion - Service ID, it will match
> +any query that has this Service ID, disregarding rest of the query fields.
> +However, if a certain query has only Service ID (which means that this is the
> +only bit in the PR/MPR component mask that is on), it will not match any rule
> +that has other matching criteria besides Service ID.
> +.PP
> +.B Simplified QoS Policy Definition
> +.PP
> +Simplified QoS policy definition comprises of a single section denoted by
> +qos-ulps. Similar to the full QoS policy, it has a list of match rules and
> +their QoS Level, but in this case a match rule has only one criterion - its
> +goal is to match a certain ULP (or a certain application on top of this ULP)
> +PR/MPR request, and QoS Level has only one constraint - Service Level (SL).
> +The simplified policy section may appear in the policy file in combine with
> +the full policy, or as a stand-alone policy definition.
> +See more details and list of match rule criteria below.
> +.PP
> +.B Policy File Syntax Guidelines
> +.PP
> +Empty lines are ignored.
> +Leading and trailing blanks, as well as empty lines, are ignored, so the
> +indentation in the example is just for better readability.
> +Comments are started with the pound sign (#) and terminated by EOL.
> +Any keyword should be the first non-blank in the line, unless it's a comment.
> +Keywords that denote section/subsection start have matching closing keywords.
> +Having a QoS Level named "DEFAULT" is a must - it is applied to PR/MPR
> +requests that didn't match any of the matching rules.
> +Any section/subsection of the policy file is optional.
> +
> +.PP
> +.B Examples of Full Policy File
> +.PP
> +As mentioned earlier, any section of the policy file is optional, and
> +the only mandatory part of the policy file is a default QoS Level.
> +Here's an example of the shortest policy file:
> +
> + qos-levels
> + qos-level
> + name: DEFAULT
> + sl: 0
> + end-qos-level
> + end-qos-levels
> +
> +Port groups section is missing because there are no match rules, which means
> +that port groups are not referred anywhere, and there is no need defining
> +them. And since this policy file doesn't have any matching rules, PR/MPR query
> +won't match any rule, and OpenSM will enforce default QoS level.
> +Essentially, the above example is equivalent to not having QoS policy file
> +at all.
> +
> +The following example shows all the possible options and keywords in the
> +policy file and their syntax:
> +
> + #
> + # See the comments in the following example.
> + # They explain different keywords and their meaning.
> + #
> + port-groups
> + port-group
> + name: Storage
> + # "use" is just a description that is used for logging
> + # Other than that, it is just a comment
> + use: SRP Targets
> + port-guid: 0x10000000000001, 0x10000000000005-0x1000000000FFFA
> + port-guid: 0x1000000000FFFF
> + end-port-group
> +
> + port-group
> + name: Virtual Servers
> + # The syntax of the port name is as follows:
> + # "node_description/Pnum".
> + # node_description is compared to the NodeDescription of the node,
> + # and "Pnum" is a port number on that node.
> + port-name: vs1 HCA-1/P1, vs2 HCA-1/P1
> + end-port-group
> +
> + # using partitions defined in the partition policy
> + port-group
> + name: Partitions
> + partition: Part1
> + pkey: 0x1234
> + end-port-group
> +
> + # using node types: CA, ROUTER, SWITCH, SELF (for node that runs SM)
> + # or ALL (for all the nodes in the subnet)
> + port-group
> + name: CAs and SM
> + node-type: CA, SELF
> + end-port-group
> +
> + end-port-groups
> +
> + qos-setup
> + # This section of the policy file describes how to set up SL2VL and VL
> + # Arbitration tables on various nodes in the fabric.
> + # However, this is not supported in OFED 1.3 - the section is parsed
> + # and ignored. SL2VL and VLArb tables should be configured in the
> + # OpenSM options file (by default - /var/cache/opensm/opensm.opts).
> + end-qos-setup
> +
> + qos-levels
> +
> + # Having a QoS Level named "DEFAULT" is a must - it is applied to
> + # PR/MPR requests that didn't match any of the matching rules.
> + qos-level
> + name: DEFAULT
> + use: default QoS Level
> + sl: 0
> + end-qos-level
> +
> + # the whole set: SL, MTU-Limit, Rate-Limit, PKey, Packet Lifetime
> + qos-level
> + name: WholeSet
> + sl: 1
> + mtu-limit: 4
> + rate-limit: 5
> + pkey: 0x1234
> + packet-life: 8
> + end-qos-level
> +
> + end-qos-levels
> +
> + # Match rules are scanned in order of their apperance in the policy file.
> + # First matched rule takes precedence.
> + qos-match-rules
> +
> + # matching by single criteria: QoS class
> + qos-match-rule
> + use: by QoS class
> + qos-class: 7-9,11
> + # Name of qos-level to apply to the matching PR/MPR
> + qos-level-name: WholeSet
> + end-qos-match-rule
> +
> + # show matching by destination group and service id
> + qos-match-rule
> + use: Storage targets
> + destination: Storage
> + service-id: 0x10000000000001, 0x10000000000008-0x10000000000FFF
> + qos-level-name: WholeSet
> + end-qos-match-rule
> +
> + qos-match-rule
> + source: Storage
> + use: match by source group only
> + qos-level-name: DEFAULT
> + end-qos-match-rule
> +
> + qos-match-rule
> + use: match by all parameters
> + qos-class: 7-9,11
> + source: Virtual Servers
> + destination: Storage
> + service-id: 0x0000000000010000-0x000000000001FFFF
> + pkey: 0x0F00-0x0FFF
> + qos-level-name: WholeSet
> + end-qos-match-rule
> +
> + end-qos-match-rules
> +
> +.PP
> +.B Simplified QoS Policy - Details and Examples
> +.PP
> +Simplified QoS policy match rules are tailored for matching ULPs (or
> +some application on top of a ULP) PR/MPR requests. It has a list of
> +per-ULP (or per-application) match rules and the SL that should be
> +enforced on the matched PR/MPR query.
> +
> +Match rules include:
> + - Default match rule that is applied to PR/MPR query that didn't
> + match any of the other match rules
> + - SDP
> + - SDP application with a specific target TCP/IP port range
> + - SRP with a specific target IB port GUID
> + - RDS
> + - iSER
> + - iSER application with a specific target TCP/IP port range
> + - IPoIB with a default PKey
> + - IPoIB with a specific PKey
> + - any ULP/application with a specific Service ID in the PR/MPR query
> + - any ULP/application with a specific PKey in the PR/MPR query
> + - any ULP/application with a specific target IB port GUID in the PR/MPR query
> +
> +Since any section of the policy file is optional, as long as basic rules
> +of the file are kept (such as no referring to nonexisting port group,
> +having default QoS Level, etc), the simplified policy section (qos-ulps)
> +can serve as a complete QoS policy file.
> +The shortest policy file in this case would be as follows:
> +
> + qos-ulps
> + default : 0 #default SL
> + end-qos-ulps
> +
> +It is equivalent to not having policy file at all.
> +
> +Below is an example of simplified QoS policy with all the possible keywords:
> +
> + qos-ulps
> + default : 0 # default SL
> + sdp, port-num 30000 : 0 # SL for application running on top
> + # of SDP when a destination
> + # TCP/IPport is 30000
> + sdp, port-num 10000-20000 : 0
> + sdp : 1 # default SL for any other
> + # application running on top of SDP
> + rds : 2 # SL for RDS traffic
> + iser, port-num 900 : 0 # SL for iSER with a specific target
> + # port
> + iser : 3 # default SL for iSER
> + ipoib, pkey 0x0001 : 0 # SL for IPoIB on partition with
> + # pkey 0x0001
> + ipoib : 4 # default IPoIB partition,
> + # pkey=0x7FFF
> + any, service-id 0x6234 : 6 # match any PR/MPR query with a
> + # specific Service ID
> + any, pkey 0x0ABC : 6 # match any PR/MPR query with a
> + # specific PKey
> + srp, target-port-guid 0x1234 : 5 # SRP when SRP Target is located on
> + # a specified IB port GUID
> + any, target-port-guid 0x0ABC-0xFFFFF : 6 # match any PR/MPR query with
> + # a specific target port GUID
> + end-qos-ulps
> +
> +
> +Similar to the full policy definition, matching of PR/MPR queries is done in
> +order of appearance in the QoS policy file such as the first match takes
> +precedence, except for the "default" rule, which is applied only if the query
> +didn't match any other rule.
> +
> +All other sections of the QoS policy file take precedence over the qos-ulps
> +section. That is, if a policy file has both qos-match-rules and qos-ulps
> +sections, then any query is matched first against the rules in the
> +qos-match-rules section, and only if there was no match, the query is matched
> +against the rules in qos-ulps section.
> +
> +Note that some of these match rules may overlap, so in order to use the
> +simplified QoS definition effectively, it is important to understand how each
> +of the ULPs is matched:
> +
> +.B IPoIB:
> +PR query is matched by PKey. Default PKey for IPoIB partition is 0x7fff, so
> +the following three match rules are equivalent:
> +
> + ipoib : <SL>
> + ipoib, pkey 0x7fff : <SL>
> + any, pkey 0x7fff : <SL>
> +
> +.I Note
> +: For OFED 1.3, IPoIB partition SL configuration should be done through
> +partition configuration file only.
> +
> +\fBSDP\fP: PR query is matched by Service ID. The Service-ID for SDP is
> +0x000000000001PPPP, where PPPP are 4 hex digits holding the remote TCP/IP
> +Port Number to connect to. The following two match rules are equivalent:
> +
> + sdp : <SL>
> + any, service-id 0x0000000000010000-0x000000000001ffff : <SL>
> +
> +\fBRDS\fP: Similar to SDP, RDS PR query is matched by Service ID. The
> +Service ID for RDS is 0x000000000106PPPP, where PPPP are 4 hex digits
> +holding the remote TCP/IP Port Number to connect to. Default port number
> +for RDS is 0x48CA, which makes a default Service-ID 0x00000000010648CA.
> +The following two match rules are equivalent:
> +
> + rds : <SL>
> + any, service-id 0x00000000010648CA : <SL>
> +
> +\fBiSER\fP: Similar to RDS, iSER query is matched by Service ID, where the
> +Service ID is also 0x000000000106PPPP. Default port number for iSER is 0x035C,
> +which makes a default Service-ID 0x000000000106035C.
> +The following two match rules are equivalent:
> +
> + iser : <SL>
> + any, service-id 0x000000000106035C : <SL>
> +
> +\fBSRP\fP: Service ID for SRP varies from storage vendor to vendor, thus SRP query is
> +matched by the target IB port GUID. The following two match rules are
> +equivalent:
> +
> + srp, target-port-guid 0x1234 : <SL>
> + any, target-port-guid 0x1234 : <SL>
> +
> +Note that any of the above ULPs might contain target port GUID in the PR
> +query, so in order for these queries not to be recognized by the QoS manager
> +as SRP, the SRP match rule (or any match rule that refers to the target port
> +guid only) should be placed at the end of the qos-ulps match rules.
> +
> +\fBMPI\fP: SL for MPI is manually configured by MPI admin. OpenSM is not
> +forcing any SL on the MPI traffic, and that's why it is the only ULP that
> +did not appear in the qos-ulps section.
> +
> +
> +.SS SL2VL Mapping and VL Arbitration
> +.PP
> +
> +OpenSM cached options file has a set of QoS related configuration
> +parameters, that are used to configure SL2VL mapping and VL arbitration
> +on IB ports. These parameters are:
> + - Max VLs: the maximum number of VLs that will be on the subnet.
> + - High limit: the limit of High Priority component of VL Arbitration
> + table (IBA 7.6.9).
> + - VLArb low table: Low priority VL Arbitration table (IBA 7.6.9) template.
> + - VLArb high table: High priority VL Arbitration table (IBA 7.6.9) template.
> + - SL2VL: SL2VL Mapping table (IBA 7.6.6) template. It is a list of VLs
> + corresponding to SLs 0-15 (Note that VL15 used here means drop this SL).
> +
> +There are separate QoS configuration parameters sets for various target
> +types: CAs, routers, switch external ports, and switch's enhanced port 0.
> +The names of such parameters are prefixed by "qos_<type>_" string.
> +Here is a full list of the currently supported sets:
> +
> + qos_ca_ - QoS configuration parameters set for CAs.
> + qos_rtr_ - parameters set for routers.
> + qos_sw0_ - parameters set for switches' port 0.
> + qos_swe_ - parameters set for switches' external ports.
> +
> +Here's the example of typical default values for all the ports in the
> +subnet (hard-coded in OpenSM initialization):
> +
> + qos_max_vls=15
> + qos_high_limit=0
> + qos_vlarb_high=0:4,1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0,9:0,10:0,11:0,12:0,13:0,14:0
> + qos_vlarb_low=0:0,1:4,2:4,3:4,4:4,5:4,6:4,7:4,8:4,9:4,10:4,11:4,12:4,13:4,14:4
> + qos_sl2vl=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7
> +
> +
> +VL arbitration tables (both high and low) are lists of VL/Weight pairs.
> +Each list entry contains a VL number (values from 0-14), and a weighting value
> +(values 0-255), indicating the number of 64 byte units (credits) which may be
> +transmitted from that VL when its turn in the arbitration occurs. A weight
> +of 0 indicates that this entry should be skipped. If a list entry is
> +programmed for VL15 or for a VL that is not supported or is not currently
> +configured by the port, the port may either skip that entry or send from any
> +supported VL for that entry.
> +
> +Note, that the same VLs may be listed multiple times in the High or Low
> +priority arbitration tables, and, further, it can be listed in both tables.
> +
> +The limit of high-priority VLArb table (qos_<type>_high_limit) indicates the
> +number of high-priority packets that can be transmitted without an opportunity
> +to send a low-priority packet. Specifically, the number of bytes that can be
> +sent is high_limit times 4K bytes.
> +
> +A high_limit value of 255 indicates that the byte limit is unbounded.
> +Note: if the 255 value is used, the low priority VLs may be starved.
> +A value of 0 indicates that only a single packet from the high-priority table
> +may be sent before an opportunity is given to the low-priority table.
> +
> +Keep in mind that ports usually transmit packets of size equal to MTU.
> +For instance, for 4KB MTU a single packet will require 64 credits, so in order
> +to achieve effective VL arbitration for packets of 4KB MTU, the weighting
> +values for each VL should be multiples of 64.
> +
> +Below is an example of SL2VL and VL Arbitration configuration on subnet:
> +
> + qos_max_vls=15
> + qos_high_limit=6
> + qos_vlarb_high=0:4
> + qos_vlarb_low=0:0,1:64,2:128,3:192,4:0,5:64,6:64,7:64
> + qos_sl2vl=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7
> +
> +In this example, there are 8 VLs configured on subnet: VL0 to VL7. VL0 is
> +defined as a high priority VL, and it is limited to 6 x 4KB = 24KB in a single
> +transmission burst. Such configuration would suilt VL that needs low latency
> +and uses small MTU when transmitting packets. Rest of VLs are defined as low
> +priority VLs with different weights, while VL4 is effectively turned off.
>
> .SH PREFIX ROUTES
> .PP
More information about the general
mailing list