[openib-general] OSM QoS policy file

Hal Rosenstock halr at voltaire.com
Tue Feb 13 07:15:04 PST 2007


Hi Yevgeny,

Sorry for the slow response; I've been consumed getting ready for OFED
1.2 alpha.

On Mon, 2007-02-05 at 07:37, Yevgeny Kliteynik wrote:
> Hi Hal.
> 
> I added osm/doc/qos-policy.txt file with the description of the QoS
> policy file, and an example of such file (with more comments inside).
> I'm sure you'll have questions and corrections regarding this file,
> so for now, to make our work easier, I'm not sending it as patch, 
> but just as text. Please review the file.

Thanks for doing this. This helps but I still do have a number of
questions on it as you expected. See below for specifics.

It would be nice to turn this into a DTD when things get closer to
finalizing so XML configs could readily be validated. Can you do this ? 

I'd also like to see a futures/todo list. I think we've discussed a few
topics which fall into this category.

Thanks.

-- Hal

> Thanks
> 
> -- Yevgeny
> 
> =============================================================
> 
> QoS Policy File
> ===============
> 
> The QoS policy file is divided into 4 sub sections:
> 
>  - Port Group: a set of CAs, Routers or Switches that share 
>    the same settings. A port group might be a partition 
>    defined by the partition manager policy in terms of 
>    GUIDs. Future implementations might provide support 
>    for NodeDescription based definition of port groups.

IMO, this group be a separate schema on which this (and partitions and
perhaps other things are based) ? 

>  - Fabric Setup: 
>    Defines how the SL2VL and VLArb tables should be setup.
>    This policy definition assumes the computation of target 
>    behavior should be performed outside of OpenSM.

Rather than fabric setup, is this better named QoS Setup (which seems
consistent with the tag used below) or QoS Fabric Setup  ? Also, what is
the relation of this group to the port group ?

>  - QoS-Levels Definition:
>    This section defines the possible sets of parameters for 
>    QoS that a client might be mapped to. Each set holds: SL
>    and optionally: Max MTU, Max Rate, Packet Lifiteme and 
>    QoS Class.
> 
>  - Matching Rules:
>    A list of rules that match an incoming PathRecord request
>    to a QoS-Level. The rules are processed in order such as 
>    the first match is applied. Each rule is built out of set
>    of match expressions which should all match for the rule
>    to apply. The matching expressions are defined for the 
>    following fields:
>      - SRC and DST to lists of port groups
>      - Service-ID to a list of Service-ID or Service-ID ranges
>      - QoS Class to a list of QoS Class values or ranges
> 
> 
> Example of the QoS policy file
> ==============================
> 
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <qos-policy>
>     <!-- Port Groups define sets of ports to be used later in the settings -->
>     <port-groups>
>         <!-- using port GUIDs -->
>         <port-group> 
>             <name>Storage</name> 
>             <!-- <use> is just a description that is used for logging.
>                  Other than that, it is just a commentary -->

I would think the name is for logging. use also ?

>             <use>our SRP storage targets</use>
>             <port-guid>0x1000000000000001</port-guid>
>             <port-guid>0x1000000000000002</port-guid>
>         </port-group>
>         <port-group> 
>             <name>Virtual Servers</name> 
>             <use>node desc and IB port #</use>
>             <!-- The syntax of the port name is as follows: "hostname/CA-num/Pnum".
>                  "hostname" and "CA-num" are compared to the first 2 words of 
>                  NodeDescription, and "Pnum" is a port number on that node. -->
>             <port-name>vs1/HCA-1/P1</port-name>
>             <port-name>vs3/HCA-1/P1</port-name>
>             <port-name>vs3/HCA-2/P1</port-name>

Shouldn't this be CA rather than HCA ?

I think this may also cover routers too.

Also, any support for switches ?

>         </port-group>
>         <!-- using partitions defined in the partition policy -->
>         <port-group> 
>             <name>Partition 1</name> 
>             <use>default settings</use>
>             <partition>Part1</partition>

Thiswould correlate to the partition named Part1 in the partition
configuration. Should pkey based port groups be supported as well ? Just
wondering...

The current partition config indicates a set of port GUIDs and whether
they are full or limited members. As mentioned before, I would prefer
that this heads towards a port grouping schema on which both partitions
and QoS and perhaps other things depend.

>         </port-group>
>         <!-- using node types CA|ROUTER|SWITCH -->
>         <port-group> 
>             <name>Routers</name> 
>             <use>all routers</use>
>             <node-type>ROUTER</node-type> 
>         </port-group>  
>     </port-groups>

This grouping is similar to existing QoS support. For switches, there
are external/physical ports and extended switch port 0 which are
different. Base switch port 0 does not support QoS.

>     <qos-setup>
>         <sl2vl-tables>
>             <!-- scope defines the exact devices and in/out ports the tables apply to
>                  if the same port is matching several rules the last one applies -->
>             <sl2vl-scope> 
>                 <group>Part1</group> 
>                 <!-- *see explanation below the policy file example* -->
>                 <from>*</from> 
>                 <!-- *see explanation below the policy file example* -->
>                 <to>*</to> 
>                 <!-- SL2VL table has to have exactly 16 values (one for each SL) -->
>                 <sl2vl-table>0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7</sl2vl-table>
>             </sl2vl-scope>
>             <sl2vl-scope>
>                 <!-- *see explanation below the policy file example* -->
>                 <across-from>Storage1</across-from>
>                 <!-- *see explanation below the policy file example* -->
>                 <across-to>Storage2</across-to>
>                 <sl2vl-table>0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0</sl2vl-table>
>             </sl2vl-scope>
>         </sl2vl-tables>
> 
>         <!-- define all types of VLArb tables. The length of the tables should 
>              match the physically supported tables by their target ports -->
>         <vlarb-tables>
>             <!-- scope defines the exact ports the VLArb tables apply to -->
>             <vlarb-scope> 
>                 <!-- defining VLArb tables on all the ports that belong to 
>                      port group 'Storage', and on all the ports that connected 
>                      to ports of port group 'Storage' -->
>                 <group>Storage</group>
>                 <!-- "across" means all the ports that are connected to ports 
>                      that belong to the specified port group -->
>                 <across>Storage</across>
>                 <!-- VLArb table holds VL and weight pairs -->
>                 <vlarb-high>0:255,1:127,2:63,3:31,4:15,5:7,6:3,7:1</vlarb-high>
>                 <vlarb-low>8:255,9:127,10:63,11:31,12:15,13:7,14:3</vlarb-low>
>                 <vl-high-limit>10</vl-high-limit>
>             </vlarb-scope>
>         </vlarb-tables>
> 
>     </qos-setup>
> 
>     <qos-levels>
>         <!-- the first one is just setting SL -->
>         <qos-level> 
>             <!-- Serial number (unique ID) of the QoS level -->
>             <sn>1</sn> 
>             <use>for the lowest priority comm</use>
>             <sl>16</sl>

SL 16 is not valid.

>             <pkey>123</pkey>

Will this take hex as well as decimal ?

>             <packet-life>16</packet-life>
>         </qos-level>
>         <!-- the second sets SL and QoS Class -->
>         <qos-level> 
>             <sn>2</sn> 
>             <use>low latency best bandwidth</use>
>             <sl>0</sl> 
>             <qos-class>7</qos-class>
>         </qos-level>
>         <!-- the whole set: SL, QoS Class, MTU-Limit, Rate-Limit, Packet Lifetime -->
>         <qos-level> 
>             <sn>3</sn> 
>             <use>just an example</use>
>             <sl>0</sl> 
>             <qos-class>32</qos-class> 
>             <mtu-limit>1</mtu-limit> 
>             <rate-limit>1</rate-limit>
>             <packet-life>12</packet-life>
>         </qos-level>
>     </qos-levels>
> 
>     <!-- Match rules are scanned in a first-fit manner (like firewall rules table) -->
>     <qos-match-rules>
>         <!-- matching by single criteria: class (list of values and ranges) -->
>         <qos-match-rule> 
>             <qos-level-sn>1</qos-level-sn> <!-- defined in <sn> of <qos-level> -->

Can this be <sn> rather than <qos-level-sn> or can't the keywords be
duplicated ?

>             <use>low latency by class 7-9 or 11</use> <!-- just a description -->
>             <qos-class>7-9,11</qos-class>
>         </qos-match-rule>
>         <!-- show matching by destination group AND service-ids -->
>         <qos-match-rule> 
>             <qos-level-sn>2</qos-level-sn> 
>             <use>Storage targets connection></use>
>             <destination>Storage</destination>

Is destination a port group and used for matching destination GID or LID
on SA PR/MPR lookups ?

>             <service>22,4719-5000</service>
>         </qos-match-rule>
>         <!-- show matching by source group only -->
>         <qos-match-rule> 
>             <qos-level-sn>3</qos-level-sn> 
>             <use>bla bla</use>
>             <source>Storage</source>

Is source a port group and used for matching source GID or LID on SA
PR/MPR lookups ?

>         </qos-match-rule>
>     </qos-match-rules>
> 
> </qos-policy>
> 
> 
> Explanation of some fields
> ==========================
> 
> Most of the tags meaning is either intuitive or explained by the 
> comments along the file. One section that deserves a special
> explanation is SL2VL tables definition - <sl2vl-scope>.
> 
> In general, VL is a function of in-port (the port that the packet
> has entered through), out-port (the port that the packet is supposed
> to come out from) and the SL.
> In OpenSM, SL2VL table is defined on every port, where this port is 
> an out-port. Hence, on every port, SL2VL table is defined as function 
> of in-port and SL.

Would the syntax work for any SM ?

Are the below tags applicable to more than switches or only switches ?

> <to>n,m</to>

Will it take n-m too (port range) ? Might be more concise for some
configs.

>   This means that of all the ports of the specified port group, define
>   SL2VL tables where to-ports are ports number n and m. Since SL2VL 
>   table is defined per out-port, using <to> effectively means defining
>   SL2VL table on ports n and m.
>   In order to specify that SL2VL table should be defined on all the 
>   ports, an asterisk (*) can be used.
> 
> <from>i,j</from>

Will this take i-j too (port range) ? Might be more concise for some
configs.

>   This means that of all the ports of the specified port group that were
>   not filtered out by the <to> value, define SL2VL table only for entries
>   where from-ports are ports number i and j.
>   In order to specify that SL2VL table should be defined for all the in-ports, 
>   an asterisk (*) can be used.
> 
> To specify that all the SL2VL tables entries should be defined for all 
> the ports of a certain group, use the following:
>     <group>port_group</group> 
>     <from>*</from>
>     <to>*</to>
> 
> <across-to>PortGroupName</across-to>
>   
>   This is combination of <across> keyword (that can be found in VLArb tables 
>   definition) and <to> keyword. 
>   <across>PortGroupName</across> means that the ports that we're talking about
>   are all the ports that are connected to ports that belong to PortGroupName.
>   Essintially, <across-to>PortGroupName</across-to> means the folowing:
>   <to>list_of_all_the_ports_that_are_connected_to_group_PortGroupName</to>
>   
>   Example of usage of <across-to>:
>   A user has a set of 'special' nodes (e.g. storage nodes), and all the
>   traffic to these nodes has to get specific VL. The solution is to define port
>   group (i.e "Storage") that will include all the ports of these nodes, and then
>   to configure SL2VL tables on all the switch ports that are connected to the
>   Storage port group by specifying <across-to>Storage</across-to>
>   
> <across-from>PortGroupName</across-from>
> 
>   Similar to <across-to>, <across-from> is combination of <across> and <from>
>   keywords.

Is omission of these keywords treated as a wildcard (*) ?

After initial read of this, I have the following higher level
questions/thoughts:

How are trunk (switch to switch) links handled by the QoS syntax ?

I also need to think more about the across ramifications. Is it really
simpler to use this syntax than to specify the specific ports in
question ?





More information about the general mailing list