[ofa-general] QoS RFC

Tue Jul 31 09:25:37 PDT 2007

FYI - It is my intention to implement the host side portion of QoS 
support.  (It's one of my path forward objectives.)  I plan on 
implementing the host side as outlined below.  If anyone has any 
comments, I would like to get them as soon as possible.

- Sean

Sean Hefty wrote:
>> 2. Architecture ----------------
> 
> This is a higher level approach to the problem, but I came up with the
> following QoS relationship hierarchy, where '->' means 'maps to'.
> 
> Application Service -> Service ID (or range)
> Service ID -> desired QoS
> QoS, SGID, DGID, PKey -> SGID, DGID, TClass, FlowLabel, PKey
> SGID, DGID, TC, FL, PKey -> SLID, DLID, SL (set if crossing subnets)
> SLID, DLID, SL -> MTU, Rate, VL, PacketLifeTime
> 
> I use these relationships below:
> 
>> 4. IPoIB ---------
>>
>> IPoIB already query the SA for its broadcast group information. The 
>> additional functionality required is for IPoIB to provide the
>> broadcast group SL, MTU, and RATE in every following PathRecord query
>> performed when a new UDAV is needed by IPoIB. We could assign a
>> special Service-ID for IPoIB use but since all communication on the
>> same IPoIB interface shares the same QoS-Level without the ability to
>>  differentiate it by target service we can ignore it for simplicity.
> 
> Rather than IPoIB specifying SL, MTU, and rate with PR queries, it 
> should specify TClass and FlowLabel.  This is necessary for IPoIB to 
> span IB subnets.
> 
>> 5. CMA features ----------------
>>
>> The CMA interface supports Service-ID through the notion of port
>> space as a prefixes to the port_num which is part of the sockaddr
>> provided to rdma_resolve_add(). What is missing is the explicit
>> request for a QoS-Class that should allow the ULP (like SDP) to
>> propagate a specific request for a class of service. A mechanism for
>> providing the QoS-Class is available in the IPv6 address, so we could
>> use that address field. Another option is to implement a special 
>> connection options API for CMA.
>>
>> Missing functionality by CMA is the usage of the provided QoS-Class
>> and Service-ID in the sent PR/MPR. When a response is obtained it is
>> an existing requirement for the CMA to use the PR/MPR from the
>> response in setting up the QP address vector.
> 
> I think the RDMA CM needs two solutions, depending on which address 
> family is used.  For IPv6, the existing interface is sufficient, and 
> works for both IB and iWarp.  The RDMA CM only needs to include the TC 
> and FL as part of its PR query.  For IPv4, to remain transport neutral, 
> I think we should add an rdma_set_option() routine to specify the QoS 
> field.  The RDMA CM would include the QoS field for PR query under this 
> condition.
> 
> For IB, this requires changes to the ib_sa to support the new PR 
> extensions.  I don't think we gain anything having the RDMA CM include 
> service IDs as part of the query.
> 
>> 6. SDP -------
>>
>> SDP uses CMA for building its connections. The Service-ID for SDP is
>> 0x000000000001PPPP, where PPPP are 4 hex digits holding the remote
>> TCP/IP Port Number to connect to. SDP might be provided with
>> SO_PRIORITY socket option. In that case the value provided should be
>> sent to the CMA as the TClass option of that connection.
> 
> SDP would use specify the QoS through the IPv6 address or 
> rdma_set_option() routine.
> 
>> 7. SRP -------
>>
>> Current SRP implementation uses its own CM callbacks (not CMA). So
>> SRP should fill in the Service-ID in the PR/MPR by itself and use
>> that information in setting up the QP. The T10 SRP standard defines
>> the SRP Service-ID to be defined by the SRP target I/O Controller
>> (but they should also comply with IBTA Service- ID rules). Anyway,
>> the Service-ID is reported by the I/O Controller in the ServiceEntries 
>> DMA attribute and should be used in the PR/MPR if the
>> SA reports its ability to handle QoS PR/MPRs.
> 
> I agree.
> 
>> 8. iSER -------- iSER uses CMA and thus should be very close to SDP.
>> The Service-ID for iSER should be TBD.
> 
> See RDMA CM and SDP.
> 
>> 3.2. PR/MPR query handling: OpenSM should be able to enforce the
>> provided policy on client request. The overall flow for such requests
>> is: first the request is matched against the defined match rules such
>> that the target QoS-Level definition is found. Given the QoS-Level a
>> path(s) search is performed with the given restrictions imposed by
>> that level. The following two sections describe these steps.
> 
> If we use the QoS hierarchy outlined above, I think we can construct 
> some fairly simple tables to guide our PR selection.  The SA may need to 
> construct the tables starting at the bottom and working up, but I 
> *think* it could be done.  And by distributing the tables, we can 
> support a more distributed (a la local SA) operation.
> 
>  From an administration point, I would be happier seeing something where 
> the administrator defines a QoS level in terms of latency or bandwidth 
> requirements and relative priority.  Then, if desired, the administrator 
> could provide more details, such as indicating which nodes would use 
> which services, minimum required MTUs, etc.  It would then be up to the 
> SA to map these requirements to specific TC, FL, SL, VL values.
> 
> In general, though, I'm personally far less concerned with the QoS 
> specification interface to the SA, versus the operation that takes place 
> on the hosts.
> 
> Comments on using this approach on the host side?