[ofa-general] Re: [PATCH] opensm/osm_qos_policy.c : set SL for IPoIB broadcast groups

Yevgeny Kliteynik kliteyn at dev.mellanox.co.il
Mon Mar 24 00:09:51 PDT 2008


Hi Sasha,

Sasha Khapyorsky wrote:
> Hi Yevgeny,
> 
> On 17:45 Sun 23 Mar     , Yevgeny Kliteynik wrote:
>> QoS manager should set the right SL in the IPoIB broadcast groups
>> to enforce the right SL for IPoIB traffic.
>>
>> At start it seemed to me that it won't work in the following scenario:
>>  - IPoIB partition created
>>  - IPoIB mcast group created
>>  - IPoIB connectivity was established between two ports
>>  - SL in the QoS policy file was changed
>>  - SM is doing heavy sweep, and setting new SL in the
>>    partition and mcast group, but the nodes already have
>>    connectivity, so they will continue to transmit on the
>>    previous SL
>> But then I realized that this scenario will be a problem for any
>> communication (not only IPoIB), because as long as clients are not
>> issuing new PR request, they will have the old parameters that won't
>> be affected by the changes in QoS policy.
>>
>> This patch is for ofed_1_3 branch.
> 
> Is it really OFED 1.3 materials and it is not enough (for already
> stabilized version) to have just a warning about SL value mismatch
> between partition and QoS managers?

QoS annex support in OFED 1.3 is defined as "experimental",
or "technology preview", if you wish.
I really can't say that QoS annex support is "stabilized" -
the first user ever tried it (besides be) found a bug :)
In light of the above, I do think that it's very much relevant
for OFED 1.3. W/o this fix one of the main features (SL for IPoIB)
in the QoS policy is broken.

>> Please review and let me know what you think.
>>
>> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
>> ---
>>  opensm/opensm/osm_qos_policy.c |   39 +++++++++++++++++++++++++++++++++++++++
>>  1 files changed, 39 insertions(+), 0 deletions(-)
>>
>> diff --git a/opensm/opensm/osm_qos_policy.c b/opensm/opensm/osm_qos_policy.c
>> index bde1e7e..4586a77 100644
>> --- a/opensm/opensm/osm_qos_policy.c
>> +++ b/opensm/opensm/osm_qos_policy.c
>> @@ -920,6 +920,13 @@ int osm_qos_policy_validate(osm_qos_policy_t * p_qos_policy,
>>
>>  				if (p_qos_match_rule->p_qos_level->sl_set &&
>>                                      p_prtn->sl != p_qos_match_rule->p_qos_level->sl) {
>> +					uint8_t sl;
>> +					uint32_t flow;
>> +					uint8_t hop;
>> +					cl_qmap_t * p_mlid_tbl;
>> +					osm_mgrp_t * p_mgrp;
>> +					osm_mgrp_t * p_next_mgrp;
>> +
>>  					/* overriding partition's SL */
>>  					osm_log(p_log,
>>  						OSM_LOG_ERROR,
>> @@ -931,6 +938,38 @@ int osm_qos_policy_validate(osm_qos_policy_t * p_qos_policy,
>>  						p_prtn->sl,
>>  						p_qos_match_rule->p_qos_level->sl);
>>  					p_prtn->sl = p_qos_match_rule->p_qos_level->sl;
>> +
>> +					/* If this partition is an IPoIB partition,
>> +					   there should be a matching MCast group.
>> +					   Fix this group's SL too */
>> +
>> +					p_mlid_tbl = &p_qos_policy->p_subn->mgrp_mlid_tbl;
>> +					p_next_mgrp = (osm_mgrp_t *) cl_qmap_head(p_mlid_tbl);
>> +					while (p_next_mgrp != (osm_mgrp_t *)
>> +					       cl_qmap_end(p_mlid_tbl)) {
> 
> Instead of looping here you could just keep osm_mgrp_t reference as
> field in partition structure.

Sure, that would be better. This is what I did initially, but then
reworked the change to be as "local" as possible, because as I mentioned,
this patch was for OFED 1.3 only - for master I intended to do it with
reference in partition structure.

>> +						p_mgrp = p_next_mgrp;
>> +						p_next_mgrp = (osm_mgrp_t *)
>> +						cl_qmap_next(&p_mgrp->map_item);
>> +						if (!p_mgrp->well_known)
>> +							continue;
>> +						if ((cl_ntoh16(p_mgrp->mcmember_rec.pkey) & 0x7fff) !=
>> +						    (cl_ntoh16(pkey) & 0x7fff))
>> +							continue;
>> +						ib_member_get_sl_flow_hop(
>> +							p_mgrp->mcmember_rec.sl_flow_hop,
>> +							&sl, &flow, &hop);
>> +						if (sl != p_prtn->sl) {
>> +							osm_log(p_log, OSM_LOG_DEBUG,
>> +								"osm_qos_policy_validate: "
>> +								"Updating MCGroup (MLID 0x%04x) SL to "
>> +								"match partition SL %u\n",
>> +								cl_hton16(p_mgrp->mcmember_rec.mlid),
>> +								p_prtn->sl);
>> +							p_mgrp->mcmember_rec.sl_flow_hop =
>> +								ib_member_set_sl_flow_hop(
>> +									p_prtn->sl, flow, hop);
>> +						}
>> +					}
>>  				}
>>  			}
>>  		}
> 
> What about to reduce a number of flow nesting? We have a functions in C.

Gee, man.. Completely forgot about that...

Like I said, I reworked the change to be as local as possible (although
it turned out to be quite ugly). Anyway, I'll repost the improved patch.

-- Yevgeny

> Sasha
> 




More information about the general mailing list