[ofa-general] Re: [PATCHv2] osm: up/dn optimization - improved ranking
Yevgeny Kliteynik
kliteyn at dev.mellanox.co.il
Mon May 21 01:17:02 PDT 2007
Hi Sasha,
Sasha Khapyorsky wrote:
> Hi Yevgeny,
>
> On 14:26 Sun 20 May , Yevgeny Kliteynik wrote:
>> Hi Hal,
>>
>> This patch optimizes fabric ranking similar to the fat-tree ranking.
>> All the root switches are marked with rank and added to the BFS list,
>> and only then ranking of rest of the fabric begins.
>> This version of the patch is updated in accordance with Sasha's suggestions.
>>
>> Please apply to master.
>>
>> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
>> ---
>
> Looks fine for me. Nice optimization. Thanks.
>
> I guess there still be issue with max_rank calculation (details are
> below), which affects only log message and for me it is ok to fix it in
> the incremental patch.
>
>> opensm/opensm/osm_ucast_updn.c | 80
>> +++++++++++++++++----------------------
>> 1 files changed, 35 insertions(+), 45 deletions(-)
>>
>> diff --git a/opensm/opensm/osm_ucast_updn.c b/opensm/opensm/osm_ucast_updn.c
>> index 5cebd9b..95a0622 100644
>> --- a/opensm/opensm/osm_ucast_updn.c
>> +++ b/opensm/opensm/osm_ucast_updn.c
>
> [snip...]
>
>> @@ -483,7 +474,7 @@ updn_subn_rank(
>> {
>> remote_u = p_remote_physp->p_node->sw->priv;
>> port_guid = p_remote_physp->port_guid;
>> - did_cause_update = __updn_update_rank(remote_u, rank);
>> + did_cause_update = __updn_update_rank(remote_u, u->rank+1);
>>
>> osm_log( p_log, OSM_LOG_DEBUG,
>> "updn_subn_rank: "
>> @@ -492,7 +483,10 @@ updn_subn_rank(
>> remote_u->rank );
>>
>> if (did_cause_update)
>> + {
>> cl_qlist_insert_tail(&list, &remote_u->list);
>> + max_rank = remote_u->rank;
>> + }
>
> I think this still be not accurate. For instance with topology like:
> A <-> B <-> C <-> D <-> E , where roots are A and E we will get
> max_rank= 1, which obviously should be 2.
Not exactly. What you're describing would happen if the scan would be DFS-like,
not BFS. In your example there are two roots: A and E. They both got rank 0 and
entered to the BFS list. Now, starting BFS scan:
- Removing head of the list - A
- Discovering B
- Assigning B with rank 1 -------> updating max_rank
- Adding B to the end of the list
- Removing head of the list - E
- Discovering D
- Assigning D with rank 1 -------> updating max_rank
- Adding D to the end of the list
- Removing head of the list - B
- Discovering C
- Assigning C with rank 2 -------> updating max_rank
- Adding C to the end of the list
- Removing head of the list - D
- Nothing to discover (C has been already discovered)
- Removing head of the list - C
- BFS list is empty
As you can see, the last rank was 2.
I actually was expecting this mail, because I thought of something like this initially :)
> Probably we need something like this instead:
>
> if (did_cause_update)
> cl_qlist_insert_tail(&list, &remote_u->list);
> if (remote_u->rank <= u->rank + 1)
> max_rank = remote_u->rank;
>
> (and after such intervention into rank updating technique we may want to
> remove also __updn_update_rank() function)
Although I can't think of any scenario that would prove me wrong, I do think that
to make the code more "intuitive" we might want to remove the __updn_update_rank()
and do something like this:
if (remote_u->rank > u->rank + 1)
{
remote_u->rank = u->rank + 1;
max_rank = remote_u->rank;
cl_qlist_insert_tail(&list, &remote_u->list);
}
> And again, this nit affects only reported value in the log message (and
> just this log message removing can be option too :)) and doesn't touch
> the optimization itself - good stuff, Yevgeny!
Truth, all this is for the log message only :)
We also might want to remove the message :)
I'm OK with either of the two options.
-- Yevgeny
> Sasha
>
More information about the general
mailing list