[ofa-general] [RFC] Fat-Tree upgrades

Nicolas Morey Chaisemartin nicolas.morey-chaisemartin at ext.bull.net
Mon Mar 2 00:06:08 PST 2009


Nicolas Morey Chaisemartin wrote:

> Another thing we have developped here is to balance more secondary path.
> In the current algorithm, secondary down path (going_down_by_going up)
> are created in port_group order.
> This means that if the primary path didn't reach all the network
> (because a switch is broken for examples), all the routes missing will
> be created through the first port group. Which unbalance the network
> load a lot.
> To solve this,  we create the secondary path by port group load.
> The previous patch has made us increment the port/portgroup counters 
> when secondary routes towards HCA are created, therefore these counters
> are significant even when creating secondary routes.
> What our patch does is at the beginning of the function sort all the
> port group from lowest load to highest. Pick the first one for the
> primary path, and try secondary path from the 2nd to the last.
> Once again this seems to have no effect on regular topology but it made
> a real impact on our failover tests.
> 
> 

After some work on this, I managed to achieve even better results. 
When both links are equally loaded, I look which one of the remote switch as the least loaded down route.
This way when all the switch are equivalent to go up, we choose the best one to go down at the next step.
I'm still running some tests but early results show much more balanced results than the previous ones.

I'd still like to hear comments about this approach and the previous one.
It would be a pity not to share the work we have done on the algorithm.


Nicolas




More information about the general mailing list