[ofa-general] opensm support for toroidal meshes

Sven-Arne Reinemo svenar at simula.no
Mon Dec 1 00:43:02 PST 2008


Hi all,

I just thought I should share some simulation results with you. I just
did some simulations to test Bob's suggested changes, and I see that the
number VLs required for both 2d and 3d tori is either _reduced_ or
_equal_ to that of the existing implementation. Moreover, the port
reordering seems to work very well on tori that is not cabled regularly
with regards to port numbering. As state by Bob it makes LASH route them
as if they where regularly cabled, which is the most optimal for LASH.

Below are some numbers for the VLs required for various 2d and 3d tori.
Be aware that the number of VLs required would be different if the torus
has a different size along each dimension.

Tori  Current  Patch
4x4      2      2
5x5      3      3
6x6      4      3
7x7      3      3
8x8      6      4
9x9      4      4
10x10    9      6
11x11    5      5
12x12    9      4
13x13    7      7
14x14   12      8
15x15   10     10

4x4x4    5      5
5x5x5    5      4
6x6x6   10      6
7x7x7   10     10
8x8x8   12      9
9x9x9   14     14

Regards,
Sven-Arne

On ma., 2008-11-10 at 15:47 -0600, Robert Pearson wrote: 
> We have been involved in a project to deliver a large system based on a
> toroidal mesh fabric. One of the requirements for this system is to be able
> to guarantee a deadlock free routing of the fabric. The lash routing engine
> in opensm did not work in this case because required number of VLs for the
> machine as configured was 12 which exceeded the number of VLs supported by
> Mellanox switch ASICs. It turns out that if one has the freedom to reorder
> the order of the port assignments used by lash optimally that lash can
> successfully route the fabric but that is impractical in the hardware. The
> attached note describes an algorithm for automatically recognizing when a
> Cartesian mesh fabric is a torus, determining its size and optimally
> reordering the ports in opensm so that lash can generate a route with the
> smallest number of VLs.
> 
> We have implemented a set of changes to opensm that implement this algorithm
> and will submit the changes as patches. This note will help to understand
> the code.
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general




More information about the general mailing list