[ofa-general] Re: [OPENSM PATCH 0/5]: New "guid-routing-order" option for updn routing

Sasha Khapyorsky sashak at voltaire.com
Mon Jun 16 01:24:15 PDT 2008


Hi Al,

On 15:48 Fri 13 Jun     , Al Chu wrote:
> 
> This is a conceptually simple option I've developed for updn routing.
> 
> Currently in updn routing, nodes/guids are routed on switches in a
> seemingly-random order, which I believe is due to internal data
> structure organization (i.e. cl_qmap_apply_func is called on
> port_guid_tbl) as well as how the fabric is scanned (it is logically
> scanned from a port perspective, but it may not be logical from a node
> perspective).  I had a hypothesis that this was leading to increased
> contention in the network for MPI.
> 
> For example, suppose we have 12 uplinks from a leaf switch to a spine
> switch.  If we want to send data from this leaf switch to node[13-24],
> the up links we will send on are pretty random.

Yeah, the issue is known. And idea is good and useful.

Actually we discussed this issue with Yiftah some time ago and his idea
was to have an option to order routing generation by ports connected to
leaf switches with higher number of active links. Something like:

  foreach switch reverse (higher is first) sorted by number of active links
    foreach port connected to the switch
      do_rounting()

Which is good for most cases, but also I thought that in addition we will
need something more configurable - just like your patch series :).

Now comment: why do you think it is useful only for Up/Down?

IMO min hops algo could benefit from this feature just well. If so this
simplifies implementation - patches 2,3,4 are not needed, code from patch
5 is going to osm_ucast_mgr.c. What do you think?

Sasha



More information about the general mailing list