[openib-general] RE: OpenSM Routing Scalability Proposal

Wed May 25 11:45:21 PDT 2005

Hi Hal,

All your points are valid. Especially the ones regarding the incremental
algorithm for routing. Please see below.

Eitan Zahavi
Design Technology Director
Mellanox Technologies LTD
Tel:+972-4-9097208
Fax:+972-4-9593245
P.O. Box 586 Yokneam 20692 ISRAEL

> -----Original Message-----
> From: Hal Rosenstock [mailto:halr at voltaire.com]
> Sent: Wednesday, May 25, 2005 5:53 PM
> To: Eitan Zahavi
> Cc: 'openib-general at openib.org'
> Subject: Re: [openib-general] RE: OpenSM Routing Scalability Proposal
> 
> Hi Eitan,
> 
> On Tue, 2005-05-17 at 10:53, Eitan Zahavi wrote:
> > Hi All,
> >
> > This is an updated proposal document for your comments.
> 
> I finally got a chance to read this. Some comments below.
> 
> > The main change is in describing the need for preserving enough data
> > to enable incremental routing algorithm.
> 
> I think incremental can help but presents some new issues.
[EZ] Yes very true. I do not have a full algorithm in place that will cover
all possible cases.
> 
> > So the actual proposal is to implement the algorithm described in
> > section 4.3.
> 
> 4.1 (min hop) and 4.2 (up/down) are already implemented, right ?
[EZ] No they are not. The current implementation uses MinHop tables etc.
> 
> > EZ <<OpenSM Routing.pdf>>
> 
> It seems like there are 2 parts to 4.3:
> 1. Min hop table per leaf switch rather than per LID
> What are the savings for this ? Seems like in terms of memory, this is
> something like a divisor of L times the number of LIDs per HCA port.
[EZ] Yes. 
> Of course, switch port 0s on non leaf switches need to be accomodated.
[EZ] True.
> 
> 2. Incremental routing (5)
> a. Subcase of 5 where there is no other link between 2 adjacent
> switches. Is another way of stating this, examine next hop switches to
> see if there is a path between the 2 original switches and keep
> expanding the depth until 1 is found ? Couldn't this be worse from a
> compute standpoint than rerouting everything depending on the topology
> (the likelihood of another path between the 2 original switches) ?
[EZ] If one knows which ports have changed this will be faster then full
recalc.
> 
> b. 5 asks "How do we support topology changes line moving an HCA from
> one Switch to another?" Also, what about a link moving from one switch
> to another ? It seems that link down is handled, but nothing is done on
> a link up. Doesn't there need to be incremental defined for links being
> added ?
[EZ] Yes. This is not even close to full algorithm.
> 
> c. Also, with incremental routing, it's unclear to me how the paths
> found would compare with the ones which would be determined from the
> full algorithm (from scratch). Also, would there be some point at which
> the full routing would be retriggered ?
[EZ] Good point.
> 
> d. Clearly, there are end node responsibilities here as well (whether
> this is done incrementally or fully or something else).
[EZ] Not sure what you mean.
> 
> 3. Persistency (6)
> a. Full LFT storage (6.1) This presumes that the determination of a
> topology change upon discovery is cheaper computationally than running
> the routing. Has this been proven ? (I hope this is the case).
[EZ] Comparing two graphs is O(Links). 
> b. Root nodes storage (6.2) Are the root nodes determined by the routing
> or supplied to the routing ? Are they different for unicast and
> multicast ?
[EZ] Both is true. If not provided by human extracted using heuristics.
> 
> -- Hal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050525/5a7c974a/attachment.html>