<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<META NAME="Generator" CONTENT="MS Exchange Server version 5.5.2654.45">
<TITLE>RE: [openib-general] RE: OpenSM Routing Scalability Proposal</TITLE>
</HEAD>
<BODY>
<P><FONT SIZE=2>Hi Hal,</FONT>
</P>
<P><FONT SIZE=2>All your points are valid. Especially the ones regarding the incremental algorithm for routing. Please see below.</FONT>
</P>
<P><FONT SIZE=2>Eitan Zahavi</FONT>
<BR><FONT SIZE=2>Design Technology Director</FONT>
<BR><FONT SIZE=2>Mellanox Technologies LTD</FONT>
<BR><FONT SIZE=2>Tel:+972-4-9097208</FONT>
<BR><FONT SIZE=2>Fax:+972-4-9593245</FONT>
<BR><FONT SIZE=2>P.O. Box 586 Yokneam 20692 ISRAEL</FONT>
</P>
<BR>
<P><FONT SIZE=2>> -----Original Message-----</FONT>
<BR><FONT SIZE=2>> From: Hal Rosenstock [<A HREF="mailto:halr@voltaire.com">mailto:halr@voltaire.com</A>]</FONT>
<BR><FONT SIZE=2>> Sent: Wednesday, May 25, 2005 5:53 PM</FONT>
<BR><FONT SIZE=2>> To: Eitan Zahavi</FONT>
<BR><FONT SIZE=2>> Cc: 'openib-general@openib.org'</FONT>
<BR><FONT SIZE=2>> Subject: Re: [openib-general] RE: OpenSM Routing Scalability Proposal</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> Hi Eitan,</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> On Tue, 2005-05-17 at 10:53, Eitan Zahavi wrote:</FONT>
<BR><FONT SIZE=2>> > Hi All,</FONT>
<BR><FONT SIZE=2>> ></FONT>
<BR><FONT SIZE=2>> > This is an updated proposal document for your comments.</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> I finally got a chance to read this. Some comments below.</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> > The main change is in describing the need for preserving enough data</FONT>
<BR><FONT SIZE=2>> > to enable incremental routing algorithm.</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> I think incremental can help but presents some new issues.</FONT>
<BR><FONT SIZE=2>[EZ] Yes very true. I do not have a full algorithm in place that will cover all possible cases.</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> > So the actual proposal is to implement the algorithm described in</FONT>
<BR><FONT SIZE=2>> > section 4.3.</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> 4.1 (min hop) and 4.2 (up/down) are already implemented, right ?</FONT>
<BR><FONT SIZE=2>[EZ] No they are not. The current implementation uses MinHop tables etc.</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> > EZ <<OpenSM Routing.pdf>></FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> It seems like there are 2 parts to 4.3:</FONT>
<BR><FONT SIZE=2>> 1. Min hop table per leaf switch rather than per LID</FONT>
<BR><FONT SIZE=2>> What are the savings for this ? Seems like in terms of memory, this is</FONT>
<BR><FONT SIZE=2>> something like a divisor of L times the number of LIDs per HCA port.</FONT>
<BR><FONT SIZE=2>[EZ] Yes. </FONT>
<BR><FONT SIZE=2>> Of course, switch port 0s on non leaf switches need to be accomodated.</FONT>
<BR><FONT SIZE=2>[EZ] True.</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> 2. Incremental routing (5)</FONT>
<BR><FONT SIZE=2>> a. Subcase of 5 where there is no other link between 2 adjacent</FONT>
<BR><FONT SIZE=2>> switches. Is another way of stating this, examine next hop switches to</FONT>
<BR><FONT SIZE=2>> see if there is a path between the 2 original switches and keep</FONT>
<BR><FONT SIZE=2>> expanding the depth until 1 is found ? Couldn't this be worse from a</FONT>
<BR><FONT SIZE=2>> compute standpoint than rerouting everything depending on the topology</FONT>
<BR><FONT SIZE=2>> (the likelihood of another path between the 2 original switches) ?</FONT>
<BR><FONT SIZE=2>[EZ] If one knows which ports have changed this will be faster then full recalc.</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> b. 5 asks "How do we support topology changes line moving an HCA from</FONT>
<BR><FONT SIZE=2>> one Switch to another?" Also, what about a link moving from one switch</FONT>
<BR><FONT SIZE=2>> to another ? It seems that link down is handled, but nothing is done on</FONT>
<BR><FONT SIZE=2>> a link up. Doesn't there need to be incremental defined for links being</FONT>
<BR><FONT SIZE=2>> added ?</FONT>
<BR><FONT SIZE=2>[EZ] Yes. This is not even close to full algorithm.</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> c. Also, with incremental routing, it's unclear to me how the paths</FONT>
<BR><FONT SIZE=2>> found would compare with the ones which would be determined from the</FONT>
<BR><FONT SIZE=2>> full algorithm (from scratch). Also, would there be some point at which</FONT>
<BR><FONT SIZE=2>> the full routing would be retriggered ?</FONT>
<BR><FONT SIZE=2>[EZ] Good point.</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> d. Clearly, there are end node responsibilities here as well (whether</FONT>
<BR><FONT SIZE=2>> this is done incrementally or fully or something else).</FONT>
<BR><FONT SIZE=2>[EZ] Not sure what you mean.</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> 3. Persistency (6)</FONT>
<BR><FONT SIZE=2>> a. Full LFT storage (6.1) This presumes that the determination of a</FONT>
<BR><FONT SIZE=2>> topology change upon discovery is cheaper computationally than running</FONT>
<BR><FONT SIZE=2>> the routing. Has this been proven ? (I hope this is the case).</FONT>
<BR><FONT SIZE=2>[EZ] Comparing two graphs is O(Links). </FONT>
<BR><FONT SIZE=2>> b. Root nodes storage (6.2) Are the root nodes determined by the routing</FONT>
<BR><FONT SIZE=2>> or supplied to the routing ? Are they different for unicast and</FONT>
<BR><FONT SIZE=2>> multicast ?</FONT>
<BR><FONT SIZE=2>[EZ] Both is true. If not provided by human extracted using heuristics.</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> -- Hal</FONT>
</P>
</BODY>
</HTML>