[ofa-general] Re: [PATCH] opensm: preserve base lid routes

Al Chu chu11 at llnl.gov
Thu Jun 12 13:20:06 PDT 2008


Hey Hal,

On Thu, 2008-06-12 at 06:03 -0700, Hal Rosenstock wrote:
> On Thu, 2008-06-12 at 15:11 +0300, Sasha Khapyorsky wrote:
> > On 04:49 Thu 12 Jun     , Hal Rosenstock wrote:
> > > On Thu, 2008-06-12 at 14:33 +0300, Sasha Khapyorsky wrote:
> > > > On 03:59 Thu 12 Jun     , Hal Rosenstock wrote:
> > > > > 
> > > > > Would you elaborate on the motivation behind the requirement to
> > > > > maintain/preserve the base LID routing ?
> > > > 
> > > > I see couple advantages:
> > > > 
> > > > 1. Application which works on base LIDs only will not be affected by LMC
> > > 
> > > Yes; that's the obvious one.
> > > 
> > > > 2. Changing LMC on a fabric will not change routing paths (when LIDs are
> > > > not reassigned)
> > > 
> > > That's when LMC is reduced rather than increased.
> 
> > Also when increased and LIDs are not reassigned.
> 
> Is that possible for anything other than the last base LID assigned ?
> 
> > > > 3. Finally it does better balancing for secondary LIDs ("port
> > > > offsetting")
> > > 
> > > Isn't that accomodated in the patch but separate from the base LID
> > > preservation ?
> > 
> > It is integrated in the patch - balancing for each LID starts from
> > its lower LID's port + 1.
> 
> Understood (mostly) with comment below.
> 
> > Not doing this would be really bad.
> 
> Is the badness disrupting the base LID traffic or something else ?

Just thought I'd comment on this since much discussion between Sasha,
myself, and Yiftah (Sasha's coworker) was off list.

As Sasha stated, the bad balancing was the worst part.  That's what led
me to develop my port-offsetting patch series (reasons for its
development can be seen in that thread).

Yiftah, made the case that disrupting base lid traffic was also part of
the problem.  If lmc is changed to > 0, it should not affect any user's
code's performance due to base lid routing changes.  This was reported
by some other Voltaire user, but I can see the "principle" behind this.
Needless to say, the lmc change to > 0 affected routing greatly due to
bad balancing before.  But even minor basic-lid routing changes could
affect users jobs.

Al

> > > Preserving/maintaining base LIDs is a policy decision and perhaps this
> > > should be an option with this as the default. For some balancing all
> > > paths might be more important that not disrupting base LID traffic.
> > 
> > This is the trick - by preserving base LID traffic and offsetting over
> > other LID ports we get better than before balancing. So right now it is
> > hard to me to see when proposed option would be useful. 
> 
> When all ULPs use all LIDs "equally" and it's not just MPI ?
> 
> > But probably it would, I think we can add it then.
> 
> Sure; this can be viewed a future thing.
> 
> -- Hal
> 
> > Sasha
> 
-- 
Albert Chu
chu11 at llnl.gov
925-422-5311
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory




More information about the general mailing list