[ofa-general] RE: pkey.sim.tcl
Eitan Zahavi
eitan at mellanox.co.il
Sun Jul 29 02:11:05 PDT 2007
Regarding the test :
Once I will know the exact condition causing a full re-sweep I would use
it in the test.
In OFED 1.2 it was enough to set one switch ChangeBit to force a full
reconfiguration.
Regarding incremental flow in general:
1. Yes - it is good.
2. But we must make sure it is robust enough that we do not loose some
nodes or functionality
under extreme cases of reboot or HW errors.
3. We should have a way to force a full sweep without killing the SM:
As the size of the clusters grow there is a growing chance that "soft
errors" will hit the devices.
Most of the device memory is guarded and would be auto detected if
affected.
However I think it is wise to allow for the user to force full
reconfiguration without making the SM "go away".
Regarding OpenSM does not respond to SA queries during sweep:
It is due to the fact there is no "double buffer" for the internal DB.
So whenever the SM starts a sweep the SA will see an "empty" DB.
The solution for that problem may be having a "previous" DB during
sweeps.
I suspect using that approach will also enable a fine grain incremental
capability too.
Eitan
Eitan Zahavi
Senior Engineering Director, Software Architect
Mellanox Technologies LTD
Tel:+972-4-9097208
Fax:+972-4-9593245
P.O. Box 586 Yokneam 20692 ISRAEL
> -----Original Message-----
> From: Sasha Khapyorsky [mailto:sashak at voltaire.com]
> Sent: Sunday, July 29, 2007 12:55 AM
> To: Eitan Zahavi
> Cc: Yevgeny Kliteynik; Hal Rosenstock; general at lists.openfabrics.org
> Subject: Re: pkey.sim.tcl
>
> Hi Eitan,
>
> On 07:56 Fri 27 Jul , Eitan Zahavi wrote:
> > >
> > > On 09:26 Thu 26 Jul , Eitan Zahavi wrote:
> > > >
> > > > I am happy you actually use the simulator.
> > > > Please provide more info regarding the failure. You should tar
> > > > compress the /tmp/ibmgtsim.XXXX of your run.
> > >
> > > I can send this for you if you want, but the failure is trivial.
> > No need if you already know where the bug is...
> > >
> > > Yes, and it is due (6), where default Pkey is removed
> "externally".
> > > I'm not sure that OpenSM should handle the case when pkey
> table is
> > > modified externally by something which is not SM.
> > >
> >
> > For a few years it just worked fine. So I wonder why this
> > fucntionality was removed ?
> > It is a real BAD case where Pkeys are altered but I think would be
> > wise to "refresh" these tables on heavy seep.
>
> We discussed how and when port tables refresh should be done
> just few days ago in this thread. My impression was that we
> are "in sync" about this.
>
> > In general it seems OpenSM has lost its "heavy sweep"
> concept. Now it
> > does not refresh the fabric setup even on heavy sweep.
>
> Not on each heavy sweep, but it does when it needed or when
> data could change. I don't think the concept was changed,
> just optimized. Let just look at the numbers:
>
> $ time ./opensm/opensm -e -f ./osm.log -o ...
> SUBNET UP
> Exiting SM
>
> real 0m7.995s
> user 0m4.488s
> sys 0m6.072s
>
> $ time ./opensm/opensm -e -f ./osm.log -o --qos ...
> SUBNET UP
> Exiting SM
>
> real 0m22.521s
> user 0m10.921s
> sys 0m17.173s
>
>
> This is simulated runs (with ibsim), the fabric is ~1300 nodes.
>
> The difference there is '--qos' flag, so OpenSM skips SL2VL
> and VLArb update in first run and does it in the second -
> sweep times are 8 against 22 seconds.
>
> > This is assuming a "perfect" HW and software and I would
> really this
> > we should have preserved that capability.
>
> What about an option? Now with subn->need_update flag (which
> always enforces updates) it is trivial to implement.
>
> > Note that a "heavy sweep" does not happen unless somethng
> changed or
> > trapped.
>
> Yes, for example some port was connected/disconnected, some
> node rebooted, etc.. OpenSM starts huge heavy sweep, it takes
> a while, SA is not responsive most the time, TCP connection
> over IPoIB timeouted, applications failed. This is production
> experiences... :(
>
> Sasha
>
More information about the general
mailing list