[ofa-general] Re: [PATCH 3/4] opensm: resweep instead of exit when duplicated guid suspected

Hal Rosenstock hal.rosenstock at gmail.com
Tue Aug 14 07:44:30 PDT 2007


On 8/14/07, Sasha Khapyorsky <sashak at voltaire.com> wrote:
> On 10:50 Mon 13 Aug     , Hal Rosenstock wrote:
> > On 8/13/07, Sasha Khapyorsky <sashak at voltaire.com> wrote:
> > > Hi Hal,
> > >
> > > On 09:19 Mon 13 Aug     , Hal Rosenstock wrote:
> > > > On 8/12/07, Sasha Khapyorsky <sashak at voltaire.com> wrote:
> > > > > Anyway OpenSM will request resweep when there are suspected nodes
> > > > > with duplicate GUID on the subnet. And because we cannot be 100% sure
> > > > > that detected GUIDs duplication is not some corner case of port moving
> > > > > I prefer to not exit. Endless (re)discovery and syslog messages should
> > > > > be good indication if it is indeed this case.
> > > >
> > > > Couldn't there be some duplication state kept per GUID so the messages
> > > > only get logged on change of state to duplicated rather than
> > > > continually spewing into the log ?
> > >
> > > There should be one message per duplicated GUID in the sweep. The sweep
> > > will be repeated and in the case of real duplication the message will
> > > appear again - so it is per sweep. I hope it is not too much.
> >
> > Once per sweep is too much IMO. It still fills the log over time.
>
> Hmm, I cannot find how to limit those printing in an elegant way.
> When there is real GUID duplication it is fatal error and setup must be
> fixed, so it is not something which could let us to work normally. Also I
> guess the case itself is pretty esoteric one. Do you think it is
> critical?

Critical no but important since when it does occur, it fills the log
with these repeated messages obscuring the important ones.

-- Hal

> Sasha
>



More information about the general mailing list