[ofa-general] IB performance stats (revisited)
Ira Weiny
weiny2 at llnl.gov
Tue Jul 10 09:46:59 PDT 2007
On Thu, 28 Jun 2007 10:24:59 +0300
"Eitan Zahavi" <eitan at mellanox.co.il> wrote:
> > On Wed, 2007-06-27 at 14:23, Eitan Zahavi wrote:
> > > In the last months it is the second time I hear people
> > complaining the
> > > current monitoring solution in OFA is integrated with OpenSM.
> >
> > I must have missed this both times (didn't see this in Mark's
> > post) and the statement itself is somewhat inaccurate as well.
> Private talks - I hope they will speak up for themselves now...
> >
> > > These people do not use OpenSM but do use OFED.
> >
> > I'm not sure I'm following what you mean here.
> >
> > If you mean that some people want to run PerfMgr without the
> > SM/SA aspects (so that they can run a vendor based SM), that
> > is the next thing we are adding to the implementation.
> Exactly. OK when is that coming?
There is very little which ties the current PerfMgr to OpenSM. Basically it
just gets the current fabric topology. As Hal has said changes are coming.
>
> >
> > > Another drawback if that
> > > no naming is provided and the reporting uses GUIDs.
> >
> > Naming is provided via NodeDescription.
> This might be good for hosts but is not covering switches ...
It does include switches. However, since most systems have the same name for
multiple switches this becomes ineffective. I have queried Voltaire for a way
to change the NodeDescription for switches, but at the time I asked, there was
no way to do it. Perhaps there is now? What about other vendors? This is why
ibnetdiscover and other diags have "switch map" support. (A GUID->name mapping
to override the default NodeDescription.) Nothing would please me more than to
be able to remove that for a more "automatic" solution.
> >
> > > I also can't hold myself from saying again I think you are going to
> > > hit the wall with the concept of doing the PMA from a single node.
> >
> > If you are referring to the fact the PerMgr is currently not
> > distributed, that will be done as has been stated before.
> Good. When is it expected? Will it be OFED 1.3?
When Hal first sent out the PerfMgr design I thought we should jump right to
the distributed model as well. But now I am glad we have gone the way we did.
First off, we have something which "works" and from which we can expand.
Second, I have run some tests querying the fabric of our large clusters here
(~500 nodes) and the results were promising for a single node implementation.
I don't recall the numbers as this was a while ago but it was on the order of
<2 sec and I think <1 but I don't want to be misquoted.
For sure, a distributed model offers many advantages and we will get there. But
for many the current single node approach should work just fine.
Thanks,
Ira
>
> Thanks
> >
> > -- Hal
> >
> > > Eitan Zahavi
> > > Senior Engineering Director, Software Architect Mellanox
> > Technologies
> > > LTD
> > > Tel:+972-4-9097208
> > > Fax:+972-4-9593245
> > > P.O. Box 586 Yokneam 20692 ISRAEL
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: general-bounces at lists.openfabrics.org
> > > > [mailto:general-bounces at lists.openfabrics.org] On Behalf Of Hal
> > > > Rosenstock
> > > > Sent: Wednesday, June 27, 2007 8:12 PM
> > > > To: Mark Seger
> > > > Cc: Finn, Ed; general at lists.openfabrics.org
> > > > Subject: Re: [ofa-general] IB performance stats (revisited)
> > > >
> > > > On Wed, 2007-06-27 at 13:07, Mark Seger wrote:
> > > > > >The performance managers deal with the counter stickiness (by
> > > > > >resetting them when they think they need to). They
> > > > typically export
> > > > > >their data although this is not specified by IBA so it is
> > > > in a vendor
> > > > > >proprietary manner.
> > > > > >
> > > > > >
> > > > > so I guess these guys are poor citizens as well...
> > > >
> > > > Not sure what you mean.
> > > >
> > > > > the real issue as I see it then means nobody can trust
> > the data if
> > > > > randon tools randomly reset the counters. a real shame...
> > > >
> > > > I consider this to be a real rather than random app for this.
> > > > Guess it depends on what one considers random.
> > > >
> > > > -- Hal
> > > >
> > > > > -mark
> > > > >
> > > > >
> > > >
> > > > _______________________________________________
> > > > general mailing list
> > > > general at lists.openfabrics.org
> > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> > > >
> > > > To unsubscribe, please visit
> > > > http://openib.org/mailman/listinfo/openib-general
> > > >
> >
> >
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
More information about the general
mailing list