[openib-general] Port error rate detection
Hal Rosenstock
halr at voltaire.com
Tue Feb 20 10:42:38 PST 2007
On Tue, 2007-02-20 at 10:25, Steven Carter wrote:
> Hal Rosenstock wrote:
> > On Tue, 2007-02-20 at 09:44, Steven Carter wrote:
> >
> >> Hal Rosenstock wrote:
> >>
> >>> On Mon, 2007-02-19 at 15:53, Steven Carter wrote:
> >>>
> >>>
> >>>> I have a Nagios module that alerts on connectivity, port errors,
> >>>> speed/width problems. I would like to give it the ability to change the
> >>>> severity of the alert depending on whether errors are just present or if
> >>>> they are increasing faster than a specified rate. The intent is to
> >>>> equip the module to keep the state of the last query and possibly
> >>>> history, but I wanted to make sure that I was not re-inventing the wheel
> >>>> first. Is there an attribute or utility that I am overlooking that will
> >>>> help me do this?
> >>>>
> >>>>
> >>> Not currently (to my knowledge). The thresholding of rate aspect is
> >>> similat to what will be supported in the proposed PerfManager.
> >>>
> >>>
> >> I noticed that in your RFC. How are you planning on presenting the data
> >> to other agents (e.g. Nagios, Openview, MRTG, etc.)? One comment that I
> >> should have made on your RFC is that I wonder if it is necessary to
> >> include the data analysis/reduction part.
> >>
> >
> > I think it is because there is too much data to push up the tree to one
> > manager.
> >
> I agree, but does the data need to be pushed to one node? If you go
> with a distributed approach where information is aggregated per network
> device (switch or group of switches),
The proposal includes a distributed approach.
> then a third-party monitoring
> server can collect and present it in the same way that it does for an
> Ethernet network. That way, you do not need to pass information up to a
> central node. You can just have a third party monitoring application
> collect and present the information. I guess it just depends on how
> much you want to leverage existing monitoring solutions and/or how much
> capability you want inherent in the OFA software.
Third party monitoring agents can hook in at the intermediate nodes in
the collection hierarchy if that is what is desired.
> >> Just having a central location that collects the values and presents it via SNMP is extremely
> >> useful since there are a plethora of monitoring apps (free and
> >> commercial) that do what you are proposing.
> >>
> I should have said 'a location' and not 'a central location'. Since
> most monitoring applications support multiple agents, it is not
> necessary to aggregate the information into one place.
> >
> > In general, this information can be exported via SNMP or whatever the
> > management infrastructure is.
> >
> > BTW, are there SNMP MIBs for all of this information ? To my knowledge,
> > some of these were started but never completed. Also, the MIBs were
> > geared at the agents rather than the managers (in the PerfMgt arena).
> >
> There are standard MIBS (e.g. mib-2's ifTable) that can present most of
> the useful information (in/out octets, errors, etc.)
Not most of the useful IB information.
> , but I would suspect that you would have to supplement that with a private MIB as
> most other technologies/vendors have.
Yes, as this may be data out of a non IBTA specified manager, it is
likely a private MIB unless one goes for all the agent (PMA) data. There
was a proposed MIB for the PMA at the IETF IPoIB WG.
-- Hal
> Steven.
>
> > -- Hal
> >
> >
> >> That way, a network manager can leverage existing tools currently used for monitoring
> >> Ethernet Nodes, Hosts, etc. You can still include a last change
> >> attribute with each counter so that simple utilities (like the one that
> >> I am writing) can get an idea of how quickly errors are occurring.
> >>
> >
> >
> >> Steven.
> >>
> >>
> >>> -- Hal
> >>>
> >>>
> >>>
> >>>> Thanks,
> >>>>
> >>>> Steven.
> >>>>
> >>>> _______________________________________________
> >>>> openib-general mailing list
> >>>> openib-general at openib.org
> >>>> http://openib.org/mailman/listinfo/openib-general
> >>>>
> >>>> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >
> >
>
More information about the general
mailing list