[openib-general] Port error rate detection

Hal Rosenstock halr at voltaire.com
Tue Feb 20 06:47:52 PST 2007


On Tue, 2007-02-20 at 09:44, Steven Carter wrote:
> Hal Rosenstock wrote:
> > On Mon, 2007-02-19 at 15:53, Steven Carter wrote:
> >   
> >> I have a Nagios module that alerts on connectivity, port errors, 
> >> speed/width problems.  I would like to give it the ability to change the 
> >> severity of the alert depending on whether errors are just present or if 
> >> they are increasing faster than a specified rate.  The intent is to 
> >> equip the module to keep the state of the last query and possibly 
> >> history, but I wanted to make sure that I was not re-inventing the wheel 
> >> first.  Is there an attribute or utility that I am overlooking that will 
> >> help me do this?
> >>     
> >
> > Not currently (to my knowledge). The thresholding of rate aspect is
> > similat to what will be supported in the proposed PerfManager.
> >   
> I noticed that in your RFC.  How are you planning on presenting the data 
> to other agents (e.g. Nagios, Openview, MRTG, etc.)?  One comment that I 
> should have made on your RFC is that I wonder if it is necessary to 
> include the data analysis/reduction part.

I think it is because there is too much data to push up the tree to one
manager.

> Just having a central location that collects the values and presents it via SNMP is extremely 
> useful since there are a plethora of monitoring apps (free and 
> commercial) that  do what you are proposing.

In general, this information can be exported via SNMP or whatever the
management infrastructure is.

BTW, are there SNMP MIBs for all of this information ? To my knowledge,
some of these were started but never completed. Also, the MIBs were
geared at the agents rather than the managers (in the PerfMgt arena).

-- Hal

> That way, a network manager can leverage existing tools currently used for monitoring 
> Ethernet Nodes, Hosts, etc.  You can still include a last change 
> attribute with each counter so that simple utilities (like the one that 
> I am writing) can get an idea of how quickly errors are occurring.

> Steven.
> 
> > -- Hal
> >
> >   
> >> Thanks,
> >>
> >> Steven.
> >>
> >> _______________________________________________
> >> openib-general mailing list
> >> openib-general at openib.org
> >> http://openib.org/mailman/listinfo/openib-general
> >>
> >> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> >>
> >>     
> >
> >   
> 





More information about the general mailing list