[openib-general] Port error rate detection

Hal Rosenstock halr at voltaire.com
Tue Feb 20 10:42:38 PST 2007


On Tue, 2007-02-20 at 10:25, Steven Carter wrote:
> Hal Rosenstock wrote:
> > On Tue, 2007-02-20 at 09:44, Steven Carter wrote:
> >   
> >> Hal Rosenstock wrote:
> >>     
> >>> On Mon, 2007-02-19 at 15:53, Steven Carter wrote:
> >>>   
> >>>       
> >>>> I have a Nagios module that alerts on connectivity, port errors, 
> >>>> speed/width problems.  I would like to give it the ability to change the 
> >>>> severity of the alert depending on whether errors are just present or if 
> >>>> they are increasing faster than a specified rate.  The intent is to 
> >>>> equip the module to keep the state of the last query and possibly 
> >>>> history, but I wanted to make sure that I was not re-inventing the wheel 
> >>>> first.  Is there an attribute or utility that I am overlooking that will 
> >>>> help me do this?
> >>>>     
> >>>>         
> >>> Not currently (to my knowledge). The thresholding of rate aspect is
> >>> similat to what will be supported in the proposed PerfManager.
> >>>   
> >>>       
> >> I noticed that in your RFC.  How are you planning on presenting the data 
> >> to other agents (e.g. Nagios, Openview, MRTG, etc.)?  One comment that I 
> >> should have made on your RFC is that I wonder if it is necessary to 
> >> include the data analysis/reduction part.
> >>     
> >
> > I think it is because there is too much data to push up the tree to one
> > manager.
> >   
> I agree, but does the data need to be pushed to one node?  If you go 
> with a distributed approach  where information is aggregated per network 
> device (switch or group of switches), 

The proposal includes a distributed approach.

> then a third-party monitoring 
> server can collect and present it in the same way that it does for an 
> Ethernet network.  That way, you do not need to pass information up to a 
> central node.  You can just have a third party monitoring application 
> collect and present the information.  I guess it just depends on how 
> much you want to leverage existing monitoring solutions and/or how much 
> capability you want inherent in the OFA software.

Third party monitoring agents can hook in at the intermediate nodes in
the collection hierarchy if that is what is desired.

> >> Just having a central location that collects the values and presents it via SNMP is extremely 
> >> useful since there are a plethora of monitoring apps (free and 
> >> commercial) that  do what you are proposing.
> >>     
> I should have said 'a location' and not 'a central location'.  Since 
> most monitoring applications support multiple agents, it is not 
> necessary to aggregate the information into one place.
> >
> > In general, this information can be exported via SNMP or whatever the
> > management infrastructure is.
> >
> > BTW, are there SNMP MIBs for all of this information ? To my knowledge,
> > some of these were started but never completed. Also, the MIBs were
> > geared at the agents rather than the managers (in the PerfMgt arena).
> >   
> There are standard MIBS (e.g. mib-2's ifTable) that can present most of 
> the useful information (in/out octets, errors, etc.)

Not most of the useful IB information.

> , but I would suspect that you would have to supplement that with a private MIB as 
> most other technologies/vendors have.

Yes, as this may be data out of a non IBTA specified manager, it is
likely a private MIB unless one goes for all the agent (PMA) data. There
was a proposed MIB for the PMA at the IETF IPoIB WG.

-- Hal

> Steven.
> 
> > -- Hal
> >
> >   
> >> That way, a network manager can leverage existing tools currently used for monitoring 
> >> Ethernet Nodes, Hosts, etc.  You can still include a last change 
> >> attribute with each counter so that simple utilities (like the one that 
> >> I am writing) can get an idea of how quickly errors are occurring.
> >>     
> >
> >   
> >> Steven.
> >>
> >>     
> >>> -- Hal
> >>>
> >>>   
> >>>       
> >>>> Thanks,
> >>>>
> >>>> Steven.
> >>>>
> >>>> _______________________________________________
> >>>> openib-general mailing list
> >>>> openib-general at openib.org
> >>>> http://openib.org/mailman/listinfo/openib-general
> >>>>
> >>>> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> >>>>
> >>>>     
> >>>>         
> >>>   
> >>>       
> >
> >   
> 





More information about the general mailing list