[Date Prev] [Date Next] | [Thread Prev] [Thread Next] | [Date Index] [Thread Index] |
Re: [nocol-users] Root Cause Analysis
|
On Tue, Jan 18, 2000 at 01:38:16PM -0500, Jonathan A. Zdziarski wrote: > I think what everyone's goal is in this new NOCOL is not only to have > dependencies, but ultimately do some root cause analysis. I'm still working > on the basic architecture of the rulesets, but in the meantime if some of > you want to contribute your own dependency architecture and any other > information you'd like, it will help me create a decent rules structure for > something like this. Nothing terrifies me more in a meeting when a customer asks about Root Cause Analysis. The Holy Grail of network monitoring: tell me not what happened to the network but where to fix it! I think what needs to be done to accomplish this is: intelligent NOC operators. The software can help some, but keeping dependancy information for any decent size netowrk up to date is neigh impossible. A good journaling / helpdesk system can help here. Given Problem A, in the past, causes have been C D or E. Some thresholding rules can help the operator determine what happened when then network goes crazy: 100 alarms in the last five minutes, 3 in the previous five minute chunk. Well what were the first few errors of those 100? Event filtering, sorting and well trained operators are the solution to RCA, imho. Opinions? -- Barry Robison - brobison@deimos.org Why one contradicts. One often contradicts an opinion when it is really only the way in which it has been presented that is unsympathetic. |