[Date Prev]   [Date Next] [Thread Prev]   [Thread Next] [Date Index]   [Thread Index]


     Re: [snips-users] false/null alert

On Thu, Jun 20, 2002 at 09:27:01AM -0500, peiffer at engineer8 nts.umn.edu wrote:
> Has anyone completed additional debug on hostmon null alerts?
> I too have found problems with hostmon similar to what Rusell reported back
> in March.  I have monitors based upon the hostmon script that I am recycling.
> The scripts give null information on reporting monitor, device and variable.

I /think/ mine may be related to the root directory getting close to
full (ie. DFSpace for "/").  Looking in to the logfiles, I noticed that
it's actually getting /logged/ that way and not just alerted that way,
so the problem is probably in what's recording the issue (ie. hostmon
itself in this case?).

> Tue Jun 18 11:16:49 2002 []: DEVICE   VAR  1127 900  LEVEL Critical LOGLEVEL Cri
> tical STATE down old
> Tue Jun 18 11:21:49 2002 []: DEVICE   VAR  298 900  LEVEL Info LOGLEVEL Critical
>  STATE up
> Discussions locally suggested that there may be some interaction near
> the polling time.  Dumping various datafiles reveals no exact match,
> ( ../bin/display_snips_datafile hostmon-output) but does indicate some 
> problems with 'unknown' state variables on startup.  The only place that
> I see null device/agent are hostmon,snmpmon, and dhcpmon (recycled hostmon).
> The times are the same for all 3 monitors.  Could there be a race condition,
> or deadlock on resources common to all of the above?
> Tim Peiffer	peiffer at umn edu

Personally, I hope to revisit this soon... though with the (in)famous
caveat... "as soon as work calms down enough that I can look at it."  I
have noticed, now though... now that I keep my root drives pretty wide
open and away from the thresholds, it seems a lot quieter.


Russell M. Van Tassell
russell at loosenut com

Seleznick's Theory of Holistic Medicine:
       Ice Cream cures all ills.

Zyrion Traverse Network Monitoring & Network Management Software