[Date Prev]   [Date Next] [Thread Prev]   [Thread Next] [Date Index]   [Thread Index]


     Re: [nocol-users] multiple ippingmon programs

Bill Hauck wrote:
> i've got a few questions that i haven't found in the archives or the doc's.
> 1.  can ippingmon be run so that escalations happen differently for different equipment?  (router and switches should be 5 min polls, servers should be 10 min polls.)

I've never tried to run multiple versions of ippingmon but it should work.
You need to create separate config files for each interval you want to run.
You also need to use separate output files (the ones that show up in
nocol/data). Finally, because each program creates a pid file based on its
name, copy or create a link to ippingmon. Then modify keepalive_monitors (or
whatever you use) to start them like:

ippingmon -o ../data/ippingmon5-output ../etc/ippingmon5.conf
ippingmon10 -o ../data/ippingmon10-output ../etc/ippingmon10.conf

> 2.  can ippingmon page me if some equipment goes down and not if other equipment goes down (local and wide area).

ippingmon doesn't page but notifier can. Notifier, as shipped, is a shell
script that sends mail when things are down. It's run from cron periodically.
I had started to hack it to dial a pager but then we got new alphanumeric
pagers that we can send email to. If yours are more conventional, you'll have
to figure out how to dial a modem under programmatic control and incorporate
that into notifier.

To monitor different devices differently, I've simply made multiple copies of
notifier, each for a different purpose. I have one that beeps someone if our
web servers go down and different one that beeps someone else if the mail
servers go down. A third one beeps us network guys if any of a list of
routers goes down. They run at different intervals depending on the
criticality of the devices being watched. The key to the kingdom is the call
to eventselect. Eventselect pulls certain entries out of nocol--you pick the
criticality, the monitor and how long the device has been critical. $i and $j
select the time frame. The default notifier looks at everything every hour,
but you can be more selective. For example, our web server is monitored with
portmon so that script calls (basically)

eventselect -v critical -s portmon -f +3600 -t +300 ${DATADIR}/*

I run that output through grep to pick out just the web servers. The script
runs every 5 minutes (cron) because we want to know NOW if the web server
freezes up. A similar one that monitors SMTP port runs every half hour.
Finally, I run the standard notifier which sends me mail for all devices that
are down for 1 to 2 hours. It's a useful 'first check' when I get in in the
> 3.  when i "hide from critical view" will the system still page me for the device?

"Hide from critical view" only affects what gets placed on the Critical.html
and User.html web pages. It has no other effect.

     Rick Beebe                                           (203) 785-4566
     Network Engineering Manager                     FAX: (203) 737-4037
     ITS-Med Technology Operations                Richard.Beebe@yale.edu   
     Yale University School of Medicine                                 
     333 Cedar Street, New Haven, CT 06510