[Date Prev]   [Date Next] [Thread Prev]   [Thread Next] [Date Index]   [Thread Index]

 

     Monitoring the monitors

Hi,

	We just ran into a problem where a machine failed, and it caused
hostmon to "lock".  The last thing it appears to be doing was an rcp of
the file from the machine that failed.  We fixed the machine last Friday
and didn't check NOCOL.  Since then, 2 other machines had the hostmon
monitor die, which we didn't know.  Even worse, though, was that another
machine went down and we didn't find out until we saw other indications.

	I hate to do this, but is there something we can do to monitor the
monitors?  (We'd of course have a monitor monitor). 

	In an unrelated story....... Is there a way to keep "X" previous copies
of hostmon output?  We sometimes don't catch a situation that would have
really been great to see the hostmon information until after its refreshed
it.  Only keep the last "X" rolling copies of the config per machine.

		Thanks, Tuc/TTSG