[Date Prev] [Date Next] | [Thread Prev] [Thread Next] | [Date Index] [Thread Index] |
Re: Hostmon issues
|
> I have noclogd set up to pipe its messages to a paging script. > Hopefully being VERY careful what actually gets paged, and what doesn't. We do alot of DNS monitoring, and its amazing what happens when connectivity to an off-site DNS server goes and we have a couple hundred domains go Critical. > > What > happens is that noclog constantly reports that machines haven't posted any > HostMonData. If I check netconsole, everything is hunky-dory, or at least > as good as it gets. Hostmon is set to sleep for 15 minutes. > Hostmon-client runs every 5 minutes. > Which message do you get? RPCPing or OLDData? Are you checking the netconsole with proper log level? > > Also, I previously described problems where hostmon would either return > one value for all the filesystems I wanted to monitor, or it would monitor > the same group of variables across all the servers, returning a bunch of > uninit values. Somebody previously replied and said that they devised a > scheme whereby they look for the DFspace_%used[0-9][0-9] variables, which > are dynamically assigned to different FS's. > > I modified the code so that hostmon-client would return such variables; > thanks for the tip. Although I get less uninit variables, I still get > some, which is a bother. Has anyone found how to further granularize the > search for variables so that I can search for DF_space_%used03 on machine > X, while not on machine Y? > I think something we do prevents this by entering EVERY machine, EVERY disk. Granted, if 2 disks are in Warning its ugly, but that RARELY happens. > > Another issue is that I keep running into problems where portmon > erroneously reports that hosts or services are down. I've found that > jumbling entries in the portmon-confg file helps, but doesn't entirely fix > the problem. Why would this happen? Are there any fixes known? > I'd like to hear more about this. We never have a problem like that. > > Finally, nocol-4.2.2beta3 doesn't compile on Solaris 2.6. The SNMP > utilities complain about a bunch of undefined references. I've since > given up on that issue because beta3 isn't official yet anyway. > Didn't know beta3 was out... Tuc/TTSG |