[Date Prev] [Date Next] | [Thread Prev] [Thread Next] | [Date Index] [Thread Index] |
[nocol-users] Hostmon, start/stop scripts, changing configurations and merging old and new configurations.
|
I recently submitted for possible NOCOL inclusion a number of changes. One change was to make daemons check the timestamp of the daemon's configuration file; if the configuration file time changes the daemon rereads the configuration file. Currently it seems that when the configuration file has changed, people "kill" the daemon and either restart it -- oh yeah, I also submitted start/stop/restart scripts for just such purposes. Alternatively when one kills the daemon for a configuration change, one can wait and let the keepalive monitor restart it. The keepalive program however has no way of knowing if the program stopped because of such a configuration change or something more serious like a bug in the code. And the keepalive as to go through all the daemons to find the one you know you just killed. In contrast start/stop/restart code I submitted can "know" that a restart was issued. The code I wrote to reread the configuration is basically boilerplate code that I've used many other times and have been quite happy with -- especially given what existing situation which is a bit awkward. A couple of concerns were raised. The first concern was this is might be too automatic -- the daemon might automatically reread the configuration when in an inconsitent state of modification. Having used this code quite a lot, this hasn't been a problem. Most text editors save an internal copy of the file while it is being worked on and they keep an "autosave" copy on disk so that even if lots of changes are made those are saved if say the computer or editor session crashes unexpectedly. The file is written under the its original name only when a "save" command is issued. So yes, one may have to be more careful and only save when the file is consistent, but I have not found that overly cumbersome. If it is for others, I'd suggest making a copy of the configuration file, modifying that can copying it back when all of the changes are to one's satisfaction. The other reason automatically rereading a configuration file when the timestamp changes is not a problem is that the daemon only checks the timestamp doing work; most of the time the deamon is sleeping. In fact that the daemon *doesn't* read the configuration file immediately after it has been changed has also been something of an annoyance. Therefore in the perl code for hostmon, I added the ability to send a HUP signal (such as you do for other daemons like inetd or syslogd) which has the effect of breaking the program out of a sleep (which is most of the time) and then the configuration file is reread. The second concern was that there is a bit of history or state that one may want to save between configuration changes. For example the when a condition first started occurring, tracking IP changes between devices with the same name or a device name changes with the same IP and so on. Although this is a good thing to add and I hope it does get added, my change doesn't attempt to address this facet. I believe this is an independent problem that needs to be addressed. It's a problem with the existing code as it stands. You restart a daemon which people have to do now when there is a configuration change, and the history of what happened before is lost. If somehow NOCOL daemons are improved so that on startup they consult existing output data and merge them before overwriting them, then the code I wrote would probably be able to take advantage of that too. I should also mention too that switching between NOCOL versions 4.3 and 4.4 makes the "merging" problem a tad more difficult because one also has to consider different protocol formats in the output file. I suggest then in the future output format to store the protocol version in the data file. Although between 4.3 and 4.4 one can sort of figure out which format is used by checking the file size to see if it a multiple of the record size, this is not a good method -- what happens if the output is a multiple of *both* record sizes? What happens if in the future there is a output file protocol change that doesn't affect the size? Finally I should mention that the hostmon changes I've made now allow the program write its output to syslog or a file or nowhere. There are now many command-line switches for many of the things that one had to edit file for before, like debugging, the configuration file location and so on. I still consider the hostmon changes experimental, but if folks want to test it out, they can contact me for this code or the start/stop code which currently works on System V-ish systems like Solaris or Redhat-like (not Debian) Linux. |