diff -Naur nocol-4.3/INSTALL nocol-4.3.1/INSTALL
--- nocol-4.3/INSTALL	Wed Jan 26 23:56:19 2000
+++ nocol-4.3.1/INSTALL	Tue Mar 21 00:24:06 2000
@@ -1,4 +1,4 @@
-## $Id: INSTALL,v 4.4 2000/01/27 04:55:57 vikas Exp $
+## $Id: INSTALL,v 4.5 2000/03/21 05:23:58 vikas Exp $
 
 INSTALLATION INSTRUCTIONS FOR 'NOCOL' v4.3
 ==========================================
@@ -147,25 +147,31 @@
 ---------
 
 There is a PERL interface for developing additional NOCOL monitors. To use 
-this, you need to have PERL installed on your system (perl is available
-from ftp.uu.net or from ftp.netlabs.com).
+this, you need to have PERL installed on your system.
 
-1.  If using 'hostmon', edit the hostmon server and the clients to set
-    the port & permitted hosts. The client routines do not use nocollib.pl
-    and can be run entirely standalone on remote hosts (just copy over
-    perlnocol/hostmon-osclients to all your Unix clients and start up
-    hostmon-client at boot time by makeing an entry in your /etc/rc.local
-    or equivalent file). As an example, you can do the following on all
-    your Unix hosts:
+1.  If using 'hostmon', you need to run the standalone 'hostmon-client'
+    programs on the machines you want monitored, and run the 'hostmon'
+    process on the 'nocol' server. Check the '@permithosts' line in the
+    'hostmon-client' program to ensure that it allows the nocol host to
+    connect to the hostmon-client processes. Then copy over the entire
+    'perlnocol/hostmon-osclients' directory to all the Unix hosts that
+    you want monitored. These client routines do not use nocollib.pl
+    and do not use any configuration file.
+    Start up hostmon-client at boot time by making an entry in your
+    /etc/rc.local or equivalent file. As an example, you can do the
+    following on all your Unix hosts you want monitored:
 	cd $ROOTDIR/bin
 	rsh host1 mkdir /usr/local/nocol
 	rcp -r hostmon-osclients host1:/usr/local/nocol
 	rlogin host1
 	# Now edit your /etc/rc.local or whatever system startup script
 	# and add the line:
-	#	(cd /usr/local/nocol/hostmon-osclients; ./hostmon-client)
-	# Run this command manully for now since you are not rebooting
+	#   (cd /usr/local/nocol/hostmon-osclients; ./hostmon-client)
+	# Run this command manually for now since you are not rebooting
 	# your machine.
+
+    The 'hostmon' process on the nocol host will be restarted by the
+    'keepalive_monitors' process. Edit the hostmon-confg file.
 
 2.  To use 'snmpmon',  edit and set the thresholds in the snmpmon-confg
     file. List the devices that need to be monitored in the 
diff -Naur nocol-4.3/Makefile.mid nocol-4.3.1/Makefile.mid
--- nocol-4.3/Makefile.mid	Wed Jan 26 23:58:24 2000
+++ nocol-4.3.1/Makefile.mid	Thu Apr  6 15:35:19 2000
@@ -1,4 +1,4 @@
-# $Header: /home/vikas/src/nocol/RCS/Makefile.mid,v 1.3 2000/01/27 04:58:15 vikas Exp $
+# $Header: /home/vikas/src/nocol/RCS/Makefile.mid,v 1.4 2000/04/06 19:35:08 vikas Exp $
 #
 # Makefile for 'nocol'. This file simply calls on other Makefiles in
 # the subdirectories to do all the work. All the definitions are used
@@ -7,7 +7,7 @@
 # To 'make' for only one program, use 
 #	make "SRCS=trapmon" [install|clean]
 #
-REV = "4.3"
+REV = "4.3.1"
 package=@package@
 OS=@OS@
 WHOAMI=@WHOAMI@
diff -Naur nocol-4.3/html/design.html nocol-4.3.1/html/design.html
--- nocol-4.3/html/design.html	Wed Jan 26 23:39:32 2000
+++ nocol-4.3.1/html/design.html	Mon Mar 27 22:08:58 2000
@@ -1,230 +1,237 @@
-<html>
-
-<head>
-<title>The Design of NOCOL</title>
-<meta name="GENERATOR" content="Microsoft FrontPage 3.0">
-</head>
-
-<body bgcolor="#ffffff">
-<div align="center"><center>
-
-<table border="0" cellpadding="0" cellspacing="8" width="98%">
-  <tr>
-    <td align="right" valign="top" width="20%">&nbsp; </td>
-    <td width="15"></td>
-    <td valign="bottom" width="80%"><p align="left"><font size="5" face="Arial,Helvetica">NOCOL-
-    Design and Internals</font><br>
-    <font size="3" face="Arial,Helvetica">vikas@navya.com<br>
-    Jan 19, 2000</font></td>
-  </tr>
-  <tr>
-    <td></td>
-    <td></td>
-    <td>&nbsp; <hr noshade>
-    </td>
-  </tr>
-  <tr>
-    <td valign="top" width="20%"><a href="#overview"><font size="2" face="Arial,Helvetica"><strong>Overview</strong></font></a><font
-    size="3" face="Arial,Helvetica"><br>
-    </font><font size="2" face="Arial,Helvetica">¤ Design principles<br>
-    ¤ Architecture</font><p><a href="#monitors"><strong><font size="2" face="Arial,Helvetica">Monitors</font></strong></a></p>
-    <p><font size="2" face="Arial,Helvetica"><strong><a href="#userInterfaces">User Interfaces</a><br>
-    </strong>¤ Netconsole<br>
-    ¤ WebNocol<br>
-    ¤ tkNocol</font></p>
-    <p><font size="2" face="Arial,Helvetica"><a href="#reporting"><strong>Reporting</strong></a></font></p>
-    <p><a href="#futureWork"><font size="2" face="Arial,Helvetica"><strong>Future Work</strong></font></a><font
-    size="2" face="Arial,Helvetica"><br>
-    </font></td>
-    <td width="15"></td>
-    <td valign="top" width="80%"><h2><a name="overview">Overview</a></h2>
-    <p>NOCOL is a system and network monitoring software which runs on Unix platforms and
-    monitors reachability, ports, routes, system resources, etc. It is modular in design, and
-    allows adding new monitors easily and without impacting other portions of the software- in
-    fact, a large number of the monitors are contributed by various NOCOL users.</p>
-    <p>The basic design principles&nbsp; behind NOCOL are relatively few:<ul>
-      <li>allow multiple users to see the same data being collected instead of requiring each user
-        to start their own set of monitors</li>
-      <li>multiple layers of severity to avoid any false alarms (to stop the NOC operator from
-        ignoring an alarm because it 'usually goes away')</li>
-      <li>incremental data storage (dont store every data sample- only store a datapoint when the
-        severity of an event changes).</li>
-      <li>be able to view the events from a non-graphical interface</li>
-    </ul>
-    <p>It might seem strange, but the initial versions of SunNet Manager, CiscoWorks, etc. had
-    none of the above features when NOCOL was originally written. Most of the commercial
-    packages required pretty extensive hardware to run, and it seemed like they sacrificed a
-    lot in order to present a pretty graphical interface. Nocol in contrast could run on a
-    very low end system, the monitors could be separated from the logging and reporting
-    machine since they communicated over a network, and the datapoints collected were very
-    small in number since they were only recorded when the severity of a device/variable
-    changed. So, the disk space on a machine could vary from 10% to 60% full, and only one
-    entry is logged since this would all be considered 'normal'. If the disk becomes 80% full,
-    &nbsp; a 'warning' message is generated and another datapoint is logged. This simple
-    approach gives amazingly small volumes of data, and yet presents a perfectly comprehensive
-    (though quantized) report on the variables.</p>
-    <p>The architecture of nocol itself is very simple- the monitors poll the devices and
-    assign a threshold to each 'poll' (called an 'event'). These thresholds are user settable
-    and vary from monitor to monitor (in fact, the rest of the software does not care what is
-    being monitored and does not store any intelligence about the variables). All intelligence
-    of the variable being monitored and what conditions are to normal and abnormal is built
-    into the monitor. </p>
-    <p align="center"><img src="images/nocol-arch.gif" width="373" height="487"
-    alt="nocol-arch.gif (8177 bytes)"></p>
-    <p>The monitor would then set:<ul>
-      <li>the current value of the variable (thruput, lost packets)</li>
-      <li>the current threshold (there can be upto 3 thresholds for 3 different levels of
-        severity)</li>
-      <li>timestamp, etc.</li>
-    </ul>
-    <p>and invoke a nocol API function. This writes out the current values, etc. to a realtime
-    data file on disk (this contains the current state of any device/variable) and if the
-    severity has changed, then this also gets logged to a incremental logging daemon
-    (noclogd).</p>
-    <p>All user displays can then display the data from the realtime data directory, whereas
-    the alarm and notification subsystem gets activated by the incremental&nbsp; 'noclogd'
-    process. The noclogd process can filter events based on user defined criteria, and invoke
-    an SMS pager, send email, perhaps even run some automated tests or open a trouble ticket.</p>
-    <p>This simple architecture has proven to work very effectively in this application. The
-    base system has not really changed since the software was initially written, but new
-    monitors, displays, notification software is continually being added without any changes
-    to the core system.</p>
-    <hr noshade>
-    <h2><a name="monitors">The Monitors</a></h2>
-    <p>The monitors collect variable values and compares to see if it exceeds any of the 3
-    thresholds (warning, error, critical- these thresholds are user configurable). This is all
-    done using the nocol library functions, so in effect, all the monitor needs to do is get
-    the value for the variable being monitored and read the thresholds from a config file. The
-    nocol library ensures consistency in the way that the monitoring is processed by the rest
-    of the system.</p>
-    <p>Each monitor is unique in the way that it monitors its respective variable. The DNS
-    monitor needs to make an authoritative DNS query to see if the dns server is configured
-    properly, the Port monitor needs to connect to TCP or UDP ports to ensure that any
-    processes are responding properly, and the SNMP monitor needs to monitor snmp variables
-    using the SNMP protocol. The intelligence about the entity being monitored and how to
-    monitor it lies strictly in the monitor- the rest of the nocol subsystem&nbsp; is just
-    expecting a&nbsp; device name, variable and its value.</p>
-    <p>A fair amount of effort has gone into making the monitors very efficient where possible
-    in order to allow them to scale to a large number of devices. Connectionless (UDP)
-    monitors are specially well suited to using the select() system call so that many devices
-    can be queried at the same time and the monitor then waits for the responses to come in.
-    The other option was to fork multiple processes with a single parent and each process
-    monitors one device. However, the level of scalability that could be achieved with the
-    first method proved to be far more than what could be achieved with the forking method.</p>
-    <p>To emphasize the above, consider 'pinging' 100 devices with 5 packets each, waiting 1
-    second for each response and 1 sec between each packet to the same host. If done serially,
-    this would take at least 500 seconds for each pass. If we fork multiple processes to do it
-    in parallel, this would take about 5 seconds, but we would have to fork a 100 processes.
-    The&nbsp; 'multiping' monitor could send out 1 packet to each of the 100 devices in about
-    10 seconds and then listen for the responses to come in- effectively taking about 15
-    seconds for the entire pass.</p>
-    <p>Building this level of 'multi-tasking' is a lot more difficult in the TCP based
-    services since it would require non-blocking I/O, but it it important to do this for
-    monitors such as 'portmon'. All of these type of monitors (using select()) are limited by
-    the MAXFD value (maximum open file descriptors that can be handled by the select() call).</p>
-    <p>The 'hostmon'&nbsp; monitor is an example of letting the remote hosts being monitored
-    do the local data collection (i.e. distributing the 'time consuming' part to
-    hostmon-clients). The 'hostmon' process running on the nocol host simply takes all these
-    data files and uses them as raw input to the process.</p>
-    <p>In some cases, the monitors do not need any data other than what is in the nocol data
-    structure written to disk (the raw data), whereas in others they need to store ancillary,
-    variable and device specific information in memory. All possible efforts are made to avoid
-    storing unnecessary data in memory and having 'bloated' monitoring processes.</p>
-    <hr noshade>
-    <h2><a name="userInterfaces">User Interfaces</a></h2>
-    <p>The user interfaces need to display the <u>current</u> state of&nbsp; the devices being
-    monitored, and this 'current' data is stored on disk (in the 'data' directory). This
-    allows any number of users and monitors to view the same consistent data, and run only one
-    set of monitors (unlike some other systems which need a separate monitor for each
-    display).</p>
-    <p>The other diversion from traditional network monitoring packages is the displaying of
-    monitored data using text lines and not a map or other graphical interface. The reason
-    this approach was taken is that in practical experience, a network diagram was always done
-    in some 'drawing' tool and the map on the NMS was not updated regularly. Even today, most
-    network/lan diagrams are maintained in a tool such as Visio, and the NMS graphical
-    interface is always a 'second' copy. This and being able to view line based data from any
-    terminal weighed very heavily in favor of a non-graphical user interface.</p>
-    <h4>Netconsole (curses)</h4>
-    <p>Netconsole is a simple Unix 'curses' based TTY interface. It reads the raw data from
-    disk and formats it for displaying on the screen. It has limited intelligence, and its
-    method of setting an alarm is when it sees a change in the number of 'down' items. This is
-    the original user interface and was written to let engineers view the state of the
-    systems/network over a low speed connection (over which X windows, etc. would not be
-    feasible).</p>
-    <h4>WebNocol (Web)</h4>
-    <p>Contributed&nbsp; by Rick Beebe, this is a Web based frontend to the datafiles. It
-    allows running CGI's and troubleshooting listed events and all the other benefits of HTTP.</p>
-    <p><img src="images/webnocol.gif" alt="webnocol.gif (20197 bytes)" width="558"
-    height="503"></p>
-    <p>This web interface automatically refreshes periodically, plays an audio clip if a site
-    changes its severity level, etc. A 'status' message can be displayed next to each event
-    which is inserted by any valid operator. Users are assigned access levels which controls
-    how much information they can view or edit.</p>
-    <h4>tkNocol (Tcl/Tcl)</h4>
-    <p>This is a client-server application contributed by Lydia Leong. The tkNocol application
-    connects to the <em>ndaemon</em> process on the host system, and displays the nocol data
-    in a X-window.</p>
-    <p><img src="images/tkNocol.gif" alt="tkNocol.gif (18782 bytes)" width="533" height="532"></p>
-    <p>This interface needs 'tixwish' on the system. Any number of clients can&nbsp; connect
-    to the simple process (<em>ndaemon</em>) running on the nocol host which sends data to all
-    the clients periodically. Currently there is no access control configured on the ndaemon
-    host, so this should be protected by a firewall, but this interface can be extended to add
-    these features in the daemon if needed.</p>
-    <hr noshade>
-    <h2><a name="reporting">Reporting</a></h2>
-    <p>The monitors in NOCOL generate a 'historical' event (logged to the <em>noclogd</em>
-    logging daemon) only when the severity of a variable <u>changes</u> (i.e. it goes from <em>warning</em>
-    to <em>error</em> or from <em>critical</em> back to <em>info</em>. This is done to reduce
-    the amount of historical data collected and restrict it only to 'relevant' datapoints.
-    This quantized data storage allows a monitor to poll a device or variable as frequently as
-    it likes (30 secs, 10 minutes), but it will generate a logging entry only if the variable
-    crosses one of the thresholds.</p>
-    <p>This approach of classifying the data into 'bins'&nbsp; reduces the quantity of
-    historical data significantly. Even though some granularity is lost, statistical analysis
-    can easily be done on the collected data by using the time interval that a variable
-    remained in a particular level.</p>
-    <p>'noclogd' is similar to the Unix 'syslog' process- it allows piping the log message to
-    an external program or writing to a file&nbsp; based on the monitor name. This forms the
-    basis of&nbsp; invoking SMS scripts to do paging, sending email, automated insertion into
-    trouble ticketing systems, auto problem analysis, etc.- the possibilities are virtually
-    unlimited.</p>
-    <p>Currently this system writes to flat files, but the data can easily be piped to a
-    process that writes into a database. Note that the 'current' data that the monitors write
-    to disk (the raw data which is displayed by the user interfaces) is overwritten in every
-    pass by the monitor. Hence the size of those files is fixed and does not grow over time.</p>
-    <hr noshade>
-    <h2><a name="futureWork">Future Work</a></h2>
-    <p>The package does not interface to any database, and would benefit greatly from storing
-    the raw (monitored) and noclogd historical information in a database such as MySQL, etc.
-    This would allow co-relating the various variables being monitored for any device (e.g.
-    show the state of all variables being monitored for device lan-gw). Graphs and reports
-    could be charted from the historical noclogd database.</p>
-    <p>A Java based user interface along the same lines as tkNocol would allow running the
-    display on any platform and one could build a lot of graphing, reporting functionality
-    into the gui itself.</p>
-    <p>The GUI could also support collapsing all the variables for an event in one line, and
-    only display the variable when it needs to be displayed.</p>
-    <p>Instead of the current <u>line based</u> GUI, it would be good to be able to display a
-    map of the network and the devices. The raw data would have no information on coordinates
-    or drawing, but a separate file could contain all the information necessary to create a
-    graphical display. As an example, the file could contain coordinates of nodes and edges,
-    and hierarchical relationships between the devices- the user interface could read this
-    data and construct the diagram of the network.</p>
-    <p>In order to make this scalable, it would be useful to allow various NOCOL's to interact
-    with each other. This is easily doable using the <em>noclogd</em> daemon, since noclogd
-    can be enhanced to send an event to other noclogd's running on remote hosts. The data can
-    be isolated and&nbsp; referred to using the 'nodename' to prefix the data/events.</p>
-    <hr noshade color="#808080">
-    </td>
-  </tr>
-  <tr>
-    <td><address>
-      <a href="mailto:vikas@navya.com"><small>Vikas Aggarwal</small></a> 
-    </address>
-    </td>
-    <td></td>
-  </tr>
-</table>
-</center></div>
-</body>
-</html>
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<!-- $Header: /home/vikas/src/nocol/html/RCS/design.html,v 1.2 2000/03/28 03:08:51 vikas Exp $ -->
+<html>
+
+<head>
+<title>The Design of NOCOL</title>
+<meta name="GENERATOR" content="Microsoft FrontPage 3.0">
+</head>
+
+<body bgcolor="#ffffff">
+<div align="center"><center>
+
+<table border="0" cellpadding="0" cellspacing="8" width="98%">
+  <tr>
+    <td align="right" valign="top" width="20%">&nbsp; </td>
+    <td width="15"></td>
+    <td valign="bottom" width="80%"><p align="left"><font size="5" face="Arial,Helvetica">NOCOL-
+    Design and Internals</font><br>
+    <font size="3" face="Arial,Helvetica">vikas@navya.com<br>
+    Mar 22, 2000</font></td>
+  </tr>
+  <tr>
+    <td></td>
+    <td></td>
+    <td>&nbsp; <hr noshade>
+    </td>
+  </tr>
+  <tr>
+    <td valign="top" width="20%"><a href="#overview"><font size="2" face="Arial,Helvetica"><strong>Overview</strong></font></a><font
+    size="3" face="Arial,Helvetica"><br>
+    </font><font size="2" face="Arial,Helvetica">¤ Design principles<br>
+    ¤ Architecture</font><p><a href="#monitors"><strong><font size="2" face="Arial,Helvetica">Monitors</font></strong></a></p>
+    <p><font size="2" face="Arial,Helvetica"><strong><a href="#userInterfaces">User Interfaces</a><br>
+    </strong>¤ Netconsole<br>
+    ¤ WebNocol<br>
+    ¤ tkNocol</font></p>
+    <p><font size="2" face="Arial,Helvetica"><a href="#reporting"><strong>Reporting</strong></a></font></p>
+    <p><a href="#futureWork"><font size="2" face="Arial,Helvetica"><strong>Future Work</strong></font></a><font
+    size="2" face="Arial,Helvetica"><br>
+    </font></td>
+    <td width="15"></td>
+    <td valign="top" width="80%"><h2><a name="overview">Overview</a></h2>
+    <p>NOCOL is a system and network monitoring software which runs on Unix platforms and
+    monitors reachability, ports, routes, system resources, etc. It is modular in design, and
+    allows adding new monitors easily and without impacting other portions of the software- in
+    fact, a large number of the monitors are contributed by various NOCOL users.</p>
+    <p>The basic design principles&nbsp; behind NOCOL are relatively few:<ul>
+      <li>allow multiple users to see the same data being collected instead of requiring each user
+        to start their own set of monitors</li>
+      <li>multiple layers of severity to avoid any false alarms (to stop the NOC operator from
+        ignoring an alarm because it 'usually goes away')</li>
+      <li>incremental data storage (dont store every data sample- only store a datapoint when the
+        severity of an event changes).</li>
+      <li>be able to view the events from a non-graphical interface</li>
+    </ul>
+    <p>It might seem strange, but the initial versions of SunNet Manager, CiscoWorks, etc. had
+    none of the above features when NOCOL was originally written. Most of the commercial
+    packages required pretty extensive hardware to run, and it seemed like they sacrificed a
+    lot in order to present a pretty graphical interface. Nocol in contrast could run on a
+    very low end system, the monitors could be separated from the logging and reporting
+    machine since they communicated over a network, and the datapoints collected were very
+    small in number since they were only recorded when the severity of a device/variable
+    changed. So, the disk space on a machine could vary from 10% to 60% full, and only one
+    entry is logged since this would all be considered 'normal'. If the disk becomes 80% full,
+    &nbsp; a 'warning' message is generated and another datapoint is logged. This simple
+    approach gives amazingly small volumes of data, and yet presents a perfectly comprehensive
+    (though quantized) report on the variables.</p>
+    <p>The architecture of nocol itself is very simple- the monitors poll the devices and
+    assign a threshold to each 'poll' (called an 'event'). These thresholds are user settable
+    and vary from monitor to monitor (in fact, the rest of the software does not care what is
+    being monitored and does not store any intelligence about the variables). All intelligence
+    of the variable being monitored and what conditions are to normal and abnormal is built
+    into the monitor. </p>
+    <p align="center"><img src="images/nocol-arch.gif" width="373" height="487"
+    alt="nocol-arch.gif (8177 bytes)"></p>
+    <p>The monitor would then set:<ul>
+      <li>the current value of the variable (thruput, lost packets)</li>
+      <li>the current threshold (there can be upto 3 thresholds for 3 different levels of
+        severity)</li>
+      <li>timestamp, etc.</li>
+    </ul>
+    <p>and invoke a nocol API function. This writes out the current values, etc. to a realtime
+    data file on disk (this contains the current state of any device/variable) and if the
+    severity has changed, then this also gets logged to a incremental logging daemon
+    (noclogd).</p>
+    <p>All user displays can then display the data from the realtime data directory, whereas
+    the alarm and notification subsystem gets activated by the incremental&nbsp; 'noclogd'
+    process. The noclogd process can filter events based on user defined criteria, and invoke
+    an SMS pager, send email, perhaps even run some automated tests or open a trouble ticket.</p>
+    <p>An '<strong>event</strong>' is basically a unique tuple of&nbsp; device name + device
+    address + variable name. Each event has a current data value of the variable being
+    monitored, and also a threshold value corresponding to the current severity level. This is
+    best understood by looking at the <em>event</em> data structure in the <em>nocol.h</em> C
+    include file.</p>
+    <p>This simple architecture has proven to work very effectively in this application. The
+    base system has not really changed since the software was initially written, but new
+    monitors, displays, notification software is continually being added without any changes
+    to the core system.</p>
+    <hr noshade>
+    <h2><a name="monitors">The Monitors</a></h2>
+    <p>The monitors collect variable values and compares to see if it exceeds any of the 3
+    thresholds (warning, error, critical- these thresholds are user configurable). This is all
+    done using the nocol library functions, so in effect, all the monitor needs to do is get
+    the value for the variable being monitored and read the thresholds from a config file. The
+    nocol library ensures consistency in the way that the monitoring is processed by the rest
+    of the system.</p>
+    <p>Each monitor is unique in the way that it monitors its respective variable. The DNS
+    monitor needs to make an authoritative DNS query to see if the dns server is configured
+    properly, the Port monitor needs to connect to TCP or UDP ports to ensure that any
+    processes are responding properly, and the SNMP monitor needs to monitor snmp variables
+    using the SNMP protocol. The intelligence about the entity being monitored and how to
+    monitor it lies strictly in the monitor- the rest of the nocol subsystem&nbsp; is just
+    expecting a&nbsp; device name, variable and its value.</p>
+    <p>A fair amount of effort has gone into making the monitors very efficient where possible
+    in order to allow them to scale to a large number of devices. Connectionless (UDP)
+    monitors are specially well suited to using the select() system call so that many devices
+    can be queried at the same time and the monitor then waits for the responses to come in.
+    The other option was to fork multiple processes with a single parent and each process
+    monitors one device. However, the level of scalability that could be achieved with the
+    first method proved to be far more than what could be achieved with the forking method.</p>
+    <p>To emphasize the above, consider 'pinging' 100 devices with 5 packets each, waiting 1
+    second for each response and 1 sec between each packet to the same host. If done serially,
+    this would take at least 500 seconds for each pass. If we fork multiple processes to do it
+    in parallel, this would take about 5 seconds, but we would have to fork a 100 processes.
+    The&nbsp; 'multiping' monitor could send out 1 packet to each of the 100 devices in about
+    10 seconds and then listen for the responses to come in- effectively taking about 15
+    seconds for the entire pass.</p>
+    <p>Building this level of 'multi-tasking' is a lot more difficult in the TCP based
+    services since it would require non-blocking I/O, but it it important to do this for
+    monitors such as 'portmon'. All of these type of monitors (using select()) are limited by
+    the MAXFD value (maximum open file descriptors that can be handled by the select() call).</p>
+    <p>The 'hostmon'&nbsp; monitor is an example of letting the remote hosts (that are being
+    monitored) do the local data collection (i.e. distributing the 'time consuming' part to
+    hostmon-clients). The 'hostmon' process runs on the nocol monitoring host and simply takes
+    all these data files and uses them as raw input for processing.</p>
+    <p>In some cases, the monitors do not need any data other than what is in the nocol data
+    structure written to disk (the raw data), whereas in others they need to store ancillary,
+    variable and device specific information in memory. All possible efforts are made to avoid
+    storing unnecessary data in memory and having 'bloated' monitoring processes.</p>
+    <hr noshade>
+    <h2><a name="userInterfaces">User Interfaces</a></h2>
+    <p>The user interfaces need to display the <u>current</u> state of&nbsp; the devices being
+    monitored, and this 'current' data is stored on disk (in the 'data' directory). This
+    allows any number of users and monitors to view the same consistent data, and run only one
+    set of monitors (unlike some other systems which need a separate monitor for each
+    display).</p>
+    <p>The other diversion from traditional network monitoring packages is the displaying of
+    monitored data using text lines and not a map or other graphical interface. The reason
+    this approach was taken is that in practical experience, a network diagram was always done
+    in some 'drawing' tool and the map on the NMS was not updated regularly. Even today, most
+    network/lan diagrams are maintained in a tool such as Visio, and the NMS graphical
+    interface is always a 'second' copy. This and being able to view line based data from any
+    terminal weighed very heavily in favor of a non-graphical user interface.</p>
+    <h4>Netconsole (curses)</h4>
+    <p>Netconsole is a simple Unix 'curses' based TTY interface. It reads the raw data from
+    disk and formats it for displaying on the screen. It has limited intelligence, and its
+    method of setting an alarm is when it sees a change in the number of 'down' items. This is
+    the original user interface and was written to let engineers view the state of the
+    systems/network over a low speed connection (over which X windows, etc. would not be
+    feasible).</p>
+    <h4>WebNocol (Web)</h4>
+    <p>Contributed&nbsp; by Rick Beebe, this is a Web based frontend to the datafiles. It
+    allows running CGI's and troubleshooting listed events and all the other benefits of HTTP.</p>
+    <p><img src="images/webnocol.gif" alt="webnocol.gif (20197 bytes)" width="558"
+    height="503"></p>
+    <p>This web interface automatically refreshes periodically, plays an audio clip if a site
+    changes its severity level, etc. A 'status' message can be displayed next to each event
+    which is inserted by any valid operator. Users are assigned access levels which controls
+    how much information they can view or edit.</p>
+    <h4>tkNocol (Tcl/Tcl)</h4>
+    <p>This is a client-server application contributed by Lydia Leong. The tkNocol application
+    connects to the <em>ndaemon</em> process on the host system, and displays the nocol data
+    in a X-window.</p>
+    <p><img src="images/tkNocol.gif" alt="tkNocol.gif (18782 bytes)" width="533" height="532"></p>
+    <p>This interface needs 'tixwish' on the system. Any number of clients can&nbsp; connect
+    to the simple process (<em>ndaemon</em>) running on the nocol host which sends data to all
+    the clients periodically. Currently there is no access control configured on the ndaemon
+    host, so this should be protected by a firewall, but this interface can be extended to add
+    these features in the daemon if needed.</p>
+    <hr noshade>
+    <h2><a name="reporting">Reporting</a></h2>
+    <p>The monitors in NOCOL generate an event (logged to the <em>noclogd</em> logging daemon)
+    only when the severity of a variable <u>changes</u> (i.e. it goes from <em>warning</em> to
+    <em>error</em> or from <em>critical</em> back to <em>info</em>. The thresholds for the
+    various severities are defined by the user, and this tends to reduce the irrelevant data
+    points from the collected data. This threshold triggered event generation&nbsp; allows a
+    monitor to poll a device or variable as frequently as it likes (30 secs, 10 minutes), but
+    it will generate a logging entry only if the variable crosses one of the thresholds.</p>
+    <p>This approach of recording values only when the state changes also reduces the quantity
+    of historical data significantly. Even though some granularity is lost, statistical
+    analysis can easily be done on the collected data by using the time interval that a
+    variable remained in a particular level.</p>
+    <p>'noclogd' is similar to the Unix 'syslog' process- it allows piping the log message to
+    an external program or writing to a file&nbsp; based on the monitor name. This forms the
+    basis of&nbsp; invoking SMS scripts to do paging, sending email, automated insertion into
+    trouble ticketing systems, auto problem analysis, etc.- the possibilities are virtually
+    unlimited.</p>
+    <p>Currently this system writes to flat files, but the data can easily be piped to a
+    process that writes into a database. Note that the 'current' data that the monitors write
+    to disk (the raw data which is displayed by the user interfaces) is overwritten in every
+    pass by the monitor. Hence the size of those files is fixed and does not grow over time.</p>
+    <hr noshade>
+    <h2><a name="futureWork">Future Work</a></h2>
+    <p>The package does not interface to any database, and would benefit greatly from storing
+    the raw (monitored) and noclogd historical information in a database such as MySQL, etc.
+    This would allow co-relating the various variables being monitored for any device (e.g.
+    show the state of all variables being monitored for device lan-gw). Graphs and reports
+    could be charted from the historical noclogd database.</p>
+    <p>A Java based user interface along the same lines as tkNocol would allow running the
+    display on any platform and one could build a lot of graphing, reporting functionality
+    into the gui itself.</p>
+    <p>The GUI could also support collapsing all the variables for an event in one line, and
+    only display the variable when it needs to be displayed.</p>
+    <p>Instead of the current <u>line based</u> GUI, it would be good to be able to display a
+    map of the network and the devices. The raw data would have no information on coordinates
+    or drawing, but a separate file could contain all the information necessary to create a
+    graphical display. As an example, the file could contain coordinates of nodes and edges,
+    and hierarchical relationships between the devices- the user interface could read this
+    data and construct the diagram of the network.</p>
+    <p>In order to make this scalable, it would be useful to allow various NOCOL's to interact
+    with each other. This is easily doable using the <em>noclogd</em> daemon, since noclogd
+    can be enhanced to send an event to other noclogd's running on remote hosts. The data can
+    be isolated and&nbsp; referred to using the 'nodename' to prefix the data/events.</p>
+    <hr noshade color="#808080">
+    </td>
+  </tr>
+  <tr>
+    <td><address>
+      <a href="mailto:vikas@navya.com"><small>Vikas Aggarwal</small></a> 
+    </address>
+    </td>
+    <td></td>
+  </tr>
+</table>
+</center></div>
+</body>
+</html>
diff -Naur nocol-4.3/html/faq.html nocol-4.3.1/html/faq.html
--- nocol-4.3/html/faq.html	Wed Jan 26 23:51:08 2000
+++ nocol-4.3.1/html/faq.html	Mon Mar 27 22:06:07 2000
@@ -1,159 +1,192 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<!-- $Header: /home/vikas/src/nocol/html/RCS/faq.html,v 1.4 2000/01/27 04:50:25 vikas Exp $ -->
-<html>
-
-<head>
-<title>nocol FAQ</title>
-</head>
-
-<body bgcolor="#ffffff">
-
-<h1>NOCOL : Frequently Asked Questions (FAQ)</h1>
-
-<p align="left"><img src="blue_line.gif" width="552" height="6"></p>
-<i>
-
-<p align="center">Last updated Jan 2000</i> </p>
-
-<ol>
-  <li>General <ul>
-      <li><a href="#gen0">What is nocol ?</a></li>
-      <li><a href="#gen1">How does nocol differ from MRTG ?</a></li>
-      <li><a href="#gen2">Where do I get nocol ?</a></li>
-      <li><a href="#gen3">What about support ?</a></li>
-      <li><a href="#gen4">What is SNIPS ?</a></li>
-      <li><a href="#gen5">Is nocol Y2K compliant ?</a></li>
-    </ul>
-  </li>
-  <li>Installation <ul>
-      <li><a href="#R0">What are the hardware requirements ?</a></li>
-      <li><a href="#R1">Should I run nocol as root ?</a></li>
-      <li><a href="#R2">I am getting lots of messages from <i>keepalive_monitors</i> about
-        restarting daemons</a></li>
-      <li><a href="#R3"><b>multiping</b> gives errors</a></li>
-      <li><a href="#R4">Nothing is being logged to <b>noclogd</b></a></li>
-      <li><a href="R5">Why do all the events clear when I <i>kill -HUP</i> a monitor?</a></li>
-    </ul>
-  </li>
-  <li>Miscellaneous <ul>
-      <li><a href="#misc1">Can nocol handle SNMPv2 ?</a></li>
-      <li><a href="#misc2">How can I page myself when a site goes down ?</a></li>
-      <li><a href="#misc3">How do I get notified when a site comes back up ?</a></li>
-      <li><a href="#misc4">How do I get paged as soon as a site goes down?</a></li>
-      <li><a href="#misc95">Does nocol run on Windows NT ?</a></li>
-      <li><a href="#misc99">Who developed NOCOL ?</a></li>
-    </ul>
-  </li>
-</ol>
-
-<hr>
-<!-- ___________________________________________ -->
-
-<dl>
-  <dt><b><a name="gen0">What is NOCOL ?</a></b></dt>
-  <dd><font color="#AA0000">nocol</font> (Network Operation Center On-Line) is a collection of
-    system and network monitoring agents which have a common viewing interface and logging
-    mechanism. It can be used to monitor your LAN or WAN network devices as well as your Unix
-    systems and services. </dd>
-  <dt><b><a name="gen1">How does NOCOL differ from MRTG ?</a></b></dt>
-  <dd>MRTG is primarily a graphing tool. Nocol is a <em>monitoring</em> package which detects
-    outages or errors on your devices. All data is quantized in nocol and you lose granularity
-    (which might or might not be preferable). <p>The two packages are complements of each
-    other. </p>
-  </dd>
-  <dt><a name="gen2"><b>Where do I get NOCOL</b> </a></dt>
-  <dd>The distribution site is at <a
-    href="http://www.netplex-tech.com/software/nocol/downloads">www.netplex-tech.com</a>. It
-    can also be downloaded via ftp from <a href="ftp://ftp.navya.com/pub/">ftp.navya.com</a> </dd>
-  <dt><a name="gen3"><b>What about support ?</b></a></dt>
-  <dd>NOCOL is freeware, and hence no official support is available. However, it is a popular
-    product and you can send messages to the <b>nocol-users@navya.com</b> mailing list or
-    search the Web using any popular Internet search engine (Altavista, Excite) for your
-    queries. <br>
-    You can also email queries to <a href="mailto:dev@navya.com">nocol-support@navya.com</a>
-    which might grow larger in some distant future.</dd>
-  <dt><a name="gen4"><b>What is SNIPS ?</b> </a></dt>
-  <dd>NOCOL was originally developed in 1991 and released as freeware. Since then, the
-    software has almost completely been rewritten and except for the old curses based 'netmon'
-    interface, not much else remains the same. <br>
-    <font color="#AA00AA"><b>SNIPS</b></font> (System and Network Integrated Polling Software)
-    is the next version of nocol (after v4.3) with many new features such as distributed
-    monitoring for scalability, data graphing, parallel SNMP queries, SNMPv2, MRTG interface
-    for data collection and much more. <p>SNIPS will be announced on the <i>nocol-users</i>
-    mailing list when it is available. </p>
-  </dd>
-  <dt><a name="gen5"><b>Is nocol Y2K compliant ?</b> </a></dt>
-  <dd>Yes. All events are logged to <i>noclogd</i> in the Unix timestamp format, so the
-    timestamps are not effected by the Y2K problem. </dd>
-  <hr width="50%" align="center">
-<!-- ####### -->
-  <dt><a name="R0"><b>What are the hardware requirements ?</b> </a></dt>
-  <dd>Nocol can run very comfortably on any Pentium-100 class Unix machine with 64MB of RAM
-    and monitor several hundred devices. It is very lightweight in design and implementation. </dd>
-  <dt><a name="R1"><b>Should I run nocol as root ?</b></a></dt>
-  <dd><em>NO!</em> You should create a separate user such as 'nocol' or 'snips' and all
-    monitors should be run by this user. The few monitors which require root priveleges (such
-    as pingmon or trapmon) are installed as suid root in the nocol bin/ directory. </dd>
-  <dt><a name="R2"><b>I am getting lots of messages from <tt>keepalive_monitors</tt> about
-    restarting</b></a></dt>
-  <dd>Either your system's <tt>ps</tt> command is not listing the complete program name and
-    keepalive_monitors is trying to restart the program since it thinks its down or else the
-    monitor being restarted cannot write the pid or data file and is dying (incorrect owner
-    and permissions on the <tt>nocol/run</tt> directory). <p>If the monitor is not running,
-    then try running it in debug mode (most monitors will take the <tt>-d</tt> option for
-    running in debug mode). </p>
-  </dd>
-  <dt><a name="R3"><b>multiping gives error <i>socket: Operation not permitted</i></b></a></dt>
-  <dd><b>multiping</b> requires a raw socket, and needs to be installed suid root. You
-    probably did not run <tt>make root</tt> while installing nocol. Check the ownership and
-    permission of this program- it <em>must</em> show mode <tt>-rwsr-x--x</tt> with owner
-    root. If not, do the following: <pre>		chown root multiping
-		chmod 4751 multiping
-	</pre>
-  </dd>
-  <dt><a name="R4"><b>Nothing is being logged to <i>noclogd</i></b></a></dt>
-  <dd>Events are logged ONLY when their state <em>changes</em>. Thus, an event will be logged
-    to noclogd if a site goes from info level to warning level, etc. </dd>
-  <dt>&nbsp;</dt>
-  <dt><a name="R5"><b>Why do all the events clear when I send kill -HUP to a monitor?</b></a></dt>
-  <dd>Currently the monitors restart when getting a HUP signal and do not preserve the
-    existing state information. The feature of just re-reading the config file on getting a
-    HUP signal is yet to be added.</dd>
-  <hr width="50%" align="center">
-<!-- ####### -->
-  <dt><a name="misc1"><b>Can nocol handle SNMPv2 ?</b></a></dt>
-  <dd>NOCOL currently uses the CMU SNMP software which does not implement SNMPv2. This will be
-    implemented in the next version (snips). </dd>
-  <dt><a name="misc2"><b>How can I page myself when a site goes down? </b></a></dt>
-  <dd>Assuming that you have an alphanumeric pager and can page yourself using email or any
-    other perl script, you can page yourself on a particular event by using noclogd and piping
-    the events to a simple script such as <tt>utility/beep_oncall</tt>. <br>
-    In addition to <b>noclogd</b>, you can also run <tt>utility/notifier.pl</tt> to page you.<br>
-    Paging software such as <a href="http://www.qpage.org">qpage</a> can be used to do the
-    actual paging. </dd>
-  <dt><a name="misc3"><b>How do I get notified when a site comes back up ? </b></a></dt>
-  <dd>All monitors in nocol log events to <tt>noclogd</tt> based on the worst of new severity
-    or previous severity of an event. <p>Hence, when a site goes down first, it will be logged
-    at 'warning' level. If it comes back up, it will be marked as up but will be logged at a
-    loglevel of 'warning' since that was the old severity. This mechanism allows you to not
-    only detect when a device goes critical, but also detect when the device comes back up. </p>
-  </dd>
-  <dt><a name="misc4"><b>How do I get paged as soon as a site goes down ?</b></a></dt>
-  <dd>In order to avoid false alarms (and prevent operators from getting into the habit of
-    wait-and-it-will-go-away), NOCOL will escalate any events severity gradually. If you want
-    to get paged or notified as soon as a site or variable changes, you can watch it at the <i>Warning</i>
-    level instead of the <i>Critical</i> level. </dd>
-  <dt><a name="misc95"><b>Does nocol run on windows NT ?</b></a></dt>
-  <dd>Nope. No plans to port it at this time either. </dd>
-  <dt><a name="misc99"><b>Who maintains NOCOL ?</b></a></dt>
-  <dd>This software is currently maintained by <a HREF="mailto:vikas@navya.com">Vikas
-    Aggarwal.</a> Numerous authors have made contributions which have been added to the
-    package. </dd>
-</dl>
-
-<p align="center"><img src="blue_line.gif" alt width="552" height="6"></p>
-
-<p align="center"><a HREF="mailto:vikas@navya.com"><img SRC="images/feedback.jpg"
-BORDER="0" ALT="Feedback" width="76" height="14"></a></p>
-</body>
-</html>
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<!-- $Header: /home/vikas/src/nocol/html/RCS/faq.html,v 1.5 2000/03/28 03:06:05 vikas Exp $ -->
+<html>
+
+<head>
+<title>nocol FAQ</title>
+</head>
+
+<body bgcolor="#ffffff">
+
+<h1>NOCOL : Frequently Asked Questions (FAQ)</h1>
+
+<p align="left"><img src="blue_line.gif" width="552" height="6"></p>
+<i>
+
+<p align="center">Last updated Mar 2000</i> </p>
+
+<ol>
+  <li>General <ul>
+      <li><a href="#gen0">What is nocol ?</a></li>
+      <li><a href="#gen1">How does nocol differ from MRTG ?</a></li>
+      <li><a href="#gen2">Where do I get nocol ?</a></li>
+      <li><a href="#gen3">What about support ?</a></li>
+      <li><a href="#gen4">What is SNIPS ?</a></li>
+      <li><a href="#gen5">Is nocol Y2K compliant ?</a></li>
+    </ul>
+  </li>
+  <li>Installation <ul>
+      <li><a href="#R0">What are the hardware requirements ?</a></li>
+      <li><a href="#R1">Should I run nocol as root ?</a></li>
+      <li><a href="#R2">I am getting lots of messages from <i>keepalive_monitors</i> about
+        restarting daemons</a></li>
+      <li><a href="#R3"><b>multiping</b> gives errors</a></li>
+      <li><a href="#R4">Nothing is being logged to <b>noclogd</b></a></li>
+      <li><a href="#R5">Why do all the events clear when I <i>kill -HUP</i> a monitor?</a></li>
+    </ul>
+  </li>
+  <li>Miscellaneous <ul>
+      <li><a href="#misc1">Can nocol handle SNMPv2 ?</a></li>
+      <li><a href="#misc2">How can I page myself when a site goes down ?</a></li>
+      <li><a href="#misc3">How do I get notified when a site comes back up ?</a></li>
+      <li><a href="#misc4">How do I get paged as soon as a site goes down?</a></li>
+      <li><a href="#misc5">Can I setup host or variable dependencies in NOCOL?</a></li>
+      <li><a href="#misc6">Where can I get an SSL Web server monitor?</a></li>
+      <li><a href="#misc95">Does nocol run on Windows NT ?</a></li>
+      <li><a href="#misc99">Who developed NOCOL ?</a></li>
+    </ul>
+  </li>
+</ol>
+
+<hr>
+
+<h3>GENERAL</h3>
+
+<dl>
+  <dt><b><a name="gen0">What is NOCOL ?</a></b></dt>
+  <dd><font color="#AA0000">nocol</font> (Network Operation Center On-Line) is a collection of
+    system and network monitoring agents which have a common viewing interface and logging
+    mechanism. It can be used to monitor your LAN or WAN network devices as well as your Unix
+    systems and services.</dd>
+  <dt>&nbsp;</dt>
+  <dt><b><a name="gen1">How does NOCOL differ from MRTG ?</a></b></dt>
+  <dd>MRTG is primarily a graphing tool. Nocol is a <em>monitoring</em> package which detects
+    outages or errors on your devices. All data is quantized in nocol and you lose granularity
+    (which might or might not be preferable). <br>
+    The two packages are complements of each other. </dd>
+  <dt>&nbsp;</dt>
+  <dt><a name="gen2"><b>Where do I get NOCOL</b> </a></dt>
+  <dd>The distribution site is at <a
+    href="http://www.netplex-tech.com/software/nocol/downloads">www.netplex-tech.com</a>. It
+    can also be downloaded via ftp from <a href="ftp://ftp.navya.com/pub/">ftp.navya.com</a> </dd>
+  <dt>&nbsp;</dt>
+  <dt><a name="gen3"><b>What about support ?</b></a></dt>
+  <dd>NOCOL is freeware, and hence no official support is available. However, it is a popular
+    product and you can send messages to the <b>nocol-users@navya.com</b> mailing list or
+    search the Web using any popular Internet search engine (Altavista, Excite) for your
+    queries. <br>
+    You can also email queries to <a href="mailto:dev@navya.com">nocol-support@navya.com</a>
+    which might grow larger in some distant future.</dd>
+  <dt>&nbsp;</dt>
+  <dt><a name="gen4"><b>What is SNIPS ?</b> </a></dt>
+  <dd>NOCOL was originally developed in 1991 and released as freeware. Since then, the
+    software has almost completely been rewritten and except for the old curses based 'netmon'
+    interface, not much else remains the same. <br>
+    <font color="#AA00AA"><b>SNIPS</b></font> (System and Network Integrated Polling Software)
+    is the next version of nocol (after v4.3) with many new features such as distributed
+    monitoring for scalability, data graphing, parallel SNMP queries, SNMPv2, MRTG interface
+    for data collection and much more. <p>SNIPS will be announced on the <i>nocol-users</i>
+    mailing list when it is available. </p>
+  </dd>
+  <dt><a name="gen5"><b>Is nocol Y2K compliant ?</b> </a></dt>
+  <dd>Yes. All events are logged to <i>noclogd</i> in the Unix timestamp format, so the
+    timestamps are not effected by the Y2K problem. </dd>
+</dl>
+
+<hr width="50%" align="center">
+
+<h3>INSTALLATION</h3>
+
+<dl>
+  <dt><a name="R0"><b>What are the hardware requirements ?</b> </a></dt>
+  <dd>Nocol can run very comfortably on any Pentium-100 class Unix machine with 64MB of RAM
+    and monitor several hundred devices. It is very lightweight in design and implementation.</dd>
+  <dt>&nbsp;</dt>
+  <dt><a name="R1"><b>Should I run nocol as root ?</b></a></dt>
+  <dd><em>NO!</em> You should create a separate user such as 'nocol' or 'snips' and all
+    monitors should be run by this user. The few monitors which require root priveleges (such
+    as pingmon or trapmon) are installed as suid root in the nocol bin/ directory.</dd>
+  <dt>&nbsp;</dt>
+  <dt><a name="R2"><b>I am getting lots of messages from <tt>keepalive_monitors</tt> about
+    restarting</b></a></dt>
+  <dd>Either your system's <tt>ps</tt> command is not listing the complete program name and
+    keepalive_monitors is trying to restart the program since it thinks its down or else the
+    monitor being restarted cannot write the pid or data file and is dying (incorrect owner
+    and permissions on the <tt>nocol/run</tt> directory). <p>If the monitor is not running,
+    then try running it in debug mode (most monitors will take the <tt>-d</tt> option for
+    running in debug mode). </p>
+  </dd>
+  <dt><a name="R3"><b>multiping gives error <i>socket: Operation not permitted</i></b></a></dt>
+  <dd><b>multiping</b> requires a raw socket, and needs to be installed suid root. You
+    probably did not run <tt>make root</tt> while installing nocol. Check the ownership and
+    permission of this program- it <em>must</em> show mode <tt>-rwsr-x--x</tt> with owner
+    root. If not, do the following: <pre>		chown root multiping
+
+		chmod 4751 multiping
+
+	</pre>
+  </dd>
+  <dt><a name="R4"><b>Nothing is being logged to <i>noclogd</i></b></a></dt>
+  <dd>Events are logged ONLY when their state <em>changes</em>. Thus, an event will be logged
+    to noclogd if a site goes from info level to warning level, etc. </dd>
+  <dt>&nbsp;</dt>
+  <dt><a name="R5"><b>Why do all the events clear when I send kill -HUP to a monitor?</b></a></dt>
+  <dd>Currently the monitors restart when getting a HUP signal and do not preserve the
+    existing state information. The feature of just re-reading the config file on getting a
+    HUP signal is yet to be added.</dd>
+</dl>
+
+<hr width="50%" align="center">
+
+<h3>MISC</h3>
+
+<dl>
+  <dt><a name="misc1"><b>Can nocol handle SNMPv2 ?</b></a></dt>
+  <dd>NOCOL currently uses the CMU SNMP software which does not implement SNMPv2. This will be
+    implemented in the next version (snips).</dd>
+  <dt>&nbsp;</dt>
+  <dt><a name="misc2"><b>How can I page myself when a site goes down? </b></a></dt>
+  <dd>Assuming that you have an alphanumeric pager and can page yourself using email or any
+    other perl script, you can page yourself on a particular event by using noclogd and piping
+    the events to a simple script such as <tt>utility/beep_oncall</tt>. <br>
+    In addition to <b>noclogd</b>, you can also run <tt>utility/notifier.pl</tt> to page you.<br>
+    Paging software such as <a href="http://www.qpage.org">qpage</a> can be used to do the
+    actual paging.</dd>
+  <dt>&nbsp;</dt>
+  <dt><a name="misc3"><b>How do I get notified when a site comes back up ? </b></a></dt>
+  <dd>All monitors in nocol log events to <tt>noclogd</tt> based on the worst of new severity
+    or previous severity of an event. <p>Hence, when a site goes down first, it will be logged
+    at 'warning' level. If it comes back up, it will be marked as up but will be logged at a
+    loglevel of 'warning' since that was the old severity. This mechanism allows you to not
+    only detect when a device goes critical, but also detect when the device comes back up. </p>
+  </dd>
+  <dt><a name="misc4"><b>How do I get paged as soon as a site goes down ?</b></a></dt>
+  <dd>In order to avoid false alarms (and prevent operators from getting into the habit of
+    wait-and-it-will-go-away), NOCOL will escalate any events severity gradually. If you want
+    to get paged or notified as soon as a site or variable changes, you can watch it at the <i>Warning</i>
+    level instead of the <i>Critical</i> level.</dd>
+  <dt>&nbsp;</dt>
+  <dt><a name="misc5"><b>Can I setup host or variable dependencies in NOCOL?</b></a></dt>
+  <dd>The various displays do not handle dependencies at this time and will require code
+    enhancements. It is possible to write a dependency based monitor and tie it into noclogd
+    easily, but this has not been developed yet.</dd>
+  <dt>&nbsp;</dt>
+  <dt><a name="misc6"><b>Where can I find an SSL web monitor?</b></a></dt>
+  <dd>This monitor is simple to write but requires linking against an SSL library which might
+    be subject to US export regulations and hence is not available.</dd>
+  <dt>&nbsp;</dt>
+  <dt><a name="misc95"><b>Does nocol run on windows NT ?</b></a></dt>
+  <dd>Nope. No plans to port it at this time either.</dd>
+  <dt>&nbsp;</dt>
+  <dt><a name="misc99"><b>Who maintains NOCOL ?</b></a></dt>
+  <dd>This software is currently maintained by <a HREF="mailto:vikas@navya.com">Vikas
+    Aggarwal.</a> Numerous authors have made contributions which have been added to the
+    package. </dd>
+</dl>
+
+<p align="center"><img src="blue_line.gif" alt width="552" height="6"></p>
+
+<p align="center"><a HREF="mailto:vikas@navya.com"><img SRC="images/feedback.jpg"
+BORDER="0" ALT="Feedback" width="76" height="14"></a></p>
+</body>
+</html>
diff -Naur nocol-4.3/html/index.html nocol-4.3.1/html/index.html
--- nocol-4.3/html/index.html	Thu Jan 27 00:10:06 2000
+++ nocol-4.3.1/html/index.html	Mon Mar 27 22:07:13 2000
@@ -1,104 +1,106 @@
-<!-- $Header: /home/vikas/src/nocol/html/RCS/index.html,v 1.4 2000/01/27 05:09:57 vikas Exp $ -->
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
-
-<head>
-<meta name="description" content="NOCOL Network Monitoring Home Page">
-<meta name="keywords"
-content="nocol, network operation center on-line,
-     NoCOL, SNIPS, snips, network monitoring, system monitoring, netconsole,
-     multiping, cisco, snmp, monitor, network, snmp, nocol">
-<title>NOCOL SNIPS Network Monitoring &amp; Management Home Page</title>
-</head>
-
-<body bgcolor="#FFFFFF">
-
-<table WIDTH="100%" CELLPADDING="0" CELLSPACING="0" BORDER="0">
-  <tr>
-    <td width="10%" bgcolor="#eeeeee">&nbsp; </td>
-    <td width="1%" bgcolor="#c0c0c0">&nbsp; </td>
-    <td width="1%" bgcolor="#d8d8d8">&nbsp; </td>
-    <td width="5"></td>
-    <td align="left"><img src="images/nocol1.jpg"
-    ALT="&lt;h1&gt;NOCOL Network Monitoring Software&lt;/h1&gt;" width="300" height="127">&nbsp;<p>&nbsp;</td>
-  </tr>
-  <tr>
-    <td bgcolor="#eeeeee">&nbsp; </td>
-    <td bgcolor="#c0c0c0">&nbsp; </td>
-    <td bgcolor="#d8d8d8">&nbsp; </td>
-    <td></td>
-    <td><p align="center"><font color="#a00000" size="4"><b>Current Version 4.3</b> </font></p>
-    <p>NOCOL/SNIPS is a <b>system and network monitoring software </b>that runs on Unix
-    systems and can poll network and system devices. It is capable of monitoring nameservers,
-    web ports, host performance, syslogs, radius servers, BGP peers, etc. New monitors can be
-    added easily (via a C or Perl API). </p>
-    <p>All monitors have a common display and postprocessing interface (logging, notification,
-    etc.) This design allows running only one set of monitors and any number of displays each
-    seeing the same consistent data. </p>
-    <p>False alarms are avoided by escalating events through severity levels- hence if a site
-    is unreachable, the site will be tested 2 more times before finally indicating that it is
-    'critical'. All events are logged, and the operator has the capability to decide which
-    level to view the events at. </p>
-    <p><b>Available monitors are:</b></p>
-    <blockquote>
-      <table border="0" cellspacing="0" cellpadding="1" width="70%" bordercolor="#C0C0C0">
-        <tr>
-          <td>ICMP ping</td>
-          <td>RPC portmapper</td>
-          <td>OSI ping</td>
-        </tr>
-        <tr>
-          <td>Ethernet load</td>
-          <td>TCP ports</td>
-          <td>Nameserver</td>
-        </tr>
-        <tr>
-          <td>Radius server</td>
-          <td>Syslog messages</td>
-          <td>Mailq</td>
-        </tr>
-        <tr>
-          <td>NTP</td>
-          <td>UPS (APC) battery</td>
-          <td>Unix host perf</td>
-        </tr>
-        <tr>
-          <td>BGP peers</td>
-          <td>SNMP variables</td>
-          <td>Data throughput</td>
-        </tr>
-      </table>
-    </blockquote>
-    <p>Click <a href="sample1/index.html">here</a> to see a sample of the web interface. </p>
-    <p>The software is available from <a href="http://www.netplex-tech.com/software/nocol/">http://www.netplex-tech.com/software/nocol</a>
-    or via ftp from <a href="ftp://ftp.navya.com/pub/">ftp.navya.com. </a></p>
-    <p><b>nocol-users@navya.com</b> is a mailing list for general discussion of nocol. Click <a
-    href="mailto:nocol-users-request@navya.com">here </a>to subscribe to this mailing list
-    (send <tt>subscribe</tt> in the BODY of your email). </p>
-    <p>Send bug reports to <a href="mailto:nocol-bugs@navya.com">nocol-bugs@navya.com</a></p>
-    <hr width="50%" noshade align="center">
-    <p><i>The latest versions of these documents can be found <a
-    href="http://www.netplex-tech.com/software/nocol">online</a></i><ul>
-      <li><a href="downloads/">Download</a> </li>
-      <li><a HREF="install.txt">Installation</a> </li>
-      <li><a HREF="opsguide.html">Operations Guide</a> </li>
-      <li><a HREF="design.html">Design &amp; Internals</a> </li>
-      <li><a href="release.html">Release Notes (Change Log)</a><i> </i></li>
-      <li><a HREF="downloads/archives">Mailing List Archives</a> </li>
-      <li><a HREF="bugs.html">Known Bugs &amp; Limitations</a></li>
-      <li><a href="copyright.html">Copyright</a> </li>
-      <li><a HREF="faq.html">Frequently Asked Questions</a> (FAQ) <font color="#AA00FF"><i>PLEASE
-        READ THESE</i></font></li>
-    </ul>
-    <hr>
-    <p align="center"><a HREF="mailto:vikas@navya.com"><img SRC="images/feedback.jpg"
-    BORDER="0" ALT="Feedback" width="76" height="14"></a></p>
-    <p><!-- FONT SIZE=2>Copyright &copy; 1994-1998
-      <A HREF="mailto:vikas@navya.com">Vikas Aggarwal</A> <BR>
-    </FONT --> </p>
-    <a HREF="http://www.netplex-tech.com"><p><img ALIGN="right" SRC="images/logo.jpg"
-    ALT="Netplex Technologies Inc." BORDER="0" width="288" height="50"> </a></td>
-  </tr>
-</table>
-</body>
-</html>
+<!-- $Header: /home/vikas/src/nocol/html/RCS/index.html,v 1.5 2000/03/28 03:07:08 vikas Exp $ -->
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+
+<head>
+<meta name="description" content="NOCOL Network Monitoring Home Page">
+<meta name="keywords"
+content="nocol, network operation center on-line,
+     NoCOL, SNIPS, snips, network monitoring, system monitoring, netconsole,
+     multiping, portmon, cisco, snmp, monitor, network, snmp, nocol">
+<title>NOCOL SNIPS Network Monitoring &amp; Management Home Page</title>
+</head>
+
+<body bgcolor="#FFFFFF">
+
+<table WIDTH="100%" CELLPADDING="0" CELLSPACING="0" BORDER="0">
+  <tr>
+    <td width="10%" bgcolor="#eeeeee">&nbsp; </td>
+    <td width="1%" bgcolor="#666699">&nbsp; </td>
+    <td width="1%" bgcolor="#d8d8d8">&nbsp; </td>
+    <td width="5"></td>
+    <td align="left"><img src="images/nocol1.jpg"
+    ALT="&lt;h1&gt;NOCOL Network Monitoring Software&lt;/h1&gt;" width="300" height="127">&nbsp;<p>&nbsp;</td>
+  </tr>
+  <tr>
+    <td bgcolor="#eeeeee">&nbsp; </td>
+    <td bgcolor="#666699">&nbsp; </td>
+    <td bgcolor="#d8d8d8">&nbsp; </td>
+    <td></td>
+    <td><p align="center"><font color="#a00000" size="4"><b>Current Version 4.3.1</b> </font></p>
+    <p>NOCOL/SNIPS is a <b>system and network monitoring software </b>that runs on Unix
+    systems and can poll network and system devices. It is capable of monitoring nameservers,
+    web ports, host performance, syslogs, radius servers, BGP peers, etc. New monitors can be
+    added easily (via a C or Perl API). </p>
+    <p>All monitors have a common display and postprocessing interface (logging, notification,
+    etc.) This design allows running only one set of monitors and any number of displays each
+    seeing the same consistent data. </p>
+    <p>False alarms are avoided by escalating events through severity levels- hence if a site
+    is unreachable, the site will be tested 2 more times before finally indicating that it is
+    'critical'. All events are logged, and the operator has the capability to decide which
+    level to view the events at. </p>
+    <p><b>Available monitors are:</b></p>
+    <blockquote>
+      <table border="0" cellspacing="0" cellpadding="1" width="70%" bordercolor="#C0C0C0">
+        <tr>
+          <td>ICMP ping</td>
+          <td>RPC portmapper</td>
+          <td>OSI ping</td>
+        </tr>
+        <tr>
+          <td>Ethernet load</td>
+          <td>TCP ports</td>
+          <td>Nameserver</td>
+        </tr>
+        <tr>
+          <td>Radius server</td>
+          <td>Syslog messages</td>
+          <td>Mailq</td>
+        </tr>
+        <tr>
+          <td>NTP</td>
+          <td>UPS (APC) battery</td>
+          <td>Unix host perf</td>
+        </tr>
+        <tr>
+          <td>BGP peers</td>
+          <td>SNMP variables</td>
+          <td>Data throughput</td>
+        </tr>
+      </table>
+    </blockquote>
+    <p>Click <a href="sample1/index.html">here</a> to see a sample of the web interface. </p>
+    <p>The software is available from <a href="http://www.netplex-tech.com/software/nocol/">http://www.netplex-tech.com/software/nocol</a>
+    or via ftp from <a href="ftp://ftp.navya.com/pub/">ftp.navya.com. </a></p>
+    <p><b>nocol-users@navya.com</b> is a mailing list for general discussion of nocol. Click <a
+    href="mailto:nocol-users-request@navya.com">here </a>to subscribe to this mailing list
+    (send <tt>subscribe</tt> in the BODY of your email). </p>
+    <p>Send bug reports to <a href="mailto:nocol-bugs@navya.com">nocol-bugs@navya.com</a></p>
+    <hr width="50%" noshade align="center">
+    <p><i>The latest versions of these documents can be found <a
+    href="http://www.netplex-tech.com/software/nocol">online</a></i><ul>
+      <li><a href="downloads/">Download</a> </li>
+      <li><a HREF="install.txt">Installation</a> </li>
+      <li><a HREF="opsguide.html">Operations Guide</a> </li>
+      <li><a HREF="design.html">Design &amp; Internals</a> </li>
+      <li><a href="release.html">Release Notes (Change Log)</a><i> </i></li>
+      <li><a HREF="mail-archives">Mailing List Archives</a> </li>
+      <li><a HREF="bugs.html">Known Bugs &amp; Limitations</a></li>
+      <li><a href="copyright.html">Copyright</a> </li>
+      <li><a HREF="faq.html">Frequently Asked Questions</a> (FAQ) <font color="#AA00FF"><i>PLEASE
+        READ THESE</i></font></li>
+    </ul>
+    <hr>
+    <p align="center"><a HREF="mailto:vikas@navya.com"><img SRC="images/feedback.jpg"
+    BORDER="0" ALT="Feedback" width="76" height="14"></a></p>
+    <p><!-- FONT SIZE=2>Copyright &copy; 1994-1998
+
+      <A HREF="mailto:vikas@navya.com">Vikas Aggarwal</A> <BR>
+
+    </FONT --> </p>
+    <a HREF="http://www.netplex-tech.com"><p><img ALIGN="right" SRC="images/logo.jpg"
+    ALT="Netplex Technologies Inc." BORDER="0" width="288" height="50"> </a></td>
+  </tr>
+</table>
+</body>
+</html>
diff -Naur nocol-4.3/html/opsguide.html nocol-4.3.1/html/opsguide.html
--- nocol-4.3/html/opsguide.html	Wed Jan 26 23:44:52 2000
+++ nocol-4.3.1/html/opsguide.html	Mon Mar 27 22:09:01 2000
@@ -1,214 +1,221 @@
-<html>
-
-<head>
-<title>NOCOL Operations Guide</title>
-<meta name="GENERATOR" content="Microsoft FrontPage 3.0">
-</head>
-
-<body bgcolor="#ffffff">
-
-<table border="0" width="90%">
-  <tr>
-    <td width="10%"></td>
-    <td width="5%"></td>
-    <td width="75%"></td>
-  </tr>
-  <tr>
-    <td>&nbsp; </td>
-    <td align="center" colspan="2"><font size="5" face="Arial,Helvetica">NOCOL Operations
-    Guide</font><p><font size="3" face="Arial,Helvetica">Version 4.3</font><br>
-    Last Updated Jan 22, 2000</td>
-  </tr>
-  <tr>
-    <td>&nbsp; </td>
-    <td colspan="2"></td>
-  </tr>
-  <tr>
-    <td>&nbsp; </td>
-    <td colspan="2"><strong><font size="4">Contents</font></strong></td>
-  </tr>
-  <tr>
-    <td>&nbsp; </td>
-    <td></td>
-    <td><a href="#runningNocol"><font size="2" face="Arial,Helvetica"><strong>Running NOCOL</strong></font></a><ul>
-      <li><font size="2" face="Arial,Helvetica">File locations</font></li>
-      <li><font size="2" face="Arial,Helvetica">Selecting the monitors</font></li>
-      <li><font size="2" face="Arial,Helvetica">Configuration files</font></li>
-      <li><font size="2" face="Arial,Helvetica">noclogd</font></li>
-      <li><font size="2" face="Arial,Helvetica">Routine Maintenance</font></li>
-    </ul>
-    <p><a href="#userInterfaces"><strong><font size="2" face="Arial,Helvetica">User Interfaces</font></strong></a><ul>
-      <li><font size="2" face="Arial,Helvetica">netconsole</font></li>
-      <li><font size="2" face="Arial,Helvetica">webNocol</font></li>
-      <li><font size="2" face="Arial,Helvetica">tkNocol</font></li>
-    </ul>
-    <p><a href="#notifications"><font size="2" face="Arial,Helvetica"><strong>Notifications
-    &amp; Reports</strong></font></a><ul>
-      <li><font size="2" face="Arial,Helvetica">SMS Paging</font></li>
-      <li><font size="2" face="Arial,Helvetica">Email</font></li>
-      <li><font size="2" face="Arial,Helvetica">Reports</font></li>
-    </ul>
-    <p style="background-color: rgb(255,255,0)" align="center"><font size="2"
-    face="Arial,Helvetica">You must read the <a href="install.txt">Installation</a> document
-    prior to reading this Operations guide.</font></td>
-  </tr>
-  <tr>
-    <td>&nbsp; </td>
-    <td colspan="2"><hr noshade color="#808080">
-    </td>
-  </tr>
-  <tr>
-    <td>&nbsp; </td>
-    <td colspan="2" bgcolor="#C0C0C0"><font size="4"><a name="runningNocol"><strong>Running
-    NOCOL</strong></a></font></td>
-  </tr>
-  <tr>
-    <td>&nbsp; </td>
-    <td></td>
-    <td><h4><font face="Arial,Helvetica">File Locations</font></h4>
-    <p><font size="2" face="Arial,Helvetica">The main directory where nocol gets installed is
-    specified at compile time (default is set to /usr/local/nocol). Under this directory, the
-    following sub-directories exist:</font></p>
-    <blockquote>
-      <table border="0" width="90%" style="font-family: Arial,Helvetica">
-        <tr>
-          <td width="15%" height="20"><small>bin/</small></td>
-          <td width="69%" height="20"><small>All monitors and utility scripts are in this directory.</small></td>
-        </tr>
-        <tr>
-          <td width="15%" height="21"><small>data/</small></td>
-          <td width="69%" height="21"><small>The raw data collected by the monitors</small></td>
-        </tr>
-        <tr>
-          <td width="15%" height="21"><small>etc/</small></td>
-          <td width="69%" height="21"><small>All configuration files, and the snmp MIB file.</small></td>
-        </tr>
-        <tr>
-          <td width="15%" height="21"><small>msgs/</small></td>
-          <td width="69%" height="21"><small>All files in this directory are displayed in the
-          'netconsole'&nbsp; msgs subwindow.</small></td>
-        </tr>
-        <tr>
-          <td width="15%" height="40"><small>run/</small></td>
-          <td width="69%" height="40"><small>The PID files for all the monitors (used to ensure only
-          one copy of a monitor runs at a time), and the error </small></td>
-        </tr>
-      </table>
-    </blockquote>
-    <h4><font face="Arial,Helvetica">Running the Monitors</font></h4>
-    <p><font size="2" face="Arial,Helvetica">Nocol has a large number of independent monitors-
-    all desired monitors should be listed in the <b>keepalive_monitors</b> script (the
-    variable PROGRAMS). This script is run periodically from crontab and ensures that all the
-    desired monitors are running (the <strong>crontab.nocol</strong> file is installed into
-    cron during the installation steps).</font></p>
-    <p><font size="2" face="Arial,Helvetica">Generally the monitors do not need any command
-    line argument- the name and location of the configuration file and the data directory is
-    compiled into the monitors. However, you can always specify an alternate config file or
-    output data file using the '-c' or the '-o' command line options respectively. All
-    monitors also accept the '-d' flag to indicate debug mode, in which case they write
-    verbose error messages to the stderr. If started from keepalive_monitors, these error
-    messages are stored in the <strong>run/xxxx.error</strong>&nbsp; file.</font></p>
-    <h4><font face="Arial,Helvetica">Configuration Files</font></h4>
-    <p><font size="2" face="Arial,Helvetica">The configuration file for each monitor is
-    located in the etc/ directory. Each of these files should be edited for your site. Note
-    that in most monitors, the 'name' of the device is not used by the monitor, but is
-    basically a operator friendly name for the device.</font></p>
-    <p><font size="2" face="Arial,Helvetica">Currently, sending a HUP signal to the monitors
-    does NOT cause them to re-read the configuration file and preserve the existing state of
-    the variables being monitored.</font></p>
-    <h4><font face="Arial,Helvetica">noclogd - the Logging Daemon</font></h4>
-    <p><font size="2" face="Arial,Helvetica">The noclogd daemon listens on port 5354 of the
-    logging host for any events sent by the monitors. The name of the host where noclogd runs
-    is compiled into all the monitors and is not configurable in their config files at this
-    time.</font></p>
-    <p><font size="2" face="Arial,Helvetica">The noclogd process is similar to the Unix
-    'syslog' daemon and the configuration file allows piping the logged events to any external
-    process. To prevent any random host from sending it any messages, the list of allowed IP
-    addresses (which can log to it) is listed in the noclogd configuration file.</font></p>
-    <p><font size="2" face="Arial,Helvetica">Since this process can run external programs, it
-    is used to run the pager notification scripts, etc. This program can be used to log
-    messages to a database, send emails, etc.</font></p>
-    <p><font size="2" face="Arial,Helvetica">It should be noted that an 'event' in nocol is
-    generated <em><strong>only</strong></em> when a value crosses a threshold in any polling
-    interval. Hence, normally you will not see any logging activity in noclogd, but when a
-    device variable changes its state, an event will be logged.</font></p>
-    <h4><font face="Arial,Helvetica">Routine Maintenance</font></h4>
-    <p><font size="2" face="Arial,Helvetica">Routine admin tasks in nocol consist of ensuring
-    that all the monitors are running (done by running <em>keepalive_monitors</em> from cron),
-    &nbsp;and rotating all the log files maintained by noclogd (done by running <em>log-maint</em>
-    periodically from crontab). See the sample </font><font size="2"
-    face="Lucida Console,Courier">nocol.crontab</font><font size="2" face="Arial,Helvetica">
-    for achieving these tasks.</font></td>
-  </tr>
-  <tr>
-    <td>&nbsp; </td>
-    <td colspan="2"><hr noshade color="#808080">
-    </td>
-  </tr>
-  <tr>
-    <td>&nbsp; </td>
-    <td colspan="2" bgcolor="#C0C0C0"><strong><font size="4"><a name="userInterfaces">User
-    Interfaces</a></font></strong></td>
-  </tr>
-  <tr>
-    <td>&nbsp; </td>
-    <td></td>
-    <td><h4><font face="Arial,Helvetica">Netconsole</font></h4>
-    <p><font size="2" face="Arial,Helvetica">There are three different user interfaces to view
-    the nocol data. The simplest of them all is <strong>netconsole</strong>,&nbsp; which is a
-    non-graphical, curses based tool for displaying the raw data being collected by the
-    monitors. Any user on the system where the monitors are running can run this tool.</font></p>
-    <h4><font face="Arial,Helvetica">WebNocol</font></h4>
-    <p><font size="2" face="Arial,Helvetica">The Web interface for displaying nocol data is
-    divided into two scripts- <strong>genweb.pl</strong> which runs periodically from crontab
-    and generates 4 web pages (one for each severity level). The other program is a CGI script
-    <strong>webnocol.cgi</strong>, which gives added functionality to the user such as
-    troubleshooting, adding notes for an event, hiding a known event, etc. This script has its
-    own built in access control based on the user, but as an alternative the typical .htaccess
-    method can easily be used.</font></p>
-    <h4><font face="Arial,Helvetica">tkNocol</font></h4>
-    <p><font size="2" face="Arial,Helvetica">This is a Tcl/tk based monitor using
-    client-server technology. A simple daemon (called 'ndaemon') runs on the nocol machine
-    listening on TCP port 5005 and all it does is periodically send the nocol raw data to all
-    connected clients. The client displays then parse and format/display this nocol raw data.
-    ndaemon has no access control at this time, so it is important to put a firewall to
-    restrict unauthorized access to ndaemon's TCP port.</font></p>
-    <p><font size="2" face="Arial,Helvetica">Note that none of these interfaces displays
-    historical data from 'noclogd'- they all work directly on the data being collected by the
-    monitors which represents the current state of the network.</font></td>
-  </tr>
-  <tr>
-    <td>&nbsp; </td>
-    <td colspan="2"><hr noshade color="#808080">
-    </td>
-  </tr>
-  <tr>
-    <td>&nbsp; </td>
-    <td colspan="2" bgcolor="#C0C0C0"><strong><font size="4"><a name="notifications">Notifications
-    &amp; Reports</a></font></strong></td>
-  </tr>
-  <tr>
-    <td>&nbsp; </td>
-    <td></td>
-    <td><font size="2" face="Arial,Helvetica">A very flexible notification script called&nbsp;
-    '<strong>notifier.pl</strong>'&nbsp; is provided with nocol which has a configuration file
-    describing the type of event and required action. Currently the possible actions are&nbsp;
-    <em>mail</em> and <em>page</em>.</font><p><font size="2" face="Arial,Helvetica">A minimum
-    and maximum age of the event can be defined indicating that the action should be taken
-    (paging or email) only if the age of the event lies between these two values (in seconds).
-    An option exists to allow 'repeat' notification (once every hour) until the age is
-    exceeded.</font></p>
-    <p><font size="2" face="Arial,Helvetica">Currently the only reporting tool for historical
-    analysis is 'logstats' which parses the historical noclogd event logs and generates a
-    simple summary report. This is run by the 'log-maint' script which in turn is run
-    periodically from crontab.</font></td>
-  </tr>
-</table>
-
-<hr>
-
-<address>
-  <a href="mailto:vikas@navya.com"><small>Vikas Aggarwal</small></a> 
-</address>
-</body>
-</html>
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<!-- $Header: /home/vikas/src/nocol/html/RCS/opsguide.html,v 1.2 2000/03/28 03:08:51 vikas Exp $ -->
+<html>
+
+<head>
+<title>NOCOL Operations Guide</title>
+<meta name="GENERATOR" content="Microsoft FrontPage 3.0">
+</head>
+
+<body bgcolor="#ffffff">
+
+<table border="0" width="90%">
+  <tr>
+    <td width="10%"></td>
+    <td width="5%"></td>
+    <td width="75%"></td>
+  </tr>
+  <tr>
+    <td>&nbsp; </td>
+    <td align="center" colspan="2"><font size="5" face="Arial,Helvetica">NOCOL Operations
+    Guide</font><p><font size="3" face="Arial,Helvetica">Version 4.3</font><br>
+    Last Updated: Mar 22, 2000</td>
+  </tr>
+  <tr>
+    <td>&nbsp; </td>
+    <td colspan="2"></td>
+  </tr>
+  <tr>
+    <td>&nbsp; </td>
+    <td colspan="2"><strong><font size="4">Contents</font></strong></td>
+  </tr>
+  <tr>
+    <td>&nbsp; </td>
+    <td></td>
+    <td><a href="#runningNocol"><font size="2" face="Arial,Helvetica"><strong>Running NOCOL</strong></font></a><ul>
+      <li><font size="2" face="Arial,Helvetica">File locations</font></li>
+      <li><font size="2" face="Arial,Helvetica">Selecting the monitors</font></li>
+      <li><font size="2" face="Arial,Helvetica">Configuration files</font></li>
+      <li><font size="2" face="Arial,Helvetica">noclogd</font></li>
+      <li><font size="2" face="Arial,Helvetica">Routine Maintenance</font></li>
+    </ul>
+    <p><a href="#userInterfaces"><strong><font size="2" face="Arial,Helvetica">User Interfaces</font></strong></a><ul>
+      <li><font size="2" face="Arial,Helvetica">netconsole</font></li>
+      <li><font size="2" face="Arial,Helvetica">webNocol</font></li>
+      <li><font size="2" face="Arial,Helvetica">tkNocol</font></li>
+    </ul>
+    <p><a href="#notifications"><font size="2" face="Arial,Helvetica"><strong>Notifications
+    &amp; Reports</strong></font></a><ul>
+      <li><font size="2" face="Arial,Helvetica">SMS Paging</font></li>
+      <li><font size="2" face="Arial,Helvetica">Email</font></li>
+      <li><font size="2" face="Arial,Helvetica">Reports</font></li>
+    </ul>
+    <p style="background-color: rgb(255,255,0)" align="center"><font size="2"
+    face="Arial,Helvetica">You must read the <a href="install.txt">Installation</a> document
+    prior to reading this Operations guide.</font></td>
+  </tr>
+  <tr>
+    <td>&nbsp; </td>
+    <td colspan="2"><hr noshade color="#808080">
+    </td>
+  </tr>
+  <tr>
+    <td>&nbsp; </td>
+    <td colspan="2" bgcolor="#C0C0C0"><font size="4"><a name="runningNocol"><strong>Running
+    NOCOL</strong></a></font></td>
+  </tr>
+  <tr>
+    <td>&nbsp; </td>
+    <td></td>
+    <td><h4><font face="Arial,Helvetica">File Locations</font></h4>
+    <p><font size="2" face="Arial,Helvetica">The main directory where nocol gets installed is
+    specified at compile time (default is set to /usr/local/nocol). Under this directory, the
+    following sub-directories exist:</font></p>
+    <blockquote>
+      <table border="0" width="90%" style="font-family: Arial,Helvetica">
+        <tr>
+          <td width="15%" height="20"><small>bin/</small></td>
+          <td width="69%" height="20"><small>All monitors and utility scripts are in this directory.</small></td>
+        </tr>
+        <tr>
+          <td width="15%" height="21"><small>data/</small></td>
+          <td width="69%" height="21"><small>The raw data collected by the monitors</small></td>
+        </tr>
+        <tr>
+          <td width="15%" height="21"><small>etc/</small></td>
+          <td width="69%" height="21"><small>All configuration files, and the snmp MIB file.</small></td>
+        </tr>
+        <tr>
+          <td width="15%" height="21"><small>msgs/</small></td>
+          <td width="69%" height="21"><small>All files in this directory are displayed in the
+          'netconsole'&nbsp; msgs subwindow.</small></td>
+        </tr>
+        <tr>
+          <td width="15%" height="40"><small>run/</small></td>
+          <td width="69%" height="40"><small>The PID files for all the monitors (used to ensure only
+          one copy of a monitor runs at a time), and the error </small></td>
+        </tr>
+      </table>
+    </blockquote>
+    <h4><font face="Arial,Helvetica">Running the Monitors</font></h4>
+    <p><font size="2" face="Arial,Helvetica">Nocol has a large number of independent monitors-
+    all desired monitors should be listed in the <b>keepalive_monitors</b> script (the
+    variable PROGRAMS). This script is run periodically from crontab and ensures that all the
+    desired monitors are running (the <strong>crontab.nocol</strong> file is installed into
+    cron during the installation steps).</font></p>
+    <p><font size="2" face="Arial,Helvetica">Generally the monitors do not need any command
+    line argument- the name and location of the configuration file and the data directory is
+    compiled into the monitors. However, you can always specify an alternate config file or
+    output data file using the '-c' or the '-o' command line options respectively. All
+    monitors also accept the '-d' flag to indicate debug mode, in which case they write
+    verbose error messages to the stderr. If started from keepalive_monitors, these error
+    messages are stored in the <strong>run/xxxx.error</strong>&nbsp; file.</font></p>
+    <h4><font face="Arial,Helvetica">Configuration Files</font></h4>
+    <p><font size="2" face="Arial,Helvetica">The configuration file for each monitor is
+    located in the etc/ directory. Each of these files should be edited for your site. Note
+    that in most monitors, the 'name' of the device is not used by the monitor, but is
+    basically a operator friendly name for the device.</font></p>
+    <p><font size="2" face="Arial,Helvetica">Currently, sending a HUP signal to the monitors
+    does NOT cause them to re-read the configuration file and preserve the existing state of
+    the variables being monitored.</font></p>
+    <h4><font face="Arial,Helvetica">noclogd - the Logging Daemon</font></h4>
+    <p><font size="2" face="Arial,Helvetica">The noclogd daemon listens on port 5354 of the
+    logging host for any events sent by the monitors. The name of the host where noclogd runs
+    is compiled into all the monitors and is not configurable in their config files at this
+    time.</font></p>
+    <p><font size="2" face="Arial,Helvetica">The noclogd process is similar to the Unix
+    'syslog' daemon and the configuration file allows piping the logged events to any external
+    process. To prevent any random host from sending it any messages, the list of allowed IP
+    addresses (which can log to it) is listed in the noclogd configuration file.</font></p>
+    <p><font size="2" face="Arial,Helvetica">Since this process can run external programs, it
+    is used to run the pager notification scripts, etc. This program can be used to log
+    messages to a database, send emails, etc.</font></p>
+    <p><font size="2" face="Arial,Helvetica">It should be noted that an 'event' in nocol is
+    generated <em><strong>only</strong></em> when a value crosses a threshold in any polling
+    interval. Hence, normally you will not see any logging activity in noclogd, but when a
+    device variable changes its state, an event will be logged. This means that an event will
+    be sent by a monitor to noclogd both when it goes down (e.g. from info level to warning
+    level) and also when it comes back up (e.g. warning level to info level).</font></p>
+    <h4><font face="Arial,Helvetica">Routine Maintenance</font></h4>
+    <p><font size="2" face="Arial,Helvetica">Routine admin tasks in nocol consist of ensuring
+    that all the monitors are running (done by running <em>keepalive_monitors</em> from cron),
+    &nbsp;and rotating all the log files maintained by noclogd (done by running <em>log-maint</em>
+    periodically from crontab). See the sample </font><font size="2"
+    face="Lucida Console,Courier">nocol.crontab</font><font size="2" face="Arial,Helvetica">
+    for achieving these tasks.</font></td>
+  </tr>
+  <tr>
+    <td>&nbsp; </td>
+    <td colspan="2"><hr noshade color="#808080">
+    </td>
+  </tr>
+  <tr>
+    <td>&nbsp; </td>
+    <td colspan="2" bgcolor="#C0C0C0"><strong><font size="4"><a name="userInterfaces">User
+    Interfaces</a></font></strong></td>
+  </tr>
+  <tr>
+    <td>&nbsp; </td>
+    <td></td>
+    <td><h4><font face="Arial,Helvetica">Netconsole</font></h4>
+    <p><font size="2" face="Arial,Helvetica">There are three different user interfaces to view
+    the nocol data. The simplest of them all is <strong>netconsole</strong>,&nbsp; which is a
+    non-graphical, curses based tool for displaying the raw data being collected by the
+    monitors. Any user on the system where the monitors are running can run this tool.</font></p>
+    <h4><font face="Arial,Helvetica">WebNocol</font></h4>
+    <p><font size="2" face="Arial,Helvetica">The Web interface for displaying nocol data is
+    divided into two scripts- <strong>genweb.pl</strong> which runs periodically from crontab
+    and generates 4 web pages (one for each severity level). The other program is a CGI script
+    <strong>webnocol.cgi</strong>, which gives added functionality to the user such as
+    troubleshooting, adding notes for an event, hiding a known event, etc. This script has its
+    own built in access control based on the user, but as an alternative the typical .htaccess
+    method can easily be used.</font></p>
+    <h4><font face="Arial,Helvetica">tkNocol</font></h4>
+    <p><font size="2" face="Arial,Helvetica">This is a Tcl/tk based monitor using
+    client-server technology. A simple daemon (called 'ndaemon') runs on the nocol machine
+    listening on TCP port 5005 and all it does is periodically send the nocol raw data to all
+    connected clients. The client displays then parse and format/display this nocol raw data.
+    ndaemon has no access control at this time, so it is important to put a firewall to
+    restrict unauthorized access to ndaemon's TCP port.</font></p>
+    <p><font size="2" face="Arial,Helvetica">Note that none of these interfaces displays
+    historical data from 'noclogd'- they all work directly on the data being collected by the
+    monitors which represents the current state of the network.</font></td>
+  </tr>
+  <tr>
+    <td>&nbsp; </td>
+    <td colspan="2"><hr noshade color="#808080">
+    </td>
+  </tr>
+  <tr>
+    <td>&nbsp; </td>
+    <td colspan="2" bgcolor="#C0C0C0"><strong><font size="4"><a name="notifications">Notifications
+    &amp; Reports</a></font></strong></td>
+  </tr>
+  <tr>
+    <td>&nbsp; </td>
+    <td></td>
+    <td><font size="2" face="Arial,Helvetica">A very flexible notification script called&nbsp;
+    '<strong>notifier.pl</strong>'&nbsp; is provided with nocol which has a configuration file
+    describing the type of event and required action. Currently the possible actions are&nbsp;
+    <em>mail</em> and <em>page</em>. A minimum and maximum age of the event can be defined
+    indicating that the action should be taken (paging or email) only if the age of the event
+    lies between these two values (in seconds). An option exists to allow 'repeat'
+    notification (once every hour) until the age is exceeded.</font><p><font size="2"
+    face="Arial,Helvetica">A more 'event' driven notification system can be written by using <em>noclogd</em>.
+    Any event can be piped to an external script by noclogd, so a page or email can be sent as
+    soon as an event occurs and is logged to noclogd. As an example, look at the '<strong>utility/beep_oncall</strong>'
+    script.</font></p>
+    <p><font size="2" face="Arial,Helvetica">Currently the only reporting tool for historical
+    analysis is '<strong>logstats</strong>' which parses the historical noclogd event logs and
+    generates a simple summary report. This is run by the '<strong>log-maint</strong>' script
+    which in turn is run periodically from crontab.</font></td>
+  </tr>
+</table>
+
+<hr>
+
+<address>
+  <a href="mailto:vikas@navya.com"><small>Vikas Aggarwal</small></a> 
+</address>
+</body>
+</html>
diff -Naur nocol-4.3/html/release.html nocol-4.3.1/html/release.html
--- nocol-4.3/html/release.html	Wed Jan 26 23:51:25 2000
+++ nocol-4.3.1/html/release.html	Mon Mar 27 22:31:48 2000
@@ -1,5 +1,5 @@
 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<!-- $Header: /home/vikas/src/nocol/html/RCS/release.html,v 1.3 2000/01/27 04:50:25 vikas Exp $ -->
+<!-- $Header: /home/vikas/src/nocol/html/RCS/release.html,v 1.4 2000/03/28 03:31:41 vikas Exp $ -->
 <html>
 
 <head>
@@ -49,6 +49,45 @@
 
 <h3><font face="Arial,Helvetica">Release Notes</font></h3>
 
+<h4><u>nocol v4.3.1 <i>(Mar 2000)</i></u></h4>
+
+<p>Minor release to fix patches.</p>
+
+<blockquote>
+  <table border="1" width="90%" bordercolor="#C0C0C0" cellspacing="0" cellpadding="3">
+    <tr>
+      <td width="5%"><font face="Arial,Helvetica" size="2">1.</font></td>
+      <td width="18%"><font face="Arial,Helvetica" size="2">portmon.c</font></td>
+      <td><font face="Arial,Helvetica" size="2">» Missing close() left too many file
+      descriptors<br>
+      » Now running check_resp() after receiving EOF from the remote host. Should fix problem
+      of receiving data with no \n in entire data stream.</font></td>
+    </tr>
+    <tr>
+      <td width="5%"><font face="Arial,Helvetica" size="2">2.</font></td>
+      <td width="18%"><font face="Arial,Helvetica" size="2">snmpgeneric</font></td>
+      <td><font face="Arial,Helvetica" size="2">» Allow specifying client port number <a
+      href="mailto:(joe@hole-in-the.net">(joe@hole-in-the.net</a>)<br>
+      » Sets MIBFILE_v2 variable also for the mib file</font></td>
+    </tr>
+    <tr>
+      <td width="5%"><font face="Arial,Helvetica" size="2">3.</font></td>
+      <td width="18%"><font face="Arial,Helvetica" size="2">webnocol.cgi</font></td>
+      <td><font face="Arial,Helvetica" size="2">Small fix to prevent possible loop.</font></td>
+    </tr>
+    <tr>
+      <td width="5%"><font face="Arial,Helvetica" size="2">4.</font></td>
+      <td width="18%"><font face="Arial,Helvetica" size="2">nocollib.pl</font></td>
+      <td><font face="Arial,Helvetica" size="2">Changed 'ps' to '/bin/ps'</font></td>
+    </tr>
+    <tr>
+      <td width="5%"><font face="Arial,Helvetica" size="2">5.</font></td>
+      <td width="18%"><font face="Arial,Helvetica" size="2">hostmon-client</font></td>
+      <td><font face="Arial,Helvetica" size="2">Changed 'ps' to '/bin/ps'</font></td>
+    </tr>
+  </table>
+</blockquote>
+
 <h4><u>nocol v4.3 <i>(Jan 2000)</i></u></h4>
 
 <blockquote>
@@ -320,7 +359,9 @@
 <p align="center"><a HREF="mailto:vikas@navya.com"><img SRC="images/feedback.jpg"
 BORDER="0" ALT="Feedback" width="76" height="14"></a></p>
 <!-- FONT SIZE=2>Copyright &copy; 1994-1997
+
       <A HREF="mailto:vikas@navya.com">Vikas Aggarwal</A> <BR>
+
     </FONT -->
 </body>
 </html>
diff -Naur nocol-4.3/include/version.h nocol-4.3.1/include/version.h
--- nocol-4.3/include/version.h	Thu Jan 20 18:33:02 2000
+++ nocol-4.3.1/include/version.h	Mon Mar 27 23:33:25 2000
@@ -1,5 +1,5 @@
 /*
- * $Id: version.h,v 4.3 2000/01/20 23:32:53 vikas Exp $
+ * $Id: version.h,v 4.3 2000/03/28 04:33:05 vikas Exp $
  */
 
 /*
@@ -24,6 +24,6 @@
 
 #ifndef _NOCOLVERSION_
 # define _NOCOLVERSION_
-  static char nocol_version[] = "$Revision: 4.3 $" ;
+  static char nocol_version[] = "$Revision: 4.3 $ (4.3.1)" ;
 #endif
 
diff -Naur nocol-4.3/perlnocol/hostmon-osclients/hostmon-client nocol-4.3.1/perlnocol/hostmon-osclients/hostmon-client
--- nocol-4.3/perlnocol/hostmon-osclients/hostmon-client	Fri Nov  5 16:47:18 1999
+++ nocol-4.3.1/perlnocol/hostmon-osclients/hostmon-client	Mon Mar 27 01:51:48 2000
@@ -1,6 +1,6 @@
 #!/usr/local/bin/perl
 #
-# $Header: /home/vikas/src/nocol/perlnocol/hostmon-osclients/RCS/hostmon-client,v 2.5 1999/11/05 21:46:54 vikas Exp $
+# $Header: /home/vikas/src/nocol/perlnocol/hostmon-osclients/RCS/hostmon-client,v 2.6 2000/03/27 06:51:26 vikas Exp $
 #
 #	hostmon-client.main
 #
@@ -303,7 +303,7 @@
     $ostype= `uname -s -r -m` ; chop $ostype; # OS, revision, arch
     $osfile = "hostmon-client.";
     $debug && print STDERR "OSTYPE = $ostype\n";
-    $PSCMD = "ps";	# ps command to allow pid on cmd line. Autoset below
+    $PSCMD = "/bin/ps";	# ps command to allow pid on cmd line. Autoset below
 
 
     # set boolean values for OS's
@@ -330,9 +330,9 @@
     ## force units to 1024 blocks for df and vmstat for SVR4
     # $ENV{'BLOCKSIZE'} = "1024";
 
-    # if ($ostype =~ /solaris|irix/i) { $PSCMD = "/bin/ps -p"; }
+    #if ($ostype =~ /SunOS\s+5/) { $PSCMD = "/bin/ps -p"; }
     local ($status) = grep(/usage/i, `$PSCMD 1 2>&1`);
-    if ($status == 1) { $PSCMD = "ps -f -p" ;}
+    if ($status == 1) { $PSCMD = "/bin/ps -f -p" ;}
 }
 
 sub standalone {
diff -Naur nocol-4.3/perlnocol/nocollib.pl nocol-4.3.1/perlnocol/nocollib.pl
--- nocol-4.3/perlnocol/nocollib.pl	Fri Nov  5 16:59:34 1999
+++ nocol-4.3.1/perlnocol/nocollib.pl	Mon Mar 27 22:28:54 2000
@@ -1,6 +1,6 @@
 #!/usr/local/bin/perl
 #
-# $Header: /home/vikas/src/nocol/perlnocol/RCS/nocollib.pl,v 1.15 1999/11/05 21:59:13 vikas Exp $
+# $Header: /home/vikas/src/nocol/perlnocol/RCS/nocollib.pl,v 1.16 2000/03/28 03:28:43 vikas Exp $
 #
 # 	nocollib.pl - perl library of NOCOL routines
 #
@@ -316,10 +316,10 @@
     local($me,$dir)=@_;
     local($localhost)=`hostname`; chop ($localhost);
     local($pid,$host);
-    local($PSCMD) = "ps";	# cmd to see process by giving a pid
+    local($PSCMD) = "/bin/ps";	# cmd to see process by giving a pid
 
     local ($status) = scalar grep(/usage/i, `$PSCMD 1 2>&1`);
-    if ($status >= 1) { $PSCMD = "ps -p" ;}	# hope no other flags needed
+    if ($status >= 1) { $PSCMD = "/bin/ps -p" ;}	# hope no other flags needed
     
     $pidfile= "$dir/$me.pid";	# cannot be local()
 
diff -Naur nocol-4.3/perlnocol/snmpgeneric nocol-4.3.1/perlnocol/snmpgeneric
--- nocol-4.3/perlnocol/snmpgeneric	Mon Nov  8 23:21:45 1999
+++ nocol-4.3.1/perlnocol/snmpgeneric	Tue Mar 21 16:29:23 2000
@@ -1,6 +1,6 @@
 #!/usr/bin/perl 
 ##
-# $Header: /home/vikas/src/nocol/perlnocol/RCS/snmpgeneric,v 1.3 1999/11/09 04:21:23 vikas Exp $
+# $Header: /home/vikas/src/nocol/perlnocol/RCS/snmpgeneric,v 1.5 2000/03/21 21:28:57 vikas Exp $
 #
 #  snmpgeneric - perl monitor for generic SNMP variables.
 # 	Directly monitors SNMP variables from the hosts listed.
@@ -47,6 +47,7 @@
 #$mibfile = "$etcdir/mibII.txt" ;	# location of MIB file SET_THIS
 $mibfile = "$etcdir/mib-v2.txt" ;	# location of MIB file SET_THIS
 $ENV{"MIBFILE"}= $mibfile ;
+$ENV{"MIBFILE_v2"}= $mibfile ;
 
 $numtries = 2;   # number of times to try and connect before failing
 
@@ -129,7 +130,9 @@
   local ($acount, $isok) = (0, 0);
   
   # $host is the text hostname   $router is the IP
-  
+  # Check if a port number is attached to $router:$port
+  ($router, $port) = split(/:/, $router, 2);
+
   if ($debug)
   {
     print "Checking $router\n";
@@ -150,8 +153,14 @@
     
     $myoid=$oid{$item};		# get the OID ready
     $myoid =~ s/^\+(.*)$/\1/;	# remove the leading +
-    
-    $cmd = "$snmpwalk $router $community{$item} $myoid";
+
+    if ($port) {
+      $cmd = "$snmpwalk -p $port $router $community{$item} $myoid";
+    }
+    else {
+      $cmd = "$snmpwalk $router $community{$item} $myoid";
+    }      
+
     open (WALK, "$cmd |") || die "Could not run \"$cmd\"\n";
     while (<WALK>)
     {
@@ -181,7 +190,12 @@
     $tries=$numtries;
     while ((! (($active =~ /INTEGER/)||($active =~ /Timeticks/)) ) && ($tries) )
     {
-      $cmd = "$snmpget $router $community{$item} $oid{$item}";
+      if ($port) {
+	$cmd = "$snmpget -p $port $router $community{$item} $oid{$item}";
+      }
+      else {
+	$cmd = "$snmpget $router $community{$item} $oid{$item}";
+      }
       print "cmd=$cmd\n" if $debug;
       $active = `$cmd`;
       print "$active" if $debug;
diff -Naur nocol-4.3/perlnocol/snmpgeneric-confg nocol-4.3.1/perlnocol/snmpgeneric-confg
--- nocol-4.3/perlnocol/snmpgeneric-confg	Tue Oct 26 00:34:03 1999
+++ nocol-4.3.1/perlnocol/snmpgeneric-confg	Tue Mar 21 16:24:16 2000
@@ -3,7 +3,7 @@
 #  Ed Landa  (elanda@comstar.net)  May 1999
 #
 ###
-# HOST IP SNMP-OID  VARNAME COMMUNITY WARN ERROR CRITICAL [COMP]
+# HOST IP[:PORT] SNMP-OID  VARNAME COMMUNITY WARN ERROR CRITICAL [COMP]
 #
 # If OID starts with a plus, then we will walk that OID and add up values that
 # are returned from the COMP evalution.
diff -Naur nocol-4.3/portmon/portmon.c nocol-4.3.1/portmon/portmon.c
--- nocol-4.3/portmon/portmon.c	Thu Jan 20 00:28:27 2000
+++ nocol-4.3.1/portmon/portmon.c	Thu Apr  6 15:32:26 2000
@@ -1,5 +1,5 @@
 /* #define DEBUG		/* */
-/* $Header: /home/vikas/src/nocol/portmon/RCS/portmon.c,v 2.3 2000/01/20 05:27:18 vikas Exp $  */
+/* $Header: /home/vikas/src/nocol/portmon/RCS/portmon.c,v 2.5 2000/04/06 19:31:44 vikas Exp $  */
 
 /* Copyright 1994 Vikas Aggarwal, vikas@navya.com */
 
@@ -11,7 +11,7 @@
  * response string.
  *
  * CAVEATS:
- *	1) Uses 'strstr' and not a real regular expression. Case insensitive.
+ *	1) Uses case insensitive 'strstr' and not a real regular expression.
  *	2) Looks only at the first buffer of the response unless using the
  *	   timeouts to calculate the response time.
  *	3) Does not implement milli-second timers while reading responses
@@ -23,6 +23,15 @@
 
 /*
  * $Log: portmon.c,v $
+ * Revision 2.5  2000/04/06 19:31:44  vikas
+ * Now replaces all '\0' in the read stream with a '\n'. Needed by
+ * a user who has a host which terminates lines with \0
+ *
+ * Revision 2.4  2000/04/04 04:46:22  vikas
+ * Added close() to prevent max file open problem.
+ * Better error logging.
+ * Now checking for any read buffer even if \n not found.
+ *
  * Revision 2.3  2000/01/20 05:27:18  vikas
  * Fixed error in processing sites where we are just testing connectivity.
  * Needed to call process_site() if the responselist was null.
@@ -68,7 +77,7 @@
 
 /*  */
 #ifndef lint
-static char rcsid[] = "$Id: portmon.c,v 2.3 2000/01/20 05:27:18 vikas Exp $" ;
+static char rcsid[] = "$Id: portmon.c,v 2.5 2000/04/06 19:31:44 vikas Exp $" ;
 #endif
 
 #include "portmon.h"
@@ -216,7 +225,7 @@
   FD_ZERO(&rset); FD_ZERO(&wset);
   nleft = nhosts;
 
-  if (debug > 1) fprintf(stderr, "testing %d sites simultaneously\n", nhosts);
+  if (debug > 1) fprintf(stderr, "Testing %d sites simultaneously\n", nhosts);
 
   /* issue a non-blocking connect to each host */
   for (i = 0; i < nhosts; ++i)
@@ -241,8 +250,8 @@
 	errno != EINPROGRESS)
     {		/* some error */
       if (debug)
-	fprintf(stderr, "connect() failed for %s- %s\n", (hv[i])->ipaddr,
-		sys_errlist[errno]);
+	fprintf(stderr, "connect() failed for %s:%d- %s\n",
+		hv[i]->ipaddr, hv[i]->port, sys_errlist[errno]);
       close(sockarray[i]);
       --nleft;
     }
@@ -275,7 +284,7 @@
 #ifdef DEBUG
     if (debug > 2)
       fprintf(stderr,
-	      "checkports() calling select(), timeout %ld, nleft= %d\n",
+	      "\ncheckports() calling select(), timeout %ld, nleft= %d\n",
 	      tval.tv_sec, nleft);
 #endif
     if ( (n = select(maxfd + 1, &rs, &ws, NULL, &tval)) <= 0)
@@ -320,9 +329,11 @@
 	    else  fprintf(stderr, "%s\n", sys_errlist[errno]);
 	  }
 	  FD_CLR(sockarray[i], &rset); FD_CLR(sockarray[i], &wset);
+	  close(sockarray[i]);	/* ensure this is closed */
 	  --nleft;
 	  if (debug > 2)
-	    fprintf(stderr, "DONE %d. %s\n", i, hv[i]->ipaddr);
+	    fprintf(stderr, "DONE #%d. %s:%d (%s)\n",
+		    i, hv[i]->hname, hv[i]->port, hv[i]->ipaddr);
 	}
 	else	/* ready for reading/writing and connected */
 	{
@@ -338,7 +349,8 @@
 	      close(sockarray[i]);	/* ensure this is closed */
 	      --nleft;
 	      if (debug > 2)
-		fprintf(stderr, "DONE %d. %s\n", i, hv[i]->ipaddr);
+		fprintf(stderr, "DONE #%d. %s:%d (%s)\n",
+			i, hv[i]->hname, hv[i]->port, hv[i]->ipaddr);
 	    }
 
 	  if (hv[i]->wptr == NULL || *(hv[i]->wptr) == '\0')
@@ -353,13 +365,11 @@
 
   /* here if timeout or all the connects have been processed */
 
-  if (nleft)
-  {
-    if (debug > 1)
-      fprintf(stderr, " %d sites unprocessed (no response)\n", nleft);
-    for (i = 0; i < nhosts; ++i)
-      close(sockarray[i]);
-  }
+  if (nleft && debug > 1)
+    fprintf(stderr, " %d sites unprocessed (no response)\n", nleft);
+
+  for (i = 0; i < nhosts; ++i)		/* close any open sockets */
+    close(sockarray[i]);
 
   return (0);
 
@@ -410,15 +420,15 @@
   {
 #ifdef DEBUG
     if (debug > 1)
-      fprintf(stderr, "  (debug) host %s:%d- sent %d bytes\n",
+      fprintf(stderr, "  (debug) Host %s:%d- sent %d bytes\n",
 	      h->hname, h->port, n);
 #endif
     return 0;
   }
 
   if (debug)
-    fprintf(stderr, "  (debug) %s: host %s:%d- Sent string '%s'\n",
-	    prognm, h->hname, h->port, h->writebuf) ;
+    fprintf(stderr, "  (debug) Host %s:%d- Sent string '%s'\n",
+	    h->hname, h->port, h->writebuf) ;
 
   return 1;
 
@@ -433,7 +443,7 @@
   int sock;		/* connected socket */
   struct _harray *h;
 {
-  int n;
+  int i, n;
   int sflags;
   int buflen, maxsev;
   register char *r, *s;
@@ -461,8 +471,13 @@
       return(1);	/* done testing */
     }
 
-  buflen = h->readbuf + sizeof(h->readbuf) - h->rptr;
+  /* now fill any remaining read buffer space we have */
+  buflen = h->readbuf + sizeof(h->readbuf) - h->rptr;	/* amount we can read*/
   n = read(sock, h->rptr, buflen - 1);
+#ifdef DEBUG  	/* */
+  if (debug > 2)
+    fprintf(stderr, "  read %d bytes from %s:%d\n", n, h->hname, h->port);
+#endif	/* */
   if (n < 0)
   {		/* read() error */
     if (errno == EWOULDBLOCK)
@@ -480,47 +495,54 @@
     h->status = 0;	/* mark as down */
     return(1);		/* finished testing */
   }	/* end if (n < 0)  */
-  else if (n == 0)	/* end of file */
+
+  /* if n==0, then we have read end of file, so do a final check_resp() */
+
+  /* replace any \0 in the stream with a \n */
+  for (i = 0; i < n ; ++i)
+    if ((h->rptr)[i] == '\0')
+      (h->rptr)[i] = '\n';
+
+  (h->rptr) += n;		/* increment pointer */
+  *(h->rptr) = '\0';
+
+  if (n > 0 && (r = (char *)strrchr(h->readbuf, '\n')) == NULL)  /* no \n */
   {
-    if (h->quitstr && *(h->quitstr))
-      write(sock, h->quitstr, strlen(h->quitstr));
-    close (sock);
-    h->testseverity = h->connseverity;
-    h->status = 0;	/* mark as down */
-    return 1;		/* finished testing */
+    if ( n < (buflen - 1) )	/* remaining empty buffer space */
+      return (0);		/* need to continue reading */
+    else			/* filled buffer, but no \n yet */
+      r = h->rptr - 32;		/* set to about 32 chars back from end */
   }
 
-#ifdef DEBUG  
-  if (debug > 2)
-    fprintf(stderr, "  read %d bytes from %s\n", n, h->hname);
-#endif
+  /* here if end-of-file or found a newline */
+  maxsev = check_resp(h->readbuf, h->responselist);
 
-  (h->rptr) += n;		/* increment pointer */
-  *(h->rptr) = '\0'; 
-  if ( (r = (char *)strrchr(h->readbuf, '\n')) == NULL &&  n < (buflen - 1) )
-    return (0);		/* need to continue reading */
-
-  if (r == NULL)	/* filled buffer, but no \n yet */
-    r = h->rptr - 20;	/* set to about 20 chars back from end */
-  if ( (maxsev = check_resp(h->readbuf, h->responselist)) == -1 )
+  if (maxsev == -1)
   {	/* no match in response list */
-    for (++r, s = h->readbuf; *r; )
-      *s++ = *r++;	/* shift stuff after \n to start of readbuf */
-    h->rptr = s;		/* point to next location to be read */
-    return 0;			/* still not done */
+    if (n == 0)		/* end of file, so we are done */
+      maxsev = h->connseverity;
+    else
+    {
+      for (++r, s = h->readbuf; *r; )
+	*s++ = *r++;	/* shift stuff after \n to start of readbuf */
+      *s = '\0';	       	/* lets be safe */
+      h->rptr = s;		/* point to next location to be read */
+      return 0;			/* still not done */
+    }
   }
-
-  /* Here if we found a match in check_resp() */
-  
-  if (h->timeouts[0] != 0)
-  {				/* we are checking port speed */
-    if (debug > 1)
-      fprintf(stderr,"  (debug) elapsed time= %ld secs\n", h->elapsedsecs);
-    if (h->elapsedsecs < h->timeouts[0])	maxsev = E_INFO;
-    else if (h->elapsedsecs < h->timeouts[1])	maxsev = E_WARNING;
-    else if (h->elapsedsecs < h->timeouts[2])	maxsev = E_ERROR;
-    else maxsev = E_CRITICAL;
+  else
+  {			/* Here if we found a match in check_resp() */
+    if (h->timeouts[0] != 0)
+    {				/* we are checking port speed */
+      if (debug > 1)
+	fprintf(stderr,"  (debug) elapsed time= %ld secs\n", h->elapsedsecs);
+      if (h->elapsedsecs < h->timeouts[0])	maxsev = E_INFO;
+      else if (h->elapsedsecs < h->timeouts[1])	maxsev = E_WARNING;
+      else if (h->elapsedsecs < h->timeouts[2])	maxsev = E_ERROR;
+      else maxsev = E_CRITICAL;
+    }
   }
+
   if (debug)
     fprintf(stderr," (debug) process_host(%s:%d): returning severity %d\n",
 	    h->hname, h->port, maxsev);
@@ -539,8 +561,7 @@
 
 /*+ 
  * FUNCTION:
- * 	Check the list of responses. Notice doing a strstr() which is NOT
- * case sensitive.
+ * 	Check the list of responses using Strcasestr()
  */
 check_resp(readstr, resarr)
   char *readstr;
@@ -549,7 +570,7 @@
   struct _response *r;
 
   if (debug > 1)
-    fprintf(stderr, " (debug) %s: Checking response '%s'\n", prognm,readstr);
+    fprintf(stderr, "  (debug) check_resp() '%s'\n", readstr);
 
   for (r = resarr; r ; r = r->next)
   {
@@ -559,14 +580,14 @@
     {
 #ifdef DEBUG
       if (debug > 1)
-	fprintf(stderr," (debug) check_resp(): Matched '%s'\n",	r->response);
+	fprintf(stderr,"   (debug) check_resp(): Matched '%s'\n", r->response);
 #endif
       return(r->severity);
     }
   }	/* for() */
 
   if (debug)
-    fprintf (stderr, " check_resp(): No response matched for site\n");
+    fprintf (stderr, "  check_resp(): No response matched for site\n");
 
   return(-1);		/* No response matched given list */
 
@@ -664,7 +685,7 @@
 readconfig(fdout)
   int fdout ;				/* output data filename */
 {
-  int mxsever ;
+  int mxsever, i;
   char *j1;				/* temp string pointers */
   FILE *cfd ;
   EVENT v;                            	/* Defined in NOCOL.H */
@@ -705,7 +726,7 @@
   while(fgetLine(cfd, record, MAXLINE - 3) > 0 ) 	/* keeps the \n */
   {
     static int skiphost;
-    int port ;
+    int port;
     int checkspeed = 0;
     int readquitstr = 0;
     struct sockaddr_in sin;			/* temporary */
@@ -925,12 +946,13 @@
   fclose (cfd);       		/* Not needed any more */
 
   if (debug > 1)
-    for (h = hostlist ; h; h = h->next)
+    for (h = hostlist, i=0 ; h; h = h->next)
     {
-      fprintf(stderr, "Host=%s %s :%d, Sendstr=%s, MaxSev= %d\n", h->hname,
-	      h->ipaddr, h->port, (h->writebuf ? h->writebuf : ""), h->connseverity);
+      fprintf(stderr, "#%d. Host=%s %s :%d, MaxSev= %d, Sendstr=%s",
+	      i++, h->hname, h->ipaddr, h->port,  h->connseverity,
+	      (h->writebuf ? h->writebuf : "\n"));
       for (r = h->responselist; r; r = r->next)
-	fprintf(stderr, "\t%s (%d)\n", r->response, r->severity);
+	fprintf(stderr, "\t%s (sev=%d)\n", r->response, r->severity);
     }
 
   return(1);                          /* All OK  */
diff -Naur nocol-4.3/webnocol/genweb.pl nocol-4.3.1/webnocol/genweb.pl
--- nocol-4.3/webnocol/genweb.pl	Mon Nov  1 08:46:54 1999
+++ nocol-4.3.1/webnocol/genweb.pl	Mon Mar 27 23:37:13 2000
@@ -1,6 +1,6 @@
 #!/usr/local/bin/perl
 #
-# $Header: /home/vikas/src/nocol/webnocol/RCS/genweb.pl,v 1.10 1999/11/01 13:46:45 vikas Exp $ 
+# $Header: /home/vikas/src/nocol/webnocol/RCS/genweb.pl,v 1.11 2000/03/28 04:37:02 vikas Exp $ 
 #
 #			genweb.pl
 # ------------
@@ -69,7 +69,7 @@
 
 #########################################################################
 
-$VERSIONSTR = "4.2.2";		# version
+$VERSIONSTR = "4.3.1";		# version
 
 ## Customize $baseurl
 
@@ -180,7 +180,8 @@
   local ($ADMINMODE) = ($lvl eq "User") ? 0 : 1; # No href links for userPage
   $cnt{$lvl} = 1;	# serial number per view
 
-  open (OUTPUT, ">$webdir/${lvl}.html") or die "Unable to open output file $!";
+  open (OUTPUT, ">$webdir/${lvl}.html") or die 
+    "Unable to open output file ($webdir/${lvl}.html) $!";
   select OUTPUT;		# default for print statements
 
   &print_html_prologue($thispage, $lvl, $refresh);
@@ -330,7 +331,7 @@
   local ($thispage, $lvl, $refresh) = @_;
 
   local ($action) = $levels[$ilevels{$lvl}];
-  local ($id) = '$Id: genweb.pl,v 1.10 1999/11/01 13:46:45 vikas Exp $';#'
+  local ($id) = '$Id: genweb.pl,v 1.11 2000/03/28 04:37:02 vikas Exp $';#'
 
   $id =~ s/\$//g;	# cleanup
   print <<EOT;
diff -Naur nocol-4.3/webnocol/webnocol.cgi nocol-4.3.1/webnocol/webnocol.cgi
--- nocol-4.3/webnocol/webnocol.cgi	Thu Jan 27 00:21:22 2000
+++ nocol-4.3.1/webnocol/webnocol.cgi	Tue Mar 21 16:11:58 2000
@@ -1,6 +1,6 @@
 #!/usr/local/bin/perl
 #
-# $Header: /home/vikas/src/nocol/webnocol/RCS/webnocol.cgi,v 2.6 2000/01/27 05:19:56 vikas Exp $
+# $Header: /home/vikas/src/nocol/webnocol/RCS/webnocol.cgi,v 2.7 2000/03/21 21:11:07 vikas Exp $
 #
 #			webnocol.cgi
 #
@@ -171,6 +171,8 @@
 # 
 sub set_userlevel {
 
+  # quick and fast,, however the HTTP_REFERER needs to be trimmed
+  # before checking.
   if ($ENV{'HTTP_REFERER'} eq $processor  && 
       $FORM{'userlevel'} ne "" && $FORM{'user'} ne "")
   {
@@ -190,6 +192,7 @@
       {
 	local ($junk, $junk, $file_userlevel) = split /:/;
 	$userlevel = int($file_userlevel);
+	$FORM{'userlevel'} = $file_userlevel;
 	close(AUTH);
 	return;
       }
@@ -207,6 +210,7 @@
       local ($file_cookie, $file_user, $file_userlevel, $file_age) = split /:/;
       $FORM{'user'} = $file_user;
       $userlevel = int($file_userlevel);
+      $FORM{'userlevel'} = $file_userlevel;
       close(COOKIE);
       return;
     }
@@ -257,6 +261,7 @@
   }	# while(AUTH)
   close(AUTH);
 
+  $FORM{'userlevel'} = $userlevel;
   if (!$found) { &print_auth_form; }
 
   ## now generate a cookie and store in the cookie file
@@ -400,9 +405,9 @@
 
   if ($ldebug > 2 && $userlevel < 3) {
     print "<!-- debug output -->\n";
+    print "<p><b>Current userlevel = $userlevel</b> </p>";
     print "<p><h4>FORM Variables</h4>\n";
     for (keys %FORM) { print "<tt>$_ = $FORM{$_} </tt><br>" ;}
-    print "<tt>userlevel = $userlevel</tt> <br>";
     print "<p><hr>\n <h4>ENV Variables</h4>\n";
     for (keys %ENV) { print "<tt>$_ = $ENV{$_} </tt><br>" ;}
   }
@@ -421,7 +426,7 @@
 	 <input type=hidden name=variable value="$FORM{'variable'}">
 	 <input type=hidden name=sender value="$FORM{'sender'}">
 	 <input type=hidden name=user value="$FORM{'user'}">
-	 <input type=hidden name=userlevel value="$userlevel">
+	 <input type=hidden name=userlevel value="$FORM{'userlevel'}">
 	 <input type=hidden name=return value="$FORM{'return'}">
 	 <input type=hidden name=displaylevel value="$FORM{'displaylevel'}">
 EoState

		NOCOL- - Design and Internals - vikas@navya.com - Jan 19, 2000
		-
Overview - ¤ Design principles - ¤ Architecture Monitors - User Interfaces - ¤ Netconsole - ¤ WebNocol - ¤ tkNocol - Reporting - Future Work -		Overview - NOCOL is a system and network monitoring software which runs on Unix platforms and - monitors reachability, ports, routes, system resources, etc. It is modular in design, and - allows adding new monitors easily and without impacting other portions of the software- in - fact, a large number of the monitors are contributed by various NOCOL users. - The basic design principles behind NOCOL are relatively few: - allow multiple users to see the same data being collected instead of requiring each user - to start their own set of monitors - multiple layers of severity to avoid any false alarms (to stop the NOC operator from - ignoring an alarm because it 'usually goes away') - incremental data storage (dont store every data sample- only store a datapoint when the - severity of an event changes). - be able to view the events from a non-graphical interface - - It might seem strange, but the initial versions of SunNet Manager, CiscoWorks, etc. had - none of the above features when NOCOL was originally written. Most of the commercial - packages required pretty extensive hardware to run, and it seemed like they sacrificed a - lot in order to present a pretty graphical interface. Nocol in contrast could run on a - very low end system, the monitors could be separated from the logging and reporting - machine since they communicated over a network, and the datapoints collected were very - small in number since they were only recorded when the severity of a device/variable - changed. So, the disk space on a machine could vary from 10% to 60% full, and only one - entry is logged since this would all be considered 'normal'. If the disk becomes 80% full, - a 'warning' message is generated and another datapoint is logged. This simple - approach gives amazingly small volumes of data, and yet presents a perfectly comprehensive - (though quantized) report on the variables. - The architecture of nocol itself is very simple- the monitors poll the devices and - assign a threshold to each 'poll' (called an 'event'). These thresholds are user settable - and vary from monitor to monitor (in fact, the rest of the software does not care what is - being monitored and does not store any intelligence about the variables). All intelligence - of the variable being monitored and what conditions are to normal and abnormal is built - into the monitor. - - The monitor would then set: - the current value of the variable (thruput, lost packets) - the current threshold (there can be upto 3 thresholds for 3 different levels of - severity) - timestamp, etc. - - and invoke a nocol API function. This writes out the current values, etc. to a realtime - data file on disk (this contains the current state of any device/variable) and if the - severity has changed, then this also gets logged to a incremental logging daemon - (noclogd). - All user displays can then display the data from the realtime data directory, whereas - the alarm and notification subsystem gets activated by the incremental 'noclogd' - process. The noclogd process can filter events based on user defined criteria, and invoke - an SMS pager, send email, perhaps even run some automated tests or open a trouble ticket. - This simple architecture has proven to work very effectively in this application. The - base system has not really changed since the software was initially written, but new - monitors, displays, notification software is continually being added without any changes - to the core system. - - The Monitors - The monitors collect variable values and compares to see if it exceeds any of the 3 - thresholds (warning, error, critical- these thresholds are user configurable). This is all - done using the nocol library functions, so in effect, all the monitor needs to do is get - the value for the variable being monitored and read the thresholds from a config file. The - nocol library ensures consistency in the way that the monitoring is processed by the rest - of the system. - Each monitor is unique in the way that it monitors its respective variable. The DNS - monitor needs to make an authoritative DNS query to see if the dns server is configured - properly, the Port monitor needs to connect to TCP or UDP ports to ensure that any - processes are responding properly, and the SNMP monitor needs to monitor snmp variables - using the SNMP protocol. The intelligence about the entity being monitored and how to - monitor it lies strictly in the monitor- the rest of the nocol subsystem is just - expecting a device name, variable and its value. - A fair amount of effort has gone into making the monitors very efficient where possible - in order to allow them to scale to a large number of devices. Connectionless (UDP) - monitors are specially well suited to using the select() system call so that many devices - can be queried at the same time and the monitor then waits for the responses to come in. - The other option was to fork multiple processes with a single parent and each process - monitors one device. However, the level of scalability that could be achieved with the - first method proved to be far more than what could be achieved with the forking method. - To emphasize the above, consider 'pinging' 100 devices with 5 packets each, waiting 1 - second for each response and 1 sec between each packet to the same host. If done serially, - this would take at least 500 seconds for each pass. If we fork multiple processes to do it - in parallel, this would take about 5 seconds, but we would have to fork a 100 processes. - The 'multiping' monitor could send out 1 packet to each of the 100 devices in about - 10 seconds and then listen for the responses to come in- effectively taking about 15 - seconds for the entire pass. - Building this level of 'multi-tasking' is a lot more difficult in the TCP based - services since it would require non-blocking I/O, but it it important to do this for - monitors such as 'portmon'. All of these type of monitors (using select()) are limited by - the MAXFD value (maximum open file descriptors that can be handled by the select() call). - The 'hostmon' monitor is an example of letting the remote hosts being monitored - do the local data collection (i.e. distributing the 'time consuming' part to - hostmon-clients). The 'hostmon' process running on the nocol host simply takes all these - data files and uses them as raw input to the process. - In some cases, the monitors do not need any data other than what is in the nocol data - structure written to disk (the raw data), whereas in others they need to store ancillary, - variable and device specific information in memory. All possible efforts are made to avoid - storing unnecessary data in memory and having 'bloated' monitoring processes. - - User Interfaces - The user interfaces need to display the current state of the devices being - monitored, and this 'current' data is stored on disk (in the 'data' directory). This - allows any number of users and monitors to view the same consistent data, and run only one - set of monitors (unlike some other systems which need a separate monitor for each - display). - The other diversion from traditional network monitoring packages is the displaying of - monitored data using text lines and not a map or other graphical interface. The reason - this approach was taken is that in practical experience, a network diagram was always done - in some 'drawing' tool and the map on the NMS was not updated regularly. Even today, most - network/lan diagrams are maintained in a tool such as Visio, and the NMS graphical - interface is always a 'second' copy. This and being able to view line based data from any - terminal weighed very heavily in favor of a non-graphical user interface. - Netconsole (curses) - Netconsole is a simple Unix 'curses' based TTY interface. It reads the raw data from - disk and formats it for displaying on the screen. It has limited intelligence, and its - method of setting an alarm is when it sees a change in the number of 'down' items. This is - the original user interface and was written to let engineers view the state of the - systems/network over a low speed connection (over which X windows, etc. would not be - feasible). - WebNocol (Web) - Contributed by Rick Beebe, this is a Web based frontend to the datafiles. It - allows running CGI's and troubleshooting listed events and all the other benefits of HTTP. - - This web interface automatically refreshes periodically, plays an audio clip if a site - changes its severity level, etc. A 'status' message can be displayed next to each event - which is inserted by any valid operator. Users are assigned access levels which controls - how much information they can view or edit. - tkNocol (Tcl/Tcl) - This is a client-server application contributed by Lydia Leong. The tkNocol application - connects to the ndaemon process on the host system, and displays the nocol data - in a X-window. - - This interface needs 'tixwish' on the system. Any number of clients can connect - to the simple process (ndaemon) running on the nocol host which sends data to all - the clients periodically. Currently there is no access control configured on the ndaemon - host, so this should be protected by a firewall, but this interface can be extended to add - these features in the daemon if needed. - - Reporting - The monitors in NOCOL generate a 'historical' event (logged to the noclogd - logging daemon) only when the severity of a variable changes (i.e. it goes from warning - to error or from critical back to info. This is done to reduce - the amount of historical data collected and restrict it only to 'relevant' datapoints. - This quantized data storage allows a monitor to poll a device or variable as frequently as - it likes (30 secs, 10 minutes), but it will generate a logging entry only if the variable - crosses one of the thresholds. - This approach of classifying the data into 'bins' reduces the quantity of - historical data significantly. Even though some granularity is lost, statistical analysis - can easily be done on the collected data by using the time interval that a variable - remained in a particular level. - 'noclogd' is similar to the Unix 'syslog' process- it allows piping the log message to - an external program or writing to a file based on the monitor name. This forms the - basis of invoking SMS scripts to do paging, sending email, automated insertion into - trouble ticketing systems, auto problem analysis, etc.- the possibilities are virtually - unlimited. - Currently this system writes to flat files, but the data can easily be piped to a - process that writes into a database. Note that the 'current' data that the monitors write - to disk (the raw data which is displayed by the user interfaces) is overwritten in every - pass by the monitor. Hence the size of those files is fixed and does not grow over time. - - Future Work - The package does not interface to any database, and would benefit greatly from storing - the raw (monitored) and noclogd historical information in a database such as MySQL, etc. - This would allow co-relating the various variables being monitored for any device (e.g. - show the state of all variables being monitored for device lan-gw). Graphs and reports - could be charted from the historical noclogd database. - A Java based user interface along the same lines as tkNocol would allow running the - display on any platform and one could build a lot of graphing, reporting functionality - into the gui itself. - The GUI could also support collapsing all the variables for an event in one line, and - only display the variable when it needs to be displayed. - Instead of the current line based GUI, it would be good to be able to display a - map of the network and the devices. The raw data would have no information on coordinates - or drawing, but a separate file could contain all the information necessary to create a - graphical display. As an example, the file could contain coordinates of nodes and edges, - and hierarchical relationships between the devices- the user interface could read this - data and construct the diagram of the network. - In order to make this scalable, it would be useful to allow various NOCOL's to interact - with each other. This is easily doable using the noclogd daemon, since noclogd - can be enhanced to send an event to other noclogd's running on remote hosts. The data can - be isolated and referred to using the 'nodename' to prefix the data/events. - -
- Vikas Aggarwal - -
		NOCOL- + Design and Internals + vikas@navya.com + Mar 22, 2000
		+
Overview + ¤ Design principles + ¤ Architecture Monitors + User Interfaces + ¤ Netconsole + ¤ WebNocol + ¤ tkNocol + Reporting + Future Work +		Overview + NOCOL is a system and network monitoring software which runs on Unix platforms and + monitors reachability, ports, routes, system resources, etc. It is modular in design, and + allows adding new monitors easily and without impacting other portions of the software- in + fact, a large number of the monitors are contributed by various NOCOL users. + The basic design principles behind NOCOL are relatively few: + allow multiple users to see the same data being collected instead of requiring each user + to start their own set of monitors + multiple layers of severity to avoid any false alarms (to stop the NOC operator from + ignoring an alarm because it 'usually goes away') + incremental data storage (dont store every data sample- only store a datapoint when the + severity of an event changes). + be able to view the events from a non-graphical interface + + It might seem strange, but the initial versions of SunNet Manager, CiscoWorks, etc. had + none of the above features when NOCOL was originally written. Most of the commercial + packages required pretty extensive hardware to run, and it seemed like they sacrificed a + lot in order to present a pretty graphical interface. Nocol in contrast could run on a + very low end system, the monitors could be separated from the logging and reporting + machine since they communicated over a network, and the datapoints collected were very + small in number since they were only recorded when the severity of a device/variable + changed. So, the disk space on a machine could vary from 10% to 60% full, and only one + entry is logged since this would all be considered 'normal'. If the disk becomes 80% full, + a 'warning' message is generated and another datapoint is logged. This simple + approach gives amazingly small volumes of data, and yet presents a perfectly comprehensive + (though quantized) report on the variables. + The architecture of nocol itself is very simple- the monitors poll the devices and + assign a threshold to each 'poll' (called an 'event'). These thresholds are user settable + and vary from monitor to monitor (in fact, the rest of the software does not care what is + being monitored and does not store any intelligence about the variables). All intelligence + of the variable being monitored and what conditions are to normal and abnormal is built + into the monitor. + + The monitor would then set: + the current value of the variable (thruput, lost packets) + the current threshold (there can be upto 3 thresholds for 3 different levels of + severity) + timestamp, etc. + + and invoke a nocol API function. This writes out the current values, etc. to a realtime + data file on disk (this contains the current state of any device/variable) and if the + severity has changed, then this also gets logged to a incremental logging daemon + (noclogd). + All user displays can then display the data from the realtime data directory, whereas + the alarm and notification subsystem gets activated by the incremental 'noclogd' + process. The noclogd process can filter events based on user defined criteria, and invoke + an SMS pager, send email, perhaps even run some automated tests or open a trouble ticket. + An 'event' is basically a unique tuple of device name + device + address + variable name. Each event has a current data value of the variable being + monitored, and also a threshold value corresponding to the current severity level. This is + best understood by looking at the event data structure in the nocol.h C + include file. + This simple architecture has proven to work very effectively in this application. The + base system has not really changed since the software was initially written, but new + monitors, displays, notification software is continually being added without any changes + to the core system. + + The Monitors + The monitors collect variable values and compares to see if it exceeds any of the 3 + thresholds (warning, error, critical- these thresholds are user configurable). This is all + done using the nocol library functions, so in effect, all the monitor needs to do is get + the value for the variable being monitored and read the thresholds from a config file. The + nocol library ensures consistency in the way that the monitoring is processed by the rest + of the system. + Each monitor is unique in the way that it monitors its respective variable. The DNS + monitor needs to make an authoritative DNS query to see if the dns server is configured + properly, the Port monitor needs to connect to TCP or UDP ports to ensure that any + processes are responding properly, and the SNMP monitor needs to monitor snmp variables + using the SNMP protocol. The intelligence about the entity being monitored and how to + monitor it lies strictly in the monitor- the rest of the nocol subsystem is just + expecting a device name, variable and its value. + A fair amount of effort has gone into making the monitors very efficient where possible + in order to allow them to scale to a large number of devices. Connectionless (UDP) + monitors are specially well suited to using the select() system call so that many devices + can be queried at the same time and the monitor then waits for the responses to come in. + The other option was to fork multiple processes with a single parent and each process + monitors one device. However, the level of scalability that could be achieved with the + first method proved to be far more than what could be achieved with the forking method. + To emphasize the above, consider 'pinging' 100 devices with 5 packets each, waiting 1 + second for each response and 1 sec between each packet to the same host. If done serially, + this would take at least 500 seconds for each pass. If we fork multiple processes to do it + in parallel, this would take about 5 seconds, but we would have to fork a 100 processes. + The 'multiping' monitor could send out 1 packet to each of the 100 devices in about + 10 seconds and then listen for the responses to come in- effectively taking about 15 + seconds for the entire pass. + Building this level of 'multi-tasking' is a lot more difficult in the TCP based + services since it would require non-blocking I/O, but it it important to do this for + monitors such as 'portmon'. All of these type of monitors (using select()) are limited by + the MAXFD value (maximum open file descriptors that can be handled by the select() call). + The 'hostmon' monitor is an example of letting the remote hosts (that are being + monitored) do the local data collection (i.e. distributing the 'time consuming' part to + hostmon-clients). The 'hostmon' process runs on the nocol monitoring host and simply takes + all these data files and uses them as raw input for processing. + In some cases, the monitors do not need any data other than what is in the nocol data + structure written to disk (the raw data), whereas in others they need to store ancillary, + variable and device specific information in memory. All possible efforts are made to avoid + storing unnecessary data in memory and having 'bloated' monitoring processes. + + User Interfaces + The user interfaces need to display the current state of the devices being + monitored, and this 'current' data is stored on disk (in the 'data' directory). This + allows any number of users and monitors to view the same consistent data, and run only one + set of monitors (unlike some other systems which need a separate monitor for each + display). + The other diversion from traditional network monitoring packages is the displaying of + monitored data using text lines and not a map or other graphical interface. The reason + this approach was taken is that in practical experience, a network diagram was always done + in some 'drawing' tool and the map on the NMS was not updated regularly. Even today, most + network/lan diagrams are maintained in a tool such as Visio, and the NMS graphical + interface is always a 'second' copy. This and being able to view line based data from any + terminal weighed very heavily in favor of a non-graphical user interface. + Netconsole (curses) + Netconsole is a simple Unix 'curses' based TTY interface. It reads the raw data from + disk and formats it for displaying on the screen. It has limited intelligence, and its + method of setting an alarm is when it sees a change in the number of 'down' items. This is + the original user interface and was written to let engineers view the state of the + systems/network over a low speed connection (over which X windows, etc. would not be + feasible). + WebNocol (Web) + Contributed by Rick Beebe, this is a Web based frontend to the datafiles. It + allows running CGI's and troubleshooting listed events and all the other benefits of HTTP. + + This web interface automatically refreshes periodically, plays an audio clip if a site + changes its severity level, etc. A 'status' message can be displayed next to each event + which is inserted by any valid operator. Users are assigned access levels which controls + how much information they can view or edit. + tkNocol (Tcl/Tcl) + This is a client-server application contributed by Lydia Leong. The tkNocol application + connects to the ndaemon process on the host system, and displays the nocol data + in a X-window. + + This interface needs 'tixwish' on the system. Any number of clients can connect + to the simple process (ndaemon) running on the nocol host which sends data to all + the clients periodically. Currently there is no access control configured on the ndaemon + host, so this should be protected by a firewall, but this interface can be extended to add + these features in the daemon if needed. + + Reporting + The monitors in NOCOL generate an event (logged to the noclogd logging daemon) + only when the severity of a variable changes (i.e. it goes from warning to + error or from critical back to info. The thresholds for the + various severities are defined by the user, and this tends to reduce the irrelevant data + points from the collected data. This threshold triggered event generation allows a + monitor to poll a device or variable as frequently as it likes (30 secs, 10 minutes), but + it will generate a logging entry only if the variable crosses one of the thresholds. + This approach of recording values only when the state changes also reduces the quantity + of historical data significantly. Even though some granularity is lost, statistical + analysis can easily be done on the collected data by using the time interval that a + variable remained in a particular level. + 'noclogd' is similar to the Unix 'syslog' process- it allows piping the log message to + an external program or writing to a file based on the monitor name. This forms the + basis of invoking SMS scripts to do paging, sending email, automated insertion into + trouble ticketing systems, auto problem analysis, etc.- the possibilities are virtually + unlimited. + Currently this system writes to flat files, but the data can easily be piped to a + process that writes into a database. Note that the 'current' data that the monitors write + to disk (the raw data which is displayed by the user interfaces) is overwritten in every + pass by the monitor. Hence the size of those files is fixed and does not grow over time. + + Future Work + The package does not interface to any database, and would benefit greatly from storing + the raw (monitored) and noclogd historical information in a database such as MySQL, etc. + This would allow co-relating the various variables being monitored for any device (e.g. + show the state of all variables being monitored for device lan-gw). Graphs and reports + could be charted from the historical noclogd database. + A Java based user interface along the same lines as tkNocol would allow running the + display on any platform and one could build a lot of graphing, reporting functionality + into the gui itself. + The GUI could also support collapsing all the variables for an event in one line, and + only display the variable when it needs to be displayed. + Instead of the current line based GUI, it would be good to be able to display a + map of the network and the devices. The raw data would have no information on coordinates + or drawing, but a separate file could contain all the information necessary to create a + graphical display. As an example, the file could contain coordinates of nodes and edges, + and hierarchical relationships between the devices- the user interface could read this + data and construct the diagram of the network. + In order to make this scalable, it would be useful to allow various NOCOL's to interact + with each other. This is easily doable using the noclogd daemon, since noclogd + can be enhanced to send an event to other noclogd's running on remote hosts. The data can + be isolated and referred to using the 'nodename' to prefix the data/events. + +
+ Vikas Aggarwal + +
1.	portmon.c	» Missing close() left too many file + descriptors + » Now running check_resp() after receiving EOF from the remote host. Should fix problem + of receiving data with no \n in entire data stream.
2.	snmpgeneric	» Allow specifying client port number (joe@hole-in-the.net) + » Sets MIBFILE_v2 variable also for the mib file
3.	webnocol.cgi	Small fix to prevent possible loop.
4.	nocollib.pl	Changed 'ps' to '/bin/ps'
5.	hostmon-client	Changed 'ps' to '/bin/ps'