diff -Naur nocol-4.3/INSTALL nocol-4.3.1/INSTALL --- nocol-4.3/INSTALL Wed Jan 26 23:56:19 2000 +++ nocol-4.3.1/INSTALL Tue Mar 21 00:24:06 2000 @@ -1,4 +1,4 @@ -## $Id: INSTALL,v 4.4 2000/01/27 04:55:57 vikas Exp $ +## $Id: INSTALL,v 4.5 2000/03/21 05:23:58 vikas Exp $ INSTALLATION INSTRUCTIONS FOR 'NOCOL' v4.3 ========================================== @@ -147,25 +147,31 @@ --------- There is a PERL interface for developing additional NOCOL monitors. To use -this, you need to have PERL installed on your system (perl is available -from ftp.uu.net or from ftp.netlabs.com). +this, you need to have PERL installed on your system. -1. If using 'hostmon', edit the hostmon server and the clients to set - the port & permitted hosts. The client routines do not use nocollib.pl - and can be run entirely standalone on remote hosts (just copy over - perlnocol/hostmon-osclients to all your Unix clients and start up - hostmon-client at boot time by makeing an entry in your /etc/rc.local - or equivalent file). As an example, you can do the following on all - your Unix hosts: +1. If using 'hostmon', you need to run the standalone 'hostmon-client' + programs on the machines you want monitored, and run the 'hostmon' + process on the 'nocol' server. Check the '@permithosts' line in the + 'hostmon-client' program to ensure that it allows the nocol host to + connect to the hostmon-client processes. Then copy over the entire + 'perlnocol/hostmon-osclients' directory to all the Unix hosts that + you want monitored. These client routines do not use nocollib.pl + and do not use any configuration file. + Start up hostmon-client at boot time by making an entry in your + /etc/rc.local or equivalent file. As an example, you can do the + following on all your Unix hosts you want monitored: cd $ROOTDIR/bin rsh host1 mkdir /usr/local/nocol rcp -r hostmon-osclients host1:/usr/local/nocol rlogin host1 # Now edit your /etc/rc.local or whatever system startup script # and add the line: - # (cd /usr/local/nocol/hostmon-osclients; ./hostmon-client) - # Run this command manully for now since you are not rebooting + # (cd /usr/local/nocol/hostmon-osclients; ./hostmon-client) + # Run this command manually for now since you are not rebooting # your machine. + + The 'hostmon' process on the nocol host will be restarted by the + 'keepalive_monitors' process. Edit the hostmon-confg file. 2. To use 'snmpmon', edit and set the thresholds in the snmpmon-confg file. List the devices that need to be monitored in the diff -Naur nocol-4.3/Makefile.mid nocol-4.3.1/Makefile.mid --- nocol-4.3/Makefile.mid Wed Jan 26 23:58:24 2000 +++ nocol-4.3.1/Makefile.mid Thu Apr 6 15:35:19 2000 @@ -1,4 +1,4 @@ -# $Header: /home/vikas/src/nocol/RCS/Makefile.mid,v 1.3 2000/01/27 04:58:15 vikas Exp $ +# $Header: /home/vikas/src/nocol/RCS/Makefile.mid,v 1.4 2000/04/06 19:35:08 vikas Exp $ # # Makefile for 'nocol'. This file simply calls on other Makefiles in # the subdirectories to do all the work. All the definitions are used @@ -7,7 +7,7 @@ # To 'make' for only one program, use # make "SRCS=trapmon" [install|clean] # -REV = "4.3" +REV = "4.3.1" package=@package@ OS=@OS@ WHOAMI=@WHOAMI@ diff -Naur nocol-4.3/html/design.html nocol-4.3.1/html/design.html --- nocol-4.3/html/design.html Wed Jan 26 23:39:32 2000 +++ nocol-4.3.1/html/design.html Mon Mar 27 22:08:58 2000 @@ -1,230 +1,237 @@ - - - -The Design of NOCOL - - - - -
- - - - - - - - - - - - - - - - - - - - - -
 

NOCOL- - Design and Internals
- vikas@navya.com
- Jan 19, 2000

 
-
Overview
-
Design principles
- Architecture

Monitors

-

User Interfaces
-
Netconsole
- WebNocol
- tkNocol

-

Reporting

-

Future Work
-

Overview

-

NOCOL is a system and network monitoring software which runs on Unix platforms and - monitors reachability, ports, routes, system resources, etc. It is modular in design, and - allows adding new monitors easily and without impacting other portions of the software- in - fact, a large number of the monitors are contributed by various NOCOL users.

-

The basic design principles  behind NOCOL are relatively few:

    -
  • allow multiple users to see the same data being collected instead of requiring each user - to start their own set of monitors
  • -
  • multiple layers of severity to avoid any false alarms (to stop the NOC operator from - ignoring an alarm because it 'usually goes away')
  • -
  • incremental data storage (dont store every data sample- only store a datapoint when the - severity of an event changes).
  • -
  • be able to view the events from a non-graphical interface
  • -
-

It might seem strange, but the initial versions of SunNet Manager, CiscoWorks, etc. had - none of the above features when NOCOL was originally written. Most of the commercial - packages required pretty extensive hardware to run, and it seemed like they sacrificed a - lot in order to present a pretty graphical interface. Nocol in contrast could run on a - very low end system, the monitors could be separated from the logging and reporting - machine since they communicated over a network, and the datapoints collected were very - small in number since they were only recorded when the severity of a device/variable - changed. So, the disk space on a machine could vary from 10% to 60% full, and only one - entry is logged since this would all be considered 'normal'. If the disk becomes 80% full, -   a 'warning' message is generated and another datapoint is logged. This simple - approach gives amazingly small volumes of data, and yet presents a perfectly comprehensive - (though quantized) report on the variables.

-

The architecture of nocol itself is very simple- the monitors poll the devices and - assign a threshold to each 'poll' (called an 'event'). These thresholds are user settable - and vary from monitor to monitor (in fact, the rest of the software does not care what is - being monitored and does not store any intelligence about the variables). All intelligence - of the variable being monitored and what conditions are to normal and abnormal is built - into the monitor.

-

nocol-arch.gif (8177 bytes)

-

The monitor would then set:

    -
  • the current value of the variable (thruput, lost packets)
  • -
  • the current threshold (there can be upto 3 thresholds for 3 different levels of - severity)
  • -
  • timestamp, etc.
  • -
-

and invoke a nocol API function. This writes out the current values, etc. to a realtime - data file on disk (this contains the current state of any device/variable) and if the - severity has changed, then this also gets logged to a incremental logging daemon - (noclogd).

-

All user displays can then display the data from the realtime data directory, whereas - the alarm and notification subsystem gets activated by the incremental  'noclogd' - process. The noclogd process can filter events based on user defined criteria, and invoke - an SMS pager, send email, perhaps even run some automated tests or open a trouble ticket.

-

This simple architecture has proven to work very effectively in this application. The - base system has not really changed since the software was initially written, but new - monitors, displays, notification software is continually being added without any changes - to the core system.

-
-

The Monitors

-

The monitors collect variable values and compares to see if it exceeds any of the 3 - thresholds (warning, error, critical- these thresholds are user configurable). This is all - done using the nocol library functions, so in effect, all the monitor needs to do is get - the value for the variable being monitored and read the thresholds from a config file. The - nocol library ensures consistency in the way that the monitoring is processed by the rest - of the system.

-

Each monitor is unique in the way that it monitors its respective variable. The DNS - monitor needs to make an authoritative DNS query to see if the dns server is configured - properly, the Port monitor needs to connect to TCP or UDP ports to ensure that any - processes are responding properly, and the SNMP monitor needs to monitor snmp variables - using the SNMP protocol. The intelligence about the entity being monitored and how to - monitor it lies strictly in the monitor- the rest of the nocol subsystem  is just - expecting a  device name, variable and its value.

-

A fair amount of effort has gone into making the monitors very efficient where possible - in order to allow them to scale to a large number of devices. Connectionless (UDP) - monitors are specially well suited to using the select() system call so that many devices - can be queried at the same time and the monitor then waits for the responses to come in. - The other option was to fork multiple processes with a single parent and each process - monitors one device. However, the level of scalability that could be achieved with the - first method proved to be far more than what could be achieved with the forking method.

-

To emphasize the above, consider 'pinging' 100 devices with 5 packets each, waiting 1 - second for each response and 1 sec between each packet to the same host. If done serially, - this would take at least 500 seconds for each pass. If we fork multiple processes to do it - in parallel, this would take about 5 seconds, but we would have to fork a 100 processes. - The  'multiping' monitor could send out 1 packet to each of the 100 devices in about - 10 seconds and then listen for the responses to come in- effectively taking about 15 - seconds for the entire pass.

-

Building this level of 'multi-tasking' is a lot more difficult in the TCP based - services since it would require non-blocking I/O, but it it important to do this for - monitors such as 'portmon'. All of these type of monitors (using select()) are limited by - the MAXFD value (maximum open file descriptors that can be handled by the select() call).

-

The 'hostmon'  monitor is an example of letting the remote hosts being monitored - do the local data collection (i.e. distributing the 'time consuming' part to - hostmon-clients). The 'hostmon' process running on the nocol host simply takes all these - data files and uses them as raw input to the process.

-

In some cases, the monitors do not need any data other than what is in the nocol data - structure written to disk (the raw data), whereas in others they need to store ancillary, - variable and device specific information in memory. All possible efforts are made to avoid - storing unnecessary data in memory and having 'bloated' monitoring processes.

-
-

User Interfaces

-

The user interfaces need to display the current state of  the devices being - monitored, and this 'current' data is stored on disk (in the 'data' directory). This - allows any number of users and monitors to view the same consistent data, and run only one - set of monitors (unlike some other systems which need a separate monitor for each - display).

-

The other diversion from traditional network monitoring packages is the displaying of - monitored data using text lines and not a map or other graphical interface. The reason - this approach was taken is that in practical experience, a network diagram was always done - in some 'drawing' tool and the map on the NMS was not updated regularly. Even today, most - network/lan diagrams are maintained in a tool such as Visio, and the NMS graphical - interface is always a 'second' copy. This and being able to view line based data from any - terminal weighed very heavily in favor of a non-graphical user interface.

-

Netconsole (curses)

-

Netconsole is a simple Unix 'curses' based TTY interface. It reads the raw data from - disk and formats it for displaying on the screen. It has limited intelligence, and its - method of setting an alarm is when it sees a change in the number of 'down' items. This is - the original user interface and was written to let engineers view the state of the - systems/network over a low speed connection (over which X windows, etc. would not be - feasible).

-

WebNocol (Web)

-

Contributed  by Rick Beebe, this is a Web based frontend to the datafiles. It - allows running CGI's and troubleshooting listed events and all the other benefits of HTTP.

-

webnocol.gif (20197 bytes)

-

This web interface automatically refreshes periodically, plays an audio clip if a site - changes its severity level, etc. A 'status' message can be displayed next to each event - which is inserted by any valid operator. Users are assigned access levels which controls - how much information they can view or edit.

-

tkNocol (Tcl/Tcl)

-

This is a client-server application contributed by Lydia Leong. The tkNocol application - connects to the ndaemon process on the host system, and displays the nocol data - in a X-window.

-

tkNocol.gif (18782 bytes)

-

This interface needs 'tixwish' on the system. Any number of clients can  connect - to the simple process (ndaemon) running on the nocol host which sends data to all - the clients periodically. Currently there is no access control configured on the ndaemon - host, so this should be protected by a firewall, but this interface can be extended to add - these features in the daemon if needed.

-
-

Reporting

-

The monitors in NOCOL generate a 'historical' event (logged to the noclogd - logging daemon) only when the severity of a variable changes (i.e. it goes from warning - to error or from critical back to info. This is done to reduce - the amount of historical data collected and restrict it only to 'relevant' datapoints. - This quantized data storage allows a monitor to poll a device or variable as frequently as - it likes (30 secs, 10 minutes), but it will generate a logging entry only if the variable - crosses one of the thresholds.

-

This approach of classifying the data into 'bins'  reduces the quantity of - historical data significantly. Even though some granularity is lost, statistical analysis - can easily be done on the collected data by using the time interval that a variable - remained in a particular level.

-

'noclogd' is similar to the Unix 'syslog' process- it allows piping the log message to - an external program or writing to a file  based on the monitor name. This forms the - basis of  invoking SMS scripts to do paging, sending email, automated insertion into - trouble ticketing systems, auto problem analysis, etc.- the possibilities are virtually - unlimited.

-

Currently this system writes to flat files, but the data can easily be piped to a - process that writes into a database. Note that the 'current' data that the monitors write - to disk (the raw data which is displayed by the user interfaces) is overwritten in every - pass by the monitor. Hence the size of those files is fixed and does not grow over time.

-
-

Future Work

-

The package does not interface to any database, and would benefit greatly from storing - the raw (monitored) and noclogd historical information in a database such as MySQL, etc. - This would allow co-relating the various variables being monitored for any device (e.g. - show the state of all variables being monitored for device lan-gw). Graphs and reports - could be charted from the historical noclogd database.

-

A Java based user interface along the same lines as tkNocol would allow running the - display on any platform and one could build a lot of graphing, reporting functionality - into the gui itself.

-

The GUI could also support collapsing all the variables for an event in one line, and - only display the variable when it needs to be displayed.

-

Instead of the current line based GUI, it would be good to be able to display a - map of the network and the devices. The raw data would have no information on coordinates - or drawing, but a separate file could contain all the information necessary to create a - graphical display. As an example, the file could contain coordinates of nodes and edges, - and hierarchical relationships between the devices- the user interface could read this - data and construct the diagram of the network.

-

In order to make this scalable, it would be useful to allow various NOCOL's to interact - with each other. This is easily doable using the noclogd daemon, since noclogd - can be enhanced to send an event to other noclogd's running on remote hosts. The data can - be isolated and  referred to using the 'nodename' to prefix the data/events.

-
-
- Vikas Aggarwal -
-
-
- - + + + + + +The Design of NOCOL + + + + +
+ + + + + + + + + + + + + + + + + + + + + +
 

NOCOL- + Design and Internals
+ vikas@navya.com
+ Mar 22, 2000

 
+
Overview
+
Design principles
+ Architecture

Monitors

+

User Interfaces
+
Netconsole
+ WebNocol
+ tkNocol

+

Reporting

+

Future Work
+

Overview

+

NOCOL is a system and network monitoring software which runs on Unix platforms and + monitors reachability, ports, routes, system resources, etc. It is modular in design, and + allows adding new monitors easily and without impacting other portions of the software- in + fact, a large number of the monitors are contributed by various NOCOL users.

+

The basic design principles  behind NOCOL are relatively few:

    +
  • allow multiple users to see the same data being collected instead of requiring each user + to start their own set of monitors
  • +
  • multiple layers of severity to avoid any false alarms (to stop the NOC operator from + ignoring an alarm because it 'usually goes away')
  • +
  • incremental data storage (dont store every data sample- only store a datapoint when the + severity of an event changes).
  • +
  • be able to view the events from a non-graphical interface
  • +
+

It might seem strange, but the initial versions of SunNet Manager, CiscoWorks, etc. had + none of the above features when NOCOL was originally written. Most of the commercial + packages required pretty extensive hardware to run, and it seemed like they sacrificed a + lot in order to present a pretty graphical interface. Nocol in contrast could run on a + very low end system, the monitors could be separated from the logging and reporting + machine since they communicated over a network, and the datapoints collected were very + small in number since they were only recorded when the severity of a device/variable + changed. So, the disk space on a machine could vary from 10% to 60% full, and only one + entry is logged since this would all be considered 'normal'. If the disk becomes 80% full, +   a 'warning' message is generated and another datapoint is logged. This simple + approach gives amazingly small volumes of data, and yet presents a perfectly comprehensive + (though quantized) report on the variables.

+

The architecture of nocol itself is very simple- the monitors poll the devices and + assign a threshold to each 'poll' (called an 'event'). These thresholds are user settable + and vary from monitor to monitor (in fact, the rest of the software does not care what is + being monitored and does not store any intelligence about the variables). All intelligence + of the variable being monitored and what conditions are to normal and abnormal is built + into the monitor.

+

nocol-arch.gif (8177 bytes)

+

The monitor would then set:

    +
  • the current value of the variable (thruput, lost packets)
  • +
  • the current threshold (there can be upto 3 thresholds for 3 different levels of + severity)
  • +
  • timestamp, etc.
  • +
+

and invoke a nocol API function. This writes out the current values, etc. to a realtime + data file on disk (this contains the current state of any device/variable) and if the + severity has changed, then this also gets logged to a incremental logging daemon + (noclogd).

+

All user displays can then display the data from the realtime data directory, whereas + the alarm and notification subsystem gets activated by the incremental  'noclogd' + process. The noclogd process can filter events based on user defined criteria, and invoke + an SMS pager, send email, perhaps even run some automated tests or open a trouble ticket.

+

An 'event' is basically a unique tuple of  device name + device + address + variable name. Each event has a current data value of the variable being + monitored, and also a threshold value corresponding to the current severity level. This is + best understood by looking at the event data structure in the nocol.h C + include file.

+

This simple architecture has proven to work very effectively in this application. The + base system has not really changed since the software was initially written, but new + monitors, displays, notification software is continually being added without any changes + to the core system.

+
+

The Monitors

+

The monitors collect variable values and compares to see if it exceeds any of the 3 + thresholds (warning, error, critical- these thresholds are user configurable). This is all + done using the nocol library functions, so in effect, all the monitor needs to do is get + the value for the variable being monitored and read the thresholds from a config file. The + nocol library ensures consistency in the way that the monitoring is processed by the rest + of the system.

+

Each monitor is unique in the way that it monitors its respective variable. The DNS + monitor needs to make an authoritative DNS query to see if the dns server is configured + properly, the Port monitor needs to connect to TCP or UDP ports to ensure that any + processes are responding properly, and the SNMP monitor needs to monitor snmp variables + using the SNMP protocol. The intelligence about the entity being monitored and how to + monitor it lies strictly in the monitor- the rest of the nocol subsystem  is just + expecting a  device name, variable and its value.

+

A fair amount of effort has gone into making the monitors very efficient where possible + in order to allow them to scale to a large number of devices. Connectionless (UDP) + monitors are specially well suited to using the select() system call so that many devices + can be queried at the same time and the monitor then waits for the responses to come in. + The other option was to fork multiple processes with a single parent and each process + monitors one device. However, the level of scalability that could be achieved with the + first method proved to be far more than what could be achieved with the forking method.

+

To emphasize the above, consider 'pinging' 100 devices with 5 packets each, waiting 1 + second for each response and 1 sec between each packet to the same host. If done serially, + this would take at least 500 seconds for each pass. If we fork multiple processes to do it + in parallel, this would take about 5 seconds, but we would have to fork a 100 processes. + The  'multiping' monitor could send out 1 packet to each of the 100 devices in about + 10 seconds and then listen for the responses to come in- effectively taking about 15 + seconds for the entire pass.

+

Building this level of 'multi-tasking' is a lot more difficult in the TCP based + services since it would require non-blocking I/O, but it it important to do this for + monitors such as 'portmon'. All of these type of monitors (using select()) are limited by + the MAXFD value (maximum open file descriptors that can be handled by the select() call).

+

The 'hostmon'  monitor is an example of letting the remote hosts (that are being + monitored) do the local data collection (i.e. distributing the 'time consuming' part to + hostmon-clients). The 'hostmon' process runs on the nocol monitoring host and simply takes + all these data files and uses them as raw input for processing.

+

In some cases, the monitors do not need any data other than what is in the nocol data + structure written to disk (the raw data), whereas in others they need to store ancillary, + variable and device specific information in memory. All possible efforts are made to avoid + storing unnecessary data in memory and having 'bloated' monitoring processes.

+
+

User Interfaces

+

The user interfaces need to display the current state of  the devices being + monitored, and this 'current' data is stored on disk (in the 'data' directory). This + allows any number of users and monitors to view the same consistent data, and run only one + set of monitors (unlike some other systems which need a separate monitor for each + display).

+

The other diversion from traditional network monitoring packages is the displaying of + monitored data using text lines and not a map or other graphical interface. The reason + this approach was taken is that in practical experience, a network diagram was always done + in some 'drawing' tool and the map on the NMS was not updated regularly. Even today, most + network/lan diagrams are maintained in a tool such as Visio, and the NMS graphical + interface is always a 'second' copy. This and being able to view line based data from any + terminal weighed very heavily in favor of a non-graphical user interface.

+

Netconsole (curses)

+

Netconsole is a simple Unix 'curses' based TTY interface. It reads the raw data from + disk and formats it for displaying on the screen. It has limited intelligence, and its + method of setting an alarm is when it sees a change in the number of 'down' items. This is + the original user interface and was written to let engineers view the state of the + systems/network over a low speed connection (over which X windows, etc. would not be + feasible).

+

WebNocol (Web)

+

Contributed  by Rick Beebe, this is a Web based frontend to the datafiles. It + allows running CGI's and troubleshooting listed events and all the other benefits of HTTP.

+

webnocol.gif (20197 bytes)

+

This web interface automatically refreshes periodically, plays an audio clip if a site + changes its severity level, etc. A 'status' message can be displayed next to each event + which is inserted by any valid operator. Users are assigned access levels which controls + how much information they can view or edit.

+

tkNocol (Tcl/Tcl)

+

This is a client-server application contributed by Lydia Leong. The tkNocol application + connects to the ndaemon process on the host system, and displays the nocol data + in a X-window.

+

tkNocol.gif (18782 bytes)

+

This interface needs 'tixwish' on the system. Any number of clients can  connect + to the simple process (ndaemon) running on the nocol host which sends data to all + the clients periodically. Currently there is no access control configured on the ndaemon + host, so this should be protected by a firewall, but this interface can be extended to add + these features in the daemon if needed.

+
+

Reporting

+

The monitors in NOCOL generate an event (logged to the noclogd logging daemon) + only when the severity of a variable changes (i.e. it goes from warning to + error or from critical back to info. The thresholds for the + various severities are defined by the user, and this tends to reduce the irrelevant data + points from the collected data. This threshold triggered event generation  allows a + monitor to poll a device or variable as frequently as it likes (30 secs, 10 minutes), but + it will generate a logging entry only if the variable crosses one of the thresholds.

+

This approach of recording values only when the state changes also reduces the quantity + of historical data significantly. Even though some granularity is lost, statistical + analysis can easily be done on the collected data by using the time interval that a + variable remained in a particular level.

+

'noclogd' is similar to the Unix 'syslog' process- it allows piping the log message to + an external program or writing to a file  based on the monitor name. This forms the + basis of  invoking SMS scripts to do paging, sending email, automated insertion into + trouble ticketing systems, auto problem analysis, etc.- the possibilities are virtually + unlimited.

+

Currently this system writes to flat files, but the data can easily be piped to a + process that writes into a database. Note that the 'current' data that the monitors write + to disk (the raw data which is displayed by the user interfaces) is overwritten in every + pass by the monitor. Hence the size of those files is fixed and does not grow over time.

+
+

Future Work

+

The package does not interface to any database, and would benefit greatly from storing + the raw (monitored) and noclogd historical information in a database such as MySQL, etc. + This would allow co-relating the various variables being monitored for any device (e.g. + show the state of all variables being monitored for device lan-gw). Graphs and reports + could be charted from the historical noclogd database.

+

A Java based user interface along the same lines as tkNocol would allow running the + display on any platform and one could build a lot of graphing, reporting functionality + into the gui itself.

+

The GUI could also support collapsing all the variables for an event in one line, and + only display the variable when it needs to be displayed.

+

Instead of the current line based GUI, it would be good to be able to display a + map of the network and the devices. The raw data would have no information on coordinates + or drawing, but a separate file could contain all the information necessary to create a + graphical display. As an example, the file could contain coordinates of nodes and edges, + and hierarchical relationships between the devices- the user interface could read this + data and construct the diagram of the network.

+

In order to make this scalable, it would be useful to allow various NOCOL's to interact + with each other. This is easily doable using the noclogd daemon, since noclogd + can be enhanced to send an event to other noclogd's running on remote hosts. The data can + be isolated and  referred to using the 'nodename' to prefix the data/events.

+
+
+ Vikas Aggarwal +
+
+
+ + diff -Naur nocol-4.3/html/faq.html nocol-4.3.1/html/faq.html --- nocol-4.3/html/faq.html Wed Jan 26 23:51:08 2000 +++ nocol-4.3.1/html/faq.html Mon Mar 27 22:06:07 2000 @@ -1,159 +1,192 @@ - - - - - -nocol FAQ - - - - -

NOCOL : Frequently Asked Questions (FAQ)

- -

- - -

Last updated Jan 2000

- -
    -
  1. General -
  2. -
  3. Installation -
  4. -
  5. Miscellaneous -
  6. -
- -
- - -
-
What is NOCOL ?
-
nocol (Network Operation Center On-Line) is a collection of - system and network monitoring agents which have a common viewing interface and logging - mechanism. It can be used to monitor your LAN or WAN network devices as well as your Unix - systems and services.
-
How does NOCOL differ from MRTG ?
-
MRTG is primarily a graphing tool. Nocol is a monitoring package which detects - outages or errors on your devices. All data is quantized in nocol and you lose granularity - (which might or might not be preferable).

The two packages are complements of each - other.

-
-
Where do I get NOCOL
-
The distribution site is at www.netplex-tech.com. It - can also be downloaded via ftp from ftp.navya.com
-
What about support ?
-
NOCOL is freeware, and hence no official support is available. However, it is a popular - product and you can send messages to the nocol-users@navya.com mailing list or - search the Web using any popular Internet search engine (Altavista, Excite) for your - queries.
- You can also email queries to nocol-support@navya.com - which might grow larger in some distant future.
-
What is SNIPS ?
-
NOCOL was originally developed in 1991 and released as freeware. Since then, the - software has almost completely been rewritten and except for the old curses based 'netmon' - interface, not much else remains the same.
- SNIPS (System and Network Integrated Polling Software) - is the next version of nocol (after v4.3) with many new features such as distributed - monitoring for scalability, data graphing, parallel SNMP queries, SNMPv2, MRTG interface - for data collection and much more.

SNIPS will be announced on the nocol-users - mailing list when it is available.

-
-
Is nocol Y2K compliant ?
-
Yes. All events are logged to noclogd in the Unix timestamp format, so the - timestamps are not effected by the Y2K problem.
-
- -
What are the hardware requirements ?
-
Nocol can run very comfortably on any Pentium-100 class Unix machine with 64MB of RAM - and monitor several hundred devices. It is very lightweight in design and implementation.
-
Should I run nocol as root ?
-
NO! You should create a separate user such as 'nocol' or 'snips' and all - monitors should be run by this user. The few monitors which require root priveleges (such - as pingmon or trapmon) are installed as suid root in the nocol bin/ directory.
-
I am getting lots of messages from keepalive_monitors about - restarting
-
Either your system's ps command is not listing the complete program name and - keepalive_monitors is trying to restart the program since it thinks its down or else the - monitor being restarted cannot write the pid or data file and is dying (incorrect owner - and permissions on the nocol/run directory).

If the monitor is not running, - then try running it in debug mode (most monitors will take the -d option for - running in debug mode).

-
-
multiping gives error socket: Operation not permitted
-
multiping requires a raw socket, and needs to be installed suid root. You - probably did not run make root while installing nocol. Check the ownership and - permission of this program- it must show mode -rwsr-x--x with owner - root. If not, do the following:
		chown root multiping
-		chmod 4751 multiping
-	
-
-
Nothing is being logged to noclogd
-
Events are logged ONLY when their state changes. Thus, an event will be logged - to noclogd if a site goes from info level to warning level, etc.
-
 
-
Why do all the events clear when I send kill -HUP to a monitor?
-
Currently the monitors restart when getting a HUP signal and do not preserve the - existing state information. The feature of just re-reading the config file on getting a - HUP signal is yet to be added.
-
- -
Can nocol handle SNMPv2 ?
-
NOCOL currently uses the CMU SNMP software which does not implement SNMPv2. This will be - implemented in the next version (snips).
-
How can I page myself when a site goes down?
-
Assuming that you have an alphanumeric pager and can page yourself using email or any - other perl script, you can page yourself on a particular event by using noclogd and piping - the events to a simple script such as utility/beep_oncall.
- In addition to noclogd, you can also run utility/notifier.pl to page you.
- Paging software such as qpage can be used to do the - actual paging.
-
How do I get notified when a site comes back up ?
-
All monitors in nocol log events to noclogd based on the worst of new severity - or previous severity of an event.

Hence, when a site goes down first, it will be logged - at 'warning' level. If it comes back up, it will be marked as up but will be logged at a - loglevel of 'warning' since that was the old severity. This mechanism allows you to not - only detect when a device goes critical, but also detect when the device comes back up.

-
-
How do I get paged as soon as a site goes down ?
-
In order to avoid false alarms (and prevent operators from getting into the habit of - wait-and-it-will-go-away), NOCOL will escalate any events severity gradually. If you want - to get paged or notified as soon as a site or variable changes, you can watch it at the Warning - level instead of the Critical level.
-
Does nocol run on windows NT ?
-
Nope. No plans to port it at this time either.
-
Who maintains NOCOL ?
-
This software is currently maintained by Vikas - Aggarwal. Numerous authors have made contributions which have been added to the - package.
-
- -

- -

Feedback

- - + + + + + +nocol FAQ + + + + +

NOCOL : Frequently Asked Questions (FAQ)

+ +

+ + +

Last updated Mar 2000

+ +
    +
  1. General +
  2. +
  3. Installation +
  4. +
  5. Miscellaneous +
  6. +
+ +
+ +

GENERAL

+ +
+
What is NOCOL ?
+
nocol (Network Operation Center On-Line) is a collection of + system and network monitoring agents which have a common viewing interface and logging + mechanism. It can be used to monitor your LAN or WAN network devices as well as your Unix + systems and services.
+
 
+
How does NOCOL differ from MRTG ?
+
MRTG is primarily a graphing tool. Nocol is a monitoring package which detects + outages or errors on your devices. All data is quantized in nocol and you lose granularity + (which might or might not be preferable).
+ The two packages are complements of each other.
+
 
+
Where do I get NOCOL
+
The distribution site is at www.netplex-tech.com. It + can also be downloaded via ftp from ftp.navya.com
+
 
+
What about support ?
+
NOCOL is freeware, and hence no official support is available. However, it is a popular + product and you can send messages to the nocol-users@navya.com mailing list or + search the Web using any popular Internet search engine (Altavista, Excite) for your + queries.
+ You can also email queries to nocol-support@navya.com + which might grow larger in some distant future.
+
 
+
What is SNIPS ?
+
NOCOL was originally developed in 1991 and released as freeware. Since then, the + software has almost completely been rewritten and except for the old curses based 'netmon' + interface, not much else remains the same.
+ SNIPS (System and Network Integrated Polling Software) + is the next version of nocol (after v4.3) with many new features such as distributed + monitoring for scalability, data graphing, parallel SNMP queries, SNMPv2, MRTG interface + for data collection and much more.

SNIPS will be announced on the nocol-users + mailing list when it is available.

+
+
Is nocol Y2K compliant ?
+
Yes. All events are logged to noclogd in the Unix timestamp format, so the + timestamps are not effected by the Y2K problem.
+
+ +
+ +

INSTALLATION

+ +
+
What are the hardware requirements ?
+
Nocol can run very comfortably on any Pentium-100 class Unix machine with 64MB of RAM + and monitor several hundred devices. It is very lightweight in design and implementation.
+
 
+
Should I run nocol as root ?
+
NO! You should create a separate user such as 'nocol' or 'snips' and all + monitors should be run by this user. The few monitors which require root priveleges (such + as pingmon or trapmon) are installed as suid root in the nocol bin/ directory.
+
 
+
I am getting lots of messages from keepalive_monitors about + restarting
+
Either your system's ps command is not listing the complete program name and + keepalive_monitors is trying to restart the program since it thinks its down or else the + monitor being restarted cannot write the pid or data file and is dying (incorrect owner + and permissions on the nocol/run directory).

If the monitor is not running, + then try running it in debug mode (most monitors will take the -d option for + running in debug mode).

+
+
multiping gives error socket: Operation not permitted
+
multiping requires a raw socket, and needs to be installed suid root. You + probably did not run make root while installing nocol. Check the ownership and + permission of this program- it must show mode -rwsr-x--x with owner + root. If not, do the following:
		chown root multiping
+
+		chmod 4751 multiping
+
+	
+
+
Nothing is being logged to noclogd
+
Events are logged ONLY when their state changes. Thus, an event will be logged + to noclogd if a site goes from info level to warning level, etc.
+
 
+
Why do all the events clear when I send kill -HUP to a monitor?
+
Currently the monitors restart when getting a HUP signal and do not preserve the + existing state information. The feature of just re-reading the config file on getting a + HUP signal is yet to be added.
+
+ +
+ +

MISC

+ +
+
Can nocol handle SNMPv2 ?
+
NOCOL currently uses the CMU SNMP software which does not implement SNMPv2. This will be + implemented in the next version (snips).
+
 
+
How can I page myself when a site goes down?
+
Assuming that you have an alphanumeric pager and can page yourself using email or any + other perl script, you can page yourself on a particular event by using noclogd and piping + the events to a simple script such as utility/beep_oncall.
+ In addition to noclogd, you can also run utility/notifier.pl to page you.
+ Paging software such as qpage can be used to do the + actual paging.
+
 
+
How do I get notified when a site comes back up ?
+
All monitors in nocol log events to noclogd based on the worst of new severity + or previous severity of an event.

Hence, when a site goes down first, it will be logged + at 'warning' level. If it comes back up, it will be marked as up but will be logged at a + loglevel of 'warning' since that was the old severity. This mechanism allows you to not + only detect when a device goes critical, but also detect when the device comes back up.

+
+
How do I get paged as soon as a site goes down ?
+
In order to avoid false alarms (and prevent operators from getting into the habit of + wait-and-it-will-go-away), NOCOL will escalate any events severity gradually. If you want + to get paged or notified as soon as a site or variable changes, you can watch it at the Warning + level instead of the Critical level.
+
 
+
Can I setup host or variable dependencies in NOCOL?
+
The various displays do not handle dependencies at this time and will require code + enhancements. It is possible to write a dependency based monitor and tie it into noclogd + easily, but this has not been developed yet.
+
 
+
Where can I find an SSL web monitor?
+
This monitor is simple to write but requires linking against an SSL library which might + be subject to US export regulations and hence is not available.
+
 
+
Does nocol run on windows NT ?
+
Nope. No plans to port it at this time either.
+
 
+
Who maintains NOCOL ?
+
This software is currently maintained by Vikas + Aggarwal. Numerous authors have made contributions which have been added to the + package.
+
+ +

+ +

Feedback

+ + diff -Naur nocol-4.3/html/index.html nocol-4.3.1/html/index.html --- nocol-4.3/html/index.html Thu Jan 27 00:10:06 2000 +++ nocol-4.3.1/html/index.html Mon Mar 27 22:07:13 2000 @@ -1,104 +1,106 @@ - - - - - - - -NOCOL SNIPS Network Monitoring & Management Home Page - - - - - - - - - - - - - - - - - - - -
      <h1>NOCOL Network Monitoring Software</h1> 

 

     

Current Version 4.3

-

NOCOL/SNIPS is a system and network monitoring software that runs on Unix - systems and can poll network and system devices. It is capable of monitoring nameservers, - web ports, host performance, syslogs, radius servers, BGP peers, etc. New monitors can be - added easily (via a C or Perl API).

-

All monitors have a common display and postprocessing interface (logging, notification, - etc.) This design allows running only one set of monitors and any number of displays each - seeing the same consistent data.

-

False alarms are avoided by escalating events through severity levels- hence if a site - is unreachable, the site will be tested 2 more times before finally indicating that it is - 'critical'. All events are logged, and the operator has the capability to decide which - level to view the events at.

-

Available monitors are:

-
- - - - - - - - - - - - - - - - - - - - - - - - - - -
ICMP pingRPC portmapperOSI ping
Ethernet loadTCP portsNameserver
Radius serverSyslog messagesMailq
NTPUPS (APC) batteryUnix host perf
BGP peersSNMP variablesData throughput
-
-

Click here to see a sample of the web interface.

-

The software is available from http://www.netplex-tech.com/software/nocol - or via ftp from ftp.navya.com.

-

nocol-users@navya.com is a mailing list for general discussion of nocol. Click here to subscribe to this mailing list - (send subscribe in the BODY of your email).

-

Send bug reports to nocol-bugs@navya.com

-
-

The latest versions of these documents can be found online

-
-

Feedback

-

-

Netplex Technologies Inc.

- - + + + + + + + +NOCOL SNIPS Network Monitoring & Management Home Page + + + + + + + + + + + + + + + + + + + +
      <h1>NOCOL Network Monitoring Software</h1> 

 

     

Current Version 4.3.1

+

NOCOL/SNIPS is a system and network monitoring software that runs on Unix + systems and can poll network and system devices. It is capable of monitoring nameservers, + web ports, host performance, syslogs, radius servers, BGP peers, etc. New monitors can be + added easily (via a C or Perl API).

+

All monitors have a common display and postprocessing interface (logging, notification, + etc.) This design allows running only one set of monitors and any number of displays each + seeing the same consistent data.

+

False alarms are avoided by escalating events through severity levels- hence if a site + is unreachable, the site will be tested 2 more times before finally indicating that it is + 'critical'. All events are logged, and the operator has the capability to decide which + level to view the events at.

+

Available monitors are:

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
ICMP pingRPC portmapperOSI ping
Ethernet loadTCP portsNameserver
Radius serverSyslog messagesMailq
NTPUPS (APC) batteryUnix host perf
BGP peersSNMP variablesData throughput
+
+

Click here to see a sample of the web interface.

+

The software is available from http://www.netplex-tech.com/software/nocol + or via ftp from ftp.navya.com.

+

nocol-users@navya.com is a mailing list for general discussion of nocol. Click here to subscribe to this mailing list + (send subscribe in the BODY of your email).

+

Send bug reports to nocol-bugs@navya.com

+
+

The latest versions of these documents can be found online

+
+

Feedback

+

+

Netplex Technologies Inc.

+ + diff -Naur nocol-4.3/html/opsguide.html nocol-4.3.1/html/opsguide.html --- nocol-4.3/html/opsguide.html Wed Jan 26 23:44:52 2000 +++ nocol-4.3.1/html/opsguide.html Mon Mar 27 22:09:01 2000 @@ -1,214 +1,221 @@ - - - -NOCOL Operations Guide - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  NOCOL Operations - Guide

Version 4.3
- Last Updated Jan 22, 2000

 
  Contents
  Running NOCOL
    -
  • File locations
  • -
  • Selecting the monitors
  • -
  • Configuration files
  • -
  • noclogd
  • -
  • Routine Maintenance
  • -
-

User Interfaces

    -
  • netconsole
  • -
  • webNocol
  • -
  • tkNocol
  • -
-

Notifications - & Reports

    -
  • SMS Paging
  • -
  • Email
  • -
  • Reports
  • -
-

You must read the Installation document - prior to reading this Operations guide.

 
-
  Running - NOCOL
 

File Locations

-

The main directory where nocol gets installed is - specified at compile time (default is set to /usr/local/nocol). Under this directory, the - following sub-directories exist:

-
- - - - - - - - - - - - - - - - - - - - - -
bin/All monitors and utility scripts are in this directory.
data/The raw data collected by the monitors
etc/All configuration files, and the snmp MIB file.
msgs/All files in this directory are displayed in the - 'netconsole'  msgs subwindow.
run/The PID files for all the monitors (used to ensure only - one copy of a monitor runs at a time), and the error
-
-

Running the Monitors

-

Nocol has a large number of independent monitors- - all desired monitors should be listed in the keepalive_monitors script (the - variable PROGRAMS). This script is run periodically from crontab and ensures that all the - desired monitors are running (the crontab.nocol file is installed into - cron during the installation steps).

-

Generally the monitors do not need any command - line argument- the name and location of the configuration file and the data directory is - compiled into the monitors. However, you can always specify an alternate config file or - output data file using the '-c' or the '-o' command line options respectively. All - monitors also accept the '-d' flag to indicate debug mode, in which case they write - verbose error messages to the stderr. If started from keepalive_monitors, these error - messages are stored in the run/xxxx.error  file.

-

Configuration Files

-

The configuration file for each monitor is - located in the etc/ directory. Each of these files should be edited for your site. Note - that in most monitors, the 'name' of the device is not used by the monitor, but is - basically a operator friendly name for the device.

-

Currently, sending a HUP signal to the monitors - does NOT cause them to re-read the configuration file and preserve the existing state of - the variables being monitored.

-

noclogd - the Logging Daemon

-

The noclogd daemon listens on port 5354 of the - logging host for any events sent by the monitors. The name of the host where noclogd runs - is compiled into all the monitors and is not configurable in their config files at this - time.

-

The noclogd process is similar to the Unix - 'syslog' daemon and the configuration file allows piping the logged events to any external - process. To prevent any random host from sending it any messages, the list of allowed IP - addresses (which can log to it) is listed in the noclogd configuration file.

-

Since this process can run external programs, it - is used to run the pager notification scripts, etc. This program can be used to log - messages to a database, send emails, etc.

-

It should be noted that an 'event' in nocol is - generated only when a value crosses a threshold in any polling - interval. Hence, normally you will not see any logging activity in noclogd, but when a - device variable changes its state, an event will be logged.

-

Routine Maintenance

-

Routine admin tasks in nocol consist of ensuring - that all the monitors are running (done by running keepalive_monitors from cron), -  and rotating all the log files maintained by noclogd (done by running log-maint - periodically from crontab). See the sample nocol.crontab - for achieving these tasks.

 
-
  User - Interfaces
 

Netconsole

-

There are three different user interfaces to view - the nocol data. The simplest of them all is netconsole,  which is a - non-graphical, curses based tool for displaying the raw data being collected by the - monitors. Any user on the system where the monitors are running can run this tool.

-

WebNocol

-

The Web interface for displaying nocol data is - divided into two scripts- genweb.pl which runs periodically from crontab - and generates 4 web pages (one for each severity level). The other program is a CGI script - webnocol.cgi, which gives added functionality to the user such as - troubleshooting, adding notes for an event, hiding a known event, etc. This script has its - own built in access control based on the user, but as an alternative the typical .htaccess - method can easily be used.

-

tkNocol

-

This is a Tcl/tk based monitor using - client-server technology. A simple daemon (called 'ndaemon') runs on the nocol machine - listening on TCP port 5005 and all it does is periodically send the nocol raw data to all - connected clients. The client displays then parse and format/display this nocol raw data. - ndaemon has no access control at this time, so it is important to put a firewall to - restrict unauthorized access to ndaemon's TCP port.

-

Note that none of these interfaces displays - historical data from 'noclogd'- they all work directly on the data being collected by the - monitors which represents the current state of the network.

 
-
  Notifications - & Reports
  A very flexible notification script called  - 'notifier.pl'  is provided with nocol which has a configuration file - describing the type of event and required action. Currently the possible actions are  - mail and page.

A minimum - and maximum age of the event can be defined indicating that the action should be taken - (paging or email) only if the age of the event lies between these two values (in seconds). - An option exists to allow 'repeat' notification (once every hour) until the age is - exceeded.

-

Currently the only reporting tool for historical - analysis is 'logstats' which parses the historical noclogd event logs and generates a - simple summary report. This is run by the 'log-maint' script which in turn is run - periodically from crontab.

- -
- -
- Vikas Aggarwal -
- - + + + + + +NOCOL Operations Guide + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
  NOCOL Operations + Guide

Version 4.3
+ Last Updated: Mar 22, 2000

 
  Contents
  Running NOCOL
    +
  • File locations
  • +
  • Selecting the monitors
  • +
  • Configuration files
  • +
  • noclogd
  • +
  • Routine Maintenance
  • +
+

User Interfaces

    +
  • netconsole
  • +
  • webNocol
  • +
  • tkNocol
  • +
+

Notifications + & Reports

    +
  • SMS Paging
  • +
  • Email
  • +
  • Reports
  • +
+

You must read the Installation document + prior to reading this Operations guide.

 
+
  Running + NOCOL
 

File Locations

+

The main directory where nocol gets installed is + specified at compile time (default is set to /usr/local/nocol). Under this directory, the + following sub-directories exist:

+
+ + + + + + + + + + + + + + + + + + + + + +
bin/All monitors and utility scripts are in this directory.
data/The raw data collected by the monitors
etc/All configuration files, and the snmp MIB file.
msgs/All files in this directory are displayed in the + 'netconsole'  msgs subwindow.
run/The PID files for all the monitors (used to ensure only + one copy of a monitor runs at a time), and the error
+
+

Running the Monitors

+

Nocol has a large number of independent monitors- + all desired monitors should be listed in the keepalive_monitors script (the + variable PROGRAMS). This script is run periodically from crontab and ensures that all the + desired monitors are running (the crontab.nocol file is installed into + cron during the installation steps).

+

Generally the monitors do not need any command + line argument- the name and location of the configuration file and the data directory is + compiled into the monitors. However, you can always specify an alternate config file or + output data file using the '-c' or the '-o' command line options respectively. All + monitors also accept the '-d' flag to indicate debug mode, in which case they write + verbose error messages to the stderr. If started from keepalive_monitors, these error + messages are stored in the run/xxxx.error  file.

+

Configuration Files

+

The configuration file for each monitor is + located in the etc/ directory. Each of these files should be edited for your site. Note + that in most monitors, the 'name' of the device is not used by the monitor, but is + basically a operator friendly name for the device.

+

Currently, sending a HUP signal to the monitors + does NOT cause them to re-read the configuration file and preserve the existing state of + the variables being monitored.

+

noclogd - the Logging Daemon

+

The noclogd daemon listens on port 5354 of the + logging host for any events sent by the monitors. The name of the host where noclogd runs + is compiled into all the monitors and is not configurable in their config files at this + time.

+

The noclogd process is similar to the Unix + 'syslog' daemon and the configuration file allows piping the logged events to any external + process. To prevent any random host from sending it any messages, the list of allowed IP + addresses (which can log to it) is listed in the noclogd configuration file.

+

Since this process can run external programs, it + is used to run the pager notification scripts, etc. This program can be used to log + messages to a database, send emails, etc.

+

It should be noted that an 'event' in nocol is + generated only when a value crosses a threshold in any polling + interval. Hence, normally you will not see any logging activity in noclogd, but when a + device variable changes its state, an event will be logged. This means that an event will + be sent by a monitor to noclogd both when it goes down (e.g. from info level to warning + level) and also when it comes back up (e.g. warning level to info level).

+

Routine Maintenance

+

Routine admin tasks in nocol consist of ensuring + that all the monitors are running (done by running keepalive_monitors from cron), +  and rotating all the log files maintained by noclogd (done by running log-maint + periodically from crontab). See the sample nocol.crontab + for achieving these tasks.

 
+
  User + Interfaces
 

Netconsole

+

There are three different user interfaces to view + the nocol data. The simplest of them all is netconsole,  which is a + non-graphical, curses based tool for displaying the raw data being collected by the + monitors. Any user on the system where the monitors are running can run this tool.

+

WebNocol

+

The Web interface for displaying nocol data is + divided into two scripts- genweb.pl which runs periodically from crontab + and generates 4 web pages (one for each severity level). The other program is a CGI script + webnocol.cgi, which gives added functionality to the user such as + troubleshooting, adding notes for an event, hiding a known event, etc. This script has its + own built in access control based on the user, but as an alternative the typical .htaccess + method can easily be used.

+

tkNocol

+

This is a Tcl/tk based monitor using + client-server technology. A simple daemon (called 'ndaemon') runs on the nocol machine + listening on TCP port 5005 and all it does is periodically send the nocol raw data to all + connected clients. The client displays then parse and format/display this nocol raw data. + ndaemon has no access control at this time, so it is important to put a firewall to + restrict unauthorized access to ndaemon's TCP port.

+

Note that none of these interfaces displays + historical data from 'noclogd'- they all work directly on the data being collected by the + monitors which represents the current state of the network.

 
+
  Notifications + & Reports
  A very flexible notification script called  + 'notifier.pl'  is provided with nocol which has a configuration file + describing the type of event and required action. Currently the possible actions are  + mail and page. A minimum and maximum age of the event can be defined + indicating that the action should be taken (paging or email) only if the age of the event + lies between these two values (in seconds). An option exists to allow 'repeat' + notification (once every hour) until the age is exceeded.

A more 'event' driven notification system can be written by using noclogd. + Any event can be piped to an external script by noclogd, so a page or email can be sent as + soon as an event occurs and is logged to noclogd. As an example, look at the 'utility/beep_oncall' + script.

+

Currently the only reporting tool for historical + analysis is 'logstats' which parses the historical noclogd event logs and + generates a simple summary report. This is run by the 'log-maint' script + which in turn is run periodically from crontab.

+ +
+ +
+ Vikas Aggarwal +
+ + diff -Naur nocol-4.3/html/release.html nocol-4.3.1/html/release.html --- nocol-4.3/html/release.html Wed Jan 26 23:51:25 2000 +++ nocol-4.3.1/html/release.html Mon Mar 27 22:31:48 2000 @@ -1,5 +1,5 @@ - + @@ -49,6 +49,45 @@

Release Notes

+

nocol v4.3.1 (Mar 2000)

+ +

Minor release to fix patches.

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
1.portmon.c Missing close() left too many file + descriptors
+ Now running check_resp() after receiving EOF from the remote host. Should fix problem + of receiving data with no \n in entire data stream.
2.snmpgeneric Allow specifying client port number (joe@hole-in-the.net)
+ Sets MIBFILE_v2 variable also for the mib file
3.webnocol.cgiSmall fix to prevent possible loop.
4.nocollib.plChanged 'ps' to '/bin/ps'
5.hostmon-clientChanged 'ps' to '/bin/ps'
+
+

nocol v4.3 (Jan 2000)

@@ -320,7 +359,9 @@

Feedback

diff -Naur nocol-4.3/include/version.h nocol-4.3.1/include/version.h --- nocol-4.3/include/version.h Thu Jan 20 18:33:02 2000 +++ nocol-4.3.1/include/version.h Mon Mar 27 23:33:25 2000 @@ -1,5 +1,5 @@ /* - * $Id: version.h,v 4.3 2000/01/20 23:32:53 vikas Exp $ + * $Id: version.h,v 4.3 2000/03/28 04:33:05 vikas Exp $ */ /* @@ -24,6 +24,6 @@ #ifndef _NOCOLVERSION_ # define _NOCOLVERSION_ - static char nocol_version[] = "$Revision: 4.3 $" ; + static char nocol_version[] = "$Revision: 4.3 $ (4.3.1)" ; #endif diff -Naur nocol-4.3/perlnocol/hostmon-osclients/hostmon-client nocol-4.3.1/perlnocol/hostmon-osclients/hostmon-client --- nocol-4.3/perlnocol/hostmon-osclients/hostmon-client Fri Nov 5 16:47:18 1999 +++ nocol-4.3.1/perlnocol/hostmon-osclients/hostmon-client Mon Mar 27 01:51:48 2000 @@ -1,6 +1,6 @@ #!/usr/local/bin/perl # -# $Header: /home/vikas/src/nocol/perlnocol/hostmon-osclients/RCS/hostmon-client,v 2.5 1999/11/05 21:46:54 vikas Exp $ +# $Header: /home/vikas/src/nocol/perlnocol/hostmon-osclients/RCS/hostmon-client,v 2.6 2000/03/27 06:51:26 vikas Exp $ # # hostmon-client.main # @@ -303,7 +303,7 @@ $ostype= `uname -s -r -m` ; chop $ostype; # OS, revision, arch $osfile = "hostmon-client."; $debug && print STDERR "OSTYPE = $ostype\n"; - $PSCMD = "ps"; # ps command to allow pid on cmd line. Autoset below + $PSCMD = "/bin/ps"; # ps command to allow pid on cmd line. Autoset below # set boolean values for OS's @@ -330,9 +330,9 @@ ## force units to 1024 blocks for df and vmstat for SVR4 # $ENV{'BLOCKSIZE'} = "1024"; - # if ($ostype =~ /solaris|irix/i) { $PSCMD = "/bin/ps -p"; } + #if ($ostype =~ /SunOS\s+5/) { $PSCMD = "/bin/ps -p"; } local ($status) = grep(/usage/i, `$PSCMD 1 2>&1`); - if ($status == 1) { $PSCMD = "ps -f -p" ;} + if ($status == 1) { $PSCMD = "/bin/ps -f -p" ;} } sub standalone { diff -Naur nocol-4.3/perlnocol/nocollib.pl nocol-4.3.1/perlnocol/nocollib.pl --- nocol-4.3/perlnocol/nocollib.pl Fri Nov 5 16:59:34 1999 +++ nocol-4.3.1/perlnocol/nocollib.pl Mon Mar 27 22:28:54 2000 @@ -1,6 +1,6 @@ #!/usr/local/bin/perl # -# $Header: /home/vikas/src/nocol/perlnocol/RCS/nocollib.pl,v 1.15 1999/11/05 21:59:13 vikas Exp $ +# $Header: /home/vikas/src/nocol/perlnocol/RCS/nocollib.pl,v 1.16 2000/03/28 03:28:43 vikas Exp $ # # nocollib.pl - perl library of NOCOL routines # @@ -316,10 +316,10 @@ local($me,$dir)=@_; local($localhost)=`hostname`; chop ($localhost); local($pid,$host); - local($PSCMD) = "ps"; # cmd to see process by giving a pid + local($PSCMD) = "/bin/ps"; # cmd to see process by giving a pid local ($status) = scalar grep(/usage/i, `$PSCMD 1 2>&1`); - if ($status >= 1) { $PSCMD = "ps -p" ;} # hope no other flags needed + if ($status >= 1) { $PSCMD = "/bin/ps -p" ;} # hope no other flags needed $pidfile= "$dir/$me.pid"; # cannot be local() diff -Naur nocol-4.3/perlnocol/snmpgeneric nocol-4.3.1/perlnocol/snmpgeneric --- nocol-4.3/perlnocol/snmpgeneric Mon Nov 8 23:21:45 1999 +++ nocol-4.3.1/perlnocol/snmpgeneric Tue Mar 21 16:29:23 2000 @@ -1,6 +1,6 @@ #!/usr/bin/perl ## -# $Header: /home/vikas/src/nocol/perlnocol/RCS/snmpgeneric,v 1.3 1999/11/09 04:21:23 vikas Exp $ +# $Header: /home/vikas/src/nocol/perlnocol/RCS/snmpgeneric,v 1.5 2000/03/21 21:28:57 vikas Exp $ # # snmpgeneric - perl monitor for generic SNMP variables. # Directly monitors SNMP variables from the hosts listed. @@ -47,6 +47,7 @@ #$mibfile = "$etcdir/mibII.txt" ; # location of MIB file SET_THIS $mibfile = "$etcdir/mib-v2.txt" ; # location of MIB file SET_THIS $ENV{"MIBFILE"}= $mibfile ; +$ENV{"MIBFILE_v2"}= $mibfile ; $numtries = 2; # number of times to try and connect before failing @@ -129,7 +130,9 @@ local ($acount, $isok) = (0, 0); # $host is the text hostname $router is the IP - + # Check if a port number is attached to $router:$port + ($router, $port) = split(/:/, $router, 2); + if ($debug) { print "Checking $router\n"; @@ -150,8 +153,14 @@ $myoid=$oid{$item}; # get the OID ready $myoid =~ s/^\+(.*)$/\1/; # remove the leading + - - $cmd = "$snmpwalk $router $community{$item} $myoid"; + + if ($port) { + $cmd = "$snmpwalk -p $port $router $community{$item} $myoid"; + } + else { + $cmd = "$snmpwalk $router $community{$item} $myoid"; + } + open (WALK, "$cmd |") || die "Could not run \"$cmd\"\n"; while () { @@ -181,7 +190,12 @@ $tries=$numtries; while ((! (($active =~ /INTEGER/)||($active =~ /Timeticks/)) ) && ($tries) ) { - $cmd = "$snmpget $router $community{$item} $oid{$item}"; + if ($port) { + $cmd = "$snmpget -p $port $router $community{$item} $oid{$item}"; + } + else { + $cmd = "$snmpget $router $community{$item} $oid{$item}"; + } print "cmd=$cmd\n" if $debug; $active = `$cmd`; print "$active" if $debug; diff -Naur nocol-4.3/perlnocol/snmpgeneric-confg nocol-4.3.1/perlnocol/snmpgeneric-confg --- nocol-4.3/perlnocol/snmpgeneric-confg Tue Oct 26 00:34:03 1999 +++ nocol-4.3.1/perlnocol/snmpgeneric-confg Tue Mar 21 16:24:16 2000 @@ -3,7 +3,7 @@ # Ed Landa (elanda@comstar.net) May 1999 # ### -# HOST IP SNMP-OID VARNAME COMMUNITY WARN ERROR CRITICAL [COMP] +# HOST IP[:PORT] SNMP-OID VARNAME COMMUNITY WARN ERROR CRITICAL [COMP] # # If OID starts with a plus, then we will walk that OID and add up values that # are returned from the COMP evalution. diff -Naur nocol-4.3/portmon/portmon.c nocol-4.3.1/portmon/portmon.c --- nocol-4.3/portmon/portmon.c Thu Jan 20 00:28:27 2000 +++ nocol-4.3.1/portmon/portmon.c Thu Apr 6 15:32:26 2000 @@ -1,5 +1,5 @@ /* #define DEBUG /* */ -/* $Header: /home/vikas/src/nocol/portmon/RCS/portmon.c,v 2.3 2000/01/20 05:27:18 vikas Exp $ */ +/* $Header: /home/vikas/src/nocol/portmon/RCS/portmon.c,v 2.5 2000/04/06 19:31:44 vikas Exp $ */ /* Copyright 1994 Vikas Aggarwal, vikas@navya.com */ @@ -11,7 +11,7 @@ * response string. * * CAVEATS: - * 1) Uses 'strstr' and not a real regular expression. Case insensitive. + * 1) Uses case insensitive 'strstr' and not a real regular expression. * 2) Looks only at the first buffer of the response unless using the * timeouts to calculate the response time. * 3) Does not implement milli-second timers while reading responses @@ -23,6 +23,15 @@ /* * $Log: portmon.c,v $ + * Revision 2.5 2000/04/06 19:31:44 vikas + * Now replaces all '\0' in the read stream with a '\n'. Needed by + * a user who has a host which terminates lines with \0 + * + * Revision 2.4 2000/04/04 04:46:22 vikas + * Added close() to prevent max file open problem. + * Better error logging. + * Now checking for any read buffer even if \n not found. + * * Revision 2.3 2000/01/20 05:27:18 vikas * Fixed error in processing sites where we are just testing connectivity. * Needed to call process_site() if the responselist was null. @@ -68,7 +77,7 @@ /* */ #ifndef lint -static char rcsid[] = "$Id: portmon.c,v 2.3 2000/01/20 05:27:18 vikas Exp $" ; +static char rcsid[] = "$Id: portmon.c,v 2.5 2000/04/06 19:31:44 vikas Exp $" ; #endif #include "portmon.h" @@ -216,7 +225,7 @@ FD_ZERO(&rset); FD_ZERO(&wset); nleft = nhosts; - if (debug > 1) fprintf(stderr, "testing %d sites simultaneously\n", nhosts); + if (debug > 1) fprintf(stderr, "Testing %d sites simultaneously\n", nhosts); /* issue a non-blocking connect to each host */ for (i = 0; i < nhosts; ++i) @@ -241,8 +250,8 @@ errno != EINPROGRESS) { /* some error */ if (debug) - fprintf(stderr, "connect() failed for %s- %s\n", (hv[i])->ipaddr, - sys_errlist[errno]); + fprintf(stderr, "connect() failed for %s:%d- %s\n", + hv[i]->ipaddr, hv[i]->port, sys_errlist[errno]); close(sockarray[i]); --nleft; } @@ -275,7 +284,7 @@ #ifdef DEBUG if (debug > 2) fprintf(stderr, - "checkports() calling select(), timeout %ld, nleft= %d\n", + "\ncheckports() calling select(), timeout %ld, nleft= %d\n", tval.tv_sec, nleft); #endif if ( (n = select(maxfd + 1, &rs, &ws, NULL, &tval)) <= 0) @@ -320,9 +329,11 @@ else fprintf(stderr, "%s\n", sys_errlist[errno]); } FD_CLR(sockarray[i], &rset); FD_CLR(sockarray[i], &wset); + close(sockarray[i]); /* ensure this is closed */ --nleft; if (debug > 2) - fprintf(stderr, "DONE %d. %s\n", i, hv[i]->ipaddr); + fprintf(stderr, "DONE #%d. %s:%d (%s)\n", + i, hv[i]->hname, hv[i]->port, hv[i]->ipaddr); } else /* ready for reading/writing and connected */ { @@ -338,7 +349,8 @@ close(sockarray[i]); /* ensure this is closed */ --nleft; if (debug > 2) - fprintf(stderr, "DONE %d. %s\n", i, hv[i]->ipaddr); + fprintf(stderr, "DONE #%d. %s:%d (%s)\n", + i, hv[i]->hname, hv[i]->port, hv[i]->ipaddr); } if (hv[i]->wptr == NULL || *(hv[i]->wptr) == '\0') @@ -353,13 +365,11 @@ /* here if timeout or all the connects have been processed */ - if (nleft) - { - if (debug > 1) - fprintf(stderr, " %d sites unprocessed (no response)\n", nleft); - for (i = 0; i < nhosts; ++i) - close(sockarray[i]); - } + if (nleft && debug > 1) + fprintf(stderr, " %d sites unprocessed (no response)\n", nleft); + + for (i = 0; i < nhosts; ++i) /* close any open sockets */ + close(sockarray[i]); return (0); @@ -410,15 +420,15 @@ { #ifdef DEBUG if (debug > 1) - fprintf(stderr, " (debug) host %s:%d- sent %d bytes\n", + fprintf(stderr, " (debug) Host %s:%d- sent %d bytes\n", h->hname, h->port, n); #endif return 0; } if (debug) - fprintf(stderr, " (debug) %s: host %s:%d- Sent string '%s'\n", - prognm, h->hname, h->port, h->writebuf) ; + fprintf(stderr, " (debug) Host %s:%d- Sent string '%s'\n", + h->hname, h->port, h->writebuf) ; return 1; @@ -433,7 +443,7 @@ int sock; /* connected socket */ struct _harray *h; { - int n; + int i, n; int sflags; int buflen, maxsev; register char *r, *s; @@ -461,8 +471,13 @@ return(1); /* done testing */ } - buflen = h->readbuf + sizeof(h->readbuf) - h->rptr; + /* now fill any remaining read buffer space we have */ + buflen = h->readbuf + sizeof(h->readbuf) - h->rptr; /* amount we can read*/ n = read(sock, h->rptr, buflen - 1); +#ifdef DEBUG /* */ + if (debug > 2) + fprintf(stderr, " read %d bytes from %s:%d\n", n, h->hname, h->port); +#endif /* */ if (n < 0) { /* read() error */ if (errno == EWOULDBLOCK) @@ -480,47 +495,54 @@ h->status = 0; /* mark as down */ return(1); /* finished testing */ } /* end if (n < 0) */ - else if (n == 0) /* end of file */ + + /* if n==0, then we have read end of file, so do a final check_resp() */ + + /* replace any \0 in the stream with a \n */ + for (i = 0; i < n ; ++i) + if ((h->rptr)[i] == '\0') + (h->rptr)[i] = '\n'; + + (h->rptr) += n; /* increment pointer */ + *(h->rptr) = '\0'; + + if (n > 0 && (r = (char *)strrchr(h->readbuf, '\n')) == NULL) /* no \n */ { - if (h->quitstr && *(h->quitstr)) - write(sock, h->quitstr, strlen(h->quitstr)); - close (sock); - h->testseverity = h->connseverity; - h->status = 0; /* mark as down */ - return 1; /* finished testing */ + if ( n < (buflen - 1) ) /* remaining empty buffer space */ + return (0); /* need to continue reading */ + else /* filled buffer, but no \n yet */ + r = h->rptr - 32; /* set to about 32 chars back from end */ } -#ifdef DEBUG - if (debug > 2) - fprintf(stderr, " read %d bytes from %s\n", n, h->hname); -#endif + /* here if end-of-file or found a newline */ + maxsev = check_resp(h->readbuf, h->responselist); - (h->rptr) += n; /* increment pointer */ - *(h->rptr) = '\0'; - if ( (r = (char *)strrchr(h->readbuf, '\n')) == NULL && n < (buflen - 1) ) - return (0); /* need to continue reading */ - - if (r == NULL) /* filled buffer, but no \n yet */ - r = h->rptr - 20; /* set to about 20 chars back from end */ - if ( (maxsev = check_resp(h->readbuf, h->responselist)) == -1 ) + if (maxsev == -1) { /* no match in response list */ - for (++r, s = h->readbuf; *r; ) - *s++ = *r++; /* shift stuff after \n to start of readbuf */ - h->rptr = s; /* point to next location to be read */ - return 0; /* still not done */ + if (n == 0) /* end of file, so we are done */ + maxsev = h->connseverity; + else + { + for (++r, s = h->readbuf; *r; ) + *s++ = *r++; /* shift stuff after \n to start of readbuf */ + *s = '\0'; /* lets be safe */ + h->rptr = s; /* point to next location to be read */ + return 0; /* still not done */ + } } - - /* Here if we found a match in check_resp() */ - - if (h->timeouts[0] != 0) - { /* we are checking port speed */ - if (debug > 1) - fprintf(stderr," (debug) elapsed time= %ld secs\n", h->elapsedsecs); - if (h->elapsedsecs < h->timeouts[0]) maxsev = E_INFO; - else if (h->elapsedsecs < h->timeouts[1]) maxsev = E_WARNING; - else if (h->elapsedsecs < h->timeouts[2]) maxsev = E_ERROR; - else maxsev = E_CRITICAL; + else + { /* Here if we found a match in check_resp() */ + if (h->timeouts[0] != 0) + { /* we are checking port speed */ + if (debug > 1) + fprintf(stderr," (debug) elapsed time= %ld secs\n", h->elapsedsecs); + if (h->elapsedsecs < h->timeouts[0]) maxsev = E_INFO; + else if (h->elapsedsecs < h->timeouts[1]) maxsev = E_WARNING; + else if (h->elapsedsecs < h->timeouts[2]) maxsev = E_ERROR; + else maxsev = E_CRITICAL; + } } + if (debug) fprintf(stderr," (debug) process_host(%s:%d): returning severity %d\n", h->hname, h->port, maxsev); @@ -539,8 +561,7 @@ /*+ * FUNCTION: - * Check the list of responses. Notice doing a strstr() which is NOT - * case sensitive. + * Check the list of responses using Strcasestr() */ check_resp(readstr, resarr) char *readstr; @@ -549,7 +570,7 @@ struct _response *r; if (debug > 1) - fprintf(stderr, " (debug) %s: Checking response '%s'\n", prognm,readstr); + fprintf(stderr, " (debug) check_resp() '%s'\n", readstr); for (r = resarr; r ; r = r->next) { @@ -559,14 +580,14 @@ { #ifdef DEBUG if (debug > 1) - fprintf(stderr," (debug) check_resp(): Matched '%s'\n", r->response); + fprintf(stderr," (debug) check_resp(): Matched '%s'\n", r->response); #endif return(r->severity); } } /* for() */ if (debug) - fprintf (stderr, " check_resp(): No response matched for site\n"); + fprintf (stderr, " check_resp(): No response matched for site\n"); return(-1); /* No response matched given list */ @@ -664,7 +685,7 @@ readconfig(fdout) int fdout ; /* output data filename */ { - int mxsever ; + int mxsever, i; char *j1; /* temp string pointers */ FILE *cfd ; EVENT v; /* Defined in NOCOL.H */ @@ -705,7 +726,7 @@ while(fgetLine(cfd, record, MAXLINE - 3) > 0 ) /* keeps the \n */ { static int skiphost; - int port ; + int port; int checkspeed = 0; int readquitstr = 0; struct sockaddr_in sin; /* temporary */ @@ -925,12 +946,13 @@ fclose (cfd); /* Not needed any more */ if (debug > 1) - for (h = hostlist ; h; h = h->next) + for (h = hostlist, i=0 ; h; h = h->next) { - fprintf(stderr, "Host=%s %s :%d, Sendstr=%s, MaxSev= %d\n", h->hname, - h->ipaddr, h->port, (h->writebuf ? h->writebuf : ""), h->connseverity); + fprintf(stderr, "#%d. Host=%s %s :%d, MaxSev= %d, Sendstr=%s", + i++, h->hname, h->ipaddr, h->port, h->connseverity, + (h->writebuf ? h->writebuf : "\n")); for (r = h->responselist; r; r = r->next) - fprintf(stderr, "\t%s (%d)\n", r->response, r->severity); + fprintf(stderr, "\t%s (sev=%d)\n", r->response, r->severity); } return(1); /* All OK */ diff -Naur nocol-4.3/webnocol/genweb.pl nocol-4.3.1/webnocol/genweb.pl --- nocol-4.3/webnocol/genweb.pl Mon Nov 1 08:46:54 1999 +++ nocol-4.3.1/webnocol/genweb.pl Mon Mar 27 23:37:13 2000 @@ -1,6 +1,6 @@ #!/usr/local/bin/perl # -# $Header: /home/vikas/src/nocol/webnocol/RCS/genweb.pl,v 1.10 1999/11/01 13:46:45 vikas Exp $ +# $Header: /home/vikas/src/nocol/webnocol/RCS/genweb.pl,v 1.11 2000/03/28 04:37:02 vikas Exp $ # # genweb.pl # ------------ @@ -69,7 +69,7 @@ ######################################################################### -$VERSIONSTR = "4.2.2"; # version +$VERSIONSTR = "4.3.1"; # version ## Customize $baseurl @@ -180,7 +180,8 @@ local ($ADMINMODE) = ($lvl eq "User") ? 0 : 1; # No href links for userPage $cnt{$lvl} = 1; # serial number per view - open (OUTPUT, ">$webdir/${lvl}.html") or die "Unable to open output file $!"; + open (OUTPUT, ">$webdir/${lvl}.html") or die + "Unable to open output file ($webdir/${lvl}.html) $!"; select OUTPUT; # default for print statements &print_html_prologue($thispage, $lvl, $refresh); @@ -330,7 +331,7 @@ local ($thispage, $lvl, $refresh) = @_; local ($action) = $levels[$ilevels{$lvl}]; - local ($id) = '$Id: genweb.pl,v 1.10 1999/11/01 13:46:45 vikas Exp $';#' + local ($id) = '$Id: genweb.pl,v 1.11 2000/03/28 04:37:02 vikas Exp $';#' $id =~ s/\$//g; # cleanup print < 2 && $userlevel < 3) { print "\n"; + print "

Current userlevel = $userlevel

"; print "

FORM Variables

\n"; for (keys %FORM) { print "$_ = $FORM{$_}
" ;} - print "userlevel = $userlevel
"; print "


\n

ENV Variables

\n"; for (keys %ENV) { print "$_ = $ENV{$_}
" ;} } @@ -421,7 +426,7 @@ - + EoState