Table of ContentsPreviousNextIndex
 
PDF

Fidelia Technology Logo

    Troubleshooting

9.1 Troubleshooting the DGE - ProvDB Connection

Upon startup, each DGE component connects to the provisioning database (located on the provisioning server) and downloads all tests that are configured for that DGE. The DGE components maintain a connection to the provisioning database at all times. As devices and tests are added, updated, or removed, the provisioning server notifies the relevant DGE of the changes in real-time.

If the communications link between the provisioning database and the DGE is broken, the DGE repeatedly attempts to restore the connection, while continuing to monitor, using the configuration information that it has cached in memory. Once the connection to the provisioning database is restored, the DGE shuts down. A cron job restarts the DGE shortly thereafter. The reason for the shutdown and restart is that while the DGE was unable to communicate with the provisioning server, it may have missed notices about changes to device/test configurations. In the process of restarting, the DGE downloads a fresh copy of the list of tests and proceeds with normal operation.

Problem: Newly added tests remain in UNKNOWN state

For a detailed explanation of the factors that can cause tests to go into UNKNOWN state, see Section 13.2, "NetVigil Status View" on page 13-164.

  1. Make sure that the DGE that controls the device to which the tests belong hasn't lost its connectivity to the provisioning server. If the connection is down and the DGE is running with its cached configuration, it does not know about newly added tests. The DGE should automatically restart itself when the connection is restored. If it doesn't, see "Problem: DGE does not automatically restart when the connection to the provisioning database is restored" on page 130.
  2. Check the load (CPU utilization, load average, blocked disk I/O) on the DGE host. In high-load situations, it may take longer to schedule and run newly-added tests.
  3. Make sure that the DGE process is running. On UNIX systems, you can check this with the following command:
		cd NETVIGIL_HOME
		etc/monitor.init status 

Where NETVIGIL_HOME is the directory in which NetVigil is installed (typically /user/local/NetVigil).

You can also see whether the DGE is running from the Web Interface. If the DGE is not running, when you drill down into older devices, TEST TIME and DURATION values for tests that are not in UNKNOWN state should be light blue, indicating that the test results are old.

Problem: DGE does not automatically restart when the connection to the provisioning database is restored

Make sure that the crontab entry for root on the DGE includes the contents of NETVIGIL_HOME/etc/crontab.netvigil.

9.2 Log Files Used in Troubleshooting

Several log files can be useful in troubleshooting. All log files are located under NETVIGIL_HOME/logs directory.

Log File
Use By
stderr.log
All startup scripts, monitors
netvigil.error
Any warning, error or critical level messages generated by any component is logged in this file.
monitor.info
Information on monitors are logged to this file as tests are performed, actions triggered, etc.
webapp.info
All user tasks, both in the web application and BVE socket server are logged to this file. Tasks include create, delete, update, suspend and resume tasks performed on devices, Departments, users, etc.
tomcat.log
Any errors generated inside jsp pages in the web application component is logged in this file.
poet.log
Provisioning database specific errors

 

Fidelia Technology, Inc.
Contact Us
Table of ContentsPreviousNextIndex