Table of ContentsPreviousNextIndex
 
PDF

Fidelia Technology Logo

    DGE Management

6.1 Configuring Data Gathering Engines (DGEs)

NetVigil uses a distributed, tiered architecture where the data collection and storage is handled by the DGE component. Each DGE polls data from the network devices, servers and applications and performs real-time aggregation and storage of this performance data in a local relational database. The DGE also triggers actions and notifications when it detects that the threshold conditions are exceeded or crossed.

6.1.1 Configuring a New DGE

If you would like to expand your NetVigil system to monitor additional devices in remote geographical or logical locations, you can install the software on another physical machine and integrate it seamlessly into your existing setup. The main steps are:

  1. Configure the DGE itself with a unique name (etc/dge.xml)
  2. Log into the web application as superuser, and add the DGE name into a new or existing "Location".

You can add multiple DGEs in the same location for load balancing or increasing monitoring capacity, and also distribute DGEs in multiple locations as needed.

6.1.2 Changing DGE Database Type

By default, the DGE database is set to MySQL (licensed and shipped with NetVigil). However, you can use Oracle version 8.1 or higher instead.

NetVigil includes the appropriate Oracle JDBC drivers.

  1. Create Database Objects.
    1. cd NETVIGIL_HOME/database/schema/oracle
    2. Log into Oracle SQL*Plus.
    3. Run the following scripts at the SQL prompt:
    4. @TableScript.sql
    5. @AutoIncrementScript.sql
    6. @NullValueTriggerScript.sql

If the auto increment script doesn't run successfully, then execute each of the statement in the script individually at the SQL prompt.

  1. Update the NetVigil configuration file. Edit netvigil/etc/netvigil.xml and replace the statement:
    <dge vendor="mysql" ... 

with:

<dge vendor="jdbc:oracle:thin" port="1521"
user="insert_username_here"
password="insert_password_here"
name="insert_database_name_here"
driver="oracle.jdbc.driver.OracleDriver
minConnections="4" maxConnections="10"
debugging="false"
url="jdbc:oracle:thin:$USER/$PASSWORD@$DEVICE:$PORT:$DATABASE"/> 
Note: Remember to place correct parameter values within the quotation marks.

6.1.3 Changing DGE Aggregation Scheme

You can change the DGE data aggregation scheme by updating NETVIGIL_HOME/etc/aggregation.xml. However, you MUST change this prior to the installation process and not after installation. If you change the aggregation scheme in a production system, all existing performance data will be deleted and fresh databases for the data aggregation will be used.

When a fresh NetVigil system is installed, the aggregation schema is loaded from aggregation.xml and stored in the provisioning database. In order to update the schema, you have to delete the

  1. Shutdown NetVigil on ALL hosts running NetVigil
    su
    cd $NETVIGIL_HOME
    etc/netvigil.init stop
  2. Remove the existing DGE database on all DGEs. If you have sufficient disk space, you should make a backup
    mv database/aggregateddatadb database/aggregateddatadb.OLD
    cp etc/aggregation.xml database/aggregateddatadb.OLD/
    rm -rf database/aggregateddatadb
  3. Edit etc/aggregation.xml with the desired scheme.
  4. Edit database/schema/alter/rmaggscheme.sh and uncomment the 3 lines towards the end of the script
    # $JAVA_HOME/bin/java $CVPARAM \
    # com.idelia.emerald.utils.RemoveAggregationSchemes \
    # $INSTALL_DIR/etc/$DBCONF
  5. Start the provisioning database (etc/provdb.init start) which will also start some other dependencies.
  6. Reset the aggregation scheme in the provisioning database using the modified script
    sh database/schema/alter/rmaggscheme.sh
    If you recieve any error messages, please contact Fidelia Support for further assistance before proceeding any further.
  7. If there is no error, stop the provisioning database and then restart all the components
    etc/provdb.init stop
    sleep 30
    netvigil.init start

Increasing the time interval that the data is stored directly impacts the size of the DGE database.

6.1.4 Disk Space Requirements for DGE Aggregation

Note A DGE Disk Space Requirements calculator is available at http://support.fidelia.com/resources/dbsize/

The DGE database stores three main data types:

  1. Aggregated performance data
  2. Event data (threshold violations)
  3. Syslog and Trap text messages
    WARNING The largest component is typically the aggregated performance database. You can change the DGE data aggregation scheme by updating $NETVIGIL/etc/aggregation.xml. However, you MUST change this prior to the installation process and not after installation since the aggregation scheme is loaded into the configuration database on initialization. All existing performance data will be deleted and fresh databases for the data aggregation will be used (please contact support@fidelia.com if you would like to modify your aggregation scheme after installation).

Each aggregated data value is 30 bytes in size (including the size of its index). For the default aggregation scheme:

5 minute samples for 1 day = 60/5*24 = 288 samples
15 minute samples for 7 days = 60/15*24*7 = 672 samples
60 minute samples for 90 days = 60/60*24*90 = 2160 samples
1 day samples for 3 years = 1*365*3 = 1095 samples
TOTAL size per test = (288+672+2160+1095) * 30 bytes = 126 KB 
per test 

For 10,000 tests DGE database = 1.26GB

The database size for 10,000 tests using some alternate aggregation schemes are described in the table below:

Database Size for Specific Aggregation Schemes
Aggregation Scheme
DB Size for 10,000 tests
5 min for 1 day, 15 min for 1 week, 1 hour for 3 months, 1 day for 3 years
1.3 GB
5 min for 1 day, 15 min for 1 week, 1 hour for 1 month, 1 day for 2 years
0.75 GB
5 min for 1 day, 15 min for 1 month, 1 hour for 3 months, 1 day for 2 years
1.8 GB
5 min for 1 day, 15 min for 1 week, 1 hour for 6 months, 1 day for 2 years
1.9 GB
5 min for 30 days, 30 min for 3 months, 2 hours for 6 months, 1 day for 3 years
4.8 GB

Oracle also requires space for transaction logs. The transaction log size must be set to a minimum of 32MB.

6.2 Adding DGE into BVE database

Once you have configured a new physical DGE, you must configure the BVE engine to recognize and use the DGE. This is done by adding the DGE to a "Location".

6.2.1 Adding Locations and DGEs via web

The superuser creates DGE locations and adds new DGEs into these locations using the web application. During provisioning devices, you have to assign them to a Location, and the DGE is automatically selected by the BVE engine. The Location can be any logical or functional name, e.g., New York or datacenter3 or finance.

DGE Locations Page
DGE Management page

Create a New Data Gathering Engine page

6.2.2 Understanding DGE Load Balancing

DGEs are grouped within DGE locations. A DGE location is simply a way of grouping DGEs for load balancing; DGEs in the same DGE location need not be in the same physical location.

For multiple DGEs in a single "location", NetVigil uses a load balancing mechanism based on configurable test limits to ensure that DGE hosts are not overloaded. There are two limits, soft and hard, which are used to determine whether the DGE has the capacity to take on a newly-provisioned device. If the number of tests reach the hard limit, no more tests can be provisioned on that DGE. Once a soft limit is reached, tests for existing devices only can be added to that DGE. Else the device is provisioned on the least loaded DGE (note that tests for a device are not split across multiple DGEs to optimize performance).

6.3 Using the DGE Controller

6.3.1 Monitoring DGE operation/capacity

The DGE component keeps track of different types of monitors that are running, number of objects processed and number of items in various queues waiting to be processed. You can telnet into port 7655 (default, or use the port that you have configured) on the server the DGE component is running:

% telnet my_dge 7655 
Trying n.n.n.n...
Connected to my_dge
Escape character is '^]'. 
NetVigil device monitor 
password: *****
<<welcome>> 

Once logged in, you can use the status command to view the health of each monitor, as well as the number of times they have performed a health check of configured elements:

controller> status
<<begin>> 
Monitor[sql] - com.fidelia.emerald.monitor.SqlQueryMonitor
		Number of passes: 0
		Work Units processed: 0
			Thread Status: alive 
Monitor[radius] - com.fidelia.emerald.monitor.RadiusMonitor
		Number of passes: 993
		Work Units processed: 993
		Thread Status: alive 
Monitor[ldap] - com.fidelia.emerald.monitor.LdapMonitor
		Number of passes: 0
		Work Units processed: 0
		Thread Status: alive 
[additional status lines removed]
<<end>> 

On a healthy DGE, Thread Status for all the monitors should indicate alive and the number of passes and number of work units processed should be increasing, provided there are one or more tests of that particular type configured (and not suspended) in the system.

The DGE status server also provides important information regarding capacity planning. The Schedule Queue section of the status command output indicates how many tests are waiting to be performed:

MonitorServer
		Schedule Queue [Monitor[sql]] Size: 0
		Schedule Queue [Monitor[ldap]] Size: 0
		Schedule Queue [Monitor[radius]] Size: 0
		Schedule Queue [Monitor[port]] Size: 0
		Schedule Queue [Monitor[ntp]] Size: 0
		Schedule Queue [Monitor[poet]] Size: 0
		Schedule Queue [Monitor[ping]] Size: 0
		Schedule Queue [Monitor[snmp]] Size: 2
		Schedule Queue [Monitor[dns]] Size: 0
		Schedule Queue [Monitor[external]] Size: 0
		Result Queue Size: 0
		Aggregation Writer Queue Size: 0
		Result Writer Queue Size: 0
		Event Writer Queue Size: 0 

In the event of a network outage, the size of different queues may grow to a large number depending on the network topology and reachability of each device. Once the outage has been resolved, the queues should start to decrease. However, if under normal operating conditions the queue continues to grow, it would indicate that new tests are being added to the queue before existing tests can be performed, and your DGE capacity has been exceeded. At this point you need to either add another DGE at the same location, move some tests/devices to a different DGE (either at same location or a different location), reduce the frequency of the tests or suspend some tests until capacity on the DGE can be increased.

Once completed, you can use quit command to log out of the DGE status server:

controller> quit
<<bye>>
Connection closed by foreign host. 

6.4 Upgrading DGE Hardware

As the load on a DGE increases, it may be necessary to perform upgrades to the capacity of the hardware to increase the physical limits of the machine. If the upgrade involves addition of resources (memory, disk space, etc.) to the same machine, no special steps need to be performed. However, if the physical server is being upgraded for any reason, then the following steps need to be performed. Refer to the section on database backup/restoration for additional details.

  1. Install NetVigil on the new host, making sure to use the same DGE name as the old one, when asked during the install. Else you will need to edit $NETVIGIL_HOME/dge.xml and set the DGE name there.
  2. Copy over the following directories in their entirety from the old host to the new using tar on Unix machines or copying the directories in Windows:

database/
etc/licenseKey.xml
etc/dge.xml
etc/netvigil.xml

  1. If the new host will have a different IP address, then you will also need to log into the web application superuser and change the IP address of the relevant DGE from SUPERUSER -> DGE Mgmt


Fidelia Technology, Inc.
Contact Us
Table of ContentsPreviousNextIndex