Table of ContentsPreviousNextIndex
 
PDF

Fidelia Technology Logo

    Reports

NetVigil has extensive and flexible reporting at various levels (container, device, tests) as well as of different types (fault, performance, SLA). Most reports are generated in real-time by collecting data from the DGEs and then creating the graphs and statistics from the raw data by the BVE reporting engine.

There are three levels of reports with increasing levels of flexiblity: Summary, Advanced and Custom. The reporting framework is very flexible and allows completely arbitrary custom reports and statistics generated on the fly.

The different types of reports available from NetVigil are:

The graphs automatically scale between linear and logarithmic scale if the range of the graph is too high (e.g. for a bandwidth spike, which would render a linear graph useless).

15.1 Summary Reports

These reports give a quick snapshot for the past week. There are reports for a technical manager, and executive reports for a business manager showing the impact on business service containers.

15.1.1 Technical Summary Report

This report gives the top N problems, distribution of problems by day of week, correlation graphs for all devices that had problems over the past week, trend graphs for various elements, network, CPU and disk issues over all devices.

15.1.2 Business Impact Summary

This report shows which devices or elements caused various service containers to go down, correlation graph for the top 10 service containers (excluding OK elements), top N service containers sorted by downtime. All reports are interactive, so you can drill down further into any report for much more detailed analysis in real-time.

Top-N service container outages

15.2 Advanced Reports

These reports give operational and engineering analysis of your IT infrastructure and answer some commonly asked questions.

Advanced Reports Menu

15.2.1 Fault Management Reports

The Fault Management Reports provide an in-depth and rigorous analysis of the events where tests/devices and services crossed the thresholds. They provide Device and Service reports on the most fault prone services and the number of events that occurred.

These reports use events (or threshold violations) to calculate the number of times or the total time spent in warning or critical conditions. These reports answer questions such as:

Event History

is designed to provide a consolidated view of events for either the last 24 hours or for a specific historical month. Each report entry is a unique combination of device name, test name and severity, detailing both the total duration in the specified severity (i.e. CRITICAL, WARNING, etc.) and the number of times that the test entered that severity. Below the text listing is a graphical display of the top 10 'worst' results in a horizontal bar style. Clicking on any of the column headings for the text list will automatically update this graph.

Service Instability

provides reports on top 10, 25 or 50 services affected by number of events. The report consists of the Frequency distribution of the events during each hour of the day, each day of the week/month and duration of events.

Device Instability

provides reports on top 10, 25 or 50 services affected by number of events. The report consists of the Frequency distribution of the events during each hour of the day, each day of the week/month and duration of events. You can choose all or a category of devices.

Threshold Violation Reports

provide you with data on threshold violations for Bandwidth, CPU, Memory and Disk Utilization.

15.2.2 Performance and Capacity Planning Reports

These reports help you plan managing your IT infrastructure investments and targeting them in right direction. These reports help to know where exactly the performance is the bottleneck due to capacity constraints.

Top 'N' Usage & Trend Report

gives useful data on the capacity planning for creating redundant capacity where required and removal of excess capacity where it is not required by reporting on TOP N devices or Tests by highest or lowest usage values. This report can be based on the status of one or more test types.

30 Day Upcoming Trends

for Bandwidth, CPU and Disk Space utilization gives a trend analysis for next one month and allows you to plan accordingly.

15.2.3 SLA Reports

For a detailed explanation of SLA thresholds, see Section 5.4.2, "Admin and SLA Thresholds" on page 5-93.

Unavailability/Downtime Report

The Unavailability/Downtime SLA Report is based on device availability as measured by the ICMP packet loss test. The report shows how many times and for how long Packet Loss tests were in the Critical or Unreachable states. The SLA threshold for the Packet Loss test is used to determine when the test was in Critical state. This report shows the Top 10 devices by amount of "unavailability", displaying total time unavailable and % unavailable, with graphics showing either view.

Users may link to an availability distribution report/graph as well. This histogram is a distribution of the numbers of devices falling into blocks of 10% availability. That is, it displays the number of devices falling between 0-10% availability, 10-20% availability, and so on.

Threshold Violation Report

The Threshold Violation report allows you to run reports on system resources (CPU, disk space, bandwidth, etc.), comparing test results with SLA thresholds.

You can also create custom reports for other tests using SLA thresholds.

15.2.4 Alarm Reports

For alarms generated from text messages such as SNMP traps, syslog messages or other logs inserted via the ISM API, the following reports are available:

15.2.5 Stored and Scheduled Reports

You can save any custom or advanced report and then schedule the report to be run automatically and email the results if desired. Whenever an advanced or custom report is generated, a `Save' option is displayed on the report to save it under a custom name. These saved reports are all listed under this menu item.

15.3 Custom Reports

In addition to the large number of preset reports listed above, NetVigil offers complete flexibility in creating ad-hoc reports over any time period. You can select the data over which to generate the report by specifying the device or test names, the time period and other such parameters. You can decide on the type of report to be generated such as a top-N table, or a trend report, a correlation graph, etc.

The categories available under Custom reports are:

Test Level Reports

Generates one or more of the Top Ten, Number of Events Distribution, Event Duration Distribution, Number of Events, Performance, Statistics, Trend Analysis reports for the particular tests of chosen test types for a device.

Device Level Reports

Generates reports for one or more of Top Ten, Number of Events Distribution, Event Duration Distribution, Number of Events for devices of a particular vendor and Device type.

Event Log

Features the event distribution against time reports for chosen types of Tests or a particular device.

Dashboard Reports

Short Graphical Reports for last 24 hours 5-minutes interval for a particular test/device or types of tests chosen

Composite Reports

Plots reports for the similar tests on a single graph allowing comparison of performance.

15.4 Sample Reports

The following screenshots show some of the commonly used NetVigil reports.

15.4.1 Instability Report

This report pinpoints the main problems across the entire IT infrastructure by calculating the total time they were in Critical or Warning stage. It also shows the distribution of alarms on a daily basis as well as the distribution over the day of the week and hour of day. You can drill down into a device and see all the individual problems on that device and also see trend reports from the same screen.

Device InstabilityReport

15.4.2 Test Performance Detail

Test performance details

15.4.3 Trend Analysis Reports

Trend Analysis

15.4.4 Event Correlation Report

This report shows a 24 hour snapshot of all the individual elements of a service container or a device. If a particular test has gone into any non-OK state within an hour, that hour is colored to reflect the non-OK state. This report allows you to correlate various problems in your service container or device and see what events happened during the same hour during the day..

24 Hour Event Correlation Report


Fidelia Technology, Inc.
Contact Us
Table of ContentsPreviousNextIndex