Scalable Network Monitoring from Zyrion

Zyrion Traverse is a real-time network monitoring product for end-to-end visibility across networks, servers and applications. Its real-time monitors provide the ability to perform network monitoring functions such as instantly drill down from high-level transactions, through applications and servers, down to the network or even an isolated device to highlight potential problems before they occur. Traverse's functionality is available via a standard Web interface which allows it to be accessed easily from anywhere without any complex client requirements. In addition Traverse is easily configurable and provides multi-level views for every department within an enterprise.

Key network monitoring features

Distributed, Fault-tolerant Architecture
 

Zyrion Traverse is based on a flexible and horizontally scalable, 3 tier system architecture, which is designed to collect massive amounts of data and monitor tens of thousands of elements while separating the data collection from the reporting and the graphing.  The distributed monitors, or Data Gathering Elements (DGEs) store all performance data locally, and only forward events or alarms to an upstream entity. The reporting and graphing engines can also be distributed and scaled as necessary. The databases are run in a mirrored mode achieving levels of fault-tolerance not possible with traditional NMS systems.

In small environments, all of Traverse’s components can be run on a single system. As the requirements grow, each and every component of Traverse’s architecture can be expanded onto new hardware as required. If a new location comes online, just add a new DGE and configure the system to recognize it within minutes. If an existing DGE is starting to get overloaded, just add another DGE in the same location and all new devices will automatically be provisioned on the new DGE. In a complex environment, you can have any number of Web reporting engines, any number of data collection servers with data being stored in any desired location. The unique architecture permits unparalleled flexibility and reliability to meet enterprise requirements.

Real-time architecture and dynamic correlation for intelligent reporting 

Traverse can monitor any SNMP-enabled device with its extremely efficient multi-threaded polling engine (multiple queries to a host are sent in a single packet). It can either use existing MIBs, or can work with any custom enterprise MIB and SNMP agent. Traverse is a network monitoring solution for routers, switches, IP connectivity, servers, and applications. In addition to SNMP devices, Traverse can import data from any non-SNMP device also via its real-time data import API.

A sample of the Traverse's network monitoring capabilities are:

Traverse’s URL transaction monitor can track performance of multi-step web based transactions consisting of forms, hyperlinks and frames, to measure actual service delivery to end users.

Data Gathering Elements (DGE's) are installed in at least one location within the enterprise and can be used to monitor devices anywhere on the Internet. DGE's store the data locally, reducing traffic overhead for the system, and are designed to store data for several years to support in-depth trend analysis with minimal or no database maintenance requirements.

Traverse provides status views across all devices in the organization, and can group them into departments, applications, or other logical views. In addition, users can drill down on specific devices, and tests, and see recent and long term aggregated performance information in real-time. Traverse’s unique architecture permits deployment in NAT or firewall networks easily.

Configure Traverse instantly to manage custom devices

Traverse can be set up and installed in a stand-alone environment very rapidly. Default test settings, action profiles, and reports are pre-loaded into the system via XML based configuration files. Lists of devices can be imported automatically into the system via Traverse’s flexible API and automatically detect and provision all appropriate tests for each device. Polling can be as rapid as 1 minute intervals, due to the multi-threaded design. Traverse’s unique auto-discovery offers true “two-click” provisioning of a device via the user interface.

Flexible Open API for Extensibility and Easy Integration: An API for provisioning users, devices, tests and actions enables mass data entry and updates. The API supports C, Perl, Java or any other common programming language. This provides very easy integration with existing OSS infrastructure.

Traverse integrates seamlessly with existing NMS products like HP OpenView or NetCool. It extends the capabilities of NNM to give detailed performance reports on all elements of a device within a few hours of deployment. Traverse automatically extracts the node list and topology information from NNM, and integrates into the NNM user interface. Additionally, it can send traps and events to other products.

 

Zyrion Traverse provides a unique context sensitive help with smart correlation, which was built from feedback received from existing customers. This integrated solution has helped reduce MTTR in a matter of days after deploying Traverse at these organizations.

 

Multi-level user administration and security enables “virtual departmental NMS”

Zyrion Traverse enables network administrators to flexibly provide access to systems information using multi-level views. Administrators can be set over all systems in the organization, or sub-assigned to specific departments or locations. This flexibility makes it easy to provide groups, such as the help-desk, the appropriate level of access to systems information to help them do their jobs. Privileges to add devices, tests, set thresholds or new actions, or update any of these are governed by the security group you are assigned to.

System administration for initial system configuration is provided by the top level (Superuser) administrator. The Superuser can set up collection servers, other administrators, and assign privileges to administrators and systems. The Superuser can, for example, set up application groups, and assign differing global thresholds between the applications. The Superuser also has overwrite privileges to create devices, tests, actions, etc. for any system. Administrators (or Enterprise-level users) can manage any number of application groups (or departments), or be assigned to a single application group. Administrators can add and update new devices, tests, and actions for the application groups they are assigned to.

Both users and devices can be created as read-only. This additional level of security maintains a view into the system for non-administrative users, such as help desk personnel, but restricts modifications for security purposes. For example multiple departments may share a switch and be able to receive reports on their bandwidth usage, but not be able to change performance settings.

Other key features: Extensive Reporting, Data Archival and Alerts

Reporting: Zyrion Traverse provides on-demand real-time reports - the heart of any network monitoring software, for end-to-end service and individual components across the organization. Using a single system to monitor all components allows you to achieve event-correlation in an unprecedented manner. Zyrion Traverse analyzes a full set of metrics on systems performance, from network availability and throughput, to deeper application monitoring of servers or databases. Sample reporting capabilities are:

As an example, an eCommerce or Payroll service might consist of a Web application server (BEA Weblogic), a database (Oracle), running on several systems (Windows, Solaris) and connected by a network (Cisco, CheckPoint). Traverse allows you to monitor the entire eCommerce transaction as well as all individual components in real time, allowing instant drill down when the service performance degrades or fails.

Data Archiving: Designed for collecting and storing massive amounts of performance data, Zyrion does real-time progressive data aggregation to consolidate and reduce the data storage requirements. Data is stored in any relational database and accessed directly or via Zyrion’s API. Smart data management techniques ensure relevant data is not discarded whereas statistical data is stored for several years without complex data management issues.

 Smart Event Management....Smart Notification: Traverse can eliminate sending multiple notifications when a device goes down or is unavailable. For example, if a device is unreachable then all configured notifications for other tests on the device are suppressed until packet loss returns to normal.

Traverse’s alarms are triggered by thresholds set by the administrator. Warning and Critical thresholds provide real-time alarms on the screen, with notification via email, digital pager, and instant messaging. Notifications can be assigned to any test, with differing actions by type of test. In addition, Traverse provides multi-level action profiles to provide a comprehensive escalation process. Because the actions and notifications are assigned through an action template and assigned to tests, a simple change to the template will automatically take effect for all tests for a user.

Action profiles can be assigned by time of day, providing you with flexibility in changing actions for the night shift. You can also create custom actions such as automatic restarting of a database after a crash.

Traverse also suppresses alarms for cascading events based on topology relationships between devices and uses complex heuristics to avoid false suppression.

More on Network Performance