Search This Blog

Monday, March 10, 2014

Data Center Overviews with Nagios/Icinga Nagvis Visualization


A Data Center Overview Map provides detailed information at the system level.  This is primarily Layers 1 through 4 of the OSI Networking Model.  At Layer 1, for instance, hardware condition is displayed as hard disk, memory and processor information.  At Layer 4, Transport Layer information is provided by TCP host checks.
NagVis Data Center Overview
Additional information about configuring NagVis is available in the article on Enterprise Overviews with NagVis.

Nagios / Icinga / NagVis Model

For the systems monitored by this model, that will be:
  • A Background Image depicting two hardware racks
  • Host Icons to depict the Servers and WAN Router
  • Service Icons to depict selected Server and WAN Router Interfaces
  • Service Group Lines to depict iSCSI and WAN Links
Once the Data Center Overview is finalized, it depicts all monitored objects as up and running. The whole point of a model is to simulate behavior under real-world conditions.  To do so, a series of WAN link failures on the Philadelphia Router will be simulated.  The failures will be:
  1. A secondary interface on the Philadelphia Monitoring Server will be administratively shut down
  2. The Windows Domain Controller's Active Directory Web Service will be stopped
  3. The iSCSI interface on the Windows Domain Controller connectecting to the iSCSI SAN Server  will be disconnected
  4. The router interface on the Harrisburg WAN Router connecting to the Philadelphia WAN Router will be disconnected
  5. The router interface on the Philadelphia WAN Router connecting to the already unavailable Harrisburg WAN Router will be disconnected
For this test, an Icinga Server running NagVis located in the Pittsburgh Data Center shall monitor and report the condition of services in the Philadelphia Data Center.

Testing the Model's NagVis Reporting

The expected behavior is:
  1. The monitoring server will recognize the administratively down Philadelphia Monitoring Server interface and report the Interface Service Icon as WARNING Yellow
  2. The Monitoring Server will recognize the stopped Active Directory Web Services as Critical Red
  3. The monitoring server will recognize the disconnected interface between Philadelphia Windows Domain Controller and Philadelphia SAN Server and report  the Interface Service Icon and iSCSI Link Service Group Line as DOWN Red
  4. The monitoring server will recognize the disconnected interface as DOWN Red and the Philadelphia WAN Router as AVAILABLE Green
  5. The monitoring server will recognize the disconnected interface on the Philadelphia WAN Router and report the Interface Service as DOWN Red and Router as CRITICAL Red
Upon restoring the Services and WAN Link interfaces, it takes three to four minutes for the two monitoring servers to update the host checks and report all objects as available. However, the interface also allows the administrator to manually reschedule and refresh Host and Service Checks to more quickly update information.

The video below demonstrates that the expected behavior is, indeed, what happens.


No comments :

Post a Comment