Search This Blog

Saturday, October 19, 2013

Installing and Configuring Basic Zabbix Functionality on Debian Wheezy

<data:blog.title/> <data:blog.pageName/>

The Zabbix project began in 1998 when Alexei Vladishev started working on an internal project.  By 2001, it was released in alpha and the first stable release was in 2004.  Six years is a long time for a project to reach stable release, but Zabbix is an ambitious undertaking.

Zabbix uses a variety of mechanisms to collect data.  It supports SNMP gets and also provides an installable host agent.  The host agent supports passive and active checks -- queries that only return data to be processed by the server versus those that require processing by the client prior to returning the check response to the server.  It is also designed to be scalable, providing a data-collecting proxy and a Java JMX application-monitoring proxy -- the Zabbix Java Gateway.

It is reasonably easy to install and configure the basic functionality.  But Zabbix is not an entry-level application.  An experienced administrator is necessary to design and install a full-featured deployment.  For example, implementing SNMP checks is described on the supporting documentation web page, but the administrator configuring the checks needs to have some familiarity with SNMP MIBs to obtain useful (and appropriate) OIDs. Adding functionality beyond the considerable amount available out-of-the-box also requires some regular expression knowledge.

Even the Debian Wheezy installation requires some extra work.  Debian's apt packaging system is normally very good at installing all of the dependencies.  Not so with Zabbix.  It took some hunting around the blogosphere to figure out a workable installation.  Start by installing the Postfix Mail Server and MySQL database.  Zabbix also supports Postgres databases, but the virtual machine test environment in which this is deployed is served well by MySQL.

Zabbix Installation

The command line installation requires three steps:
  1. apt-get install postfix postfix-mysql mysql-client mysql-server
  2. apt-get install apache2 apache2-mpm-prefork apache2-utils libexpat1 libapache2-mod-php5 php5-common php5-gd php5-mysql php5-cli php5-cgi libapache2-mod-fcgid php-pear php-auth php5-mcrypt mcrypt php5-imagick imagemagick php5-curl libcurl4-openssl-dev
  3. apt-get install zabbix-agent zabbix-server-mysql zabbix-frontend-php phpmyadmin
The Debian Postfix installation offers several configuration options.  The test network has a Microsoft Exchange server, so Postfix will use the Exchange Hub as its smarthost.  Provide the mail server name and DNS name or IP address of the Exchange smarthost.  The Debian MySQL installation will prompt for the password of the MySQL database root user.  The Apache web server installation requires PHP support.  Unfortunately, the Debian package dependencies do not provide all the software Zabbix needs; supply a more explicit list of packages.  Finally, install the Zabbix agent, MySQL back end server and PHP front end PHP applications.

The video below illustrates these three steps with prompted configurations designed for the test environment.

Upon completing the command line installation, copy the zabbix apache configuration file to the apache conf.d directory and procede with the web-based installation.  You will have to correct the php.ini file, create and populate a zabbix database, install the zabbix.conf.php configuration file, modify the /etc/default/zabbix_server file and update the /etc/zabbix/zabbix_server.conf file with the correct username and password.  These steps are illustrated in the video below.

Unlike many Debian LAMP applications that are installed but securely locked down, there is a bit more work to deploy Zabbix.  First, copy the /usr/share/doc/zabbix-frontend-php/apache.conf file, renaming it as /etc/apache2/conf.d/zabbix.conf.  Use PHPMyAdmin to create and populate a zabbix database.  The installation is now ready to continue with the Zabbix web installer.  Browse to http://<servername>/zabbix.  It first checks the /etc/php5/php.ini file and the default installation requires some modifications.  These are clearly indicated.  Once the edits are complete, recheck and proceed.  Next, configure Zabbix for the MySQL server.  Then supply the Zabbix server name and review the final check.  Download the zabbix.conf.php file and upload it to the server's /etc/zabbix directory. The server is now almost ready to use.  Modify the /etc/default/zabbix-server file from the default START=no to START=yes.  Then, provide the mysql user "root" and its password to the /etc/zabbix/zabbix_server.conf file.

Zabbix Configuration


Browse to http://<servername>/zabbix and supply the default user name "admin" and password "zabbix".  The default theme is clean and attractive, but I prefer darker colors and change the theme to Black and Blue.

On a production deployment, one of the first things to configure appears near the end of the support site's documentation:  Discovery.  This is a great feature, even when implemented in a basic form.  Zabbix provides a very configurable discovery system that saves a lot of work for an administrator who knows how to set it up.

The first step is to deploy Zabbix Agents on the monitored hosts.  Many operating systems are supported, but only Linux and Windows are described here.  The Linux client is well documented.  There are only three mandatory changes to apply:  Server, ServerActive and Hostname.  The first two define the IP address of the Zabbix server for passive and active checks.  The third provides the unique host name to the server. The Windows client configuration file is a deplorable mess:  a long string of characters lacking carriage returns.  Just add the three lines above to the file and call it good enough for now.

Host Groups
Devices may be logically grouped by type using Host Groups.  "Linux Servers is included by default.  For this example, add a group "Windows Servers" to display them under one heading.


Templates provide a configurable set of monitoring, trigger and display items and a large number are included by default.  For this example, modify "Template OS Linux" and "Template OS Windows" to add a default SNMP community string "public" to the OS templates.


There is much more to Discovery than simply pinging hosts or identifying agents.  The Discovery - Agent interaction may be programmed to take care of a great deal of drudgery using Actions.  Actions specify Operations to perform when specified Conditions are met.  Take this rule, for instance.  The Conditions specify the returned Zabbix Agent OS value is like "Linux" and the Operation is to place the Server in the Host Group "Linux Servers" and apply the monitoring Template "Template OS Linux" to the host.  With no further manual work, discovered hosts are configured with a reasonably comprehensive set of monitoring rules.  Clone this rule and change the OS value to "Windows" and the actions to automatically add Windows hosts to the Windows Host Group and apply the Template OS Windows.

The default installation is disabled and configured to search the network.  Modify that value to match your local addressing scheme, but restrict the Discovery Process to one subnet at a time, preferably a 24-bit netmask.  By default, this Discovery Process will look for hosts using Zabbix Agent queries and ICMP pings.Continue adding subnets to define the set of hosts to be monitored.  How many you initially configure depends upon the hardware on which Zappix is deployed, but test with three to ten.  It is a bit resource hungry when fully operational.

If you plan to monitor devices for which there are no Zabbix Agents -- such as switches and routers -- you should also add SNMP to the Discovery options.  Provide the read-only community (typically public) and OID ifDescr and SNMP will search for Ethernet interfaces.

The following video depicts the above Discovery and Actions processes.

Zabbix Screens

Screens are a customizable way to aggregate information.  There is a great variety of information that may be presented, and one example does not suffice.  However, the video below shows how to add all of the network's WAN interfaces to a screen, an external URL to a NagVis visualization and the private interfaces for the Philadelphia router and Domain Controller.  Once configured, the Domain Controller begins a download and WAN links begin to fail.  Since the WAN routers use OSPF, they quickly fail over to live routes and the download continues.  Zabbix graphs depict the traffic on WAN links throughout the scenario.

Zabbix Alerts

Monitoring systems need to provide notifications.  Zabbix provides well-implemented notification and tracking.  First, add additional monitoring functionality to the SAN servers by applying Template SNMP Disks.  Then create a user and the User Group SAN Admins.

Unfortunately, Zabbix does not have readily configured topology logic in its alert system.  For instance, if a router fails, Zabbix will alert not only the router has failed, but also any hosts behind it that are now unreachable.  Thus, administrators are flooded with alerts that distract from the root cause of the problem:  a failed router.  This situation leads to confusion and additional difficulty diagnosing problems.

Thus, while Zabbix is very useful and one of the best Debian systems visualization packages available, it is primarily a host monitoring tool and not an enterprise-class systems management tool. It is well worth deploying for host trending and visualizations, but has drawbacks that do not recommend its use as an enterprise-class monitoring tool.

Friday, October 18, 2013

Xymon Host Monitoring

View Stephen Fritz's LinkedIn profileView Stephen Fritz's profile

Xymon, like rrdtool, has a long history.  Its oldest antecedent is Big Brother, a project that has become commercial and well worth checking out.  The original Big Brother project lay dormant as the developers concentrated on the Professional edition.  Two notable forks picked up the original project:  Big Sister and Hobbit.  Forks are confusing enough, but the Hobbit team eventually decided to drop the Tolkien-themed names (preferring to avoid copyrighted and trademarked nomenclature) and adopted Xymon.  Working with Xymon can be a bit of a dinosaur dig, with old names sprinkled throughout the software's configuration files.

Xymon is gathers information with an installed client application and monitors TCP services.  The original Big Brother project did not provide much more than that, but the Xymon team has added more functionality.  With a bit of extra work, specific processes may also be monitored.

As with most Debian software, installation is easy:  apt-get install xymon xymon-client apache2.

Also like many Debian web-based application installations, the default installation is locked down and needs a bit of configuration file editing.  

Below you see the first indication of the project's history -- the /etc/apache/conf.d directory's Xymon configuration file is named hobbit.

Simply change the line "Allow from localhost  ::1/128" to "Allow from all."

"Allow from all" by itself is insecure and the web server should be protected with authentication -- the .htaccess file.  The Xymon team has provided documentation of how to use the files that implement their version of Linux/Apache .htaccess authentication -- hobbitpasswd and hobbitgroups located in the /etc/hobbit directory.  Yes, another bit of inherited nomenclature.


Once the web server is configured, browse to http://<servername/hobbit.  The monitoring host is already configured.

The configuration files are kept in the /etc/hobbit directory.  Lo and behold, we are greeted with even older heritage -- the bb-hosts and bb-services directories.  Ahhh, that brings back some fond memories.

Most of the editing is in the bb-hosts file.  The format is well-explained in the comments, but the general idea is the hosts lines consist of an IP address, host name, # character and list of services.  These services are in addition to those provided by the xymon client; more on that later.

A few quick edits brings up more hosts.

Unfortunately, it is not uncommon for the client to insert the wrong host name in the monitored hosts /etc/default/hobbit-client file.  Often (as in this case), the host name is pulled from an old /etc/hosts entry that was never updated (my bad).

The host names on the clients must match those in the /etc/hobbit/bb-hosts file.  Once they do, the default client services (CPU, disk, files, memory and msgs, ports and procs) begin to show up.

There is also a monitoring agent for Windows.  The 32-bit application runs as a service and is configured from a file in the C:\Program Files (x86)\BBWin\etc directory.

The configuration file is fairly well documented,but not as simple as the Linux client.  However, it is easy to add server-side TCP service monitoring as described below.

The information and presentation has improved significantly under the Xymon team.  Although no port checks are defined, clicking on a ports icon brings up a netstat output.

Clicking on a procs icon brings up ps output.

Adding addition services to monitor is very easy.  The bb-services file is well documented and defining a service is as simple as naming it and adding a TCP port number to monitor.   This definition will only return a green (up) or red (down) state.  Adding SEND, EXPECT and OPTIONS lines can distinguish a service running in a degraded state and return yellow (degraded) as long as you know how the service operates.  Below is an illustration of the mysql service running on port 3306.

One host returns green and one red.  The explanation is simple:  the down host is newly installed and has the default listen on address configuration in the /etc/mysql/my.cnf file.  Commenting out that line and restarting the mysql service returns the state to green; mysql is now listening on all interfaces.

The Xymon team has incorporated rrdtools reporting into the application.  This is a great benefit for performance trending.

They have also provided a comprehensive host report with the output from Linux systems utilities.

Adding TCP ports and processes to monitor is relatively simple because the application provides netstat and ps outputs.

However, Xymon's notification capability is limited.  If a router or link fails, the monitoring server node will flood failure notifications for every monitored service that is unreachable behind the failed router; it is difficult to filter the root cause of the failure when there are notifications from every service that is unreachable behind the failed router or link.

Since Xymon is primarily a host-based monitoring application, it is not designed to monitor multi-homed hosts.  Thus, in the test network -- with redundant WAN links -- on which it is installed, it may not catch a failed router interface if the configuration monitors hosts, not the individual interfaces.

Even with two failed interfaces, Xymon may not be aware of a problem because all hosts are reachable on at least one interface.

There is a way around this problem: defining interfaces as hosts with unique names.  Thus, each router has an eth0 through eth3 host name configured to monitor the four interfaces.  Simply doing this in the bb-hosts file leads to a rather cluttered monitoring entry page, however.

However, the bb-hosts file supports defining pages below the entry page quite easily.  And since these interface host definitions do not match the host name in the /etc/default/hobbit file, they only report connection info and trends information.  It is a workable and efficient solution; with experience, the formatting syntax is flexible enough to find workable solutions.

Notifications are managed in the hobbit-alerts.cfg file.  Recipients are defined as an e-mail address or script (e.g. sms message sent by a script ).  The following snips of the configuration file provide an overview of some of the options available.

Xymon appears rather limited when first installed, but digging into the well-documented configuration files reveals many more options.  It is simple enough to deploy as an entry-level monitoring tool, with enough features to provide a reasonably comprehensive monitoring, trending, reporting and notification system.  It is not resource-intensive and scales fairly well with the option to deploy multi-server configurations.

Xymon is also a good tool to deploy in an emergency because it sets up fairly quickly and can provide sufficient information to identify problem hosts in a large, complex data center.

Thursday, October 17, 2013

Using Munin on Debian Wheezy for Monitoring and Trend Analysis

The heritage of Tobi Oetiker's rrdtool is located on a previous post describing Cacti.  As mentioned, there is a rich ecosystem of applications that incorporate rrdtool -- Munin is the subject of this post.
Windows Memory Counters During Stress Tests
Cacti uses SNMP  -- a publicly available monitoring protocol -- to gather system information.  Munin uses its own agent, software that is provided by the application developers and installed on each host.  There is SNMP support described on the Munin site, but its forte is the installable agent.  Linux, FreeBSD, NetBSD, Solaris, AIX, OS/X Darwin, HP-UX and Windows are supported as of May 18, 2013.

Installing and Configuring the Munin Server

Installation on Debian Wheezy is trouble-free:  apt-get install munin munin-node apache2.  Note that the default installation uses Apache MPM-Worker, while many other Debian monitoring applications use Prefork.  If you plan on deploying multiple monitoring systems, you may wish to specify Prefork.  Be forewarned, Munin is resource-intensive and may be better left on a stand-alone system with MPM-Worker.

By default, Munin installs with the web server locked down.

For a testing environment, simply change the /etc/apache2/conf.d/munin definitions from "Allow from localhost ::1" to "Allow from all."  This is insecure and a production deployment should use authentication.

After restarting Apache ("service apache2 restart"), the default installation configuration will display with localhost.localdomain the only available host.

Configuring Linux Munin Nodes

Two files control Munin's data collection:  /etc/munin/munin.conf for the server and /etc/munin/munin-node.conf for the monitored hosts.  The only required additions to the munin-node.conf file are "allow <address>" lines for each Munin monitoring server IP address.  This file may then be copied by scp to each monitored host and the node process restarted (service munin-node restart).

However, you may customize the Munin Node configuration for each host.  The /etc/munin/plugins/ directory contains symlinks to plugins stored in the /usr/share/munin/plugins directory.  Simply delete unwanted symlinks and add new ones to those available in /usr/share/munin/plugins/.

Configuring Windows Munin Nodes

The Windows Munin Node runs as a service and is configured from the munin node.conf file.  The default installation is pictured below.
Configuring additional counters for Windows Nodes is very easy and provides access to the large number of Windows Performance Counters. For example, a large set of Microsoft SQL Server Performance Counters are available through Performance Manager.  Simply select a desired counter and add a Performance Counter definition to the node configuration file.

 Microsoft Exchange Server offers a similarly large set of Performance Counters.

Configuring the Munin Server to Collect Node Data

The munin.conf server configuration file requires a bit more work, but not much.  Simply add the host name, IP address and (normally) use_node_name yes statements to define each host as specified and restart the munin process (service munin restart).  Munin has a variety of configuration and reporting options, but they are beyond the scope of this article and should be reviewed at Munin's extensive online documentation.

Once the configured server begins to collect data, the hosts appear, grouped by domain name and sorted by host name.  Here, two Windows Server 2008 servers are the first to appear.

Munin runs from a cron job at five minute intervals.  Do NOT attempt to increase the time interval or the server will miss data and expend more system resources trying to interpret what it gets.  If Munin is grinding the server to a standstill, try editing the munin.conf file to tune the numbers of its simultaneous processes.

Make no mistake.  Munin can eat up resources.  The server contacts the monitored nodes requesting the last five minutes of collected data (the munin-node process on monitored hosts).  Once the data has been collected on the server, a variety of processes (described in the munin.conf file) begin to process the data.  The number of processes can be configured in a fairly granular fashion.  Munin-html then takes over and formats the html output.  Now the real number crunching begins:  the munin-graph process.  This is a CPU- and disk-intensive process, generating the resultant graphs.  This virtual server has two Intel Core-i5 2.6 GHz processors assigned and they are pegged.  Disk IO is also intensive.

Soon, all of the monitored nodes appear on the overview page.

Why would anyone deploy a system as resource intensive as described?  Take a look at the results.  Even on a laptop -- with data collection gaps when it and the virtual machines are shut down -- the presentation is beautiful.

Munin excels at reporting processor, memory and process data.  Munin has beauty AND brains.

Munin, like Cacti, is not an alerting system.  It is designed for more granular trend analysis and capacity planning than Cacti, at the price of much greater system loads.  It succeeds with presentation-quality output.

The final words: Management-Pleasing Colors.