Search This Blog

Thursday, October 17, 2013

Using Munin on Debian Wheezy for Monitoring and Trend Analysis

The heritage of Tobi Oetiker's rrdtool is located on a previous post describing Cacti.  As mentioned, there is a rich ecosystem of applications that incorporate rrdtool -- Munin is the subject of this post.
Windows Memory Counters During Stress Tests
 
Cacti uses SNMP  -- a publicly available monitoring protocol -- to gather system information.  Munin uses its own agent, software that is provided by the application developers and installed on each host.  There is SNMP support described on the Munin site, but its forte is the installable agent.  Linux, FreeBSD, NetBSD, Solaris, AIX, OS/X Darwin, HP-UX and Windows are supported as of May 18, 2013.

Installing and Configuring the Munin Server

Installation on Debian Wheezy is trouble-free:  apt-get install munin munin-node apache2.  Note that the default installation uses Apache MPM-Worker, while many other Debian monitoring applications use Prefork.  If you plan on deploying multiple monitoring systems, you may wish to specify Prefork.  Be forewarned, Munin is resource-intensive and may be better left on a stand-alone system with MPM-Worker.


By default, Munin installs with the web server locked down.



For a testing environment, simply change the /etc/apache2/conf.d/munin definitions from "Allow from localhost 127.0.0.0/8 ::1" to "Allow from all."  This is insecure and a production deployment should use authentication.


After restarting Apache ("service apache2 restart"), the default installation configuration will display with localhost.localdomain the only available host.


Configuring Linux Munin Nodes


Two files control Munin's data collection:  /etc/munin/munin.conf for the server and /etc/munin/munin-node.conf for the monitored hosts.  The only required additions to the munin-node.conf file are "allow <address>" lines for each Munin monitoring server IP address.  This file may then be copied by scp to each monitored host and the node process restarted (service munin-node restart).

However, you may customize the Munin Node configuration for each host.  The /etc/munin/plugins/ directory contains symlinks to plugins stored in the /usr/share/munin/plugins directory.  Simply delete unwanted symlinks and add new ones to those available in /usr/share/munin/plugins/.


Configuring Windows Munin Nodes

The Windows Munin Node runs as a service and is configured from the munin node.conf file.  The default installation is pictured below.
Configuring additional counters for Windows Nodes is very easy and provides access to the large number of Windows Performance Counters. For example, a large set of Microsoft SQL Server Performance Counters are available through Performance Manager.  Simply select a desired counter and add a Performance Counter definition to the node configuration file.






 Microsoft Exchange Server offers a similarly large set of Performance Counters.


Configuring the Munin Server to Collect Node Data

The munin.conf server configuration file requires a bit more work, but not much.  Simply add the host name, IP address and (normally) use_node_name yes statements to define each host as specified and restart the munin process (service munin restart).  Munin has a variety of configuration and reporting options, but they are beyond the scope of this article and should be reviewed at Munin's extensive online documentation.


Once the configured server begins to collect data, the hosts appear, grouped by domain name and sorted by host name.  Here, two Windows Server 2008 servers are the first to appear.


Munin runs from a cron job at five minute intervals.  Do NOT attempt to increase the time interval or the server will miss data and expend more system resources trying to interpret what it gets.  If Munin is grinding the server to a standstill, try editing the munin.conf file to tune the numbers of its simultaneous processes.



Make no mistake.  Munin can eat up resources.  The server contacts the monitored nodes requesting the last five minutes of collected data (the munin-node process on monitored hosts).  Once the data has been collected on the server, a variety of processes (described in the munin.conf file) begin to process the data.  The number of processes can be configured in a fairly granular fashion.  Munin-html then takes over and formats the html output.  Now the real number crunching begins:  the munin-graph process.  This is a CPU- and disk-intensive process, generating the resultant graphs.  This virtual server has two Intel Core-i5 2.6 GHz processors assigned and they are pegged.  Disk IO is also intensive.




Soon, all of the monitored nodes appear on the overview page.


Why would anyone deploy a system as resource intensive as described?  Take a look at the results.  Even on a laptop -- with data collection gaps when it and the virtual machines are shut down -- the presentation is beautiful.

Munin excels at reporting processor, memory and process data.  Munin has beauty AND brains.









Munin, like Cacti, is not an alerting system.  It is designed for more granular trend analysis and capacity planning than Cacti, at the price of much greater system loads.  It succeeds with presentation-quality output.

The final words: Management-Pleasing Colors.

No comments :

Post a Comment