Search This Blog

Monday, March 31, 2014

High Availability Stand Alone Zabbix (Failover Zabbix, MySQL, Apache and Postfix with DRBD-Pacemaker)

This article describes Distributed Replicated Block Device (DRBD) fault tolerance, analogous to network disk mirroring.

DRBD-Pacemaker Failover HA Cluster Diagram



High-Availability Clustering Technologies


High-availability clustering is designed to increase systems availability.  For this article, we will be using:


These technologies present two servers as one to the network; one server is active and the other is waiting to take over if the first fails or is taken off line.

The cluster is composed of elements that differ from a single-server deployment.  

  • Resources unique to each node
  • Data shared between nodes on the cluster
  • Services shared between the nodes on the cluster

Resources Unique to Each Node

Each node has is a server with its own operating system and hardware.  The processor, memory, disk and IO subsystems (including network interfaces) are controlled by the operating system installed on the boot partition.

Data Shared Between Nodes on the Cluster

Zabbix server includes serveral components:
  • Apache2 Web Server
  • MySQL Database Server
  • Postfix Mail Server
  • Zabbix Server
  • Zabbix Agent
  • Zabbix PHP Frontend
Once configured,  Apache and Postfix do not require additional modifications and may remain unique to each server; theie configuration files do not need to be shared.

For a MySQL cluster, there are two types of shared data:  configuration files and databases.  The configuration files are those located in the /etc/mysql/ directory.  When shared between the two nodes, the MySQL server will have an identical configuration regardless of the node that is active.  However, there are circumstances in which the MySQL configuration files may be unique to each server.  The databases are kept in the /var/lib/mysql/ directory and include the log files.

The Zabbix configuration files (of the server, client and front-end web server) are stored in the /etc/zabbix directory.  The PHP Frontend files (used by Apache) are stored in the /usr/share/zabbix/directory; these may remain on each server.

MySQL Clustering Caveats

Although the two nodes share the same MySQL databases, UNDER NO CIRCUMSTANCES SHALL THE TWO NODES SIMULTANEOUSLY ACCESS THE DATABASES.  That is, only one node may run the mysqld daemon at any given time.  If two MySQL daemons access the same database, there will eventually be corruption.  The clustering software controls which node accesses the data.

DRBD - Corosync - Pacemaker Overview

The illustration below depicts a high-availability cluster design.  Each server has four network interfaces:

  • eth0 -- the publicly addressable interfaces
  • eth1 -- the DRBD data replication and control interfaces
  • eth2 and eth3 -- the Corosync - Pacemaker control interfaces

The first interface -- eth0 -- is the publicly addressable interface that provides MySQL database and Apache web server (for PHPMyAdmin) access.  Two IP addresses (unique to server) are assigned at boot time and a third is assigned by the Corosync - Pacemaker portion of the clustering software.

The second interface -- eth1 -- is controlled by the DRBD daemon.  This daemon is configured with two or more files: /etc/drbd.d/global_common.conf and /etc/drbd.d/ro.res, r1.res, r2.res... assigned for each shared block device.  For this example, only r0.res is installed.  System-wide settings are defined in the global_common.conf file.  Settings specific to each pair of shared block devices are defined in the r0.res (and other) resource files.  DRBD defines an entire block device (or hard drive) as shared and replicated between two nodes.  In this example, block device /dev/sdb (a SCSI drive in each server) is shared between the two nodes as /dev/drbd0.  Once configured, both servers see /dev/sdb as a new block device /dev/drbd0 and only ONE may mount it at any given time.  The server with the mounted partition replicates any changes to the failover node.  If the first server fails or is taken off line, the other server immediately mounts its /dev/drbd0 block device and assumes control of replication.

The third and fourth interfaces -- eth2 and eth3 -- are controlled by the Corosync - Pacemaker software.  These interfaces provide communication links defining the status of each defined resource and control how and where shared services (such as shared DRBD block devices, IP addresses, and the MySQL / Apache2 daemons) run.  In failover clustering, only one node may actively be in control at any time.


DRBD-Pacemaker Failover HA Cluster Diagram

Installing and Configuring the MYSQL - Apache - Postfix - Zabbix Failover Cluster

It is assumed the reader knows how to set up a Zabbix server.  If not, please read and understand this article:  Installing and Configuring Basic Zabbix Functionality on Debian Wheezy

Leave MySQL listening on the default loopback interface 127.0.0.1.  Then, start the Linux Cluster Management Console (LCMC) -- a Java application that will install and configure everything required for clustering.  Select the two nodes by name or IP address, install Pacemaker (NOT Heartbeat) and DRBD.

Once both nodes have the required software, configure the cluster.  LCMC will prompt you for two interfaces to use in the cluster (select eth2 and eth3) and the two-node system will be recognized as a Cluster.  Configure global options and then select device /dev/sdb on one node and mirror it to /dev/sdb on the other to configure DRBD device /dev/drbd0.  Format it with an ext4 file system and make sure to perform an initial full synchronization.

When the DRBD device finishes synchronization, create a shared mount point -- /mnt/sdb -- and IP address (10.195.0.100 shared between eth0 on the nodes).  The cluster will now recognize the DRBD device as a file system on the Active node.

The shared data must then be moved to the DRBD device.  Stop MySQL on each node.  On the Active node, move the directories /etc/mysql, /var/lib/mysql and /etc/zabbix/ to /mnt/sdb/etc/mysql, /mnt/sdb/var/lib/mysql and /etc/zabbix/, respectively, and create symlinks back to their original locations.  On the Inactive node, delete the mysql and zabbix directories and replace them with symlinks to the same locations -- even though those locations are not yet visible.

On the Active Node:
  • mv /etc/mysql /mnt/sdb/etc/mysql
  • mv /etc/zabbix /mnt/sdb/etc/zabbic
  • mv /var/lib/mysql /mnt/sdb/var/lib/mysql
  • ln -s /mnt/sdb/etc/mysql /etc/mysql
  • ln -s /mnt/sdb/etc/zabbix /etc/zabbix
  • ln -s /mnt/sdb/var/lib/mysql /var/lib/mysql
 On the Inactive Node:
  • rm /etc/mysql /mnt/sdb/etc/mysql -r
  • rm /etc/zabbix /mnt/sdb/etc/zabbic  -r
  • rm /var/lib/mysql /mnt/sdb/var/lib/mysql -r
  • ln -s /mnt/sdb/etc/mysql /etc/mysql
  • ln -s /mnt/sdb/etc/zabbix /etc/zabbix
  • ln -s /mnt/sdb/var/lib/mysql /var/lib/mysql

The LCMC console is then used to finalize the shared services.  Add Primitive LSB Init Script resources (that is, only running on one server at a time) for MySQL and Apache2.  Once the services are installed, change the listener address in /etc/mysql/my.cnf to the shared IP Address of the cluster (10.195.0.100 for this installation).

Fail the servers back and forth several times to check that the system performs as expected.  Finally, install a database.  For this article, I install the Zabbix Monitoring database.

The video below illustrates the entire installation and configuration process.



Sunday, March 30, 2014

DRBD High Availability MySQL Servers

High-Availability Clustering Technologies

High-availability clustering is designed to increase systems availability.  For this article, we will be using:

  • Distributed Replicated Block Device (DRBD)
  • Corosync - Pacemaker

These technologies present two servers as one to the network; one server is active and the other is waiting to take over if the first fails or is taken off line.

The cluster is composed of elements that differ from a single-server deployment.  

  • Resources unique to each node
  • Data shared between nodes on the cluster
  • Services shared between the nodes on the cluster

Resources Unique to Each Node

Each node has is a server with its own operating system and hardware.  The processor, memory, disk and IO subsystems (including network interfaces) are controlled by the operating system installed on the boot partition.

Data Shared Between Nodes on the Cluster

For a MySQL cluster, there are two types of shared data:  configuration files and databases.  The configuration files are those located in the /etc/mysql/ directory.  When shared between the two nodes, the MySQL server will have an identical configuration regardless of the node that is active.  However, there are circumstances in which the MySQL configuration files may be unique to each server.  The databases are kept in the /var/lib/mysql/ directory and include the log files.

MySQL Clustering Caveats

Although the two nodes share the same MySQL databases, UNDER NO CIRCUMSTANCES SHALL THE TWO NODES SIMULTANEOUSLY ACCESS THE DATABASES.  That is, only one node may run the mysqld daemon at any given time.  If two MySQL daemons access the same database, there will eventually be corruption.  The clustering software controls which node accesses the data.



DRBD - Corosync - Pacemaker Overview

The illustration below depicts a high-availability cluster design.  Each server has four network interfaces:

  • eth0 -- the publicly addressable interfaces
  • eth1 -- the DRBD data replication and control interfaces
  • eth2 and eth3 -- the Corosync - Pacemaker control interfaces

The first interface -- eth0 -- is the publicly addressable interface that provides MySQL database and Apache web server (for PHPMyAdmin) access.  Two IP addresses (unique to server) are assigned at boot time and a third is assigned by the Corosync - Pacemaker portion of the clustering software.

The second interface -- eth1 -- is controlled by the DRBD daemon.  This daemon is configured with two or more files: /etc/drbd.d/global_common.conf and /etc/drbd.d/ro.res, r1.res, r2.res... assigned for each shared block device.  For this example, only r0.res is installed.  System-wide settings are defined in the global_common.conf file.  Settings specific to each pair of shared block devices are defined in the r0.res (and other) resource files.  DRBD defines an entire block device (or hard drive) as shared and replicated between two nodes.  In this example, block device /dev/sdb (a SCSI drive in each server) is shared between the two nodes as /dev/drbd0.  Once configured, both servers see /dev/sdb as a new block device /dev/drbd0 and only ONE may mount it at any given time.  The server with the mounted partition replicates any changes to the failover node.  If the first server fails or is taken off line, the other server immediately mounts its /dev/drbd0 block device and assumes control of replication.

The third and fourth interfaces -- eth2 and eth3 -- are controlled by the Corosync - Pacemaker software.  These interfaces provide communication links defining the status of each defined resource and control how and where shared services (such as shared DRBD block devices, IP addresses, and the MySQL / Apache2 daemons) run.  In failover clustering, only one node may actively be in control at any time.


Installing and Configuring the MYSQL - Apache Failover Cluster

Begin by installing and MySQL and Apache.  Leave MySQL listening on the default loopback interface 127.0.0.1.  Then, start the Linux Cluster Management Console (LCMC) -- a Java application that will install and configure everything required for clustering.  Select the two nodes by name or IP address, install Pacemaker (NOT Heartbeat) and DRBD.

Once both nodes have the required software, configure the cluster.  LCMC will prompt you for two interfaces to use in the cluster (select eth2 and eth3) and the two-node system will be recognized as a Cluster.  Configure global options and then select device /dev/sdb on one node and mirror it to /dev/sdb on the other to configure DRBD device /dev/drbd0.  Format it with an ext4 file system and make sure to perform an initial full synchronization.

When the DRBD device finishes synchronization, create a shared mount point -- /mnt/sdb -- and IP address (10.195.0.100 shared between eth0 on the nodes).  The cluster will now recognize the DRBD device as a file system on the Active node.

The shared data must then be moved to the DRBD device.  Stop MySQL on each node.  On the Active node, move the directories /etc/mysql and /var/lib/mysql to /mnt/sdb/etc/mysql and /mnt/sdb/var/lib/mysql, respectively, and create symlinks back to their original locations.  On the Inactive node, delete the mysql directories and replace them with symlinks to the same locations -- even though those locations are not yet visible.

The LCMC console is then used to finalize the shared services.  Add Primitive LSB Init Script resources (that is, only running on one server at a time) for MySQL and Apache2.

Fail the servers back and forth several times to check that the system performs as expected.  Finally, install a database.  For this article, I install the Zabbix Monitoring database.

The video below illustrates the entire installation and configuration process.



Tuesday, March 25, 2014

High-Availability Nagios / Icinga on a DRBD - Corosync - Pacemaker Failover Cluster

Failover clustering is relatively uncomplicated and provides high availability.  It consists of two servers -- preferably almost identical -- in which one is active and the other passive.  In the event the active server fails or is manually taken offline, the passive server assumes the active role.

This may sound like the two nodes are independent, but they share resources.  There is data and shared services that must be available to both servers (but not simultaneously). Shared data is stored on a shared drive controlled by the operating system with the help of clustering software or clustering-aware file systems on a SAN; shared services are controlled by clustering software.
DRBD High-Availability Clustering


Installing and Configuring DRBD, Corosync and Pacemaker with LCMC

The first step is to provide the servers with a shared data drive.  In this case, the shared drive will be two physical drives -- one on each server -- replicated by the Distributed Replicated Block Device (DRBD) software package.  DRBD is analogous to drive mirroring, however the drives reside on different servers and are mirrored over a network connection.  The / partition of the servers is installed on drive /dev/sda and a blank drive -- /dev/sdb -- will be controlled by DRBD.  Once configured, /dev/sdb will no longer be accessible and will be addressed as /dev/drbd0.

The second step is to share the drive with the servers, control which one is active and assign a shared IP address by which the cluster will be available.  This is controlled by the Pacemaker - Corosync clustering software packages.  Pacemaker and Corosync use network interfaces (preferable at least two) to maintain a "ring" over which the servers provide staus updates.  The cluster will move resource control from one server to the other, either manually or automatically.

For this setup, the servers each have four network controllers:

  • eth0 -- publicly-available
  • eth1 -- dedicated to DRBD data control and replication
  • eth2 and eth3 -- dedicated to Pacemaker - Corosync services control
There are several files that control these applications and they may be configured manually. DRBD is set up with the files /etc/drbd.d/global_common.conf and /etc/drbd.d/r0.res (or more if there are multiple drive resources).  Pacemaker - Corosync is set up with the /etc/corosync/corosync.conf file.

There is another application -- the Linux Cluster Management Console (LCMC) -- that automates setting up such a cluster.  LCMC is Java-based and will install / configure all of the software necessary to operate a failover cluster.

The video below demonstrates adding the two unconfigured servers to LCMC, installing and configuring DRBD, Pacemaker and Corosync, setting up and replicating a shared ext4 partition on the shared data drive, adding a Pacemaker-controlled drive partition, adding a shared IP address and testing Active-Passive operations and failover.

Shared Apache, Postfix, MySQL and Icinga Resources

Once the basic shared resources are configured and tested, we can add applications that are controlled by the cluster.  There are a number of other posts that describe setting up Nagios and Icinga in this blog; refer to them for details.

Begin by installing and configuring the Apache2 web server, Postfix mail server and MySQL database server.  Then install and configure all of the Icinga monitoring, web and database packages.  These must be operating correctly on both servers and have identical configurations before clustering may proceed.

Now important distinctions must be identified:  data and services that are controlled by the operating system versus those controlled by the cluster.  For instance, the clustering software runs independently on each server and is controlled by the operating system, available at boot time.  The web, mail and database servers share data and resources and are controlled by the cluster; they must be disabled at boot time and started by the clustering software.  This is very important.  For instance, MySQL will be configured so that the configuration files (/etc/mysql/*.*) and data drive and symlinked back to their original location. Thus, since only one server has access to the configuration and data, the MySQL daemon must be disabled at boot and the DRBD - Pacemaker - Corosync clustering software decides upon which server has access to the files and starts the servers.  I use the Webmin interface to disable all shared services at boot time.

This process is illustrated in the video below.  Upon completion, the cluster will be in control of all shared services (web, mail, database, Icinga, etc.).  Failover is illustrated.

Testing High-Availability Nagios / Icinga on a DRBD - Corosync - Pacemaker Failover Cluster

Once failover is demonstrated, it is time to allow the servers to collect, process and display data.  Which shared data files to move to the replicated DRBD drive depend upon what services are installed.  For instance, PNP4Nagios -- the RRDTool graphing add-on -- stores shared data in /var/lib/pnp4nagios; this file is moved to the DRBD drive and symlinked back to each node.

Also be aware that some directories do not move well.  There are numerous symlinks in the base installations that end up pointing to nothing if moved and symlinked back to the node file systems.  The /etc/icinga directory is a case in point.  If this directory is moved to the shared drive, Icinga's operation becomes, at best, unstable.  Thus, updates to the Icinga configuration must be manually installed on EACH node.


However, as the video below demonstrates, shared MySQL databases and moving selected PNP4Nagios and NagVis directories to the shared drive provides high-availability performance.






Wednesday, March 12, 2014

Configuring Debian Wheezy as Routers for a Four-Office Virtual Network

A previous article described installing Debian Wheezy servers on Oracle VirtualBox VMs.  This article continues setting up a four-office test environment by deploying Debian Wheezy with Quagga routing software.  These routers are the nervous system of the virtual network, connecting the offices together and to the Internet over redundant links.

Quagga on Debian

Quagga is a routing package whose command structure is similar to Cisco's IOS.  Forked from the Zebra project, Quagga continues to add features.  By default, it is accessed through local telnet sessions to specific ports for each supported routing protocol.  An integrated vty shell is available if so compiled, but this article will use the more familiar process of editing separate configuration files for different daemons.  For this example, Open Shortest Path First (OSPF) is utilized for its fast convergence and rapid recovery from faults.

Installation is simple:  "apt-get install quagga."  There are a few configuration steps before it will work.  First, edit the /etc/quagga/daemons file.  There are a series of statements defining which daemons are active; for our configuration, set "zebra" and "ospfd" to "yes."
  • zebra=yes
  • bgpd=no
  • ospfd=yes
  • ospf6d=no
  • ripd=no
  • ripngd=no
  • isisd=no
  • babeld=no
Each active daemon must also have a configuration file present in the /etc/quagga/ directory owned by user and group quagga with permissions 640.  Create "zebra.conf" and "ospfd.conf" files with the statements:
  • password ########
  • enable password ########
The daemons may then be started with "service quagga start."

The first daemon to configure is Zebra – a name inherited from the original open source project that has been picked up by the Quagga team. Start by establishing a telnet session to port 2601, where the IOS-like log on and syntax is apparent.  Enable password encryption (“service password-encryption”) and assign access passwords.  You may configure the interfaces with IP addresses, but it is not necessary because the daemons pick them up automatically.

OSPF is configured over a telnet session to port 2604.  The configurations used in this model are discussed in detail below.

Quagga injects its routes into the Linux kernel routing table. Notice some of the differences between the Quagga OSPF route metrics (costs) and those in the Linux kernel. Under Quagga, directly connected interfaces have a default cost of 10, but under the Linux kernel, they are 0.  There are some other important differences.  For instance, you may not assign a device IP address through quagga.  To mimic this, create a separate loopback interface on the Linux host (/etc/network/interfaces) and provide it with a routable IP address.  Thus, the Coudersport Router is configured with lo:10 with address 10.0.0.254.

OSPF WAN Model Overview

The model network consists of:
  1. Coudersport – The Host Laptop upon which VirtualBox runs.  It acts as a proxy-firewall Internet gateway router connected to the Internet by a Wireless LAN device and provided with three "public" Host-Only VirtualBox NICs (vboxnet0 through 2).
  2. Philadelphia – A WAN Router with three "public" interfaces and one private interface for a subnetted Class A network.
  3. Harrisburg – A WAN Router with three "public" interfaces and one private interface for a subnetted Class A network.
  4. Pittsburgh – A WAN Router with three "public" interfaces and one private interface for a subnetted Class A network.
  5. A backbone (Area 0) network, consisting of point-to-point connections between the above "public" interfaces.
The rationale for this design is that it mimics a live WAN network without the additional overhead required for VPNs.  That is, despite point-to-point links on an Area 0 backbone, the behavior accurately portrays the desired configuration without the overhead of additional "public" routers and VPN connections.

Initial WAN Router Configurations

Each router is configured as an Area router that summarizes the private routes it serves.  The basic OSPF configuration is quite simple:
  • ospf router-id 10.0.0.254 (10.64.0.254, 10.128.0.254, 10.192.0.254) 
  • network 10.0.0.0/10 area 10.0.0.0 (10.64.0.0/10 area 10.64.0.0, etc.) 
  • network 172.16.0.0/16 area 0.0.0.0 
  • area 0.0.0.0 range 172.16.0.0/16 
  • area 10.0.0.0 range 10.0.0.0/10 (area 10.64.0.0 range 10.64.0.0/10, etc.)
This configuration sets the router ID, defines the private network served and its area number, the backbone network to which the "public" interfaces are connected and summarizes the routes for the connected networks.  More on that later.
 
The Coudersport proxy gateway, connected to the Internet, contains two additional lines:
  • passive-interface wlan0
  • default-information originate metric 1
The first statement prevents OSPF update announcements on the public, Internet-connected interface.  The second statement instructs OSPF to announce that this device is a default gateway -- a route to networks unknown to the routing process on other routers.

ABRs, ASBRs and Route Summarization

The backbone network -- Area 0 or Area 0.0.0.0 -- forms the core of the network.  All other areas are connected to the core, either directly or through virtual-links.  In this model, the core is an isolated network (not connected to any external systems).

Philadelphia, Pittsburgh and Harrisburg act as Area Border Routers (ABRs) -- routers configured with an one or more areas and connected to the backbone.  Coudersport, with an Internet connection, acts as an Autonomous System Border Router -- strictly speaking, an ABR that is also connected to routers running another protocol (e.g. BGP).  In this case, it is simply a default gateway.

Only ABRs and ASBRs may provide route summaries.  For instance, the statement "
area 10.64.0.0 range 10.64.0.0/10" on the Philadelphia router advertises a summary of every network in this range, rather than each individual network as a separate entry.  Thus, the summary includes networks 10.64.0.0/24, 10.64.1.0/24, 10.64.2.0/24... 10.127.254.254/24 -- 16,384 24-bit (or 254 host) networks.  This drastically reduces overhead for routing tables and updates between areas. The four routers are each connected to the backbone Area 0 and provide route summaries to other areas.

Basic OSPF Router Operation

Although simple, the configuration described above is also fully functional.  It provides a redundant mesh network, in which links -- not routers -- may fail and communications continue uninterrupted.  The simplicity of this bare-bones configuration is demonstrated in the video below.  Each router directs traffic directly to adjacent areas -- it lacks any traffic shaping or preferential behavior.  It works, but only for simple configurations in which "all things are equal."  The video initially depicts only the Coudersport Router operating and follows changes in the routing tables as each other router is started up and exchanges link state information with the rest of the network.




OSPF Operation and Updates

The video above displays a series of routers staring up and being monitored by a Nagios /NagVis server.  That's fine, but there is a lot more happening.  Each router running a link-state protocol keeps track of three sets of information in tables. The information is:

  • Its immediate neighbor routers.
  • All the other routers in the network, or in its area of the network, and their attached networks.
  • The best paths to each destination.
And the tables are:
  • OSPF neighbor table = adjacency database
  • OSPF topology table = OSPF topology database = LSDB
  • Routing table = forwarding database
OSPF routers merely keep track of neighboring routers during normal operations.  They only exchange information when there is a topology change.  A defined process then efficiently updates the topology change among routers.  That process is:
  1. When a link changes state, the device that detected the change creates a link-state advertisement (LSA) concerning that link.
  2. The LSA propagates to all neighboring devices using a special multicast address (224.0.0.5).
  3. Each routing device stores the LSA, forwards the LSA to all neighboring devices (in same area).
  4. This flooding of the LSA ensures that all routing devices can update their databases and then update their routing tables to reflect the new topology.
  5. The LSDB is used to calculate the best paths through the network.
  6. Link-state routers find the best paths to a destination by applying Dijkstra’s algorithm, also known as SPF, against the LSDB to build the SPF tree.
  7. Each router selects the best paths from their SPF tree and places them in their routing table.
OSPF Packets
  • Neighbor discovery, to form adjacencies
  • Flooding link-state information, to facilitate LSDBs being built in each router
  • Running SPF to calculate the shortest path to all known destinations
  • Populating the routing table with the best routes to all known destinations

Using OSPF Interface Costs to Modify Routes

The previous model configuration above is simple because traffic takes the shortest route defined by the number of hops -- and there are direct routes available as long as all the links are active.  That may not be desirable.  Suppose there is "a reason" for the Philadelphia, Harrisburg and Pittsburgh traffic to preferentially route through the Philadelphia - Coudersport interface, through the Pittsburgh - Coudersport interface if that fails and as a last resort through the Harrisburg - Coudersport interface.  One way to achieve that is by assigning administrative costs to interfaces.

Using Quagga, telnet to the localhost on port 2604 (OSPF) and issue the "config terminal" command.  Go to the "interface" set of commands and assign a cost with the command "ip ospf cost #" to prioritize traffic.  The greater the cost, the less favorable the interface's preference for a path.  These costs are additive.

To achieve the results described above, we will use the following interface costs:
  • Coudersport Router: 100 on the Pittsburgh interface, 200 on the Harrisburg interface
  • Pittsburgh Router: 100 on the Coudersport interface
  • Harrisburg Router: 200 on the Coudersport interface
Once these costs are assigned, the network will perform as desired.  The video below shows how it works.

Using OSPF Interface Bandwidth to Modify Routes

Assigning administrative costs to interfaces may work for small networks and simple configurations, but not for larger, dynamically changing ones.  They quickly become a time-consuming administrative task, cumbersome and error-prone.  A more viable option is to assign bandwidth values to interfaces and use OSPF bandwidth cost calculations.

The cost is calculated by dividing a reference bandwidth by the value of bandwidth assigned to an interface.  Thus, the greater the bandwidth, the lower the cost.  If you choose to override the default reference bandwidth used for the calculation using the "auto-cost reference-bandwidth" command, make sure to do so on ALL routers and using the same value.  If different reference bandwidths are used, inappropriate cost calculations result and faster interfaces may end up with higher costs than slower ones.

The reference bandwidth default value is 100 Mb/s, but this is no longer a desirable value.  The minimum calculated cost is 1, so a Gigabit Ethernet and Fast Ethernet port will have the same -- 1 -- cost.  Override the value with "auto-cost reference-bandwidth 1000" if Gigabit Ethernet is the fastest speed on the network.

For this model, we will assign bandwidths to all of the "public" interfaces in the backbone Area 0:

  • Coudersport - Philadelphia: 36,000 kbs (T-3)
  • Coudersport - Harrisburg: 1,500 kbs (T-1)
  • Coudersport - Pittsburgh: 18,000 kbs (1/2 T-3 or 12 x T-1)
  • Harrisburg - Philadelphia: 72,000 kbs (2 x T-3)
  • Harrisburg - Pittsburgh: 72,000 kbs (2 x T-3)
  • Philadelphia - Pittsburgh: 36,000 kbs (T-3)

The results are interesting.  Philadelphia and Harrisburg still preferentially route to the Internet through the Philadelphia - Coudersport link, and through Pittsburgh if that fails.  Pittsburgh, however, routes through the Pittsburgh - Coudersport link.  Thus, we distribute Internet traffic over two links instead of one.  The same result can be achieved with administrative interface costs, but doing so for a large network would be a burdensome administrative task.  That does not mean interface administrative costs are not useful -- they override bandwidth cost calculations when assigned and may be used for fine tuning -- but the bandwidth calculation is automated and effective.

The video below demonstrates configuring bandwidth costs and how routes change as interfaces fail.
Merely setting bandwidths in quagga does not actually set interface speeds.  This model's routers are built with Intel PRO/1000 desktop adapters on the virtual machines, so they operate at Gigabit Ethernet speed.  To accurately reflect WAN performance, the link speed must be changed to much lower values.  Merely changing the adapter speed through the Linux driver options limits the choice to Gigabit Ethernet, Fast Ethernet and Ethernet speeds -- 1,000, 100 and 10 kbs, respectively -- and is not granular enough to model WANs.  Another option is to write iptables rules to limit traffic.  That works well and also can be used to configure firewall QoS features.  But a simpler solution is to simply install the package "wondershaper."  It is not feature-rich, but does everything we need for this model -- sets interface speeds to whatever value we want.  From the man page:

  • wondershaper [ interface ] [ downlink ] [ uplink ]
  • Configures the wondershaper on the  specified  interface,  given the  specified  downlink  speed  in kilobits per second, and the specified uplink speed in kilobits per second.

That's all we need to configure realistic WAN link speeds for the model.


NagVis - Zabbix Video Demonstration

Finally, a brief demonstration of the routing protocols in operation during WAN link failures.




Monday, March 10, 2014

Data Center Overviews with Nagios/Icinga Nagvis Visualization


A Data Center Overview Map provides detailed information at the system level.  This is primarily Layers 1 through 4 of the OSI Networking Model.  At Layer 1, for instance, hardware condition is displayed as hard disk, memory and processor information.  At Layer 4, Transport Layer information is provided by TCP host checks.
NagVis Data Center Overview
Additional information about configuring NagVis is available in the article on Enterprise Overviews with NagVis.

Nagios / Icinga / NagVis Model

For the systems monitored by this model, that will be:
  • A Background Image depicting two hardware racks
  • Host Icons to depict the Servers and WAN Router
  • Service Icons to depict selected Server and WAN Router Interfaces
  • Service Group Lines to depict iSCSI and WAN Links
Once the Data Center Overview is finalized, it depicts all monitored objects as up and running. The whole point of a model is to simulate behavior under real-world conditions.  To do so, a series of WAN link failures on the Philadelphia Router will be simulated.  The failures will be:
  1. A secondary interface on the Philadelphia Monitoring Server will be administratively shut down
  2. The Windows Domain Controller's Active Directory Web Service will be stopped
  3. The iSCSI interface on the Windows Domain Controller connectecting to the iSCSI SAN Server  will be disconnected
  4. The router interface on the Harrisburg WAN Router connecting to the Philadelphia WAN Router will be disconnected
  5. The router interface on the Philadelphia WAN Router connecting to the already unavailable Harrisburg WAN Router will be disconnected
For this test, an Icinga Server running NagVis located in the Pittsburgh Data Center shall monitor and report the condition of services in the Philadelphia Data Center.

Testing the Model's NagVis Reporting

The expected behavior is:
  1. The monitoring server will recognize the administratively down Philadelphia Monitoring Server interface and report the Interface Service Icon as WARNING Yellow
  2. The Monitoring Server will recognize the stopped Active Directory Web Services as Critical Red
  3. The monitoring server will recognize the disconnected interface between Philadelphia Windows Domain Controller and Philadelphia SAN Server and report  the Interface Service Icon and iSCSI Link Service Group Line as DOWN Red
  4. The monitoring server will recognize the disconnected interface as DOWN Red and the Philadelphia WAN Router as AVAILABLE Green
  5. The monitoring server will recognize the disconnected interface on the Philadelphia WAN Router and report the Interface Service as DOWN Red and Router as CRITICAL Red
Upon restoring the Services and WAN Link interfaces, it takes three to four minutes for the two monitoring servers to update the host checks and report all objects as available. However, the interface also allows the administrator to manually reschedule and refresh Host and Service Checks to more quickly update information.

The video below demonstrates that the expected behavior is, indeed, what happens.


Sunday, March 9, 2014

Enterprise Overviews with Nagios/Icinga Nagvis Visualization

This article describes how to install and configure NagVis, a highly customizable add-on visualization package.  Nagios and Icinga include  a basic Map feature that depicts the enterprise, but it is not very customizable.
NagVis Enterprise Overview


Icinga-Web and the NagVis Default Automap

NagVis is an add-on for Nagios and Icinga that provides highly-customizable views.  It uses the NDO - IDO database as a source of information.  The Maps are then defined through a dedicated user interface.  Nagvis also includes, by default, an Automap; it is a slightly improved view compared to the Nagios Icinga Map views, but also does not include many customizable features.

The video below depicts the Icinga Web pages to view Hosts, Host Groups, Services and Service Groups and the NagVis Automap feature. 





Creating and Configuring a NagVis Map


NagVis provides much, much more than the default Automap feature.  The Automap only provides Host objects arranged according to Parent relationships as defined in the configuration files.  Custom Maps provide a wide range of Object types (Services, Host and Service Groups, etc.), Backgrounds, and Shapes (images that depict user-defined objects).

An Enterprise Overview Map only needs to provide the highest-level information.  For the systems monitored by this model, that will be:
  • A Background Image depicting the geographical area covered
  • Shapes to depict WAN Router locations
  • Host Icons to depict the WAN Routers
  • Host Group icons to depict Data Center servers
  • Service Icons to depict WAN Router Interfaces
  • Service Group Lines to depict WAN Links
  • Service Group Icons to depict the enterprise-wide health of groups of monitored services.
The video below demonstrates how these items are added. 



Nagios / Icinga / NagVis Model

Once the Enterprise Overview is finalized, it depicts all monitored objects as up and running. The whole point of a model is to simulate behavior under real-world conditions.  To do so, a series of WAN link failures on the Philadelphia Router will be simulated.  The failures will be:
  1. The router interface on the Philadelphia Router connecting to the Harrisburg Router will be administratively shut down
  2. The router interface on the Philadelphia Router connecting to the Pittsburgh Router will be disconnected
  3. The router interface on the Philadelphia Router connecting to the Coudersport Router will be disconnected

There is a Nagios server located in the Philadelphia Data Center that will maintain connectivity with the Data Center Hosts.  There is an Icinga server located in the Pittsburgh Data Center that will maintain connectivity with all hosts outside Philadelphia.  The WAN Routers are running OSPF and alternate routes will become active as WAN links fail.

Testing the Model's NagVis Reporting

The expected behavior is:
  1. Both monitoring servers will recognize the administratively down interface and report the Interface Service Icon and WAN Link Service Group Line as WARNING Yellow
  2. Both monitoring servers will recognize the disconnected interface between Philadelphia and Pittsburgh as report  the Interface Service Icon and WAN Link Service Group Line as DOWN Red
  3. When all WAN interfaces on the Philadelphia Router are down, the Pittsburgh Monitoring Server will report the Router as DOWN Red and the child Philadelphia Data Center Host Group Icon (representing four servers) as UNREACHABLE Purple and the Philadelphia Monitoring Server NagVis web service will error because it is unreachable.
  4. The Philadelphia Monitoring Server will recognize all Data Center Hosts and Services as available, the Philadelphia Router as available, but with CRITICAL Red interfaces and the remainder of the network as Unreachable.

Upon restoring the WAN Link interfaces, it takes three to four minutes for the two monitoring servers to update the host checks and report all objects as available.

The video below demonstrates that the expected behavior is, indeed, what happens.