Stephen Fritz on Systems Engineering: High Availability Zabbix and Zabbix Proxy Servers

<data:blog.title/> <data:blog.pageName/>

Introduction

This article explains how to install high availability (fail over) Zabbix and Zabbix Proxy servers.

High-availability clustering is designed to increase systems availability. For this article, we will be using:

Distributed Replicated Block Device (DRBD)
Corosync - Pacemaker

These technologies present two servers as one to the network; one server is active and the other is waiting to take over if the first fails or is taken off line.

The cluster is composed of elements that differ from a single-server deployment.

Resources unique to each node
Data shared between nodes on the cluster
Services shared between the nodes on the cluster

Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA")

at http://www.drbd.org/uploads/pics/overview_02.gif

Resources Unique to Each Node

Each node has is a server with its own operating system and hardware. The processor, memory, disk and IO subsystems (including network interfaces) are controlled by the operating system installed on the boot partition.

Data Shared Between Nodes on the Cluster

For a MySQL cluster, there are two types of shared data: configuration files and databases. The configuration files are those located in the /etc/mysql/ directory. When shared between the two nodes, the MySQL server will have an identical configuration regardless of the node that is active. However, there are circumstances in which the MySQL configuration files may be unique to each server. The databases are kept in the /var/lib/mysql/ directory and include the log files.

MySQL Clustering Caveats

Although the two nodes share the same MySQL databases, UNDER NO CIRCUMSTANCES SHALL THE TWO NODES SIMULTANEOUSLY ACCESS THE DATABASES. That is, only one node may run the mysqld daemon at any given time. If two MySQL daemons access the same database, there will eventually be corruption. The clustering software controls which node accesses the data.

DRBD - Corosync - Pacemaker Overview

The illustration below depicts a high-availability cluster design. Each server has four network interfaces:

eth0 -- the publicly addressable interfaces
eth1 -- the DRBD data replication and control interfaces
eth2 and eth3 -- the Corosync - Pacemaker control interfaces

The first interface -- eth0 -- is the publicly addressable interface that provides MySQL database and Apache web server (for PHPMyAdmin) access. Two IP addresses (unique to server) are assigned at boot time and a third is assigned by the Corosync - Pacemaker portion of the clustering software.

The second interface -- eth1 -- is controlled by the DRBD daemon. This daemon is configured with two or more files: /etc/drbd.d/global_common.conf and /etc/drbd.d/ro.res, r1.res, r2.res... assigned for each shared block device. For this example, only r0.res is installed. System-wide settings are defined in the global_common.conf file. Settings specific to each pair of shared block devices are defined in the r0.res (and other) resource files. DRBD defines an entire block device (or hard drive) as shared and replicated between two nodes. In this example, block device /dev/sdb (a SCSI drive in each server) is shared between the two nodes as /dev/drbd0. Once configured, both servers see /dev/sdb as a new block device /dev/drbd0 and only ONE may mount it at any given time. The server with the mounted partition replicates any changes to the failover node. If the first server fails or is taken off line, the other server immediately mounts its /dev/drbd0 block device and assumes control of replication.

The third and fourth interfaces -- eth2 and eth3 -- are controlled by the Corosync - Pacemaker software. These interfaces provide communication links defining the status of each defined resource and control how and where shared services (such as shared DRBD block devices, IP addresses, and the MySQL / Apache2 daemons) run. In failover clustering, only one node may actively be in control at any time.

Installing and Configuring the Zabbix - MYSQL - Apache - Postfix Failover Cluster

Begin by installing and MySQL, Apache and Postfix. Leave MySQL listening on the default loopback interface 127.0.0.1. Then, start the Linux Cluster Management Console (LCMC) -- a Java application that will install and configure everything required for clustering. Select the two nodes by name or IP address, install Pacemaker (NOT Heartbeat) and DRBD.

Once both nodes have the required software, configure the cluster. LCMC will prompt you for two interfaces to use in the cluster (select eth2 and eth3) and the two-node system will be recognized as a Cluster. Configure global options and then select device /dev/sdb on one node and mirror it to /dev/sdb on the other to configure DRBD device /dev/drbd0. Format it with an ext4 file system and make sure to perform an initial full synchronization.

When the DRBD device finishes synchronization, create a shared mount point -- /mnt/sdb -- and IP address (10.195.0.100 shared between eth0 on the nodes). The cluster will now recognize the DRBD device as a file system on the Active node.

The shared data must then be moved to the DRBD device. Stop MySQL on each node. On the Active node, move the directories /etc/mysql, /var/lib/mysql and /etc/zabbix/ to /mnt/sdb/etc/mysql, /mnt/sdb/var/lib/mysql and /mnt/sdb/etc/zabbix, respectively, and create symlinks back to their original locations. On the Inactive node, delete the mysql and zabbix directories and replace them with symlinks to the same locations -- even though those locations are not yet visible.

The LCMC console is then used to finalize the shared services. Add Primitive LSB Init Script resources (that is, only running on one server at a time) for MySQL, Apache2, Postfix, Zabbix Server and Zabbix-Agent.

Fail the servers back and forth several times to check that the system performs as expected. Finally, install a database. For this article, I install the Zabbix Monitoring database.

The video below illustrates the entire installation and configuration process.

Installing the Zabbix Proxy - MySQL Failover Cluster

Zabbix Proxy only requires the zabbix-agent and zabbix-proxy-mysql packages. The proxy is linked to the server by configuration file settings. It is then configured and controlled through the server console.

The cluster configuration -- unlike the system set up -- is very much like configuring Zabbix Server failover clustering. Similar hardware and identical clustering software and symlinked shared directories are used.

The video below demonstrates the Zabbix Proxy failover cluster installation.

Zabbix Proxy Performance

Zabbix Proxy is not designed to distribute load. While it has its own database, is configured from and subordinate to a Zabbix server, all the results are still processed by the Zabbix server. There is a small performace improvement at the server, but not much.

Zabbix Proxy is intended to provide fault-tolerance, particularly across unreliable links. The brief video below demonstrates the process loads on an active Zabbix Server and Proxy.

Stephen Fritz on Systems Engineering

Search This Blog

Labels

Tuesday, April 1, 2014

High Availability Zabbix and Zabbix Proxy Servers