Search This Blog

Tuesday, February 17, 2015

GNS3 - VirtualBox Part 11: Debian/Ubuntu Bonded NIC Layer 2 Switches

This article demonstrates how to configure Debian/Ubuntu VirtualBox guests to operate as Layer 2 switches with bonded NICs, aggregating several adapters into higher-speed logical interfaces.


Introduction

There have been many implementations of adapter bonding -- often vendor-specific and proprietary; these implementations are not germane to this article.  Over time, published standards have replaced proprietary ones.  Linux supports seven different bonding types and a Linux Bonding HOW-TO document is available at kernel.org.

This article describes Mode 4, IEEE 802.3ad Dynamic link aggregation.  This is a common implementation and requires switch support.  It is widely supported by vendors.

Bonding Modes

Specifies one of the bonding policies. The default is balance-rr (round robin). Possible values are: 

Balance-rr or 0

Round-robin policy: Transmit packets in sequential order from the first available slave through the last. This mode provides load balancing and fault tolerance.

Active-Backup or 1

Active-backup policy: Only one slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch. This mode provides fault tolerance.

Balance-XOR or 2

XOR policy: Transmit based on the selected transmit hash policy. The default policy is a simple [(source MAC address XOR'd with destination MAC address XOR packet type ID) modulo slave count]. Alternate transmit policies may be selected via the xmit_hash_policy option. This mode provides load balancing and fault tolerance. 

Broadcast or 3

Broadcast policy: transmits everything on all slave interfaces. This mode provides fault tolerance.

802.3ad or 4

IEEE 802.3ad Dynamic link aggregation. Creates aggregation groups that share the same speed and duplex settings. Utilizes all slaves in the active aggregator according to the 802.3ad specification. Slave selection for outgoing traffic is done according to the transmit hash policy, which may be changed from the default simple XOR policy via the xmit_hash_policy option. Most switches will require some type of configuration to enable 802.3ad mode. This mode provides load balancing and fault tolerance.

Balance-TLB or 5

Adaptive transmit load balancing: channel bonding that does not require any special switch support. In tlb_dynamic_lb=1 mode; the outgoing traffic is distributed according to the current load (computed relative to the speed) on each slave. In tlb_dynamic_lb=0 mode; the load balancing based on current load is disabled and the load is distributed only using the hash distribution. Incoming traffic is received by the current slave. If the receiving slave fails, another slave takes over the MAC address of the failed receiving slave.

Balance-ALB or 6

Adaptive load balancing: includes balance-tlb plus receive load balancing (rlb) for IPV4 traffic, and does not require any special switch support. The receive load is distributed sequentially (round robin) among the group of highest speed slaves in the bond. When a link is reconnected or a new slave joins the bond the receive traffic is redistributed among all active slaves in the bond.


Installing ifenslave and Modifying Configuration Files

The first step is to install the ifenslave package (apt-get install ifenslave).  The package includes the commands and kernel module support.  Boot time kernel module loading requires adding a line -- bonding -- to /etc/modules thus:
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
# Parameters can be specified after the module name.

lp
rtc
bonding


A new file -- etc/modprobe.d/modules.conf -- is required for additional configuration.  While many configuration commands may be later added to the /etc/network/interfaces file, they may also be added to this one.

alias bond0 bonding
options bonding mode=4 miimon=100 downdelay=200 updelay=200 max_bonds=2
alias bond1 bonding
options bonding mode=4 miimon=100 downdelay=200 updelay=200 max_bonds=2

The illustrated options are not mandatory, but advisable.  The miimon option specifies the interval (milliseconds) in which MII Link Monitoring occurs.  The default value is 0 and disables MII Link Monitoring.  The downdelay and updelay options specify delay times (milliseconds) between MII Link Monitoring detection of a state change and application of the change; it is a multiple of the miimon value and will be automatically rounded if otherwise defined.  Another important option is "maxbonds=#."  The default for this value is 1 and it allows (but does not automatically create) a bond0 interface.  If you plan to add more than one bonded interface, you will need to specify "maxbond=#" as a larger value.
Additional configuration options are detailed in the kernel.org Linux Bonding HOW-TO document.

Configuring from the Command Line Interface

Bonded interfaces may be configured from the command line or added to the /etc/rc.local file.  The following commands utilize iproute2, ifenslave and the deprecated bridge-utils packages:
ifenslave bond0 eth0 eth1
ifenslave bond1 eth2 eth3 eth4

ip link set bond0 up
ip link set bond1 up
ip link add dev br0 type bridge
ip link set dev bond0 master br0
ip link set dev bond1 master br0
ip addr add 10.64.0.4/255.255.255.0 dev br0ip route add default via 10.64.0.1ip link set dev br0 upbrctl stp br0 on
It is not necessary to set the individual Ethernet adapters to "up" when using bonded interfaces.  However, if unbonded Ethernet adapters are to be used in the bridge, they must be set to "up" thus:
ip link set dev eth5 up

Configuring /etc/network/interfaces

Alternatively, bonded Ethernet interfaces may also be configured in the /etc/network/interfaces file, providing support for the ifupdown package for high-level configuration.   The following configuration creates a bridge (br0) with a static IP address, two bonds (bond0 = eth0 and eth1; bond1 = eth2, eth3 and eth4) and three Ethernet interfaces (eth5, eth6 and eth7) that are members of bridge br0.

auto lo
iface lo inet loopback

auto bond0
iface bond0 inet manual
pre-up ifenslave bond0 eth0 eth1
post-up ip link set dev bond0 master br0
pre-down ip link set dev bond0 nomaster
post-down ifenslave -d bond0 eth0 eth1

auto bond1
iface bond1 inet manual
pre-up ifenslave bond1 eth2 eth3 eth4
post-up ip link set dev bond1 master br0
pre-down ip link set dev bond1 nomaster
post-down ifenslave -d bond1 eth2 eth3 eth4
 

iface eth5 inet manual


iface eth6 inet manual

iface eth6 inet manual


auto br0
iface br0 inet static
address 10.64.0.4
netmask 255.255.255.0
gateway 10.64.0.1
dns_nameservers 192.168.1.1 8.8.8.8 4.4.4.4

bridge_stp on
bridge_waitport 0
bridge_fd 0
bridge_ports bond0 bond1 eth5 eth6 eth7

Testing the Switches

The Legacy ifconfig Command

This command is of limited utility for bonded interfaces.  Note that it lists the bond and Ethernet interfaces as up and whether they are masters (bonds) or slaves (Ethernet).  It does not specify details of master-slave relationships.
bond0     Link encap:Ethernet  HWaddr 08:00:27:1c:e8:ec 
          inet6 addr: fe80::a00:27ff:fe1c:e8ec/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:420 errors:0 dropped:3 overruns:0 frame:0
          TX packets:576 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:46060 (46.0 KB)  TX bytes:52141 (52.1 KB)

bond1     Link encap:Ethernet  HWaddr 08:00:27:83:45:d7 
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:72 errors:0 dropped:2 overruns:0 frame:0
          TX packets:473 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:8736 (8.7 KB)  TX bytes:38259 (38.2 KB)

br0       Link encap:Ethernet  HWaddr 08:00:27:1c:e8:ec 
          inet addr:10.64.0.4  Bcast:10.64.0.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe1c:e8ec/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:367 errors:0 dropped:1 overruns:0 frame:0
          TX packets:212 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:34606 (34.6 KB)  TX bytes:27137 (27.1 KB)

eth0      Link encap:Ethernet  HWaddr 08:00:27:1c:e8:ec 
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:88 errors:0 dropped:0 overruns:0 frame:0
          TX packets:32 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:8736 (8.7 KB)  TX bytes:3604 (3.6 KB)

...

eth4      Link encap:Ethernet  HWaddr 08:00:27:83:45:d7 
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:26 errors:0 dropped:2 overruns:0 frame:0
          TX packets:38 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:3032 (3.0 KB)  TX bytes:6982 (6.9 KB)

The ip addr Command

This command -- part of the newer iproute2 package -- provides detailed information about interfaces states and master-slave relationships.  For instance, Ethernet interfaces eth0, eth1 and eth2 enumerate bond0 as their masters and bond0 is enumerates br0 as its master.
2: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP group default qlen 1000
    link/ether 08:00:27:1c:e8:ec brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP group default qlen 1000
    link/ether 08:00:27:1c:e8:ec brd ff:ff:ff:ff:ff:ff
...
9: eth7: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 08:00:27:81:dd:5a brd ff:ff:ff:ff:ff:ff
10: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UP group default
    link/ether 08:00:27:1c:e8:ec brd ff:ff:ff:ff:ff:ff
    inet6 fe80::a00:27ff:fe1c:e8ec/64 scope link
       valid_lft forever preferred_lft forever
11: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UP group default
    link/ether 08:00:27:83:45:d7 brd ff:ff:ff:ff:ff:ff
12: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 08:00:27:1c:e8:ec brd ff:ff:ff:ff:ff:ff
    inet 10.64.0.4/24 brd 10.64.0.255 scope global br0
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe1c:e8ec/64 scope link
       valid_lft forever preferred_lft forever

Listing /proc/net/bonding Files

The /proc/net/bonding file contains bond-specific information, this time for a different switch using bonded Ethernet adapters eth2, eth3 and eth4.  Notice that the bond Transmit Hash Policy is the default Layer 2.  Also notice that the default LACP rate -- slow -- applies.  LACP is the protocol that negotiates bundling between complaint switches.  The default interval is slow (30 seconds) while fast (1 second) must be manually-specified.
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 200
Down Delay (ms): 200

802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
    Aggregator ID: 1
    Number of ports: 3
    Actor Key: 17
    Partner Key: 17
    Partner Mac Address: 08:00:27:74:2e:0d

Slave Interface: eth2
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 08:00:27:83:45:d7
Aggregator ID: 1
Slave queue ID: 0

Slave Interface: eth3
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 08:00:27:1d:37:ae
Aggregator ID: 1
Slave queue ID: 0

Slave Interface: eth4
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 08:00:27:7e:25:d0
Aggregator ID: 1
Slave queue ID: 0

The Legacy brctl Commands

The legacy bridge-utils package provides three commands -- brctl show, brctl showmacs and brctl showstp -- that provide an overview of basic bridge configuration and operation.

brctl show <bridge_name>

brctl show br0
bridge name    bridge id        STP enabled    interfaces
br0        8000.0800271ce8ec    yes              bond0
                                                                 bond1

brctl showmacs <bridge_name>

brctl showmacs br0
port no    mac addr        is local?    ageing timer
  1    08:00:27:1c:e8:ec    yes           0.00
  2    08:00:27:83:45:d7    yes           0.00
  1    ca:02:10:48:00:06    no           0.00

brctl showstp <bridge_name>

brctl showstp br0
br0
 bridge id        8000.0800271ce8ec
 designated root    8000.0800271ce8ec
 root port           0                                 path cost           0
 max age          20.00                           bridge max age          20.00
 hello time           2.00                          bridge hello time       2.00
 forward delay           2.00                     bridge forward delay       2.00
 ageing time         300.00
 hello timer           0.10                         tcn timer           0.00
 topology change timer       0.00            gc timer         141.52
 flags           


bond0 (1)
 port id        8001                                      state             forwarding
 designated root    8000.0800271ce8ec       path cost           4
 designated bridge    8000.0800271ce8ec    message age timer       0.00
 designated port    8001                             forward delay timer       0.00
 designated cost       0                               hold timer           0.00
 flags           

bond1 (2)
 port id        8002                                      state             forwarding
 designated root    8000.0800271ce8ec       path cost         100
 designated bridge    8000.0800271ce8ec    message age timer       0.00
 designated port    8002                             forward delay timer       0.00
 designated cost       0                               hold timer           0.00
 flags           

GNS3 Configuration

The GNS3 configuration depicted at the top of this article depicts three Linux switches -- PHL-Core, PHL-Servers and PHL-Storage.  PHL-Core is connected to two routers over bridged Ethernet ports eth0 and eth1; it is connected to PHL-Servers over the bridged two-adapter bond0 interface (eth2 and eth3 to eth0 and eth1, respectively).  PHL-Servers is connected to bridged three-adapter bond1 interface (eth2, eth3 and eth4 to eth0, eth1 and eth2, respectively).  The /etc/network/interfaces files for the three configurations are below.

PHL-Core Configuration

auto lo
iface lo inet loopback

auto bond0
iface bond0 inet manual
pre-up ifenslave bond0 eth2 eth3
post-up ip link set dev bond0 master br0
pre-down ip link set dev bond0 nomaster
post-down ifenslave -d bond0 eth2 eth3

auto br0
iface br0 inet static
address 10.64.0.2
netmask 255.255.255.0
gateway 10.64.0.1
dns-nameservers 192.168.1.1 8.8.8.8 4.4.4.4
bridge_stp on
bridge_waitport 0
bridge_fd 0
bridge_ports eth0 eth1 bond0

iface eth0 inet manual

iface eth1 inet manual

PHL-Servers Configuration

auto lo
iface lo inet loopback

auto bond0
iface bond0 inet manual
pre-up ifenslave bond0 eth0 eth1
post-up ip link set dev bond0 master br0
pre-down ip link set dev bond0 nomaster
post-down ifenslave -d bond0 eth0 eth1

auto bond1
iface bond1 inet manual
pre-up ifenslave bond1 eth2 eth3 eth4
post-up ip link set dev bond1 master br0
pre-down ip link set dev bond1 nomaster
post-down ifenslave -d bond1 eth2 eth3 eth4

auto br0
iface br0 inet static
address 10.64.0.4
netmask 255.255.255.0
gateway 10.64.0.1
dns-nameservers 192.168.1.1 8.8.8.8 4.4.4.4
bridge_stp on

bridge_waitport 0
bridge_fd 0
bridge_ports bond0 bond1

PHL-Storage Configuration

auto lo
iface lo inet loopback

auto bond0
iface bond0 inet manual
pre-up ifenslave bond0 eth0 eth1 eth2
post-up ip link set dev bond0 master br0
pre-down ip link set dev bond0 nomaster
post-down ifenslave -d bond0 eth0 eth1

auto br0
iface br0 inet static
address 10.64.0.5
netmask 255.255.255.0
gateway 10.64.0.1
dns-nameservers 192.168.1.1 8.8.8.8 4.4.4.4
bridge_stp on

bridge_waitport 0
bridge_fd 0
bridge_ports bond0


No comments :

Post a Comment