originally published on April 24, 2006 at
linux.com
content revised May 25, 2006
The ubiquitous Linux, Apache, MySQL, and PHP/Perl/Python (LAMP)
combination powers many interactive web sites and projects. When
demand exceeds the capabilities of a single server, the database
is typically moved to a different server to spread the workload.
When demand exceeds a two server solution, it's time to think
cluster.
LAMP cluster defined
Before getting into the details, it helps to distinguish what is meant
by cluster in the context of this article. This is not the Beowulf kind
of cluster that uses specialized message passing software to tackle a
compute intensive task. Also, it does not cover high availability
features such as automatic fail over. Rather, it is a load sharing
cluster that distributes web requests among multiple web and database
servers while appearing to be a single application.
Everything required to implement this cluster is done with software that
ships with most Linux distributions, making it easy
and (relatively) inexpensive to implement. We'll
construct a cluster using seven computers for a fictitious company,
foo.com. Two servers will run DNS (primary and backup) to distribute
web requests among three web servers that read and write data from two
MySQL database servers.
Any number of different designs can be built, with more or fewer of each
kind of server, but the model will serve as a good illustration of what
can be done.
Load balancing
The first part of the cluster handles load balancing by using the round
robin feature of the popular DNS software, Berkeley Internet Name Daemon
(BIND). To use round robin, each web server must have its own public
IP address. A common scenario is to use network address translation and
port forwarding at the firewall to assign each web server a public IP
address while internally using a private address. In the DNS example, I
show private IP addresses, but public IPs are required for the web servers
so DNS can work its magic.
This snippet from the DNS zone definition for foo.com assigns the
same name to each of the three web servers, but uses different IP
addresses for each:
;
; Domain database for foo.com
;
foo.com. IN SOA ns1.foo.com. hostmaster.foo.com. (
2006032801 ; serial
10800 ; refresh
3600 ; retry
86400 ; expire
86400 ; default_ttl
)
;
; Name servers
;
foo.com. IN NS ns1.foo.com.
foo.com. IN NS ns2.foo.com.
;
; Web servers
; (private IPs shown, but public IPs are required)
;
www IN A 10.1.1.11
www IN A 10.1.1.12
www IN A 10.1.1.13
When DNS gets a request to resolve the name www.foo.com,
it will return
one IP address, then a different address for the next request and so on.
Theoretically, each web server will get one third of the web traffic.
Due to DNS caching and because some requests may use more resources that
others, the load will not be shared equally. However, over time it will
come close.
Hardware load balancers
If round robin DNS is too crude and you have the money, hardware load
balancers offer better performance. Some take into account
the actual load on each web server to maximize cluster performance
instead of just delegating incoming requests evenly. They may also
have features to solve the cookie problem discussed below.
Cisco and Citrix
are popular choices of companies that sell hardware load balancers.
You can even use round robin DNS in front of the hardware load
balancers.
Web servers
Configuring the web servers is largely the same as configuring a single
Apache web server with one exception. The content on each web server
has to be identical to maintain the illusion that visitors are using one
web site and not three. That requires some mechanism to keep the
content synchronized.
The most elegant solution would be to use some kind of global
shared file system. NFS won't work very well due to locking and
performance issues.
Red Hat's Global File System
might work, or Intermezzo, but they are beyond the scope of this work.
For file syncronization, my tool of choice is rsync.
First, designate one server,
web1 for example, as the primary web server and the other two as
secondaries. We make content changes only on the primary web server and
let rsync and cron update the others every minute. Due to the
advanced algorithms in rsync, content updates happen quickly.
I recommend creating a special user account on each web server, called
"syncer" or something similar. The syncer account needs to have write
permissions to the web content directory on each server. Then, generate
a pair of secure shell (SSH) keys for the syncer account using
ssh-keygen on the primary web server and distribute the public keys to
the /home/syncer/.ssh directory on the other two web servers. This
allows the use of password-less SSH along with rsync to keep the content
updated.
Here is a shell script that uses rsync to update the web content.
#!/bin/bash
rsync -r -a -v -e "ssh -l syncer" --delete /var/www/ web2:/var/www/
rsync -r -a -v -e "ssh -l syncer" --delete /var/www/ web3:/var/www/
This script should be set up in cron to run every minute and push updates
out to web2 and web3.
The cookie conundrum and application design
Cookies can be a tricky issue when LAMP applications use this kind of
cluster. By default, Apache stores it's cookies in the /tmp directory
on the server where it is running. If a visitor starts a session on one
web server but future HTTP requests are handled by a different web
server in the cluster, the cookie won't be there and things won't work
as expected.
Because the IP of a web server is cached locally, this doesn't happen
often, but it is something that must be accounted for and may require
some application programming changes. One solution to the cookie
problem is to use a shared cookie directory for all web servers.
Be particularly aware of this issue when using pre-built LAMP applications.
Aside from the cookie issue, the only other requirement for the
application is that all database writes are sent to the database master,
while reads should be distributed between the master and slave(s). In
our example cluster, I would configure the master web server to read
from the master database server, while other two web servers would read
from the slave database server. All web servers write to the master
database server.
Database servers
MySQL has a replication feature to keep databases on different servers
synchronized. It uses what is known as log replay, meaning that a
transaction log is created on the master server which is then read by a
slave server and applied to the database. As with the web servers, we
designate one database server as the master, call it db1 to match the
naming convention we used earlier, and the other one, db2, is the slave.
To set up the master, the first thing to do is create a replication
account. This is a user ID defined in MySQL, not a system account, that
is used by the slaves to authenticate to the master in order to read the
logs. For simplicity, I'll create a MySQL user called "copy" with a
password of "copypass". You will need a better password for a
production system. This MySQL command creates the copy user and gives
it the necessary privileges:
GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.*
TO copy@"10.1.0.0/255.255.0.0"
IDENTIFIED BY 'copypass';
Next, edit the MySQL configuration file, /etc/my.cnf, and add these
entries in the [mysqld] section:
# Replication Master Server (default)
# binary logging is required for replication
log-bin
# required unique id
server-id = 1
The log-bin entry enables the binary log file required for replication,
and the server-id of 1 identifies this server as the master.
Then, restart MySQL. You should see the new binary log file in the
MySQL directory with the default name of $HOSTNAME-bin.001.
MySQL will create new log files as needed.
To set up the slave, edit the /etc/my.cnf file and
add these entries in the [mysqld] section:
# required unique id
server-id = 2
#
# The replication master for this slave - required
# (replace with the actual IP of the master database server)
master-host = 10.1.1.21
#
# The username the slave will use for authentication when
# connecting to the master - required
master-user = copy
# The password the slave will authenticate with when connecting to
# the master - required
master-password = copypass
# How often to retry lost connections to the master
master-connect-retry = 15
# binary logging - not required for slaves, but recommended
log-bin
While not required for a slave, it is good planning to create the MySQL
replication user ("copy" in our example) on each slave in case it needs to
take over from the master in an emergency.
Restart MySQL on the slave and it will attempt to connect to the master
and begin replicating transactions. When replication is started for the
first time (even unsuccessfully), the slave will create a master.info
file with all the replication settings in the default database
directory, usually /var/lib/mysql.
To recap the database configuration steps,
- create a MySQL replication user on the master (and optionally on the slave)
- grant privileges to the replication user
- edit /etc/my.cnf on master and restart mysql
- edit /etc/my.cnf on the slave(s) and restart mysql
How to tell if replication is working
On the master, login to the mysql monitor and use "show master status":
mysql> show master status \G;
************************ 1. row ************************
File: master-bin.006
Position: 73
Binlog_do_db:
Binlog_ignore_db:
1 row in set (0.00 sec)
On the slave, login to the mysql monitor and use "show slave status":
mysql> show slave status \G;
************************ 1. row ************************
Master_Host: master.foo.com
Master_User: copy
Master_Port: 3306
Connect_retry: 15
Master_Log_File: intranet-bin.006
[snip]
Slave_IO_Running: Yes
Slave_MySQL_Running: Yes
The most important fields are Slave_IO_Running and
Slave_MySQL_Running.
They should both have values of Yes. Of course, the real test is the
execute a write query to a database on the master and see if the results
appear on the slave. When replication is working, slave updates usually
appear within milliseconds.
Recovering from a database error
If the slave database server loses power or the network connection, it
will no longer be able to stay synchronized with the master. If the
outage is short, replication should pick up where it left off. However,
if a serious error occurs on the slave, the safest way to get
replication working again is to:
- stop mysql on the master and slave
- dump the master database
- reload the database on the slave
- start mysql on the master
- start mysql on the slave
Depending on the nature of the problem, a full reload on the slave
may not be necessary, but this procedure should always work.
If the problem is with the master database server and it will be down
for a while, the slave can be reconfigured as the master by updating its
IP address and /etc/my.cnf file. All web servers have to be changed to
read from the new master. When the old master is repaired, it can be
brought up as the slave server and the web servers changed to read from
the slave again.
Going large
Clusters make it possible to scale a web application to handle a
tremendous number of requests. As traffic builds, network bandwidth
also becomes an issue. Top tier hosting providers can supply the
redundancy and bandwidth required for scaling. The number of possible
cluster configurations is only limited by your imagination. MySQL 5
introduced a special storage engine designed for distributed databases
called NDB that provides another option. For more in depth information
on MySQL clustering, see the MySQL web
site or High
Performance MySQL by Jeremy Zawodny and Derek Balling.

This work is licensed under a
Creative Commons Attribution-NonCommercial 2.5 License.