Posts

How to setup Galera 3 node cluster on Ubuntu 12.04

Galera is a multi-master replication solution for MySQL, which provides an interesting alternative to the standard master-master MySQL replication we are all so used with. One main advantage of Galera is the ability of doing sync replication, thus reducing the risk of data inconsistency between masters.

Setup on RackSpace Cloud

3x 512MB RAM instances, with 20GB storage space
1x Load Balancer for MySQL, RoundRobin algorithm, Health check enabled
1x 512MB RAM instance for testing
OS: Ubuntu 12.04 LTS 64bit

Goal:

Quickly setup a Galera cluster and run some benchmarks using sysbench.

Note: For the sake of simplicity I will refer to the Galera instances as node01, node02 and node03. The test instance will be referred as test01.

Common settings on all nodes

On every node execute:

  1. An apt-get update and upgrade to bring the instances up to date.
  2. Install required packages
    apt-get install libaio1 libssl0.9.8 mysql-client libdbd-mysql-perl libdbi-perl
  3. Download Galera wsrep provider
    wget https://launchpad.net/galera/2.x/23.2.4/+download/galera-23.2.4-amd64.deb
    dpkg -i galera-23.2.4-amd64.deb
  4. Download MySQL server with wsrep patch
    wget https://launchpad.net/codership-mysql/5.5/5.5.28-23.7/+download/mysql-server-wsrep-5.5.28-23.7-amd64.deb
    dpkg -i mysql-server-wsrep-5.5.28-23.7-amd64.deb
  5. I had some issues and I had to create /var/log/mysql
    mkdir -pv /var/log/mysql
    chown mysql:mysql -R /var/log/mysql
  6. Secure the mysql installation and assign a good password to root user:
    service mysql restart
    mysql_secure_installation
  7. Create an user for galera nodes to use for connect/replication
    mysql -p
    mysql> grant all privileges on *.* to galera@'%' identified by 'password';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> flush privileges;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> set global max_connect_errors = 10000;
    Query OK, 0 rows affected (0.01 sec)
  8. Edit /etc/hosts and make sure you add all the nodes and their corresponding IPs

Galera setup for each node

Edit the /etc/mysql/conf.d/wsrep.cnf and change the values for the following variables:

Configuration for node01:

wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="galera"
wsrep_cluster_address="gcomm://"
wsrep_sst_method=mysqldump
wsrep_sst_auth=galera:password

Configuration for node02:

wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="galera"
wsrep_cluster_address="gcomm://node01:4567"
wsrep_sst_method=mysqldump
wsrep_sst_auth=galera:password

Configuration for node03:

wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="galera"
wsrep_cluster_address="gcomm://node02:4567"
wsrep_sst_method=mysqldump
wsrep_sst_auth=galera:password

Testing the setup

Now restart mysql on all the nodes and check out if cluster is working:

service mysql restart
mysql -p
mysql> show status like 'wsrep%';
+----------------------------+-------------------------------------------------------------+
| Variable_name | Value |
+----------------------------+-------------------------------------------------------------+
| wsrep_cluster_size | 3 |
| wsrep_ready | ON |
+----------------------------+-------------------------------------------------------------+

One more thing before you are done:
Edit node01 wsrep_cluster_address=”gcomm://node3:4567″ and restart mysql server.

Benchmarks were performed from test01 instance using sysbench 0.5 OLTP read-only complex test:

sysbench OLTP (ro) Galera cluster transactions vs threads
ThreadsTransactions/s
115
225
449
8103
16205
32390
64506
128653

galera-transactions-threads

sysbench OLTP (ro) Galera cluster avg response time
ThreadsAvg response timeMin response timeAprox 95%
16642131
27953135
48042153
87742136
167743143
328142142
6412548322
12819445427

galera-response-times

Benchmark Galera cluster vs MySQL master-master on RackSpace

Setup:

Before starting this I would like to point out that I have compared 2 instances(master-master) vs 3 instances(galera cluster) so the test is not correct/accurate. It’s more of a “what if I switch from master-master replication to 3 nodes galera”.

MySQL Master-Master replication:

2x 512 MB instances with 20GB of storage, Ubuntu 12.04 64bit, mysql-server 5.5 was used with no optimization changes to my.cnf, except the required changes for master-master replication.
1x LoadBalancer, RoundRobin algorithm

Galera 3 nodes cluster:

3x 512 MB instances with 20GB of storage, Ubuntu 12.04 64bit, mysql-server 5.5 from galera was used, with no changes to my.cnf, only required node changes were made wsrep.cnf.
1x LoadBalancer, RoundRobin algorithm

Test instance:

1x 512MB instance with 20GB of storage, Ubuntu 12.04 64bit running sysbench

sysbench --test=oltp --mysql-host=loadbalancer_ip --mysql-user=root --mysql-password=password--oltp-table-size=1000000 prepare

The tests were performed on a database of about 256MB size, InnoDB table(s). No optimization changes were made to default my.cnf files, except the required to setup replication.

sysbench OLTP transactions per second
TestMaster-MasterSingle nodeGalera cluster
1 thread,3m10.9717.1112
16 threads,1m, rw1541400
16 threads,1m, r only217158.7206
32 threads,1m, r only325160.79375

galera-cluster-vs-master-master

As you can see from the table and graph I had some issues performing sysbench for Galera cluster in rw mode for 16 threads. From what I have found on Internet it’s an issue with sysbench 0.4.12 so I will attempt to rerun the tests with a newer version.

Apache2 worker vs prefork for ISPConfig benchmark

I’ve been running ISPConfig latest version(3.0.4) on Amazon cloud t1.micro instance for some time to host several small sites, mostly WordPress. I’m quite happy with the performance of the instance. The OS is Ubuntu 10.04 LTS. Until recently I’ve used the default mpm which is prefork, but I decided to test out worker also. If you are wondering I use mod_fcgid for all the sites. That being said I performed several tests with ab (apache benchmark) to see which mpm can server most requests per second.

While I do not claim this is the best setup, I think worker is better suited for me. Some people said they had problems because of mpm worker. So far so good, but will update this post if there are any issues.

Test results:

preforkworker
Concurrency Level: 32
Time taken for tests: 7.834 seconds
Complete requests: 5000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 4972
Total transferred: 84831033 bytes
HTML transferred: 83206915 bytes
Requests per second: 638.27 [#/sec] (mean)
Time per request: 50.136 [ms] (mean)
Time per request: 1.567 [ms] (mean, across all concurrent requests)
Transfer rate: 10575.21 [Kbytes/sec] received
Concurrency Level: 32
Time taken for tests: 7.096 seconds
Complete requests: 5000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 4968
Total transferred: 84877824 bytes
HTML transferred: 83247322 bytes
Requests per second: 704.63 [#/sec] (mean)
Time per request: 45.414 [ms] (mean)
Time per request: 1.419 [ms] (mean, across all concurrent requests)
Transfer rate: 11681.17 [Kbytes/sec] received

Mysql benchmark: RDS vs EC2 performance

the setup: 1 m1.small ec2 instance vs 1 db.m1.small rds instance, tests are being run from the m1.small instance. The goal is to determine how the site will perform when moving the database from localhost to a remote instance.

I used sysbench for mysql benchmarks. On a linux server running ubuntu 10.04 you can simply install it with the following command(it’s obvious but just in case):

sudo apt-get install sysbench

The first tests performed were m1.small EC2 instance running mysql-server 5.1.41-3ubuntu12.8 VS RDS instance type db.m1.small running mysql server 5.1.50. The test database had been set to 10 000 records, number of threads = 1, test oltp.

sysbench --test=oltp --mysql-host=smalltest.us-east-1.rds.amazonaws.com --mysql-user=root --mysql-password=password --max-time=180 --max-requests=0 prepare
sysbench --test=oltp --mysql-host=smalltest.us-east-1.rds.amazonaws.com --mysql-user=root --mysql-password=password --max-time=180 --max-requests=0 run

The results

m1.small EC2 instancedb.m1.small RDS instance
OLTP test statistics:
queries performed:
read: 263354
write: 94055
other: 37622
total: 395031
transactions: 18811 (104.50 per sec.)
deadlocks: 0 (0.00 per sec.)
read/write requests: 357409 (1985.56 per sec.)
other operations: 37622 (209.01 per sec.)
Test execution summary:
total time: 180.0044s
total number of events: 18811
total time taken by event execution: 179.7827
per-request statistics:
min: 4.04ms
avg: 9.56ms
max: 616.04ms
approx. 95 percentile: 38.42ms
OLTP test statistics:
queries performed:
read: 188230
write: 67225
other: 26890
total: 282345
transactions: 13445 (74.67 per sec.)
deadlocks: 0 (0.00 per sec.)
read/write requests: 255455 (1418.74 per sec.)
other operations: 26890 (149.34 per sec.)
Test execution summary:
total time: 180.0573s
total number of events: 13445
total time taken by event execution: 179.9174
per-request statistics:
min: 9.08ms
avg: 13.38ms
max: 904.58ms
approx. 95 percentile: 20.99ms

As you can see the EC2 can perform 40% more transactions than the RDS instance. Nothing unexpected so far.

Time to move on and increase the number of threads to 10

m1.small EC2 instancedb.m1.small RDS instance
OLTP test statistics:
queries performed:
read: 264866
write: 94545
other: 37818
total: 397229
transactions: 18899 (104.97 per sec.)
deadlocks: 20 (0.11 per sec.)
read/write requests: 359411 (1996.22 per sec.)
other operations: 37818 (210.05 per sec.)

Test execution summary:
total time: 180.0462s
total number of events: 18899
total time taken by event execution: 1799.9289
per-request statistics:
min: 4.08ms
avg: 95.24ms
max: 2620.70ms
approx. 95 percentile: 445.91ms

OLTP test statistics:
queries performed:
read: 343812
write: 122772
other: 49109
total: 515693
transactions: 24551 (136.18 per sec.)
deadlocks: 7 (0.04 per sec.)
read/write requests: 466584 (2588.13 per sec.)
other operations: 49109 (272.41 per sec.)

Test execution summary:
total time: 180.2788s
total number of events: 24551
total time taken by event execution: 1801.8298
per-request statistics:
min: 13.41ms
avg: 73.39ms
max: 1126.02ms
approx. 95 percentile: 143.83ms

In this test the small RDS instance is faster than the EC2, 136 vs 105 transactions per second. I’ve also benchmarked a large RDS instance (the next one available after db.m1.small) and it got 185 transactions per second. Quite good, but the price is 4x higher.

The next test was performed vs a 10 million records, 16 threads. This time I only benchmarked a small and a large RDS instance. The large instance managed to do 228 transactions per second while the small one got a decent score of 127 transactions. One thing I noticed during this test is that the small instance started to use it’s swap, while the large one did not have this issue. This is probably due to the fact that 10M records db is aprox 2.5GB and the small RDS only has 1.7GB of RAM.

So if you are planing to grow and want an easy way to do it, switching your database to its own RDS is one of the first things you should consider. One of the immediate effects you will notice is that the CPU usage on the EC2 instance will be greatly reduced, leaving more power for the web server. You can easily increase the size and capacity of the RDS instance with just a few clicks. The backups are done automatically, which is great considering how many times I had to recover databases.

MySQL benchmarks using Amazon EC2 instances

Here are some tests I’ve run on Amazon using AMIs provided by scalr for the mysql role. I’ve used the benchmark scripts supplied by MySQL located in /usr/share/mysql/sql-bench. I had to install a package before running the tests:

apt-get install libdbd-pg-perl

After that everything was simple:

root@ec2# mysql
mysql> create database test;
mysql> quit;
root@ec2# cd /usr/share/mysql/sql-bench
root@ec2# perl run-all-tests --dir='/root/'

For EBS tests I’ve done the following:
-created 1GB EBS volume in scalr
-attached it to the instance I was testing
-notice the device name (/dev/sdb for example)

root@ec2# apt-get install xfsprogs
root@ec2# mkfs.xfs /dev/sdb
root@ec2# mkdir /mnt/storage
root@ec2# cp -R /var/lib/mysql /mnt/storage/
root@ec2# chown mysql:mysql -R /mnt/storage/mysql

-edit /etc/mysql/my.cnf and change datadir from “/var/lib/mysql” to “/mnt/storage/mysql”
-restart mysql server and start the tests:


root@ec2# /etc/init.d/mysql restart
root@ec2# mysql
mysql> drop database test;
mysql> create database test;
mysql> quit;
root@ec2# cd /usr/share/mysql/sql-bench
root@ec2# perl run-all-tests --dir='/root/'

Instances types used and their codes:

m1.small(0.10$/hour) – Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit), 160 GB of instance storage, 32-bit platform

m1.large(0.40$/hour) – Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each), 850 GB of instance storage, 64-bit platform

c1.medium(0.20$/hour) – High-CPU Medium Instance 1.7 GB of memory, 5 EC2 Compute Units (2 virtual cores with 2.5 EC2 Compute Units each), 350 GB of instance storage, 32-bit platform

c1.xlarge(0.80$/hour) – High-CPU Extra Large Instance 7 GB of memory, 20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each), 1690 GB of instance storage, 64-bit platform

EC2 Compute Unit (ECU) – One EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.


seconds usr sys cpu tests
m1.small 1823 196.54 28.66 225.2 3425950
m1.small+ebs 1646 197.18 29.61 226.79 3425950
m1.large 1072 157.06 26.97 184.03 3425950
m1.large+ebs 1088 154.23 25.23 179.46 3425950
c1.medium 902 131.18 25.63 156.81 3425950
c1.medium+ebs 901 130.76 28.84 159.6 3425950
c1.xlarge 704 123.31 32.8 156.11 3425950
c1.xlarge+ebs 781 121.02 29.52 150.54 3425950

Bellow you can see a nice chart with how much time it took for each instance to finish the benchmark tests. Either I did something terribly wrong or EBS doesn’t improve MySQL performance.