Benchmarking from French Territory

Galera development has finally reached state where it is possible to start serious testing with real DBMS benchmarks. However, current Galera release, supports only SQL statement level replication. The row data level (utilizing MySQL Row based Replication (RBR)), is under development still. But, even with statement level, the scalability seems considerably good and is worth testing.

I rented a farmhouse in Bergerac/France, with the plan to run
Galera benchmark tests from there during an extensive two week benchmarking period. The plan was otherwise good, but
internet connection got never alive in that house. The WLAN
'livebox' let down all expectations and some configuration black magic would have been needed to sort it out.
So, I eventually ended up sitting in McDonalds Bergerac, while it had the least hassle free internet connection available.
Yes, it was me sitting in the corner table and drinking all those Grand Cafes, shame on me.

However, the testing was good (and coffee, as always). I'll summarize the results in following, but first some words about the test system and environment.

sqlgen
I first wanted to test just with simple SELECT/UPDATE scenarios to see how Galera scales with different cluster sizes and conlict rates. For testing, I used Alex's 'sqlgen' load generator.
Sqlgen is an extendable SQL generator, which can run different 'scenarios' against a group of DBMSs. We intended to publish
sqlgen in open source, but Alex is not too happy with the code
quality and does not want to talk about it anymore. I just
wonder why, it works for me exactly as expected and with sqlgen I can hit Galera cluster pretty hard. Powerful tool.

I used sqlgen's multihost property to share the load evenly
between all cluster nodes. I have earlier tried testing with
various middleware load balancers (.e.g. MySQL Proxy and 'pen' tcp load balancer), but load balancing seems to always cause a serious bottleneck for the testing.

Test Cluster
The test cluster has 4 more or less identical nodes and one client machine, which runs the tests. The hardware is HP
proliant DL320 G2 (3 GHz hyperthreading pentium 4) and RAM
sizes differing between 800MB - 1GB. The hardware is quite old
compared to current standards, but this is the best we could
get (well, for nothing).

All cluster nodes were running in each test. I just configured the test use 1-4 master nodes in each test run. So, .e.g. in 2 node cluster test I actually had 2 master nodes and 2 slave nodes.

Operating system is Centos 5.1

There are two 1Gb networks used, one for client connections,
and the other for replication traffic.

Galera Release
The Galera release, I used for these tests is mysql-5.1.24-rc-302,367M. As you might guess, this one is based on MySQL 5.1.24rc. This Galera release has only SQL statement level application available.

Group communication system was Spread 3.17.04. Teemu has first version of his group communication backend, called 'vsbes', ready for testing, but unfortunately I did not get Centos build early enough for these tests.

Test Scenarios

For testing, I run simple UPDATE/SELECT statements against the cluster.

DB size:
50 tables, 1000 rows each
trx length:
random 2-8 statements
statement:
random SELECT or UPDATE referencing one row
users:
5 per node
duration:
800 secs

I run several tests with different read/write ratios: 100/0, 90/10, 70/30, 50/50, 30/70 and 0/100. I run each test first against plain MySQL 5.1.24rc and and then against galera cluster with 1,2,3 and 4 nodes.

Test summaries
Here are the test summaries. The result tables summarize for each R/W ratio tested, the figures received with plain MySQL version and 1-4 master node Galera clusters. The trx/sec figure is the one to look for here.

R/W ratio 100/0:
        R/W      SELECTs         UPDATEs      TRANSACT     (abrtd  %)   CONNECTIONS
               total   1/sec    total  1/sec  total  1/sec              total 1/sec
plain   inf  2928429  3660.53      0    0.00  585664  732.08    5  0.0  39121  48.90
1 node  inf  2774580  3468.20      0    0.00  555290  694.11    5  0.0  37056  46.32
2 nodes inf  5418385  6772.94      0    0.00 1083957 1354.94   10  0.0  72278  90.35
3 nodes inf  7924326  9905.33      0    0.00 1586812 1983.50   35  0.0 105924 132.40
4 nodes inf 10276153 12844.93      0    0.00 2058321 2572.85   85  0.0 137262 171.57
R/W ratio 90/10:
        R/W      SELECTs         UPDATEs        TRANSACT   (abrtd  %)   CONNECTIONS
               total  1/sec    total  1/sec   total 1/sec               total 1/sec

plain   9.0  2412753 3015.93  268487  335.61  535950  669.93   3  0.0 35716  44.64
1 node  9.0  2131439 2664.28  237750  297.19  474265  592.83   6  0.0 31635  39.54
2 nodes 9.0  3637129 4546.36  404563  505.70  808327 1010.40  16  0.0 53958  67.45
3 nodes 9.0  4789957 5987.35  532785  665.97 1065853 1332.29  58  0.0 71125  88.90
4 nodes 9.0  5686359 7107.80  631594  789.48 1263779 1579.69  85  0.0 84293 105.36
R/W ratio 70/30:
        R/W      SELECTs         UPDATEs        TRANSACT   (abrtd  %)   CONNECTIONS
               total  1/sec    total  1/sec   total 1/sec               total 1/sec

plain   2.3  1733594 2166.98  743245  929.05  495502  619.37    5  0.0 32947  41.18
1 node  2.3  1424751 1780.92  611635  764.54  407019  508.77    4  0.0 27103  33.88
2 nodes 2.3  2184616 2730.75  937592 1171.98  624614  780.76   93  0.0 41668  52.08
3 nodes 2.3  2681934 3352.35 1148881 1436.07  766113  957.62  234  0.0 51064  63.83
4 nodes 2.3  2945579 3681.85 1261155 1576.39  841385 1051.70  517  0.1 56098  70.12
R/W ratio 50/50:
        R/W      SELECTs         UPDATEs        TRANSACT   (abrtd  %)   CONNECTIONS
               total  1/sec    total  1/sec   total 1/sec               total 1/sec

plain   1.0  1126349 1407.93 1122834 1403.53  450015  562.52   11  0.0 30004  37.50
1 node  1.0   940427 1175.52  939391 1174.23  375788  469.73    4  0.0 25028  31.28
2 nodes 1.0  1342998 1678.73 1343771 1679.70  536774  670.96  190  0.0 35769  44.71
3 nodes 1.0  1541570 1926.94 1541060 1926.30  615907  769.87  666  0.1 41125  51.41
4 nodes 1.0  1699547 2124.38 1700641 2125.75  679022  848.76 1154  0.2 45272  56.59
R/W ratio 30/70:
        R/W      SELECTs         UPDATEs        TRANSACT   (abrtd  %)   CONNECTIONS
               total  1/sec    total  1/sec   total   1/sec            total 1/sec

plain   0.4   678227  847.78 1585116 1981.39  452797  565.99    4  0.0 30218  37.77
1 node  0.4   542838  678.54 1269179 1586.46  362444  453.05    5  0.0 24179  30.22
2 nodes 0.4   728475  910.59 1704157 2130.18  485987  607.48  408  0.1 32333  40.42
3 nodes 0.4   810440 1013.04 1894855 2368.55  540251  675.31 1062  0.2 36034  45.04
4 nodes 0.4   880494 1100.58 2056919 2571.06  585425  731.76 1858  0.3 39056  48.82
R/W ratio 0/100:
        R/W      SELECTs         UPDATEs        TRANSACT   (abrtd  %)   CONNECTIONS
               total  1/sec    total  1/sec   total   1/sec            total 1/sec

plain   0.0        0    0.00 2267208 2834.00  453616  567.02    5  0.0 30280  37.85
1 node  0.0        0    0.00 1748911 2186.12  349629  437.03    4  0.0 23324  29.15
2 nodes 0.0        0    0.00 2171305 2714.11  434160  542.70  640  0.1 28957  36.20
3 nodes 0.0        0    0.00 2350447 2938.02  468227  585.28 1968  0.4 31223  39.03
4 nodes 0.0        0    0.00 2490604 3113.19  495079  618.84 3183  0.6 32964  41.20

Conclusion
In general, the scalability seems reasonably good with these tests.
In 100/0 case, the difference with plain MySQL and 1 node Galera cluster is ~5%. I'm not happy with this overhead, during pure read load, the overhead should be close to 0.
In 0/100 case, you would expect that Galera cluster never gets better figures than plain MySQL. The better performance with 3 and 4 node Galera cluster must mean that with 5 connections, the client-server communication has significant latency. And, on the other hand, when Galera applies replicated write sets, this communication latency is avoided. I need to re-test with different number of users to verify this.
All in all, this seems like a good start for more complicated benchmarks. But, must travel back to Finland for further testing...