Benchmarking from French Territory
Galera development has finally reached state where it is possible to start serious testing with real DBMS benchmarks. However, current Galera release, supports only SQL statement level replication. The row data level (utilizing MySQL Row based Replication (RBR)), is under development still. But, even with statement level, the scalability seems considerably good and is worth testing.
I rented a farmhouse in Bergerac/France, with the plan to run
Galera benchmark tests from there during an extensive two week benchmarking period. The plan was otherwise good, but
internet connection got never alive in that house. The WLAN
'livebox' let down all expectations and some configuration black magic would have been needed to sort it out.
So, I eventually ended up sitting in McDonalds Bergerac, while it had the least hassle free internet connection available.
Yes, it was me sitting in the corner table and drinking all those Grand Cafes, shame on me.
However, the testing was good (and coffee, as always). I'll summarize the results in following, but first some words about the test system and environment.
sqlgen
I first wanted to test just with simple SELECT/UPDATE scenarios to see how Galera scales with different cluster sizes and conlict rates. For testing, I used Alex's 'sqlgen' load generator.
Sqlgen is an extendable SQL generator, which can run different 'scenarios' against a group of DBMSs. We intended to publish
sqlgen in open source, but Alex is not too happy with the code
quality and does not want to talk about it anymore. I just
wonder why, it works for me exactly as expected and with sqlgen I can hit Galera cluster pretty hard. Powerful tool.
I used sqlgen's multihost property to share the load evenly
between all cluster nodes. I have earlier tried testing with
various middleware load balancers (.e.g. MySQL Proxy and 'pen' tcp load balancer), but load balancing seems to always cause a serious bottleneck for the testing.
Test Cluster
The test cluster has 4 more or less identical nodes and one client machine, which runs the tests. The hardware is HP
proliant DL320 G2 (3 GHz hyperthreading pentium 4) and RAM
sizes differing between 800MB - 1GB. The hardware is quite old
compared to current standards, but this is the best we could
get (well, for nothing).
All cluster nodes were running in each test. I just configured the test use 1-4 master nodes in each test run. So, .e.g. in 2 node cluster test I actually had 2 master nodes and 2 slave nodes.
Operating system is Centos 5.1
There are two 1Gb networks used, one for client connections,
and the other for replication traffic.
Galera Release
The Galera release, I used for these tests is mysql-5.1.24-rc-302,367M. As you might guess, this one is based on MySQL 5.1.24rc. This Galera release has only SQL statement level application available.
Group communication system was Spread 3.17.04. Teemu has first version of his group communication backend, called 'vsbes', ready for testing, but unfortunately I did not get Centos build early enough for these tests.
Test Scenarios
For testing, I run simple UPDATE/SELECT statements against the cluster.
- DB size:
- 50 tables, 1000 rows each
- trx length:
- random 2-8 statements
- statement:
- random SELECT or UPDATE referencing one row
- users:
- 5 per node
- duration:
- 800 secs
I run several tests with different read/write ratios: 100/0, 90/10, 70/30, 50/50, 30/70 and 0/100. I run each test first against plain MySQL 5.1.24rc and and then against galera cluster with 1,2,3 and 4 nodes.
Test summaries
Here are the test summaries. The result tables summarize for each R/W ratio tested, the figures received with plain MySQL version and 1-4 master node Galera clusters. The trx/sec figure is the one to look for here.
R/W ratio 100/0:
R/W SELECTs UPDATEs TRANSACT (abrtd %) CONNECTIONS
total 1/sec total 1/sec total 1/sec total 1/sec
plain inf 2928429 3660.53 0 0.00 585664 732.08 5 0.0 39121 48.90
1 node inf 2774580 3468.20 0 0.00 555290 694.11 5 0.0 37056 46.32
2 nodes inf 5418385 6772.94 0 0.00 1083957 1354.94 10 0.0 72278 90.35
3 nodes inf 7924326 9905.33 0 0.00 1586812 1983.50 35 0.0 105924 132.40
4 nodes inf 10276153 12844.93 0 0.00 2058321 2572.85 85 0.0 137262 171.57
R/W ratio 90/10:
R/W SELECTs UPDATEs TRANSACT (abrtd %) CONNECTIONS
total 1/sec total 1/sec total 1/sec total 1/sec
plain 9.0 2412753 3015.93 268487 335.61 535950 669.93 3 0.0 35716 44.64
1 node 9.0 2131439 2664.28 237750 297.19 474265 592.83 6 0.0 31635 39.54
2 nodes 9.0 3637129 4546.36 404563 505.70 808327 1010.40 16 0.0 53958 67.45
3 nodes 9.0 4789957 5987.35 532785 665.97 1065853 1332.29 58 0.0 71125 88.90
4 nodes 9.0 5686359 7107.80 631594 789.48 1263779 1579.69 85 0.0 84293 105.36
R/W ratio 70/30:
R/W SELECTs UPDATEs TRANSACT (abrtd %) CONNECTIONS
total 1/sec total 1/sec total 1/sec total 1/sec
plain 2.3 1733594 2166.98 743245 929.05 495502 619.37 5 0.0 32947 41.18
1 node 2.3 1424751 1780.92 611635 764.54 407019 508.77 4 0.0 27103 33.88
2 nodes 2.3 2184616 2730.75 937592 1171.98 624614 780.76 93 0.0 41668 52.08
3 nodes 2.3 2681934 3352.35 1148881 1436.07 766113 957.62 234 0.0 51064 63.83
4 nodes 2.3 2945579 3681.85 1261155 1576.39 841385 1051.70 517 0.1 56098 70.12
R/W ratio 50/50:
R/W SELECTs UPDATEs TRANSACT (abrtd %) CONNECTIONS
total 1/sec total 1/sec total 1/sec total 1/sec
plain 1.0 1126349 1407.93 1122834 1403.53 450015 562.52 11 0.0 30004 37.50
1 node 1.0 940427 1175.52 939391 1174.23 375788 469.73 4 0.0 25028 31.28
2 nodes 1.0 1342998 1678.73 1343771 1679.70 536774 670.96 190 0.0 35769 44.71
3 nodes 1.0 1541570 1926.94 1541060 1926.30 615907 769.87 666 0.1 41125 51.41
4 nodes 1.0 1699547 2124.38 1700641 2125.75 679022 848.76 1154 0.2 45272 56.59
R/W ratio 30/70:
R/W SELECTs UPDATEs TRANSACT (abrtd %) CONNECTIONS
total 1/sec total 1/sec total 1/sec total 1/sec
plain 0.4 678227 847.78 1585116 1981.39 452797 565.99 4 0.0 30218 37.77
1 node 0.4 542838 678.54 1269179 1586.46 362444 453.05 5 0.0 24179 30.22
2 nodes 0.4 728475 910.59 1704157 2130.18 485987 607.48 408 0.1 32333 40.42
3 nodes 0.4 810440 1013.04 1894855 2368.55 540251 675.31 1062 0.2 36034 45.04
4 nodes 0.4 880494 1100.58 2056919 2571.06 585425 731.76 1858 0.3 39056 48.82
R/W ratio 0/100:
R/W SELECTs UPDATEs TRANSACT (abrtd %) CONNECTIONS
total 1/sec total 1/sec total 1/sec total 1/sec
plain 0.0 0 0.00 2267208 2834.00 453616 567.02 5 0.0 30280 37.85
1 node 0.0 0 0.00 1748911 2186.12 349629 437.03 4 0.0 23324 29.15
2 nodes 0.0 0 0.00 2171305 2714.11 434160 542.70 640 0.1 28957 36.20
3 nodes 0.0 0 0.00 2350447 2938.02 468227 585.28 1968 0.4 31223 39.03
4 nodes 0.0 0 0.00 2490604 3113.19 495079 618.84 3183 0.6 32964 41.20
Conclusion
In general, the scalability seems reasonably good with these tests.
In 100/0 case, the difference with plain MySQL and 1 node Galera cluster is ~5%. I'm not happy with this overhead, during pure read load, the overhead should be close to 0.
In 0/100 case, you would expect that Galera cluster never gets better figures than plain MySQL. The better performance with 3 and 4 node Galera cluster must mean that with 5 connections, the client-server communication has significant latency. And, on the other hand, when Galera applies replicated write sets, this communication latency is avoided. I need to re-test with different number of users to verify this.
All in all, this seems like a good start for more complicated benchmarks. But, must travel back to Finland for further testing...
- seppo's blog
- Kirjaudu·tai·rekisteröidy·kirjoittaaksesi kommentteja
