New RBR Oars Make Galera Fly

Don't worry if the title does not make much sense to you. RBR here refers to MySQL 5.1 Row Based Replication method, which we utilize in latest Galera release. Galera, on the other had, is also known as a multi "oar motored" rowing boat. That should explain the title semantics to you.

We previously had only SQL statement level write set representation and applying required full SQL statement processing in MySQL engine. RBR representation contains direct tuple changes as binary data and applying phase is more or less shortcutted. To see the effect in performance, I run sqlgen benchmarks against plain MySQL, and SQL level and RBR level Galera clusters. The cluster size varied again from 1 to 4 nodes. We have merged Galera integration to latest 5.1.26rc release, so MySQl 5.1.26rc is used as the reference.

Here, I have again individual graphs for each R/W load profile. Plain MySQL performance is shown as a horizontal line and Galera performances with 1-4 nodes cluster sizes are plotted to show the scaling. There is separate curve for Galera in SQL and RBR modes. The y-axis metrics is transactions/sec.


R/W ratio: 100/0

http://www.codership.com/sites/default/files/pictures/sqlgen_2_100x0.jpg


R/W ratio: 90/10

http://www.codership.com/sites/default/files/pictures/sqlgen_2_90x10.jpg


R/W ratio: 70/30

http://www.codership.com/sites/default/files/pictures/sqlgen_2_70x30.jpg


R/W ratio: 50/50

http://www.codership.com/sites/default/files/pictures/sqlgen_2_50x50.jpg


R/W ratio: 30/70

http://www.codership.com/sites/default/files/pictures/sqlgen_2_30x70.jpg


R/W ratio: 0/100

http://www.codership.com/sites/default/files/pictures/sqlgen_2_0x100.jpg

For the interested reader, I present the exact figures from the test logs. Here are only plain MySQL compared to RBR level Galera clusters, I did not want to repeat the SQL level results.

The test setup is again as follows.

DB size:
50 tables, 1000 rows each
trx length:
random 2-8 statements
statement:
random SELECT or UPDATE referencing one row
users:
5 per node
duration:
800 secs
R/W ratio 100/0:
        R/W      SELECTs         UPDATEs      TRANSACT     (abrtd  %)   CONNECTIONS
               total   1/sec    total  1/sec  total  1/sec              total 1/sec
plain   inf  2866823  3583.50      0    0.00  573360   716.69   3  0.0 38233   47.79
1 node  inf  2684098  3355.10      0    0.00  537082   671.35   5  0.0 35752   44.69
2 nodes inf  5282776  6603.42      0    0.00  1056405 1320.50  10  0.0 70524   88.15
3 nodes inf  7738388  9672.79      0    0.00  1548897 1936.08  45  0.0 103225 129.03
4 nodes inf  10073732 12591.84     0    0.00  2015923 2519.84  47  0.0 134407 168.00
R/W ratio 90/10:
        R/W      SELECTs         UPDATEs      TRANSACT     (abrtd  %)   CONNECTIONS
               total   1/sec    total  1/sec  total  1/sec              total 1/sec
plain   9.0  2377462 2971.81  264315  330.39   528437 660.54     5  0.0 35215  44.02
1 node  9.0  2076543 2595.66  230808  288.51   461169 576.46     5  0.0 30741  38.43
2 nodes 9.0  3687412 4609.23  410543  513.18   819856 1024.81   21  0.0 54563  68.20
3 nodes 9.0  5009908 6262.30  557697  697.11  1114313 1392.87   64  0.0 74391  92.99
4 nodes 9.0  6164267 7705.13  686034  857.52  1371202 1713.96  124  0.0 91448 114.31
R/W ratio 70/30:
        R/W      SELECTs         UPDATEs      TRANSACT     (abrtd  %)   CONNECTIONS
               total   1/sec    total  1/sec  total  1/sec              total 1/sec
plain   2.3  1723648 2154.55   739206  924.00  492374  615.46    4  0.0 32909  41.14
1 node  2.3  1346832 1683.53   575308  719.13  384285  480.35    4  0.0 25638  32.05
2 nodes 2.3  2273404 2841.72   973865 1217.32  648926  811.15   96  0.0 43314  54.14
3 nodes 2.3  2966560 3708.15  1272302 1590.36  848110 1060.12  270  0.0 56600  70.75
4 nodes 2.3  3539225 4423.95  1519969 1899.93 1012036 1265.02  489  0.0 67378  84.22
R/W ratio 50/50:
        R/W      SELECTs         UPDATEs      TRANSACT     (abrtd  %)   CONNECTIONS
               total   1/sec    total  1/sec  total  1/sec              total 1/sec
plain   1.0  1166453 1458.06  1169114 1461.39  467005  583.75    4  0.0 31186  38.98
1 node  1.0  883614  1104.51   883547 1104.43  352995  441.24    4  0.0 23619  29.52
2 nodes 1.0  1436937 1796.16  1437124 1796.39  574882  718.60  248  0.0 38322  47.90
3 nodes 1.0  1838091 2297.58  1838769 2298.43  735640  919.54  564  0.1 49002  61.25
4 nodes 1.0  2158841 2698.50  2157055 2696.27  862219 1077.75 1009  0.1 57563  71.95
R/W ratio 30/70:
        R/W      SELECTs         UPDATEs      TRANSACT     (abrtd  %)   CONNECTIONS
               total   1/sec    total  1/sec  total  1/sec              total 1/sec
plain   0.4   681777  852.22  1585901 1982.36  453250 566.56    4  0.0 30188  37.73
1 node  0.4   504558  630.69  1180303 1475.37  336746 420.93    5  0.0 22460  28.07
2 nodes 0.4   809377 1011.71  1890133 2362.65  539906 674.88  386  0.1 35988  44.98
3 nodes 0.4  1016769 1270.94  2375951 2969.90  678398 847.99 1049  0.2 45269  56.59
4 nodes 0.4  1169442 1461.78  2737223 3421.47  779683 974.59 1877  0.2 52042  65.05
R/W ratio 30/70:
        R/W      SELECTs         UPDATEs      TRANSACT     (abrtd  %)   CONNECTIONS
               total   1/sec    total  1/sec  total  1/sec              total 1/sec
plain   0.0       0    0.00  2273204 2841.50  454476 568.09    3  0.0 30351  37.94
1 node  0.0       0    0.00  1602163 2002.69  320219 400.27    4  0.0 21314  26.64
2 nodes 0.0       0    0.00  2461794 3077.22  491694 614.61  718  0.1 32868  41.08
3 nodes 0.0       0    0.00  3018818 3773.47  602072 752.58 1885  0.3 40114  50.14
4 nodes 0.0       0    0.00  3402732 4253.30  677429 846.76 3371  0.5 45212  56.51

So what's next? Well, I have been setting up more complicated benchmarking scenarios to see how different load profiles affect the replication performance. I have some preliminary results with OSDB, and DBT2 (TPC-C) benchmarks and plan to try at least with DOTS. Running these benchmarks take a lot of tuning and sometimes also modifying test code. It is hard to estimate when I have reliable results available, but I hope, I can publish something in two weeks time.

In engineering front, there is a lot of progress in GCS implementation. I need to run benchmarks also to see the difference between spread and vsbes GCS's.