Early Conflict Detection
It is not unusual for a database application to have a "hot spot" - a table or a row most often accessed. DBT2 benchmark (and therefore TPC-C, I guess) is one such example. It has a 'warehouse' table which is naturally quite short - one row per warehouse. Even for a single MySQL server it becomes a performance bottleneck. To improve the benchmark score people are running DBT2 with tens, hundreds and even thousands of warehouses. However on my test laptops (which are quite weak by today's standards) best score of about 4300 NOTPM is achieved on only 5 warehouses, as with 6 warehouses the database no longer fits in RAM and starts swapping. Swapping is BAD, mkay?
Anyways, every 'New Order' transaction starts with updating one of the rows in that warehouse table. This is wise, as transactions just queue for lock on the row, but deadlocks are imminent. On a single node rollbacks reach 1%. On 2-node cluster - whopping 10%! This is too bad. I wonder if real life applications can cope with such rollback rate. This is the weakness of the changeset certification algorithm, where the nodes process transactions independently of each other and exchange changesets only when they are ready to commit. Easy to see how they conflict on the warehouse table rows. Equally easy to see that these certification conflicts happen at the late stage of transactions as well, so rollbacks are especially expensive. And yes, DBT2 does not scale on Galera quite well. (But hey, two nodes perform as good as native MySQL and you get high-availability with that!)
What are generally proposed solutions?
- Distributed Locks. I'm scared just to think about it. Without getting into the gory details, it's going to be a clear winner in performance bottlenecks.
- Locks Reordering. This is equally frightening. I'm not going to think about how to get it working. As far as I can tell, they have been mostly speculating about it. I'll let them have a credit for the first implementation too. But at least it does not seem to be a performance bottleneck.
- Conflict-Aware Load Balancing. This is something worth considering, but to do it efficiently you need to know application context, hence it must be done in the application. The perspectives of creating general-purpose conflict-aware load balancing middleware seem rather dim. However increasingly more applications use data partitioning nowdays. This is mostly driven but huge amounts of data, but essentially it is the aforementioned "conflict-aware load balancing". So putting conflict-aware load balancing in the application may not be such a crazy idea (how about conflict-aware JBoss?).
At today's dev meeting we seem to have come up with a fairly cheap (if not free) alternative to reduce the performance hit caused by rollbacks, which we dubbed Early Conflict Detection approach. While it is not intended to reduce the conflict rate specifically, by aborting transaction at the early stages we can make rollbacks much cheaper, raising the overall cluster performance. I hope we'll get to see how it worked out in practice in the next few months.
- alex's blog
- Kirjaudu·tai·rekisteröidy·kirjoittaaksesi kommentteja
