Controlling Auto Increments
MySQL 5.1 introduces new system variables (auto_increment_increment, auto_increment_offset) for managing auto increment 'sequences'. Using these variables, it is possible to set up a multi master replication environment, where auto increment sequences in each master node interleave, and no conflicts should happen in the replication. No matter which master(s) get the INSERTs.
Logically auto increment sequence is a shared resource, which would require distributed locking to deal with. However, auto increment sequence interleaving circumvents the need to lock, it sort of splits the auto increment sequence to several node specific sequences, and it is not a shared resource anymore.
auto_increment_increment and auto_increment_offset have been implemented as session variables, as opposed to being global. We felt at first a bit uncomfortable with this, as there is obvious risk of getting conflicts. Apparently, the idea is to let user make separation with tables which are shared in the cluster and tables which are local to each node. If some dedicated session(s) touch only local tables, they can have session specific increment and offset values set to 1. And for sessions needing access to cluster wide shared tables, proper cluster-aware auto increment settings should be used.
These auto increment control variables are suitable for our Galera replication model as well. We however, wanted to go one step further and prevent any possibility of auto increment conflicts in the cluster. Galera runs on top of group communication system, which has has a real time view of cluster memberships. We can therefore adjust increment and offset variables on the fly, triggered by any changes in cluster configuration. We implemented cluster view handler, which the group communication calls whenever somebody joins or leaves the group. The handler code is passed the number of members in the group and group ID of the processing node. These translate nicely to increment and offset values, cluster size will be the auto_increment_increment and auto_increment_offset can be calculated from the node ID.
To make auto increment controlling effective, we decided to restrict user's access to auto increment variables to read only. We wanted to play safe and make sure that no conflicts can happen. But, in order to be transparent, we also added one more variable: wsrep_auto_increment_control to define if automatic auto increment controlling is enabled or not. With auto increment controlling enabled, cluster will take fully care of setting the increment and offset variables and this will guarantee no conflicts. If auto increment controlling is disabled, system will behave as default MySQL, and user must specify increment and offset globally or by session to suitable values.
Our implementation is based to MySQL 5.1.30, which happens to suffer from a known problem with slave applying: http://bugs.mysql.com/bug.php?id=41986
This bug is so severe, that we backported the fix for this issue. So current wsrep integration code is actually 5.1.30 + 41986 patch + wsrep related changes.
- seppo's blog
- Login or register to post comments
