This article discusses a situation where you might need to update the cluster software. Rolling upgrade is thought to be the alpha and omega of cluster software upgrade, but we'll looks at other possibilities too.
This is a type of upgrade everybody is talking about when they talk about “High Availability”. It is supposed to provide uninterruptible service during upgrade. The idea is that you
The main advantage of this method is that if something goes wrong with upgrade, the other nodes are still working so you have time to sort it out.
However this method has some issues which deserve consideration:
In the end the availability of the cluster during rolling upgrade may be not as high as expected.
The idea behind this upgrade is to upgrade all nodes in the idle cluster in order to avoid time consuming state transfer. However it produces a very short but complete service outage.
Operations 2-3-4 can be done on all nodes in parallel, therefore (when properly scripted) reducing the service outage time to virtually the time needed for a single server restart.
The main advantage or this method is that for huge databases it may be much faster and result in better availability than the rolling upgrade.
Always use this method for 2-node cluster upgrade as rolling upgrade with blocking state transfers would result in a much longer service outage.
The main drawback with this method is that it relies on that upgrade and restart will be very quick. However shutting down InnoDB may take up a few minutes (flushing dirty pages), and if something goes wrong during upgrade, there is very little time to fix it. Therefore it may be advisable to not upgrade all nodes at once but first try it on a single node.
If only Galera provider upgrade is required, bulk upgrade method may be further optimized to only take few seconds. The following is an example for 64-bit CentOS (or RHEL):
# rpm -e galera # rpm -i <new galera rpm>
mysql> SET GLOBAL wsrep_provider='none'; mysql> SET GLOBAL wsrep_provider='/usr/lib64/galera/libgalera_smm.so';
mysql> SET GLOBAL wsrep_cluster_address='gcomm://'
mysql> SET GLOBAL wsrep_cluster_address='gcomm://node1'
Normally reloading provider and connecting to cluster should take less than 10 seconds, so there is virtually no service outage.
But the most important feature of this method is that warmed up InnoDB buffer pool is fully preserved, so that the cluster will pick up to operate at full speed as soon as the load is resumed.