Adding node to PXC cluster, without full SST

Recently we needed to add a Percona MySQL server node to an already existing cluster. This has to be done without having one of the nodes go into DONOR mode and initiating a full SST. IST is faster than SST, it is non-blocking and has no significant performance impact on the donor. It should be the preferred option whenever possible.

At the end of this blog you will have joined a node to the cluster, and it will initiate an Incremental State Transfer instead of the full SST. We will do so by manually creating a backup.

NOTE: Many of the optimizations made in this post assume that innodb_file_per_table is enabled.

The Galera cache

In your data directory you will be able to see files called galera.cache. These files contains a copy of all the recent writesets and are needed to be able to perform an Incremental State Transfer. Depending on the speed of your backup and the size of you data set the gcache may have to be increased to be able to perform an IST.

Adjusting the gcache size can be done by setting the following parameter, this is not a dynamic variables and will need a restart:

wsrep_provider_options="gcache.size=1G"

Creating a backup with Innobackupex

Firstly we’re going to create a backup with innobackupex, this is done in two stages:

  • Creating a backup, this will copy all the InnoDB and MyISAM files.
  • Prepare the backup, this will perform crash recovery on all the table files and replay the logs.

Installing Innobackupex

The script, created by Percona, can be found in the percona-toolkit and can be installed from the default repositories:

apt install percona-toolkit

Creating a backup

We’ll perform the first step creating the backup.

innobackupex --defaults-file=/etc/mysql/debian.cnf --rsync --no-timestamp --parallel=4 --close-files --galera-info /path/to/backup

Let’s go over all the options:

  • defaults-file: This points to your mysql-client configuration, optionally you can configure the username and password inline using --user and --password.
  • rsync: This flag tells the underlying xtrabackup tool to use rsync as copy method instead of spawning separate cp processes for each file. This is especially useful for MySQL database servers with a lot of tables.
  • no-timestamp: Prevent innobackupex from creating a new directory with the timestamp
  • parallel: This will spawn 4 parallel xtrabackup threads, drastically increasing speeds if you have many tables.
  • close-files: This tells xtrabackup to close the files once it’s done copying them, this will also increase performance when many small tables are involved.
  • galera-info: This parameter makes sure the exact information of the gcache state is saved, we need this later when joining the new node. You can compare this to the old binlog locations in traditional MySQL replication.
  • /path/to/backup: The location where the backup will be store

Make sure that the location you are about the create the backup has enough free space to create the backup of the current data set. Alternatively you could mount the new node on a location and send the backup straight to the new node.

NOTE: For prarallel, rsync and close-files to really have a big effect innodb_file_per_table must be enabled enabled. Otherwise the threads are limited by the speed of the single ibdata1 file.

Once the backup is completed, which depending on your data set could take up to a couple of hours, we move on to the next step.

innobackupex: Backup created in directory '/path/to/backup'
111225 00:00:53  innobackupex: completed OK!

For more information on this process, refer to the Percona documentation.

Prepare the backup

After creating a backup, the data is not ready to be restored. There might be uncommitted transactions to be undone or transactions in the logs to be replayed. Doing those pending operations will make the data files consistent and it is the purpose of the prepare stage. Once this has been done, the data is ready to be used.

innobackupex --defaults-file=/etc/mysql/debian.cnf --apply-log /path/to/backup`

After applying the logs, your will be greeted by the following lines:

111225  1:01:57  InnoDB: Shutdown completed; log sequence number 1609228
111225 01:01:57  innobackupex: completed OK!

For more information on this proces, refer to the Percona documenation.

Restoring the backup

First we have to move the backup to the new node to be joined to the cluster. Make sure mysql is stopped on the new node.

On the new node:

systemctl stop mysql

On the node where the backups is made: We can now rsync the backup to the joiner node.

rsync -rav /path/to/backup/ root@mynewnode:/var/lib/mysql/

Once the backup has been rsynced to the new node, we adjust the rights on the new to make sure MySQL can read and edit the files:

chown -r mysql:mysql /var/lib/mysql/

Adjusting the grastate.dat

Now we have to convince the new, joining, node that it has to correct data and that it wont need to request a full SST. This is the part where we make sure a IST is performed. For this we need both the uuid and seqno of the galera status. Luckily we created out backup using the galera-info flag and all this information has been saved for us:

cat /var/lib/mysql/xtrabackup_galera_info

The following information will be returned:

d38587ce-246c-11e5-bcce-6bbd0831cc0f:1352215

Where both are:

  • uuid: d38587ce-246c-11e5-bcce-6bbd0831cc0f
  • seqno: 1352215

We will apply both of these to a new file called grastate.dat:

echo "# GALERA saved state
version: 2.1
uuid:    <YOUR UUID>
seqno:   <YOUR SEQNO>
cert_index:" > /var/lib/mysql/grastate.dat

Start MySQL

Now we can go ahead and start MySQL.

systemctl start mysql

It will see that it has a UUID and Seqno and will try to start the IST instead of the SST:

2019-07-09T20:02:27.180301Z 0 [Note] WSREP: Initiating SST/IST transfer on JOINER side (wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.0.0.2' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '12258' --mysqld-version '5.7.25-28-57'   '' )
2019-07-09T20:02:27.664871Z 1 [Note] WSREP: Prepared SST/IST request: xtrabackup-v2|10.0.0.2:4444/xtrabackup_sst//1
2019-07-09T20:02:27.665759Z 0 [Note] WSREP: Member 1.0 (mynewnode) requested state transfer from '*any*'. Selected 0.0 (myoldnode)(SYNCED) as donor.
2019-07-09T20:02:28.375259Z 0 [Note] WSREP: 0.0 (myoldnode): State transfer to 1.0 (mynewnode) complete.
2019-07-09T20:02:28.386829Z WSREP_SST: [INFO] xtrabackup_ist received from donor: Running IST
2019-07-09T20:02:28.394920Z 0 [Note] WSREP: SST complete, seqno: 241434341
2019-07-09T20:02:40.108259Z 0 [Note] WSREP: Shifting JOINER -> JOINED (TO: 241442790)