5.2 KiB
Recover percona cluster
- Identify the new master:
Identify the unit which has safe_to_bootstrap=1
juju status mysql
Note #1: If all are '0', SSH to the one with the biggest sequence number. If all have the same seqno, SSH to a random one, and then:
sudo vi /var/lib/percona-xtradb-cluster/grastate.dat
Change safe-to-bootstrap from '0' to '1'
To get the seqnos:
juju run --application=mysql cat /var/lib/percona-xtradb-cluster/grastate.dat
Note #2: If all are '-1', run mysqld_safe --wsrep-recover
on the 3 and compare, line will say recovered position uuid,nodeno. Take a note of that nodeno.
For the rest of the example, we assume the master is mysql/0
- Before bootstrapping the master, it's a good idea to move the VIP there, and also prevent Juju from trying to do any operations in the slaves. The following steps will stop MySQL and move the VIP away from the slaves:
juju run-action hacluster-mysql/1 --wait pause
juju run-action mysql/1 --wait pause
juju run-action hacluster-mysql/2 --wait pause
juju run-action mysql/2 --wait pause
Note: that the number of this unit might be different
Confirm mysql is stopped in those units and kill any mysqld processes if necessary. Also confirm that the VIP is now placed in the master unit.
- Bootstrap master:
Confirm everything is down and kill if necessary
sudo systemctl stop mysql.service
sudo systemctl start mysql@bootstrap.service # Bionic
sudo /etc/init.d/mysql bootstrap-pxc && sudo systemctl start mysql # Xenial
- Run
show global status
and confirm that the master is the Primary unit with the size of 1 (wsrep_cluster_size
andwsrep_cluster_status
):
MYSQLPWD=$(juju run --unit mysql/0 leader-get mysql.passwd)
juju run --unit mysql/0 "mysql -uroot -p${MYSQLPWD} -e \"SHOW global status;\"" | grep -Ei "wsrep_cluster"
- Update
juju status mysql
to confirm the master is now active:
juju run --application mysql "hooks/update-status" && juju run --application hacluster-mysql "hooks/update-status" && juju status mysql
Your cluster should be operational by now, the slaves still have to be added
- Start the first slave:
juju run-action mysql/1 --wait resume
juju run-action hacluster-mysql/1 --wait resume
- Run
show global status
and confirm that the master is the Primary unit and that the size has increased by 1 (wsrep_cluster_size
andwsrep_cluster_status
):
MYSQLPWD=$(juju run --unit mysql/0 leader-get mysql.passwd)
juju run --unit mysql/0 "mysql -uroot -p${MYSQLPWD} -e \"SHOW global status;\"" | grep -Ei "wsrep_cluster"
-
Confirm the start slave is now active in Juju:
juju run --application mysql "hooks/update-status" && juju run --application hacluster-mysql "hooks/update-status" && juju status mysql
-
If the state is still not active, note that sometimes, in the slaves, the
systemctl status mysql
output shows asfailed
ortimed out
even if everything looks alright. This happens because the systemd unit times out before MySQL stops resyncing. Restart the service if that's the case:juju run --unit=mysql/1 "sudo systemctl restart mysql.service"
-
Confirm it is now active:
juju run --application mysql "hooks/update-status" && juju run --application hacluster-mysql "hooks/update-status" && juju status mysql
-
Repeat steps 6-10 to
mysql/2
-
Final check:
juju status mysql
Note #1: If any of the units shows seeded file missing
at the end of the procedure, you can fix it like this:
juju run --unit=mysql/X 'echo "done" | sudo tee -a /var/lib/percona-xtradb-cluster/seeded && sudo chown mysql:mysql /var/lib/percona-xtradb-cluster/seeded'
Note #2: If one of the slaves doesn't start at all, showing something in the lines of "MySQL PID not found, pid_file detected/guessed: /var/run/mysqld/mysqld.pid
, try this:
juju ssh to the unit
sudo systemctl stop mysql
Stop/kill pending mysqld processes;
sudo rm -rf /var/run/mysqld.*
sudo systemctl start mysql
juju run --application mysql "hooks/update-status" && juju run --application hacluster-mysql "hooks/update-status" && juju status mysql
Note #3: As a last resort, if one of the slaves doesn't start at all, you might have to rebuild its DB from scratch. Use this following procedure:
juju run-action hacluster-mysql/X --wait pause
juju run-action mysql/X --wait pause
juju ssh mysql/X
Stop/kill pending mysqld processes
sudo mv /var/lib/percona-xtradb-cluster /var/lib/percona-xtradb-cluster.bak
sudo mkdir /var/lib/percona-xtradb-cluster
sudo chown mysql:mysql /var/lib/percona-xtradb-cluster
sudo chmod 700 /var/lib/percona-xtradb-cluster
juju run-action mysql/X --wait resume
sudo du /var/lib/percona-xtradb-cluster # to see replication progress
Once it's done, check if processes are running (sudo ps -ef | grep mysqld
) and if the service is showing as up (sudo systemctl status mysql
); if the service shows as timed out (sometimes systemd times out before sync finishes), restart it: sudo systemctl restart mysql
juju run-action hacluster-mysql/X --wait resume
juju run --application mysql "hooks/update-status" && juju run --application hacluster-mysql "hooks/update-status" && juju status mysql