mysql高可用之MHA

实验环境:

master:server1(172.25.14.1)
Candicate slave : server2 (172.25.14.2)
slave : server3 (172.25.14.3)
manager:server4(172.25.14.4)

一.搭建主从复制

1.master搭建

[root@server1 ~]# mysql -p
Enter password: 
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)
[root@server1 ~]# mysql -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 5.7.17-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> grant replication slave on *.* to repl@'172.25.14.%'identified by 'Westos+123';
Query OK, 0 rows affected, 1 warning (0.06 sec)

mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000003 |      843 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

2.slave搭建(server2与server3相同)

[root@server2 mysql]# mysql -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 5.7.17-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> change master to master_host='172.25.14.1',master_user='repl',master_password='Westos+123',master_auto_position=1;
Query OK, 0 rows affected, 2 warnings (0.68 sec)

mysql> start slave;

mysql> show slave status\G;
Query OK, 0 rows affected (0.03 sec)
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.25.14.1
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000004
          Read_Master_Log_Pos: 154
               Relay_Log_File: server2-relay-bin.000004
                Relay_Log_Pos: 367
        Relay_Master_Log_File: mysql-bin.000004
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

二.安装MHA软件

3个节点安装

server1,server2和server3都安装
安装需要的包
mha4mysql-node-0.54-0.el6.noarch.rpm

yum install perl-devel perl-CPAN perl-DBD-MySQL -y

rpm -ivh mha4mysql-node-0.54-0.el6.noarch.rpm

管理节点安装manager

server4

需要的安装包
mha4mysql-manager-0.56-0.el6.noarch.rpm
mha4mysql-node-0.56-0.el6.noarch.rpm
系统自带

yum install perl-devel perl-CPAN perl-DBD-MySQL -y

需要的其他rpm包

perl-Log-Dispatch-2.27-1.el6.noarch.rpm
perl-Mail-Sender-0.8.16-3.el6.noarch.rpm
perl-Mail-Sendmail-0.79-12.el6.noarch.rpm
perl-MIME-Lite-3.027-2.el6.noarch.rpm
perl-MIME-Types-1.28-2.el6.noarch.rpm
perl-Parallel-ForkManager-0.7.9-1.el6.noarch.rpm
perl-Config-Tiny-2.12-7.1.el6.noarch.rpm
perl-Email-Date-Format-1.002-5.el6.noarch.rpm

免密配置

[root@server4 ~]# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
17:dc:04:de:f6:4e:1d:1c:77:f0:3d:47:cf:e9:d4:82 root@server4
The key's randomart image is:
+--[ RSA 2048]----+
|          ... .o+|
|         o + ..+O|
|          + E .*B|
|           o .oo+|
|        S .   o..|
|         .   o   |
|              .  |
|                 |
|                 |
+-----------------+
[root@server4 ~]# cd /root/.ssh/
[root@server4 .ssh]# yum install -y rsync
[root@server4 .ssh]# rsync -p * server1:/root/.ssh/
root@server1's password: 
[root@server4 .ssh]# rsync -p * server2:/root/.ssh/
root@server2's password: 
[root@server4 .ssh]# rsync -p * server3:/root/.ssh/
root@server3's password:

测试是否免密连接

[root@server4 .ssh]# ssh server1
Last login: Thu Aug  9 14:16:05 2018 from 172.25.14.250
[root@server1 ~]# logout
Connection to server1 closed.
[root@server4 .ssh]# ssh server2
Last login: Thu Aug  9 14:16:00 2018 from 172.25.14.250
[root@server2 ~]# logout
Connection to server2 closed.
[root@server4 .ssh]# ssh server3
Last login: Thu Aug  9 14:15:58 2018 from 172.25.14.250
[root@server3 ~]# logout
Connection to server3 closed.

三.mha配置

manage节点配置

[root@server4 masterha]# pwd
/etc/masterha    ##此目录为mkdir目录
[root@server4 masterha]# vim app.cnf

[server default]
manager_log=/etc/masterha/mha.log ##日志
manager_workdir=/etc/masterha/ ##工作目录
master_binlog_dir=/var/lib/mysql
#master_ip_online_change_script=/etc/masterha/master_ip_online_change
password=Westos+123  ##监控密码
ping_interval=1  ##设置监控主库,发送ping包的时间间隔,默认是3秒,尝试三次没有回
应的时候自动进行railover
remote_workdir=/tmp
repl_password=Westos+123 ##主从复制用户密码
repl_user=repl ##主从复制用户名
ssh_user=root ##ssh用户名
user=root

[server1]
hostname=172.25.14.1
port=3306

[server2]
candidate_master=1 ##设置为候选master,如果设置该参数以后,发生主从切换以后将会>将此从库提升为主库,即使这个主库不是集群中事件最新的slave
check_repl_delay=0 ##默认情况下如果一个slave落后master 100M的relay logs的话,MHA将不会选择该slave作为一个新的master,因为对于这个slave的恢复需要花费很长时间,通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master
hostname=172.25.14.2
port=3306

[server3]
hostname=172.25.14.3
port=3306

slave节点配置
server2和server3配置relay log的清除方式和slave配置只读,但不要写入配置文件,因为master机down掉后可能随时会升级成master

[root@server2 ~]# mysql -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 14
Server version: 5.7.17 MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> set global relay_log_purge=0;
Query OK, 0 rows affected (0.00 sec)

mysql> set global read_only=on;
Query OK, 0 rows affected (0.00 sec)

检测ssh配置

[root@server4 masterha]# masterha_check_ssh --conf=/etc/masterha/app1/app.cnf
Thu Aug  9 14:58:34 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
/etc/masterha/app1/app.cnf:No such file or directory
 at /usr/share/perl5/vendor_perl/MHA/SSHCheck.pm line 148
[root@server4 masterha]# masterha_check_ssh --conf=/etc/masterha/app.cnf 
Thu Aug  9 14:58:45 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Aug  9 14:58:45 2018 - [info] Reading application default configuration from /etc/masterha/app.cnf..
Thu Aug  9 14:58:45 2018 - [info] Reading server configuration from /etc/masterha/app.cnf..
Thu Aug  9 14:58:45 2018 - [info] Starting SSH connection tests..
Thu Aug  9 14:58:46 2018 - [debug] 
Thu Aug  9 14:58:45 2018 - [debug]  Connecting via SSH from root@172.25.14.1(172.25.14.1:22) to root@172.25.14.2(172.25.14.2:22)..
Warning: Permanently added '172.25.14.2' (RSA) to the list of known hosts.
Thu Aug  9 14:58:45 2018 - [debug]   ok.
Thu Aug  9 14:58:45 2018 - [debug]  Connecting via SSH from root@172.25.14.1(172.25.14.1:22) to root@172.25.14.3(172.25.14.3:22)..
Warning: Permanently added '172.25.14.3' (RSA) to the list of known hosts.
Thu Aug  9 14:58:45 2018 - [debug]   ok.
Thu Aug  9 14:58:46 2018 - [debug] 
Thu Aug  9 14:58:46 2018 - [debug]  Connecting via SSH from root@172.25.14.2(172.25.14.2:22) to root@172.25.14.1(172.25.14.1:22)..
Thu Aug  9 14:58:46 2018 - [debug]   ok.
Thu Aug  9 14:58:46 2018 - [debug]  Connecting via SSH from root@172.25.14.2(172.25.14.2:22) to root@172.25.14.3(172.25.14.3:22)..
Thu Aug  9 14:58:46 2018 - [debug]   ok.
Thu Aug  9 14:58:47 2018 - [debug] 
Thu Aug  9 14:58:46 2018 - [debug]  Connecting via SSH from root@172.25.14.3(172.25.14.3:22) to root@172.25.14.1(172.25.14.1:22)..
Thu Aug  9 14:58:46 2018 - [debug]   ok.
Thu Aug  9 14:58:46 2018 - [debug]  Connecting via SSH from root@172.25.14.3(172.25.14.3:22) to root@172.25.14.2(172.25.14.2:22)..
Thu Aug  9 14:58:46 2018 - [debug]   ok.
Thu Aug  9 14:58:47 2018 - [info] All SSH connection tests passed successfully.

检测repl环境
注意报错

[root@server4 masterha]# masterha_check_repl --conf=/etc/masterha/app.cnf 
Thu Aug  9 14:59:06 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Aug  9 14:59:06 2018 - [info] Reading application default configuration from /etc/masterha/app.cnf..
Thu Aug  9 14:59:06 2018 - [info] Reading server configuration from /etc/masterha/app.cnf..
Thu Aug  9 14:59:06 2018 - [info] MHA::MasterMonitor version 0.56.
Thu Aug  9 14:59:07 2018 - [info] GTID failover mode = 1
Thu Aug  9 14:59:07 2018 - [info] Dead Servers:
Thu Aug  9 14:59:07 2018 - [info] Alive Servers:
Thu Aug  9 14:59:07 2018 - [info]   172.25.14.1(172.25.14.1:3306)
Thu Aug  9 14:59:07 2018 - [info]   172.25.14.2(172.25.14.2:3306)
Thu Aug  9 14:59:07 2018 - [info]   172.25.14.3(172.25.14.3:3306)
Thu Aug  9 14:59:07 2018 - [info] Alive Slaves:
Thu Aug  9 14:59:07 2018 - [info]   172.25.14.2(172.25.14.2:3306)  Version=5.7.17 (oldest major version between slaves) log-bin:disabled
Thu Aug  9 14:59:07 2018 - [info]     GTID ON
Thu Aug  9 14:59:07 2018 - [info]     Replicating from 172.25.14.1(172.25.14.1:3306)
Thu Aug  9 14:59:07 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Aug  9 14:59:07 2018 - [info]   172.25.14.3(172.25.14.3:3306)  Version=5.7.17 (oldest major version between slaves) log-bin:disabled
Thu Aug  9 14:59:07 2018 - [info]     GTID ON
Thu Aug  9 14:59:07 2018 - [info]     Replicating from 172.25.14.1(172.25.14.1:3306)
Thu Aug  9 14:59:07 2018 - [info] Current Alive Master: 172.25.14.1(172.25.14.1:3306)
Thu Aug  9 14:59:07 2018 - [info] Checking slave configurations..
Thu Aug  9 14:59:07 2018 - [warning]  log-bin is not set on slave 172.25.14.2(172.25.14.2:3306). This host cannot be a master.
Thu Aug  9 14:59:07 2018 - [warning]  log-bin is not set on slave 172.25.14.3(172.25.14.3:3306). This host cannot be a master.
Thu Aug  9 14:59:07 2018 - [info] Checking replication filtering settings..
Thu Aug  9 14:59:07 2018 - [info]  binlog_do_db= , binlog_ignore_db= 
Thu Aug  9 14:59:07 2018 - [info]  Replication filtering check ok.
Thu Aug  9 14:59:07 2018 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln361] None of slaves can be master. Check failover configuration file or log-bin settings in my.cnf
Thu Aug  9 14:59:07 2018 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/bin/masterha_check_repl line 48
Thu Aug  9 14:59:07 2018 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Thu Aug  9 14:59:07 2018 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

解决方法

[root@server1 ~]# mysql -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 5.7.17-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> grant all on *.* to root@'172.25.14.%' identified by 'Westos+123';
Query OK, 0 rows affected, 1 warning (0.14 sec)

mysql> grant all on *.* to repl@'172.25.14.%' identified by 'Westos+123';
Query OK, 0 rows affected, 1 warning (0.08 sec)

再次测试

[root@server4 ~]# masterha_check_repl --conf=/etc/masterha/app.cnf
Thu Aug  9 15:14:27 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Aug  9 15:14:27 2018 - [info] Reading application default configuration from /etc/masterha/app.cnf..
Thu Aug  9 15:14:27 2018 - [info] Reading server configuration from /etc/masterha/app.cnf..
Thu Aug  9 15:14:27 2018 - [info] MHA::MasterMonitor version 0.56.
Thu Aug  9 15:14:27 2018 - [info] GTID failover mode = 1
Thu Aug  9 15:14:27 2018 - [info] Dead Servers:
Thu Aug  9 15:14:27 2018 - [info] Alive Servers:
Thu Aug  9 15:14:27 2018 - [info]   172.25.14.1(172.25.14.1:3306)
Thu Aug  9 15:14:27 2018 - [info]   172.25.14.2(172.25.14.2:3306)
Thu Aug  9 15:14:27 2018 - [info]   172.25.14.3(172.25.14.3:3306)
Thu Aug  9 15:14:27 2018 - [info] Alive Slaves:
Thu Aug  9 15:14:27 2018 - [info]   172.25.14.2(172.25.14.2:3306)  Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Thu Aug  9 15:14:27 2018 - [info]     GTID ON
Thu Aug  9 15:14:27 2018 - [info]     Replicating from 172.25.14.1(172.25.14.1:3306)
Thu Aug  9 15:14:27 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Aug  9 15:14:27 2018 - [info]   172.25.14.3(172.25.14.3:3306)  Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Thu Aug  9 15:14:27 2018 - [info]     GTID ON
Thu Aug  9 15:14:27 2018 - [info]     Replicating from 172.25.14.1(172.25.14.1:3306)
Thu Aug  9 15:14:27 2018 - [info] Current Alive Master: 172.25.14.1(172.25.14.1:3306)
Thu Aug  9 15:14:27 2018 - [info] Checking slave configurations..
Thu Aug  9 15:14:27 2018 - [info]  read_only=1 is not set on slave 172.25.14.2(172.25.14.2:3306).
Thu Aug  9 15:14:27 2018 - [info]  read_only=1 is not set on slave 172.25.14.3(172.25.14.3:3306).
Thu Aug  9 15:14:27 2018 - [info] Checking replication filtering settings..
Thu Aug  9 15:14:27 2018 - [info]  binlog_do_db= , binlog_ignore_db= 
Thu Aug  9 15:14:27 2018 - [info]  Replication filtering check ok.
Thu Aug  9 15:14:27 2018 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Thu Aug  9 15:14:27 2018 - [info] Checking SSH publickey authentication settings on the current master..
Thu Aug  9 15:14:27 2018 - [info] HealthCheck: SSH to 172.25.14.1 is reachable.
Thu Aug  9 15:14:27 2018 - [info] 
172.25.14.1(172.25.14.1:3306) (current master)
 +--172.25.14.2(172.25.14.2:3306)
 +--172.25.14.3(172.25.14.3:3306)

Thu Aug  9 15:14:27 2018 - [info] Checking replication health on 172.25.14.2..
Thu Aug  9 15:14:27 2018 - [info]  ok.
Thu Aug  9 15:14:27 2018 - [info] Checking replication health on 172.25.14.3..
Thu Aug  9 15:14:27 2018 - [info]  ok.
Thu Aug  9 15:14:27 2018 - [warning] master_ip_failover_script is not defined.
Thu Aug  9 15:14:27 2018 - [warning] shutdown_script is not defined.
Thu Aug  9 15:14:27 2018 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

测试:

manager机开启监控

[root@server4 ~]# nohup masterha_manager --conf=/etc/masterha/app.cnf &
[1] 1383
[root@server4 ~]# nohup: ignoring input and appending output to `nohup.out'

测试
将master机的mysql down掉后

 1598 pts/0    S      0:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql
 1843 pts/0    Sl     0:01 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/my
 1905 pts/0    R+     0:00 ps ax
[root@server1 ~]# kill -9 1598
[root@server1 ~]# kill -9 1843

manager机会自动生成日志等文件

[root@server4 masterha]# cat mha.log 
----- Failover Report -----

app: MySQL Master failover 172.25.14.1(172.25.14.1:3306) to 172.25.14.2(172.25.14.2:3306) succeeded

Master 172.25.14.1(172.25.14.1:3306) is down!

Check MHA Manager logs at server4:/etc/masterha/mha.log for details.

Started automated(non-interactive) failover.
Selected 172.25.14.2(172.25.14.2:3306) as a new master.
172.25.14.2(172.25.14.2:3306): OK: Applying all logs succeeded.
172.25.14.3(172.25.14.3:3306): OK: Slave started, replicating from 172.25.14.2(172.25.14.2:3306)
172.25.14.2(172.25.14.2:3306): Resetting slave info succeeded.
Master failover to 172.25.14.2(172.25.14.2:3306) completed successfully.

日志文件中提示master已经由2接管
此时我们在server2和server3分别查看
server2

mysql> show master status;
+------------------+----------+--------------+------------------+----------------------------------------------------------------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                                                                |
+------------------+----------+--------------+------------------+----------------------------------------------------------------------------------+
| mysql-bin.000001 |      444 |              |                  | 0d809020-9b85-11e8-ba58-525400917839:1-3,
ddc02999-9b85-11e8-8220-5254000cc710:1 |
+------------------+----------+--------------+------------------+----------------------------------------------------------------------------------+
1 row in set (0.00 sec)

server3

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.25.14.2
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 444
               Relay_Log_File: server3-relay-bin.000002
                Relay_Log_Pos: 657
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

将server的mysql开启,并将他手动设置为slave,指向新的master

[root@server1 ~]# /etc/init.d/mysqld start
Starting mysqld:                                           [  OK  ]
[root@server1 ~]# mysql -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.7.17-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show slave status;
Empty set (0.00 sec)

mysql> change master to master_host='172.25.14.2',master_user='repl',master_password='Westos+123',master_auto_position=1;
Query OK, 0 rows affected, 2 warnings (0.71 sec)

mysql> start slave;
Query OK, 0 rows affected (0.02 sec)

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.25.14.2
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 444
               Relay_Log_File: server1-relay-bin.000002
                Relay_Log_Pos: 657
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

五.master的在线切换

将manager机上配置文件中的备用master指向注释掉

[server default]
manager_log=/etc/masterha/mha.log
manager_workdir=/etc/masterha/
master_binlog_dir=/var/lib/mysql
#master_ip_online_change_script=/etc/masterha/master_ip_online_change
password=Westos+123
ping_interval=1
remote_workdir=/tmp
repl_password=Westos+123
repl_user=repl
ssh_user=root
user=root

[server1]
hostname=172.25.14.1
port=3306

[server2]
#candidate_master=1
#check_repl_delay=0
hostname=172.25.14.2
port=3306

[server3]
hostname=172.25.14.3
port=3306

手动将master从server2切换到server1

[root@server4 masterha]# masterha_master_switch --conf=/etc/masterha/app.cnf --master_state=alive --new_master_host=172.25.14.1 --new_master_port=3306 --orig_master_is_new_slave --running_updates_limit=10000
Thu Aug  9 16:03:38 2018 - [info] MHA::MasterRotate version 0.56.
Thu Aug  9 16:03:38 2018 - [info] Starting online master switch..
Thu Aug  9 16:03:38 2018 - [info] 
Thu Aug  9 16:03:38 2018 - [info] * Phase 1: Configuration Check Phase..
Thu Aug  9 16:03:38 2018 - [info] 
Thu Aug  9 16:03:38 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Aug  9 16:03:38 2018 - [info] Reading application default configuration from /etc/masterha/app.cnf..
Thu Aug  9 16:03:38 2018 - [info] Reading server configuration from /etc/masterha/app.cnf..
Thu Aug  9 16:03:38 2018 - [info] GTID failover mode = 1
Thu Aug  9 16:03:38 2018 - [info] Current Alive Master: 172.25.14.2(172.25.14.2:3306)
Thu Aug  9 16:03:38 2018 - [info] Alive Slaves:
Thu Aug  9 16:03:38 2018 - [info]   172.25.14.1(172.25.14.1:3306)  Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Thu Aug  9 16:03:38 2018 - [info]     GTID ON
Thu Aug  9 16:03:38 2018 - [info]     Replicating from 172.25.14.2(172.25.14.2:3306)
Thu Aug  9 16:03:38 2018 - [info]   172.25.14.3(172.25.14.3:3306)  Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Thu Aug  9 16:03:38 2018 - [info]     GTID ON
Thu Aug  9 16:03:38 2018 - [info]     Replicating from 172.25.14.2(172.25.14.2:3306)

It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 172.25.14.2(172.25.14.2:3306)? (YES/no): yes
Thu Aug  9 16:04:35 2018 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Thu Aug  9 16:04:35 2018 - [info]  ok.
Thu Aug  9 16:04:35 2018 - [info] Checking MHA is not monitoring or doing failover..
Thu Aug  9 16:04:35 2018 - [info] Checking replication health on 172.25.14.1..
Thu Aug  9 16:04:35 2018 - [info]  ok.
Thu Aug  9 16:04:35 2018 - [info] Checking replication health on 172.25.14.3..
Thu Aug  9 16:04:35 2018 - [info]  ok.
Thu Aug  9 16:04:35 2018 - [info] 172.25.14.1 can be new master.
Thu Aug  9 16:04:35 2018 - [info] 
From:
172.25.14.2(172.25.14.2:3306) (current master)
 +--172.25.14.1(172.25.14.1:3306)
 +--172.25.14.3(172.25.14.3:3306)

To:
172.25.14.1(172.25.14.1:3306) (new master)
 +--172.25.14.3(172.25.14.3:3306)
 +--172.25.14.2(172.25.14.2:3306)

Starting master switch from 172.25.14.2(172.25.14.2:3306) to 172.25.14.1(172.25.14.1:3306)? (yes/NO): yes
Thu Aug  9 16:04:41 2018 - [info] Checking whether 172.25.14.1(172.25.14.1:3306) is ok for the new master..
Thu Aug  9 16:04:41 2018 - [info]  ok.
Thu Aug  9 16:04:41 2018 - [info] 172.25.14.2(172.25.14.2:3306): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host.
Thu Aug  9 16:04:41 2018 - [info] 172.25.14.2(172.25.14.2:3306): Resetting slave pointing to the dummy host.
Thu Aug  9 16:04:41 2018 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Aug  9 16:04:41 2018 - [info] 
Thu Aug  9 16:04:41 2018 - [info] * Phase 2: Rejecting updates Phase..
Thu Aug  9 16:04:41 2018 - [info] 
master_ip_online_change_script is not defined. If you do not disable writes on the current master manually, applications keep writing on the current master. Is it ok to proceed? (yes/NO): yes
Thu Aug  9 16:05:15 2018 - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Thu Aug  9 16:05:15 2018 - [info] Executing FLUSH TABLES WITH READ LOCK..
Thu Aug  9 16:05:15 2018 - [info]  ok.
Thu Aug  9 16:05:15 2018 - [info] Orig master binlog:pos is mysql-bin.000001:444.
Thu Aug  9 16:05:15 2018 - [info]  Waiting to execute all relay logs on 172.25.14.1(172.25.14.1:3306)..
Thu Aug  9 16:05:15 2018 - [info]  master_pos_wait(mysql-bin.000001:444) completed on 172.25.14.1(172.25.14.1:3306). Executed 0 events.
Thu Aug  9 16:05:15 2018 - [info]   done.
Thu Aug  9 16:05:15 2018 - [info] Getting new master's binlog name and position..
Thu Aug  9 16:05:15 2018 - [info]  mysql-bin.000008:194
Thu Aug  9 16:05:15 2018 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.25.14.1', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Thu Aug  9 16:05:15 2018 - [info] 
Thu Aug  9 16:05:15 2018 - [info] * Switching slaves in parallel..
Thu Aug  9 16:05:15 2018 - [info] 
Thu Aug  9 16:05:15 2018 - [info] -- Slave switch on host 172.25.14.3(172.25.14.3:3306) started, pid: 1571
Thu Aug  9 16:05:15 2018 - [info] 
Thu Aug  9 16:05:16 2018 - [info] Log messages from 172.25.14.3 ...
Thu Aug  9 16:05:16 2018 - [info] 
Thu Aug  9 16:05:15 2018 - [info]  Waiting to execute all relay logs on 172.25.14.3(172.25.14.3:3306)..
Thu Aug  9 16:05:15 2018 - [info]  master_pos_wait(mysql-bin.000001:444) completed on 172.25.14.3(172.25.14.3:3306). Executed 0 events.
Thu Aug  9 16:05:15 2018 - [info]   done.
Thu Aug  9 16:05:15 2018 - [info]  Resetting slave 172.25.14.3(172.25.14.3:3306) and starting replication from the new master 172.25.14.1(172.25.14.1:3306)..
Thu Aug  9 16:05:16 2018 - [info]  Executed CHANGE MASTER.
Thu Aug  9 16:05:16 2018 - [info]  Slave started.
Thu Aug  9 16:05:16 2018 - [info] End of log messages from 172.25.14.3 ...
Thu Aug  9 16:05:16 2018 - [info] 
Thu Aug  9 16:05:16 2018 - [info] -- Slave switch on host 172.25.14.3(172.25.14.3:3306) succeeded.
Thu Aug  9 16:05:16 2018 - [info] Unlocking all tables on the orig master:
Thu Aug  9 16:05:16 2018 - [info] Executing UNLOCK TABLES..
Thu Aug  9 16:05:16 2018 - [info]  ok.
Thu Aug  9 16:05:16 2018 - [info] Starting orig master as a new slave..
Thu Aug  9 16:05:16 2018 - [info]  Resetting slave 172.25.14.2(172.25.14.2:3306) and starting replication from the new master 172.25.14.1(172.25.14.1:3306)..
Thu Aug  9 16:05:17 2018 - [info]  Executed CHANGE MASTER.
Thu Aug  9 16:05:17 2018 - [info]  Slave started.
Thu Aug  9 16:05:17 2018 - [info] All new slave servers switched successfully.
Thu Aug  9 16:05:17 2018 - [info] 
Thu Aug  9 16:05:17 2018 - [info] * Phase 5: New master cleanup phase..
Thu Aug  9 16:05:17 2018 - [info] 
Thu Aug  9 16:05:17 2018 - [info]  172.25.14.1: Resetting slave info succeeded.
Thu Aug  9 16:05:17 2018 - [info] Switching master to 172.25.14.1(172.25.14.1:3306) completed successfully.

分别的server1,server2和server3查看
server1

mysql> show master status;
+------------------+----------+--------------+------------------+----------------------------------------------------------------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                                                                |
+------------------+----------+--------------+------------------+----------------------------------------------------------------------------------+
| mysql-bin.000008 |      194 |              |                  | 0d809020-9b85-11e8-ba58-525400917839:1-3,
ddc02999-9b85-11e8-8220-5254000cc710:1 |
+------------------+----------+--------------+------------------+----------------------------------------------------------------------------------+
1 row in set (0.00 sec)

server2

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.25.14.1
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000008
          Read_Master_Log_Pos: 194
               Relay_Log_File: server2-relay-bin.000002
                Relay_Log_Pos: 367
        Relay_Master_Log_File: mysql-bin.000008
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

server3

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.25.14.1
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000008
          Read_Master_Log_Pos: 194
               Relay_Log_File: server3-relay-bin.000002
                Relay_Log_Pos: 367
        Relay_Master_Log_File: mysql-bin.000008
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

注:手动切换操作并不会被记录在mha的日志文件中

[root@server4 masterha]# cat app.cnf 
----- Failover Report -----

app: MySQL Master failover 172.25.14.1(172.25.14.1:3306) to 172.25.14.2(172.25.14.2:3306) succeeded

Master 172.25.14.1(172.25.14.1:3306) is down!

Check MHA Manager logs at server4:/etc/masterha/mha.log for details.

Started automated(non-interactive) failover.
Selected 172.25.14.2(172.25.14.2:3306) as a new master.
172.25.14.2(172.25.14.2:3306): OK: Applying all logs succeeded.
172.25.14.3(172.25.14.3:3306): OK: Slave started, replicating from 172.25.14.2(172.25.14.2:3306)
172.25.14.2(172.25.14.2:3306): Resetting slave info succeeded.
Master failover to 172.25.14.2(172.25.14.2:3306) completed successfully.

六.master手动切换

配置文件同在线切换
若master down,没有 nohup masterha_manager,则需手动切换

 1944 pts/0    S      0:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysq
 2193 pts/0    Sl     0:01 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/m
 2237 ?        S      0:00 pickup -l -t fifo -u
 2253 pts/0    R+     0:00 ps ax
[root@server1 ~]# kill -9 1944
[root@server1 ~]# kill -9 2193

manger节点管理:

[root@server4 masterha]# masterha_master_switch --conf=/etc/masterha/app.cnf --master_state=dead --dead_master_host=172.25.14.1 -dead_master_port=3306 --new_master_host=172.25.14.2 --new_master_port=3306 --ignore_last_failover
--dead_master_ip=<dead_master_ip> is not set. Using 172.25.14.1.
Thu Aug  9 16:31:15 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Aug  9 16:31:15 2018 - [info] Reading application default configuration from /etc/masterha/app.cnf..
Thu Aug  9 16:31:15 2018 - [info] Reading server configuration from /etc/masterha/app.cnf..
Thu Aug  9 16:31:15 2018 - [info] MHA::MasterFailover version 0.56.
Thu Aug  9 16:31:15 2018 - [info] Starting master failover.
Thu Aug  9 16:31:15 2018 - [info] 
Thu Aug  9 16:31:15 2018 - [info] * Phase 1: Configuration Check Phase..
Thu Aug  9 16:31:15 2018 - [info] 
Thu Aug  9 16:31:15 2018 - [info] GTID failover mode = 1
Thu Aug  9 16:31:15 2018 - [info] Dead Servers:
Thu Aug  9 16:31:15 2018 - [info]   172.25.14.1(172.25.14.1:3306)
Thu Aug  9 16:31:15 2018 - [info] Checking master reachability via MySQL(double check)...
Thu Aug  9 16:31:15 2018 - [info]  ok.
Thu Aug  9 16:31:15 2018 - [info] Alive Servers:
Thu Aug  9 16:31:15 2018 - [info]   172.25.14.2(172.25.14.2:3306)
Thu Aug  9 16:31:15 2018 - [info]   172.25.14.3(172.25.14.3:3306)
Thu Aug  9 16:31:15 2018 - [info] Alive Slaves:
Thu Aug  9 16:31:15 2018 - [info]   172.25.14.2(172.25.14.2:3306)  Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Thu Aug  9 16:31:15 2018 - [info]     GTID ON
Thu Aug  9 16:31:15 2018 - [info]     Replicating from 172.25.14.1(172.25.14.1:3306)
Thu Aug  9 16:31:15 2018 - [info]   172.25.14.3(172.25.14.3:3306)  Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Thu Aug  9 16:31:15 2018 - [info]     GTID ON
Thu Aug  9 16:31:15 2018 - [info]     Replicating from 172.25.14.1(172.25.14.1:3306)
Master 172.25.14.1(172.25.14.1:3306) is dead. Proceed? (yes/NO): yes
Thu Aug  9 16:32:08 2018 - [info] Starting GTID based failover.
Thu Aug  9 16:32:08 2018 - [info] 
Thu Aug  9 16:32:08 2018 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Aug  9 16:32:08 2018 - [info] 
Thu Aug  9 16:32:08 2018 - [info] * Phase 2: Dead Master Shutdown Phase..
Thu Aug  9 16:32:08 2018 - [info] 
Thu Aug  9 16:32:08 2018 - [info] HealthCheck: SSH to 172.25.14.1 is reachable.
Thu Aug  9 16:32:09 2018 - [info] Forcing shutdown so that applications never connect to the current master..
Thu Aug  9 16:32:09 2018 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master IP address.
Thu Aug  9 16:32:09 2018 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Thu Aug  9 16:32:09 2018 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Thu Aug  9 16:32:09 2018 - [info] 
Thu Aug  9 16:32:09 2018 - [info] * Phase 3: Master Recovery Phase..
Thu Aug  9 16:32:09 2018 - [info] 
Thu Aug  9 16:32:09 2018 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Thu Aug  9 16:32:09 2018 - [info] 
Thu Aug  9 16:32:09 2018 - [info] The latest binary log file/position on all slaves is mysql-bin.000008:194
Thu Aug  9 16:32:09 2018 - [info] Latest slaves (Slaves that received relay log files to the latest):
Thu Aug  9 16:32:09 2018 - [info]   172.25.14.2(172.25.14.2:3306)  Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Thu Aug  9 16:32:09 2018 - [info]     GTID ON
Thu Aug  9 16:32:09 2018 - [info]     Replicating from 172.25.14.1(172.25.14.1:3306)
Thu Aug  9 16:32:09 2018 - [info]   172.25.14.3(172.25.14.3:3306)  Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Thu Aug  9 16:32:09 2018 - [info]     GTID ON
Thu Aug  9 16:32:09 2018 - [info]     Replicating from 172.25.14.1(172.25.14.1:3306)
Thu Aug  9 16:32:09 2018 - [info] The oldest binary log file/position on all slaves is mysql-bin.000008:194
Thu Aug  9 16:32:09 2018 - [info] Oldest slaves:
Thu Aug  9 16:32:09 2018 - [info]   172.25.14.2(172.25.14.2:3306)  Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Thu Aug  9 16:32:09 2018 - [info]     GTID ON
Thu Aug  9 16:32:09 2018 - [info]     Replicating from 172.25.14.1(172.25.14.1:3306)
Thu Aug  9 16:32:09 2018 - [info]   172.25.14.3(172.25.14.3:3306)  Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Thu Aug  9 16:32:09 2018 - [info]     GTID ON
Thu Aug  9 16:32:09 2018 - [info]     Replicating from 172.25.14.1(172.25.14.1:3306)
Thu Aug  9 16:32:09 2018 - [info] 
Thu Aug  9 16:32:09 2018 - [info] * Phase 3.3: Determining New Master Phase..
Thu Aug  9 16:32:09 2018 - [info] 
Thu Aug  9 16:32:09 2018 - [info] 172.25.14.2 can be new master.
Thu Aug  9 16:32:09 2018 - [info] New master is 172.25.14.2(172.25.14.2:3306)
Thu Aug  9 16:32:09 2018 - [info] Starting master failover..
Thu Aug  9 16:32:09 2018 - [info] 
From:
172.25.14.1(172.25.14.1:3306) (current master)
 +--172.25.14.2(172.25.14.2:3306)
 +--172.25.14.3(172.25.14.3:3306)

To:
172.25.14.2(172.25.14.2:3306) (new master)
 +--172.25.14.3(172.25.14.3:3306)

Starting master switch from 172.25.14.1(172.25.14.1:3306) to 172.25.14.2(172.25.14.2:3306)? (yes/NO): yes
Thu Aug  9 16:32:24 2018 - [info] New master decided manually is 172.25.14.2(172.25.14.2:3306)
Thu Aug  9 16:32:24 2018 - [info] 
Thu Aug  9 16:32:24 2018 - [info] * Phase 3.3: New Master Recovery Phase..
Thu Aug  9 16:32:24 2018 - [info] 
Thu Aug  9 16:32:24 2018 - [info]  Waiting all logs to be applied.. 
Thu Aug  9 16:32:24 2018 - [info]   done.
Thu Aug  9 16:32:24 2018 - [info] Getting new master's binlog name and position..
Thu Aug  9 16:32:24 2018 - [info]  mysql-bin.000001:444
Thu Aug  9 16:32:24 2018 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.25.14.2', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Thu Aug  9 16:32:24 2018 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000001, 444, 0d809020-9b85-11e8-ba58-525400917839:1-3,
ddc02999-9b85-11e8-8220-5254000cc710:1
Thu Aug  9 16:32:24 2018 - [warning] master_ip_failover_script is not set. Skipping taking over new master IP address.
Thu Aug  9 16:32:24 2018 - [info] ** Finished master recovery successfully.
Thu Aug  9 16:32:24 2018 - [info] * Phase 3: Master Recovery Phase completed.
Thu Aug  9 16:32:24 2018 - [info] 
Thu Aug  9 16:32:24 2018 - [info] * Phase 4: Slaves Recovery Phase..
Thu Aug  9 16:32:24 2018 - [info] 
Thu Aug  9 16:32:24 2018 - [info] 
Thu Aug  9 16:32:24 2018 - [info] * Phase 4.1: Starting Slaves in parallel..
Thu Aug  9 16:32:24 2018 - [info] 
Thu Aug  9 16:32:24 2018 - [info] -- Slave recovery on host 172.25.14.3(172.25.14.3:3306) started, pid: 1589. Check tmp log /etc/masterha//172.25.14.3_3306_20180809163115.log if it takes time..
Thu Aug  9 16:32:24 2018 - [info] 
Thu Aug  9 16:32:24 2018 - [info] Log messages from 172.25.14.3 ...
Thu Aug  9 16:32:24 2018 - [info] 
Thu Aug  9 16:32:24 2018 - [info]  Resetting slave 172.25.14.3(172.25.14.3:3306) and starting replication from the new master 172.25.14.2(172.25.14.2:3306)..
Thu Aug  9 16:32:24 2018 - [info]  Executed CHANGE MASTER.
Thu Aug  9 16:32:24 2018 - [info]  Slave started.
Thu Aug  9 16:32:24 2018 - [info]  gtid_wait(0d809020-9b85-11e8-ba58-525400917839:1-3,
ddc02999-9b85-11e8-8220-5254000cc710:1) completed on 172.25.14.3(172.25.14.3:3306). Executed 0 events.
Thu Aug  9 16:32:24 2018 - [info] End of log messages from 172.25.14.3.
Thu Aug  9 16:32:24 2018 - [info] -- Slave on host 172.25.14.3(172.25.14.3:3306) started.
Thu Aug  9 16:32:24 2018 - [info] All new slave servers recovered successfully.
Thu Aug  9 16:32:24 2018 - [info] 
Thu Aug  9 16:32:24 2018 - [info] * Phase 5: New master cleanup phase..
Thu Aug  9 16:32:24 2018 - [info] 
Thu Aug  9 16:32:24 2018 - [info] Resetting slave info on the new master..
Thu Aug  9 16:32:25 2018 - [info]  172.25.14.2: Resetting slave info succeeded.
Thu Aug  9 16:32:25 2018 - [info] Master failover to 172.25.14.2(172.25.14.2:3306) completed successfully.
Thu Aug  9 16:32:25 2018 - [info] 

----- Failover Report -----

app: MySQL Master failover 172.25.14.1(172.25.14.1:3306) to 172.25.14.2(172.25.14.2:3306) succeeded

Master 172.25.14.1(172.25.14.1:3306) is down!

Check MHA Manager logs at server4 for details.

Started manual(interactive) failover.
Selected 172.25.14.2(172.25.14.2:3306) as a new master.
172.25.14.2(172.25.14.2:3306): OK: Applying all logs succeeded.
172.25.14.3(172.25.14.3:3306): OK: Slave started, replicating from 172.25.14.2(172.25.14.2:3306)
172.25.14.2(172.25.14.2:3306): Resetting slave info succeeded.
Master failover to 172.25.14.2(172.25.14.2:3306) completed successfully.

在server3查看

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.25.14.2
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 444
               Relay_Log_File: server3-relay-bin.000002
                Relay_Log_Pos: 414
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值