Mysql MHA 配置测试

环境准备:

192.168.10.10 Host1-master    

192.168.10.11 Host2-Slave1

192.168.10.12 Host3-Slave2

192.168.10.13   MHA Manager 

Linux : CentOS release 6.4 (Final)

Mysql  version: 5.7.19

 

Step1: 主从配置,参考:我这里

https://blog.csdn.net/jswangchang/article/details/81116651

Step2:  配置免密登陆

每台上执行下面命令: ssh-keygen -t rsa

[root@localhost .ssh]#  ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
47:95:e5:ba:22:45:94:57:14:fa:ea:68:e4:e0:97:0f root@localhost.localdomain
The key's randomart image is:
+--[ RSA 2048]----+
|          ...==. |
|         ...oo   |
|          o.. .  |
|         o   o   |
|        S o . .  |
|        .o.  o   |
|       ..+E.o    |
|        ..==     |
|         o..o    |
+-----------------+
[root@localhost .ssh]#

然后在192.168.10.10上执行下面语句,来让所有机器可以免密登陆;

[root@localhost .ssh]# ssh 192.168.10.11 cat ~/.ssh/id_rsa.pub  >> authorized_keys
root@192.168.10.11's password:
[root@localhost .ssh]# ls
authorized_keys  id_rsa  id_rsa.pub  known_hosts
[root@localhost .ssh]# rm -rf known_hosts
[root@localhost .ssh]#
[root@localhost .ssh]# ls
authorized_keys  id_rsa  id_rsa.pub
[root@localhost .ssh]# ssh 192.168.10.12 cat ~/.ssh/id_rsa.pub  >> authorized_keys
The authenticity of host '192.168.10.12 (192.168.10.12)' can't be established.
RSA key fingerprint is ed:79:e0:53:0a:77:ca:eb:09:ae:3d:e5:19:dd:3d:ba.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.10.12' (RSA) to the list of known hosts.
root@192.168.10.12's password:
[root@localhost .ssh]# ssh 192.168.10.13 cat ~/.ssh/id_rsa.pub  >> authorized_keys
The authenticity of host '192.168.10.13 (192.168.10.13)' can't be established.
RSA key fingerprint is ed:79:e0:53:0a:77:ca:eb:09:ae:3d:e5:19:dd:3d:ba.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.10.13' (RSA) to the list of known hosts.
root@192.168.10.13's password:
[root@localhost .ssh]#
[root@localhost .ssh]# scp authorized_keys 192.168.10.11:~/.ssh/authorized_keys
The authenticity of host '192.168.10.11 (192.168.10.11)' can't be established.
RSA key fingerprint is ed:79:e0:53:0a:77:ca:eb:09:ae:3d:e5:19:dd:3d:ba.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.10.11' (RSA) to the list of known hosts.
root@192.168.10.11's password:
authorized_keys                                                                                                                                                                100% 1632     1.6KB/s   00:00
[root@localhost .ssh]# scp authorized_keys 192.168.10.12:~/.ssh/authorized_keys
root@192.168.10.12's password:
authorized_keys                                                                                                                                                                100% 1632     1.6KB/s   00:00
[root@localhost .ssh]# scp authorized_keys 192.168.10.13:~/.ssh/authorized_keys
root@192.168.10.13's password:
authorized_keys                                                                                                                                                                100% 1632     1.6KB/s   00:00
[root@localhost .ssh]#

Step3, 安装安装mha;

下载地址: https://downloads.mariadb.com/MHA/

我用的是: mha4mysql-node-0.54-0.el6.noarch.rpm 

                   mha4mysql-manager-0.55-0.el6.noarch.rpm

所有节点执行,安装依赖包和 mha node

# yum install perl-DBD-MySQL

# rpm -ivh mha4mysql-node-0.56-0.el5.noarch.rpm

[root@localhost tmp]# yum install perl-DBD-MySQL
Loaded plugins: fastestmirror, refresh-packagekit
Loading mirror speeds from cached hostfile
 * base: repo.virtualhosting.hk
 * epel: mirrors.ustc.edu.cn
 * extras: repo.virtualhosting.hk
 * updates: repo.virtualhosting.hk
Setting up Install Process
Package perl-DBD-MySQL-4.013-3.el6.x86_64 already installed and latest version
Nothing to do
[root@localhost tmp]# cd /tmp
[root@localhost tmp]# rpm -ivh mha4mysql-node-0.54-0.el6.noarch.rpm
Preparing...                ########################################### [100%]
	package mha4mysql-node-0.54-0.el6.noarch is already installed
[root@localhost tmp]#

安装mha manager,只在:192.168.10.13 上运行

yum install perl-DBD-MySQLperl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager

 

[root@localhost tmp]# rpm -ivh  mha4mysql-manager-0.55-0.el6.noarch.rpm
Preparing...                ########################################### [100%]
	package mha4mysql-manager-0.55-0.el6.noarch is already installed
[root@localhost tmp]#

创建mysql 用于切换的账户 mha@‘%’ 赋予all privileges权限

mysql> select user,host from mysql.user where user='mha';
+------+--------------+
| user | host         |
+------+--------------+
| mha  | 192.168.10.% |
+------+--------------+
1 row in set (0.00 sec)

mysql> show grants for mha@'192.168.10.%';
+-----------------------------------------------------+
| Grants for mha@192.168.10.%                         |
+-----------------------------------------------------+
| GRANT ALL PRIVILEGES ON *.* TO 'mha'@'192.168.10.%' |
+-----------------------------------------------------+
1 row in set (0.00 sec)

mysql>

mkdir -p /etc/masterha/app1

[server default]

manager_workdir=/etc/masterha/app1

manager_log=/etc/masterha/app1/manager.log

user=mha
password=mha

ssh_user=root

repl_user=repl
repl_password=repl

[server1]

hostname=192.168.10.10

port=3358

master_binlog_dir=/export/data/mysql/

candidate_master=1

check_repl_delay=0


[server2]

hostname=192.168.10.11

port=3358

master_binlog_dir=/export/data/mysql/

candidate_master=1

check_repl_delay=0

[server3]

hostname=192.168.10.12

port=3358

master_binlog_dir=/export/data/mysql

ignore_fail=1

no_master=1

检查SSH免密登陆配置

masterha_check_ssh --conf=/etc/masterha/app1/app1.cnf

[root@localhost tmp]# masterha_check_ssh --conf=/etc/masterha/app1/app1.cnf
Sun Jan 28 11:19:49 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Jan 28 11:19:49 2018 - [info] Reading application default configurations from /etc/masterha/app1/app1.cnf..  #读取应用默认配置文件
Sun Jan 28 11:19:49 2018 - [info] Reading server configurations from /etc/masterha/app1/app1.cnf..  #读取Server配置
Sun Jan 28 11:19:49 2018 - [info] Starting SSH connection tests..
Sun Jan 28 11:19:50 2018 - [debug]    #ssh 连接测试
Sun Jan 28 11:19:49 2018 - [debug]  Connecting via SSH from root@192.168.10.10(192.168.10.10:22) to root@192.168.10.11(192.168.10.11:22)..
Sun Jan 28 11:19:49 2018 - [debug]   ok.
Sun Jan 28 11:19:49 2018 - [debug]  Connecting via SSH from root@192.168.10.10(192.168.10.10:22) to root@192.168.10.12(192.168.10.12:22)..
Sun Jan 28 11:19:50 2018 - [debug]   ok.
Sun Jan 28 11:19:50 2018 - [debug]   # 从第server1连接server2和3,测试正常
Sun Jan 28 11:19:50 2018 - [debug]  Connecting via SSH from root@192.168.10.11(192.168.10.11:22) to root@192.168.10.10(192.168.10.10:22)..
Sun Jan 28 11:19:50 2018 - [debug]   ok.
Sun Jan 28 11:19:50 2018 - [debug]  Connecting via SSH from root@192.168.10.11(192.168.10.11:22) to root@192.168.10.12(192.168.10.12:22)..
Sun Jan 28 11:19:50 2018 - [debug]   ok.
Sun Jan 28 11:19:51 2018 - [debug]   #从server2连接测试server1和3,正常
Sun Jan 28 11:19:50 2018 - [debug]  Connecting via SSH from root@192.168.10.12(192.168.10.12:22) to root@192.168.10.10(192.168.10.10:22)..
Sun Jan 28 11:19:51 2018 - [debug]   ok.
Sun Jan 28 11:19:51 2018 - [debug]  Connecting via SSH from root@192.168.10.12(192.168.10.12:22) to root@192.168.10.11(192.168.10.11:22)..
Sun Jan 28 11:19:51 2018 - [debug]   ok. #从server3连接server1和2测试正常
Sun Jan 28 11:19:51 2018 - [info] All SSH connection tests passed successfully.
[root@localhost tmp]# 所有SSH连接测试成功通过

验证repl配置:

[root@localhost tmp]# masterha_check_repl --conf=/etc/masterha/app1/app1.cnf
Sun Jan 28 11:20:53 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Jan 28 11:20:53 2018 - [info] Reading application default configurations from /etc/masterha/app1/app1.cnf..
Sun Jan 28 11:20:53 2018 - [info] Reading server configurations from /etc/masterha/app1/app1.cnf..   #读取app1.cnf配置
Sun Jan 28 11:20:53 2018 - [info] MHA::MasterMonitor version 0.55.
Sun Jan 28 11:20:54 2018 - [info] Dead Servers: #异常服务器没有
Sun Jan 28 11:20:54 2018 - [info] Alive Servers: #配置文件中存活服务器
Sun Jan 28 11:20:54 2018 - [info]   192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:20:54 2018 - [info]   192.168.10.11(192.168.10.11:3358)
Sun Jan 28 11:20:54 2018 - [info]   192.168.10.12(192.168.10.12:3358)
Sun Jan 28 11:20:54 2018 - [info] Alive Slaves: #当前存活的Slave
Sun Jan 28 11:20:54 2018 - [info]   192.168.10.11(192.168.10.11:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jan 28 11:20:54 2018 - [info]     Replicating from 192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:20:54 2018 - [info]     Primary candidate for the new Master (candidate_master is set) #候选服务器已设置
Sun Jan 28 11:20:54 2018 - [info]   192.168.10.12(192.168.10.12:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled 
Sun Jan 28 11:20:54 2018 - [info]     Replicating from 192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:20:54 2018 - [info]     Not candidate for the new Master (no_master is set)
Sun Jan 28 11:20:54 2018 - [info] Current Alive Master: 192.168.10.10(192.168.10.10:3358) #当前存活的Master信息
Sun Jan 28 11:20:54 2018 - [info] Checking slave configurations..
Sun Jan 28 11:20:54 2018 - [info]  read_only=1 is not set on slave 192.168.10.11(192.168.10.11:3358). #检查Slave配置,未设置read only
Sun Jan 28 11:20:54 2018 - [warning]  relay_log_purge=0 is not set on slave 192.168.10.11(192.168.10.11:3358).    #关闭自动清空relay log
Sun Jan 28 11:20:54 2018 - [info]  read_only=1 is not set on slave 192.168.10.12(192.168.10.12:3358).
Sun Jan 28 11:20:54 2018 - [warning]  relay_log_purge=0 is not set on slave 192.168.10.12(192.168.10.12:3358).   
Sun Jan 28 11:20:54 2018 - [info] Checking replication filtering settings..
Sun Jan 28 11:20:54 2018 - [info]  binlog_do_db= , binlog_ignore_db= #未设置复制过滤器
Sun Jan 28 11:20:54 2018 - [info]  Replication filtering check ok.
Sun Jan 28 11:20:54 2018 - [info] Starting SSH connection tests..#SSH连接检查
Sun Jan 28 11:20:55 2018 - [info] All SSH connection tests passed successfully.
Sun Jan 28 11:20:55 2018 - [info] Checking MHA Node version..
Sun Jan 28 11:20:56 2018 - [info]  Version check ok.
Sun Jan 28 11:20:56 2018 - [info] Checking SSH publickey authentication settings on the current master..
Sun Jan 28 11:20:56 2018 - [info] HealthCheck: SSH to 192.168.10.10 is reachable.
Sun Jan 28 11:20:56 2018 - [info] Master MHA Node version is 0.54.
Sun Jan 28 11:20:56 2018 - [info] Checking recovery script configurations on the current master.. #检查恢复脚本配置
Sun Jan 28 11:20:56 2018 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/export/data/mysql/ --output_file=/var/tmp/save_binary_logs_test --manager_version=0.55 --start_file=mysql-bin.000039
Sun Jan 28 11:20:56 2018 - [info]   Connecting to root@192.168.10.10(192.168.10.10)..
  Creating /var/tmp if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /export/data/mysql/, up to mysql-bin.000039 #发现二进制日志
Sun Jan 28 11:20:56 2018 - [info] Master setting check done.
Sun Jan 28 11:20:56 2018 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Sun Jan 28 11:20:56 2018 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.10.11 --slave_ip=192.168.10.11 --slave_port=3358 --workdir=/var/tmp --target_version=5.7.19-log --manager_version=0.55 --relay_log_info=/export/data/mysql/relay-log.info  --relay_dir=/export/data/mysql/  --slave_pass=xxx
Sun Jan 28 11:20:56 2018 - [info]   Connecting to root@192.168.10.11(192.168.10.11:22)..
Can't exec "mysqlbinlog": No such file or directory at /usr/local/share/perl5/MHA/BinlogManager.pm line 99. #无法执行mysqlbinlog,找不到该文件
mysqlbinlog version not found!
 at /usr/local/bin/apply_diff_relay_logs line 482
Sun Jan 28 11:20:56 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln195] Slaves settings check failed!
Sun Jan 28 11:20:56 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln375] Slave configuration failed.
Sun Jan 28 11:20:56 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln386] Error happend on checking configurations.  at /usr/local/bin/masterha_check_repl line 48
Sun Jan 28 11:20:56 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln482] Error happened on monitoring servers.
Sun Jan 28 11:20:56 2018 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

报错:Can't exec "mysqlbinlog": No such file or directory....

几台都执行执行:  ln -s /export/servers/mysql/bin/mysqlbinlog /usr/local/bin/mysqlbinlog

再跑一次,报错: mysqlbinlog: [ERROR] unknown variable 'default-character-set=utf8' ,编辑/etc/my.cnf,将default-character-set=utf8注释掉,重启实例,几台都要修改

(仔细看报错结果,遇到什么错误,根据报错信息逐个解决)

再跑一次:

[root@localhost tmp]# masterha_check_repl --conf=/etc/masterha/app1/app1.cnf
Sun Jan 28 11:34:42 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Jan 28 11:34:42 2018 - [info] Reading application default configurations from /etc/masterha/app1/app1.cnf..
Sun Jan 28 11:34:42 2018 - [info] Reading server configurations from /etc/masterha/app1/app1.cnf..
Sun Jan 28 11:34:42 2018 - [info] MHA::MasterMonitor version 0.55.
Sun Jan 28 11:34:42 2018 - [info] Dead Servers:
Sun Jan 28 11:34:42 2018 - [info] Alive Servers:
Sun Jan 28 11:34:42 2018 - [info]   192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:34:42 2018 - [info]   192.168.10.11(192.168.10.11:3358)
Sun Jan 28 11:34:42 2018 - [info]   192.168.10.12(192.168.10.12:3358)
Sun Jan 28 11:34:42 2018 - [info] Alive Slaves:
Sun Jan 28 11:34:42 2018 - [info]   192.168.10.11(192.168.10.11:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jan 28 11:34:42 2018 - [info]     Replicating from 192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:34:42 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Jan 28 11:34:42 2018 - [info]   192.168.10.12(192.168.10.12:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jan 28 11:34:42 2018 - [info]     Replicating from 192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:34:42 2018 - [info]     Not candidate for the new Master (no_master is set)
Sun Jan 28 11:34:42 2018 - [info] Current Alive Master: 192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:34:42 2018 - [info] Checking slave configurations..
Sun Jan 28 11:34:42 2018 - [info]  read_only=1 is not set on slave 192.168.10.11(192.168.10.11:3358).
Sun Jan 28 11:34:42 2018 - [warning]  relay_log_purge=0 is not set on slave 192.168.10.11(192.168.10.11:3358).
Sun Jan 28 11:34:42 2018 - [info]  read_only=1 is not set on slave 192.168.10.12(192.168.10.12:3358).
Sun Jan 28 11:34:42 2018 - [warning]  relay_log_purge=0 is not set on slave 192.168.10.12(192.168.10.12:3358).
Sun Jan 28 11:34:42 2018 - [info] Checking replication filtering settings..
Sun Jan 28 11:34:42 2018 - [info]  binlog_do_db= , binlog_ignore_db=
Sun Jan 28 11:34:42 2018 - [info]  Replication filtering check ok.
Sun Jan 28 11:34:42 2018 - [info] Starting SSH connection tests..
Sun Jan 28 11:34:43 2018 - [info] All SSH connection tests passed successfully.
Sun Jan 28 11:34:43 2018 - [info] Checking MHA Node version..
Sun Jan 28 11:34:44 2018 - [info]  Version check ok.
Sun Jan 28 11:34:44 2018 - [info] Checking SSH publickey authentication settings on the current master..
Sun Jan 28 11:34:44 2018 - [info] HealthCheck: SSH to 192.168.10.10 is reachable.
Sun Jan 28 11:34:44 2018 - [info] Master MHA Node version is 0.54.
Sun Jan 28 11:34:44 2018 - [info] Checking recovery script configurations on the current master..
Sun Jan 28 11:34:44 2018 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/export/data/mysql/ --output_file=/var/tmp/save_binary_logs_test --manager_version=0.55 --start_file=mysql-bin.000039
Sun Jan 28 11:34:44 2018 - [info]   Connecting to root@192.168.10.10(192.168.10.10)..
  Creating /var/tmp if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /export/data/mysql/, up to mysql-bin.000039
Sun Jan 28 11:34:44 2018 - [info] Master setting check done.
Sun Jan 28 11:34:44 2018 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Sun Jan 28 11:34:44 2018 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.10.11 --slave_ip=192.168.10.11 --slave_port=3358 --workdir=/var/tmp --target_version=5.7.19-log --manager_version=0.55 --relay_log_info=/export/data/mysql/relay-log.info  --relay_dir=/export/data/mysql/  --slave_pass=xxx
Sun Jan 28 11:34:44 2018 - [info]   Connecting to root@192.168.10.11(192.168.10.11:22)..
  Checking slave recovery environment settings..
    Opening /export/data/mysql/relay-log.info ... ok.
    Relay log found at /export/data/mysql, up to localhost-relay-bin.000004
    Temporary relay log file is /export/data/mysql/localhost-relay-bin.000004
    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Sun Jan 28 11:34:45 2018 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.10.12 --slave_ip=192.168.10.12 --slave_port=3358 --workdir=/var/tmp --target_version=5.7.19-log --manager_version=0.55 --relay_log_info=/export/data/mysql/relay-log.info  --relay_dir=/export/data/mysql/  --slave_pass=xxx
Sun Jan 28 11:34:45 2018 - [info]   Connecting to root@192.168.10.12(192.168.10.12:22)..
  Checking slave recovery environment settings..
    Opening /export/data/mysql/relay-log.info ... ok.
    Relay log found at /export/data/mysql, up to localhost-relay-bin.000002
    Temporary relay log file is /export/data/mysql/localhost-relay-bin.000002
    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done. #测试mysqlbinlog输出
    Cleaning up test file(s).. done. #清空测试文件
Sun Jan 28 11:34:45 2018 - [info] Slaves settings check done.
Sun Jan 28 11:34:45 2018 - [info] #Slave设置检查完毕
192.168.10.10 (current master) #当前master 和Slave架构
 +--192.168.10.11
 +--192.168.10.12

Sun Jan 28 11:34:45 2018 - [info] Checking replication health on 192.168.10.11..
Sun Jan 28 11:34:45 2018 - [info]  ok.#检查复制环境,OK
Sun Jan 28 11:34:45 2018 - [info] Checking replication health on 192.168.10.12..
Sun Jan 28 11:34:45 2018 - [info]  ok.
Sun Jan 28 11:34:45 2018 - [warning] master_ip_failover_script is not defined. #未定义master_ip_failover_script脚本
Sun Jan 28 11:34:45 2018 - [warning] shutdown_script is not defined. #shutdown_script脚本未设置
Sun Jan 28 11:34:45 2018 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK. #复制环境检查OK

replication检查成功

Step 4 进行切换

manager中确认没有 /etc/masterha/app1/app1.failover.complete和 /etc/masterha/app1/app1.failover.error,如果有就手动删除

启动MHA 

nohup masterha_manager --conf=/etc/masterha/app1/app1.cnf > /etc/masterha/app1/mha_manager.log 2>&1 &

查看MHA启动状态,确认正在运行,当前Master 是 192.168.10.10

[root@localhost tmp]#  masterha_check_status --conf=/etc/masterha/app1/app1.cnf
app1 (pid:88441) is running(0:PING_OK), master:192.168.10.10
[root@localhost tmp]#

查看Report

[root@localhost tmp]# tail -f /etc/masterha/app1/manager.log
192.168.10.10 (current master)
 +--192.168.10.11
 +--192.168.10.12

Sun Jan 28 11:38:06 2018 - [warning] master_ip_failover_script is not defined.
Sun Jan 28 11:38:06 2018 - [warning] shutdown_script is not defined.
Sun Jan 28 11:38:06 2018 - [info] Set master ping interval 3 seconds.
Sun Jan 28 11:38:06 2018 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Sun Jan 28 11:38:06 2018 - [info] Starting ping health check on 192.168.10.10(192.168.10.10:3358)..
Sun Jan 28 11:38:06 2018 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

由上面report可知,当前master: 192.168.10.10 , slave 1: 192.168.10.11  slave2: 192.168.10.12

测试自动failover: 

手动停掉master实例:service msyqld stop

 此时报告会产生一堆消息,仔细研读可以得知:  

[root@localhost tmp]# tail -f /etc/masterha/app1/manager.log
192.168.10.10 (current master)
 +--192.168.10.11
 +--192.168.10.12
#当前主从架构
Sun Jan 28 11:38:06 2018 - [warning] master_ip_failover_script is not defined.
Sun Jan 28 11:38:06 2018 - [warning] shutdown_script is not defined.
Sun Jan 28 11:38:06 2018 - [info] Set master ping interval 3 seconds.
Sun Jan 28 11:38:06 2018 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Sun Jan 28 11:38:06 2018 - [info] Starting ping health check on 192.168.10.10(192.168.10.10:3358)..
Sun Jan 28 11:38:06 2018 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond.. #ping健康测试OK,监控等候MySQL不响应
Sun Jan 28 11:42:54 2018 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away) # 警告-获取到mysql select ping错误
Sun Jan 28 11:42:54 2018 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/export/data/mysql/ -- output_file=/var/tmp/save_binary_logs_test --manager_version=0.55--binlog_prefix=mysql-bin #自动执行保存命令,保存二进制日志
Sun Jan 28 11:42:54 2018 - [info] HealthCheck: SSH to 192.168.10.10 is reachable.#ssh可达
Sun Jan 28 11:42:57 2018 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Sun Jan 28 11:42:57 2018 - [warning] Connection failed 1 time(s)..
Sun Jan 28 11:43:00 2018 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Sun Jan 28 11:43:00 2018 - [warning] Connection failed 2 time(s)..
Sun Jan 28 11:43:03 2018 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Sun Jan 28 11:43:03 2018 - [warning] Connection failed 3 time(s)..#尝试三次连接失败
Sun Jan 28 11:43:03 2018 - [warning] Master is not reachable from health checker!
Sun Jan 28 11:43:03 2018 - [warning] Master 192.168.10.10(192.168.10.10:3358) is not reachable! #Master mysql不可达
Sun Jan 28 11:43:03 2018 - [warning] SSH is reachable. #SSH可达
Sun Jan 28 11:43:03 2018 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1/app1.cnf again, and trying to connect to all servers to check server status.. 
Sun Jan 28 11:43:03 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Jan 28 11:43:03 2018 - [info] Reading application default configurations from /etc/masterha/app1/app1.cnf..#查看配置文件,连接所有server查看状态
Sun Jan 28 11:43:03 2018 - [info] Reading server configurations from /etc/masterha/app1/app1.cnf..
Sun Jan 28 11:43:03 2018 - [info] Dead Servers: 
Sun Jan 28 11:43:03 2018 - [info]   192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:43:03 2018 - [info] Alive Servers:
Sun Jan 28 11:43:03 2018 - [info]   192.168.10.11(192.168.10.11:3358)
Sun Jan 28 11:43:03 2018 - [info]   192.168.10.12(192.168.10.12:3358)
Sun Jan 28 11:43:03 2018 - [info] Alive Slaves:
Sun Jan 28 11:43:03 2018 - [info]   192.168.10.11(192.168.10.11:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jan 28 11:43:03 2018 - [info]     Replicating from 192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:43:03 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Jan 28 11:43:03 2018 - [info]   192.168.10.12(192.168.10.12:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jan 28 11:43:03 2018 - [info]     Replicating from 192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:43:03 2018 - [info]     Not candidate for the new Master (no_master is set)
Sun Jan 28 11:43:03 2018 - [info] Checking slave configurations..
Sun Jan 28 11:43:03 2018 - [info]  read_only=1 is not set on slave 192.168.10.11(192.168.10.11:3358).
Sun Jan 28 11:43:03 2018 - [warning]  relay_log_purge=0 is not set on slave 192.168.10.11(192.168.10.11:3358).
Sun Jan 28 11:43:03 2018 - [info]  read_only=1 is not set on slave 192.168.10.12(192.168.10.12:3358).
Sun Jan 28 11:43:03 2018 - [warning]  relay_log_purge=0 is not set on slave 192.168.10.12(192.168.10.12:3358).
Sun Jan 28 11:43:03 2018 - [info] Checking replication filtering settings..
Sun Jan 28 11:43:03 2018 - [info]  Replication filtering check ok.
Sun Jan 28 11:43:03 2018 - [info] Master is down! #master down了,停止监控脚本
Sun Jan 28 11:43:03 2018 - [info] Terminating monitoring script.
Sun Jan 28 11:43:03 2018 - [info] Got exit code 20 (Master dead). #返回错误码20
Sun Jan 28 11:43:03 2018 - [info] MHA::MasterFailover version 0.55.
Sun Jan 28 11:43:03 2018 - [info] Starting master failover.
Sun Jan 28 11:43:03 2018 - [info]  #开始master 故障转移
Sun Jan 28 11:43:03 2018 - [info] * Phase 1: Configuration Check Phase..
Sun Jan 28 11:43:03 2018 - [info] 
Sun Jan 28 11:43:03 2018 - [info] Dead Servers:
Sun Jan 28 11:43:03 2018 - [info]   192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:43:03 2018 - [info] Checking master reachability via mysql(double check)..
Sun Jan 28 11:43:03 2018 - [info]  ok.
Sun Jan 28 11:43:03 2018 - [info] Alive Servers:
Sun Jan 28 11:43:03 2018 - [info]   192.168.10.11(192.168.10.11:3358)
Sun Jan 28 11:43:03 2018 - [info]   192.168.10.12(192.168.10.12:3358)
Sun Jan 28 11:43:03 2018 - [info] Alive Slaves:
Sun Jan 28 11:43:03 2018 - [info]   192.168.10.11(192.168.10.11:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jan 28 11:43:03 2018 - [info]     Replicating from 192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:43:03 2018 - [info]     Primary candidate for the new Master (candidate_master is set) #提升候选机器为master
Sun Jan 28 11:43:03 2018 - [info]   192.168.10.12(192.168.10.12:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jan 28 11:43:03 2018 - [info]     Replicating from 192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:43:03 2018 - [info]     Not candidate for the new Master (no_master is set)
Sun Jan 28 11:43:03 2018 - [info] ** Phase 1: Configuration Check Phase completed.
Sun Jan 28 11:43:03 2018 - [info]
Sun Jan 28 11:43:03 2018 - [info] * Phase 2: Dead Master Shutdown Phase..
Sun Jan 28 11:43:03 2018 - [info]  
Sun Jan 28 11:43:03 2018 - [info] Forcing shutdown so that applications never connect to the current master.. #强行关闭,让应用程序不要去练老的master
Sun Jan 28 11:43:03 2018 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master ip address.  #故障转移脚本未设置
Sun Jan 28 11:43:03 2018 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.  #关闭脚本未设置
Sun Jan 28 11:43:03 2018 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Sun Jan 28 11:43:03 2018 - [info]  
Sun Jan 28 11:43:03 2018 - [info] * Phase 3: Master Recovery Phase..
Sun Jan 28 11:43:03 2018 - [info]   
Sun Jan 28 11:43:03 2018 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Sun Jan 28 11:43:03 2018 - [info]  #正在获取最新的Slave
Sun Jan 28 11:43:03 2018 - [info] The latest binary log file/position on all slaves is mysql-bin.000039:1193 #所有的Slave二进制日志位置都是最新的
Sun Jan 28 11:43:03 2018 - [info] Latest slaves (Slaves that received relay log files to the latest): #最新Slave如下:
Sun Jan 28 11:43:03 2018 - [info]   192.168.10.11(192.168.10.11:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jan 28 11:43:03 2018 - [info]     Replicating from 192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:43:03 2018 - [info]     Primary candidate for the new Master (candidate_master is set) 
Sun Jan 28 11:43:03 2018 - [info]   192.168.10.12(192.168.10.12:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jan 28 11:43:03 2018 - [info]     Replicating from 192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:43:03 2018 - [info]     Not candidate for the new Master (no_master is set)
Sun Jan 28 11:43:03 2018 - [info] The oldest binary log file/position on all slaves is mysql-bin.000039:1193
Sun Jan 28 11:43:03 2018 - [info] Oldest slaves: #最旧的Slave:
Sun Jan 28 11:43:03 2018 - [info]   192.168.10.11(192.168.10.11:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jan 28 11:43:03 2018 - [info]     Replicating from 192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:43:03 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Jan 28 11:43:03 2018 - [info]   192.168.10.12(192.168.10.12:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jan 28 11:43:03 2018 - [info]     Replicating from 192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:43:03 2018 - [info]     Not candidate for the new Master (no_master is set)
Sun Jan 28 11:43:03 2018 - [info]
Sun Jan 28 11:43:03 2018 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Sun Jan 28 11:43:03 2018 - [info]  #保存挂死Master的binlog
Sun Jan 28 11:43:03 2018 - [info] Fetching dead master's binary logs..
Sun Jan 28 11:43:03 2018 - [info] Executing command on the dead master 192.168.10.10(192.168.10.10:3358): save_binary_logs --command=save --start_file=mysql-bin.000039  --start_pos=1193 --binlog_dir=/export/data/mysql/ --output_file=/var/tmp/saved_master_binlog_from_192.168.10.10_3358_20180128114303.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.55
  Creating /var/tmp if not exists..    ok. 
 Concat binary/relay logs from mysql-bin.000039 pos 1193 to mysql-bin.000039 EOF into /var/tmp/saved_master_binlog_from_192.168.10.10_3358_20180128114303.binlog ..
  Dumping binlog format description event, from position 0 to 123.. ok.
  Dumping effective binlog data from /export/data/mysql//mysql-bin.000039 position 1193 to tail(1216).. ok.
 Concat succeeded.
Sun Jan 28 11:43:04 2018 - [info] scp from root@192.168.10.10:/var/tmp/saved_master_binlog_from_192.168.10.10_3358_20180128114303.binlog to local:/etc/masterha/app1/saved_master_binlog_from_192.168.10.10_3358_20180128114303.binlog succeeded. #将binlog 保存到/etc/masterha/app1目录下
Sun Jan 28 11:43:04 2018 - [info] HealthCheck: SSH to 192.168.10.11 is reachable.
Sun Jan 28 11:43:04 2018 - [info] HealthCheck: SSH to 192.168.10.12 is reachable.
Sun Jan 28 11:43:04 2018 - [info] #SSH检查OK
Sun Jan 28 11:43:04 2018 - [info] * Phase 3.3: Determining New Master Phase..
Sun Jan 28 11:43:04 2018 - [info]
Sun Jan 28 11:43:04 2018 - [info] Finding the latest slave that has all relay logs for recovering other slaves.. #找到最新Slave的relay log同步给其他Slave
Sun Jan 28 11:43:04 2018 - [info] All slaves received relay logs to the same position. No need to resync each other.  #所有Slave都有相同的位置点,不需同步
Sun Jan 28 11:43:04 2018 - [info] Searching new master from slaves..#从Slave中找新Master
Sun Jan 28 11:43:04 2018 - [info]  Candidate masters from the configuration file:
Sun Jan 28 11:43:04 2018 - [info]   192.168.10.11(192.168.10.11:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jan 28 11:43:04 2018 - [info]     Replicating from 192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:43:04 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Jan 28 11:43:04 2018 - [info]  Non-candidate masters:
Sun Jan 28 11:43:04 2018 - [info]   192.168.10.12(192.168.10.12:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Sun Jan 28 11:43:04 2018 - [info]     Replicating from 192.168.10.10(192.168.10.10:3358)
Sun Jan 28 11:43:04 2018 - [info]     Not candidate for the new Master (no_master is set)
Sun Jan 28 11:43:04 2018 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Sun Jan 28 11:43:04 2018 - [info] New master is 192.168.10.11(192.168.10.11:3358)
Sun Jan 28 11:43:04 2018 - [info] Starting master failover..
Sun Jan 28 11:43:04 2018 - [info] #10.11为新Master
From:
192.168.10.10 (current master) 
 +--192.168.10.11
 +--192.168.10.12
#主从架构从之前的变为现在的架构
To:
192.168.10.11 (new master)
 +--192.168.10.12
Sun Jan 28 11:43:04 2018 - [info] #对新Master生成差异日志
Sun Jan 28 11:43:04 2018 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
Sun Jan 28 11:43:04 2018 - [info] #新Master有所有的reolay logs,不需要生成差异日志
Sun Jan 28 11:43:04 2018 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Sun Jan 28 11:43:04 2018 - [info] Sending binlog.. #发送binlog
Sun Jan 28 11:43:05 2018 - [info] scp from local:/etc/masterha/app1/saved_master_binlog_from_192.168.10.10_3358_20180128114303.binlog to root@192.168.10.11:/var/tmp/saved_master_binlog_from_192.168.10.10_3358_20180128114303.binlog succeeded.  #将原master保存的binlog发给新master
Sun Jan 28 11:43:05 2018 - [info]
Sun Jan 28 11:43:05 2018 - [info] * Phase 3.4: Master Log Apply Phase..
Sun Jan 28 11:43:05 2018 - [info]
Sun Jan 28 11:43:05 2018 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed. #注意,如果这一段有报错,则需要手动恢复
Sun Jan 28 11:43:05 2018 - [info] Starting recovery on 192.168.10.11(192.168.10.11:3358).. #开始恢复
Sun Jan 28 11:43:05 2018 - [info]  Generating diffs succeeded.
Sun Jan 28 11:43:05 2018 - [info] Waiting until all relay logs are applied.
Sun Jan 28 11:43:05 2018 - [info]  done. #应用relay log
Sun Jan 28 11:43:05 2018 - [info] Getting slave status..
Sun Jan 28 11:43:05 2018 - [info] This slave(192.168.10.11)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.000039:1193). No need to recover from Exec_Master_Log_Pos.
Sun Jan 28 11:43:05 2018 - [info] Connecting to the target slave host 192.168.10.11, running recover script..#连接到10.11,执行恢复脚本
Sun Jan 28 11:43:05 2018 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user='mha' --slave_host=192.168.10.11 --slave_ip=192.168.10.11  --slave_port=3358 -- apply_files=/var/tmp/saved_master_binlog_from_192.168.10.10_3358_20180128114303.binlog --workdir=/var/tmp --target_version=5.7.19-log --timestamp=20180128114303 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.55 --slave_pass=xxx
Sun Jan 28 11:43:05 2018 - [info] #将10.10上备份过来的binlog恢复至新master
MySQL client version is 5.7.19. Using --binary-mode.
Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.10.10_3358_20180128114303.binlog on 192.168.10.11:3358. This may take long time... 
Applying log files succeeded.  #应用binlog成功
Sun Jan 28 11:43:05 2018 - [info]  All relay logs were successfully applied.
Sun Jan 28 11:43:05 2018 - [info] Getting new master's binlog name and position..
Sun Jan 28 11:43:05 2018 - [info]  mysql-bin.000017:154 #获取binlog名字和位置
Sun Jan 28 11:43:05 2018 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.10.11', MASTER_PORT=3358, MASTER_LOG_FILE='mysql-bin.000017',MASTER_LOG_POS=154, MASTER_USER='repl', MASTER_PASSWORD='xxx';#所有的Slave都应该从这个位置开始复制
Sun Jan 28 11:43:05 2018 - [warning] master_ip_failover_script is not set. Skipping taking over new master ip address.
Sun Jan 28 11:43:05 2018 - [info] ** Finished master recovery successfully.
Sun Jan 28 11:43:05 2018 - [info] * Phase 3: Master Recovery Phase completed.
Sun Jan 28 11:43:05 2018 - [info]   #Master恢复成功
Sun Jan 28 11:43:05 2018 - [info] * Phase 4: Slaves Recovery Phase..
Sun Jan 28 11:43:05 2018 - [info] 
Sun Jan 28 11:43:05 2018 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
Sun Jan 28 11:43:05 2018 - [info]
Sun Jan 28 11:43:05 2018 - [info] -- Slave diff file generation on host 192.168.10.12(192.168.10.12:3358) started, pid: 91595. Check tmp log /etc/masterha/app1/192.168.10.12_3358_20180128114303.log if it takes time..
Sun Jan 28 11:43:05 2018 - [info]  
Sun Jan 28 11:43:05 2018 - [info] Log messages from 192.168.10.12 ...
Sun Jan 28 11:43:05 2018 - [info]
Sun Jan 28 11:43:05 2018 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Sun Jan 28 11:43:05 2018 - [info] End of log messages from 192.168.10.12.
Sun Jan 28 11:43:05 2018 - [info] -- 192.168.10.12(192.168.10.12:3358) has the latest relay log events.  #10.12拥有最新的relay log
Sun Jan 28 11:43:05 2018 - [info] Generating relay diff files from the latest slave succeeded.
Sun Jan 28 11:43:05 2018 - [info]
Sun Jan 28 11:43:05 2018 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
Sun Jan 28 11:43:05 2018 - [info]
Sun Jan 28 11:43:05 2018 - [info] -- Slave recovery on host 192.168.10.12(192.168.10.12:3358) started, pid: 91597. Check tmp log /etc/masterha/app1/192.168.10.12_3358_20180128114303.log if it takes time..
Sun Jan 28 11:43:05 2018 - [info]
Sun Jan 28 11:43:05 2018 - [info] Log messages from 192.168.10.12 ...
Sun Jan 28 11:43:05 2018 - [info]
Sun Jan 28 11:43:05 2018 - [info] Sending binlog..
Sun Jan 28 11:43:05 2018 - [info] scp from local:/etc/masterha/app1/saved_master_binlog_from_192.168.10.10_3358_20180128114303.binlog to root@192.168.10.12:/var/tmp/saved_master_binlog_from_192.168.10.10_3358_20180128114303.binlog succeeded. #将原master保存的binlog发送到Slave 10.12上
Sun Jan 28 11:43:05 2018 - [info] Starting recovery on 192.168.10.12(192.168.10.12:3358)..
Sun Jan 28 11:43:05 2018 - [info]  Generating diffs succeeded.
Sun Jan 28 11:43:05 2018 - [info] Waiting until all relay logs are applied.
Sun Jan 28 11:43:05 2018 - [info]  done.
Sun Jan 28 11:43:05 2018 - [info] Getting slave status..
Sun Jan 28 11:43:05 2018 - [info] This slave(192.168.10.12)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.000039:1193). No need to recover from Exec_Master_Log_Pos.
Sun Jan 28 11:43:05 2018 - [info] Connecting to the target slave host 192.168.10.12, running recover script..
Sun Jan 28 11:43:05 2018 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user='mha' --slave_host=192.168.10.12 --slave_ip=192.168.10.12  --slave_port=3358 --apply_files=/var/tmp/saved_master_binlog_from_192.168.10.10_3358_20180128114303.binlog --workdir=/var/tmp --target_version=5.7.19-log --timestamp=20180128114303 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.55 --slave_pass=xxx
Sun Jan 28 11:43:05 2018 - [info]
MySQL client version is 5.7.19. Using --binary-mode.
Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.10.10_3358_20180128114303.binlog on 192.168.10.12:3358. This may take long time...
Applying log files succeeded.
Sun Jan 28 11:43:05 2018 - [info]  All relay logs were successfully applied.
Sun Jan 28 11:43:05 2018 - [info]  Resetting slave 192.168.10.12(192.168.10.12:3358) and starting replication from the new master 192.168.10.11(192.168.10.11:3358)..
Sun Jan 28 11:43:05 2018 - [info]  Executed CHANGE MASTER.
Sun Jan 28 11:43:05 2018 - [info]  Slave started. #修改slave10.12的配置指向新master
Sun Jan 28 11:43:05 2018 - [info] End of log messages from 192.168.10.12.
Sun Jan 28 11:43:05 2018 - [info] -- Slave recovery on host 192.168.10.12(192.168.10.12:3358) succeeded.
Sun Jan 28 11:43:05 2018 - [info] All new slave servers recovered successfully.
Sun Jan 28 11:43:05 2018 - [info]
Sun Jan 28 11:43:05 2018 - [info] * Phase 5: New master cleanup phase..
Sun Jan 28 11:43:05 2018 - [info]
Sun Jan 28 11:43:05 2018 - [info] Resetting slave info on the new master..
Sun Jan 28 11:43:05 2018 - [info]  192.168.10.11: Resetting slave info succeeded.
Sun Jan 28 11:43:05 2018 - [info] Master failover to 192.168.10.11(192.168.10.11:3358) completed successfully.
Sun Jan 28 11:43:05 2018 - [info]

----- Failover Report -----

app1: MySQL Master failover 192.168.10.10 to 192.168.10.11 succeeded
#Master故障切换,从10.10到10.11,成功
Master 192.168.10.10 is down!

Check MHA Manager logs at localhost.localdomain:/etc/masterha/app1/manager.log for details.
#可以 查看日志/etc/masterha/app1/manager.log
Started automated(non-interactive) failover.
The latest slave 192.168.10.11(192.168.10.11:3358) has all relay logs for recovery.
Selected 192.168.10.11 as a new master.
192.168.10.11: OK: Applying all logs succeeded.
192.168.10.12: This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.10.12: OK: Applying all logs succeeded. Slave started, replicating from 192.168.10.11.
192.168.10.11: Resetting slave info succeeded.
Master failover to 192.168.10.11(192.168.10.11:3358) completed successfully.

由report可知,现在master变成了 192.168.10.11, 验证一下:

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.10.11
                  Master_User: repl
                  Master_Port: 3358
                Connect_Retry: 10
              Master_Log_File: mysql-bin.000017
          Read_Master_Log_Pos: 154
               Relay_Log_File: localhost-relay-bin.000002
                Relay_Log_Pos: 320
        Relay_Master_Log_File: mysql-bin.000017
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              ...................

待原master从故障中恢复后,(此时把192.168.10.10 实例起来),然后将其变为新Slave:先在manager上获取posistion

[root@localhost tmp]# grep -i "change master to" /etc/masterha/app1/manager.log | tail -1
Sun Jan 28 11:43:05 2018 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.10.11', MASTER_PORT=3358, MASTER_LOG_FILE='mysql-bin.000017',MASTER_LOG_POS=154, MASTER_USER='repl', MASTER_PASSWORD='xxx';
[root@localhost tmp]#

然后在原master上执行语句:

mysql> CHANGE MASTER TO MASTER_HOST='192.168.10.11', MASTER_PORT=3358, MASTER_LOG_FILE='mysql-bin.000017',MASTER_LOG_POS=154, MASTER_USER='repl', MASTER_PASSWORD='repl';
Query OK, 0 rows affected, 2 warnings (0.02 sec)
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.10.11
                  Master_User: repl
                  Master_Port: 3358
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000017
          Read_Master_Log_Pos: 154
               Relay_Log_File: localhost-relay-bin.000002
                Relay_Log_Pos: 320
        Relay_Master_Log_File: mysql-bin.000017
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

此时新的环境就起来了

当前环境:

192.168.10.10 slave

192.168.10.11  master

192.168.10.12 slave

测试手动failover:

如果mha还在运行,需要先手动停止

  masterha_stop --conf=/etc/masterha/app1/app1.cnf

[root@localhost tmp]#   masterha_stop --conf=/etc/masterha/app1/app1.cnf
MHA Manager is not running on app1(2:NOT_RUNNING).

然后检查MHA状态:

 masterha_check_repl  --conf=/etc/masterha/app1/app1.cnf

 开始切换

masterha_master_switch --master_state=alive --conf=/etc/masterha/app1/app1.cnf --orig_master_is_new_slave --running_updates_limit=3600 --interactive=0

[root@localhost tmp]#  masterha_master_switch --master_state=alive --conf=/etc/masterha/app1/app1.cnf --orig_master_is_new_slave --running_updates_limit=3600 --interactive=0
Mon Jan 29 01:10:39 2018 - [info] MHA::MasterRotate version 0.55.
Mon Jan 29 01:10:39 2018 - [info] Starting online master switch..
Mon Jan 29 01:10:39 2018 - [info] #开始在实现切换
Mon Jan 29 01:10:39 2018 - [info] * Phase 1: Configuration Check Phase..
Mon Jan 29 01:10:39 2018 - [info]
Mon Jan 29 01:10:39 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Jan 29 01:10:39 2018 - [info] Reading application default configurations from /etc/masterha/app1/app1.cnf..
Mon Jan 29 01:10:39 2018 - [info] Reading server configurations from /etc/masterha/app1/app1.cnf..
Mon Jan 29 01:10:39 2018 - [info] Current Alive Master: 192.168.10.11(192.168.10.11:3358)
Mon Jan 29 01:10:39 2018 - [info] Alive Slaves:
Mon Jan 29 01:10:39 2018 - [info]   192.168.10.10(192.168.10.10:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Mon Jan 29 01:10:39 2018 - [info]     Replicating from 192.168.10.11(192.168.10.11:3358)
Mon Jan 29 01:10:39 2018 - [info]     Primary candidate for the new Master (candidate_master is set) #当前的master和slave
Mon Jan 29 01:10:39 2018 - [info]   192.168.10.12(192.168.10.12:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Mon Jan 29 01:10:39 2018 - [info]     Replicating from 192.168.10.11(192.168.10.11:3358)
Mon Jan 29 01:10:39 2018 - [info]     Not candidate for the new Master (no_master is set)
Mon Jan 29 01:10:39 2018 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Mon Jan 29 01:10:39 2018 - [info]  ok.
Mon Jan 29 01:10:39 2018 - [info] Checking MHA is not monitoring or doing failover..
Mon Jan 29 01:10:39 2018 - [info] Checking replication health on 192.168.10.10..
Mon Jan 29 01:10:39 2018 - [info]  ok.
Mon Jan 29 01:10:39 2018 - [info] Checking replication health on 192.168.10.12..
Mon Jan 29 01:10:39 2018 - [info]  ok.
Mon Jan 29 01:10:39 2018 - [info] Searching new master from slaves..
Mon Jan 29 01:10:39 2018 - [info]  Candidate masters from the configuration file:
Mon Jan 29 01:10:39 2018 - [info]   192.168.10.10(192.168.10.10:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Mon Jan 29 01:10:39 2018 - [info]     Replicating from 192.168.10.11(192.168.10.11:3358)
Mon Jan 29 01:10:39 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Jan 29 01:10:39 2018 - [info]   192.168.10.11(192.168.10.11:3358)  Version=5.7.19-log log-bin:enabled
Mon Jan 29 01:10:39 2018 - [info]  Non-candidate masters:
Mon Jan 29 01:10:39 2018 - [info]   192.168.10.12(192.168.10.12:3358)  Version=5.7.19-log (oldest major version between slaves) log-bin:enabled
Mon Jan 29 01:10:39 2018 - [info]     Replicating from 192.168.10.11(192.168.10.11:3358)
Mon Jan 29 01:10:39 2018 - [info]     Not candidate for the new Master (no_master is set)
Mon Jan 29 01:10:39 2018 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Mon Jan 29 01:10:39 2018 - [info]
From:
192.168.10.11 (current master) 当前架构
 +--192.168.10.10
 +--192.168.10.12

To:
192.168.10.10 (new master)  改为新架构
 +--192.168.10.12
 +--192.168.10.11
Mon Jan 29 01:10:39 2018 - [info] Checking whether 192.168.10.10(192.168.10.10:3358) is ok for the new master..
Mon Jan 29 01:10:39 2018 - [info]  ok.
Mon Jan 29 01:10:39 2018 - [info] 192.168.10.11(192.168.10.11:3358): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host.
Mon Jan 29 01:10:39 2018 - [info] 192.168.10.11(192.168.10.11:3358): Resetting slave pointing to the dummy host.
Mon Jan 29 01:10:39 2018 - [info] ** Phase 1: Configuration Check Phase completed.
Mon Jan 29 01:10:39 2018 - [info]
Mon Jan 29 01:10:39 2018 - [info] * Phase 2: Rejecting updates Phase..
Mon Jan 29 01:10:39 2018 - [info]
Mon Jan 29 01:10:39 2018 - [warning] master_ip_online_change_script is not defined. Skipping disabling writes on the current master.
Mon Jan 29 01:10:39 2018 - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Mon Jan 29 01:10:39 2018 - [info] Executing FLUSH TABLES WITH READ LOCK..
Mon Jan 29 01:10:39 2018 - [info]  ok.
Mon Jan 29 01:10:39 2018 - [info] Orig master binlog:pos is mysql-bin.000017:682.
Mon Jan 29 01:10:39 2018 - [info]  Waiting to execute all relay logs on 192.168.10.10(192.168.10.10:3358)..
Mon Jan 29 01:10:39 2018 - [info]  master_pos_wait(mysql-bin.000017:682) completed on 192.168.10.10(192.168.10.10:3358). Executed 0 events.
Mon Jan 29 01:10:39 2018 - [info]   done.
Mon Jan 29 01:10:39 2018 - [info] Getting new master's binlog name and position..
Mon Jan 29 01:10:39 2018 - [info]  mysql-bin.000040:194
Mon Jan 29 01:10:39 2018 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.10.10', MASTER_PORT=3358, MASTER_LOG_FILE='mysql-bin.000040',MASTER_LOG_POS=194, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Mon Jan 29 01:10:39 2018 - [info]
Mon Jan 29 01:10:39 2018 - [info] * Switching slaves in parallel..
Mon Jan 29 01:10:39 2018 - [info]
Mon Jan 29 01:10:39 2018 - [info] -- Slave switch on host 192.168.10.12(192.168.10.12:3358) started, pid: 60807
Mon Jan 29 01:10:39 2018 - [info]
Mon Jan 29 01:10:39 2018 - [info] Log messages from 192.168.10.12 ...
Mon Jan 29 01:10:39 2018 - [info]
Mon Jan 29 01:10:39 2018 - [info]  Waiting to execute all relay logs on 192.168.10.12(192.168.10.12:3358)..
Mon Jan 29 01:10:39 2018 - [info]  master_pos_wait(mysql-bin.000017:682) completed on 192.168.10.12(192.168.10.12:3358). Executed 0 events.
Mon Jan 29 01:10:39 2018 - [info]   done.
Mon Jan 29 01:10:39 2018 - [info]  Resetting slave 192.168.10.12(192.168.10.12:3358) and starting replication from the new master 192.168.10.10(192.168.10.10:3358)..
Mon Jan 29 01:10:39 2018 - [info]  Executed CHANGE MASTER.
Mon Jan 29 01:10:39 2018 - [info]  Slave started.
Mon Jan 29 01:10:39 2018 - [info] End of log messages from 192.168.10.12 ...
Mon Jan 29 01:10:39 2018 - [info]
Mon Jan 29 01:10:39 2018 - [info] -- Slave switch on host 192.168.10.12(192.168.10.12:3358) succeeded.
Mon Jan 29 01:10:39 2018 - [info] Unlocking all tables on the orig master:
Mon Jan 29 01:10:39 2018 - [info] Executing UNLOCK TABLES..
Mon Jan 29 01:10:39 2018 - [info]  ok.
Mon Jan 29 01:10:39 2018 - [info] Starting orig master as a new slave..
Mon Jan 29 01:10:39 2018 - [info]  Resetting slave 192.168.10.11(192.168.10.11:3358) and starting replication from the new master 192.168.10.10(192.168.10.10:3358)..
Mon Jan 29 01:10:39 2018 - [info]  Executed CHANGE MASTER.
Mon Jan 29 01:10:39 2018 - [info]  Slave started.
Mon Jan 29 01:10:39 2018 - [info] All new slave servers switched successfully.
Mon Jan 29 01:10:39 2018 - [info]
Mon Jan 29 01:10:39 2018 - [info] * Phase 5: New master cleanup phase..
Mon Jan 29 01:10:39 2018 - [info]
Mon Jan 29 01:10:39 2018 - [info]  192.168.10.10: Resetting slave info succeeded.
Mon Jan 29 01:10:39 2018 - [info] Switching master to 192.168.10.10(192.168.10.10:3358) completed successfully.

验证:

ysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.10.10
                  Master_User: repl
                  Master_Port: 3358
                Connect_Retry: 10
              Master_Log_File: mysql-bin.000040
          Read_Master_Log_Pos: 194
               Relay_Log_File: localhost-relay-bin.000002
                Relay_Log_Pos: 320
        Relay_Master_Log_File: mysql-bin.000040
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
                        ...................

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值