MySQL高可用架构之MHA搭建以及测试(二)

一、MHA特点

MHA监控复制架构的主服务器,一旦检测到主服务器故障,就会自动进行故障转移。

即使有些从服务器没有收到最新的relay log,MHA自动从最新的从服务器上识别差异的relay log并把这些日志应用到其他从服务器上,因此所有的从服务器保持一致性了。MHA通常在几秒内完成故障转移,9-12秒可以检测出主服务器故障,7-10秒内关闭故障的主服务器以避免脑裂,几秒中内应用差异的relay log到新的主服务器上,整个过程可以在10-30s内完成。还可以设置优先级指定其中的一台slave作为master的候选人。由于MHA在slaves之间修复一致性,因此可以将任何slave变成新的master,而不会发生一致性的问题,从而导致复制失败。

二、注意问题

1.从数据库需要设置为read_only;

2.一旦发生切换,管理进程将会退出;重新启动mha_manager进行另一次的切换,需要手工删除管理目录里面的app1.failover.complete;

3.MHA 在切换的时候需要用mysqlbinlog命令,如果不是标准安装需要手动增加软连接;

ln -s /usr/local/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog
ln -s /usr/local/mysql/bin/mysql /usr/bin/mysql

4.主从配置都要有

binlog-do-db=test
replicate-do-db=test
一般情况下,主服务器需要包含binlog-do-db=test,从服务器需要包含replicate-do-db=test,这样主从就可以同步了。但是只是这样配置的话,会报以下错误
All log-bin enabled servers must have same binlog filtering rules (same binlog-do-db and binlog-ignore-db). Check SHOW MASTER STATUS output and set my.cnf correctly.
上面英文的意思是说,主从同步的数据库要一样,其实不是,而是配置文件中,配置数据库这一块要一样。
5.
从服务器,要加上relay_log_purge=0,不加的话,会报出warning,relay_log_purge=0 is not set on slave(后期文件会有说明以及测试)
6.MHA配置文件中

candidate_master=1                              #slave 是否优先提升为master
no_master=1                                     #该server禁止提升为master

7.masterha_check_repl检测步骤

a、读取配置文件
b、检测配置文件中列出的mysql服务器(识别主从)
c、检测从库配置信息
    read_only参数
    relay_log_purge参数
    复制过滤规则
d、ssh等效性验证 
e、检测主库保存binlog脚本(save_binary_logs) ,主要是用于在master死掉后从binlog读取日志
f、检测各从库能否apply差量binlog(apply_diff_relay_logs)
g、检测IP切换,如果有部署脚本

8.online master switch 的条件

1. IO threads on all slaves are running   // 在所有slave上IO线程运行。
2. SQL threads on all slaves are running  //SQL线程在所有的slave上正常运行。
3. Seconds_Behind_Master on all slaves are less or equal than --running_updates_limit seconds  // 在所有的slaves上 Seconds_Behind_Master 要小于等于  running_updates_limit seconds
4. On master, none of update queries take more than --running_updates_limit seconds in the show processlist output  // 在主上,没有更新查询操作多于running_updates_limit seconds 在show processlist输出结果上。


三、MHA环境搭建

#mysql版本
mysql> select version();
+------------+
| version()  |
+------------+
| 5.6.37-log |
+------------+
1 row in set (0.00 sec)
#主机
192.168.18.50  主数据库
192.168.18.60  从数据库
192.168.18.70  从数据库

#三台服务器建立SSH互信
#每台服务器上执行ssh-keygen,然后将每台的id_rsa.pub追加到authorized_keys,然后分别复制到/root/.ssh下即可
[root@bogon ~]# ssh-keygen 
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:+FbMFmsG8/Yeq9Enfg4fN/0N/Ff0kcu/l0hq6g7YDFI root@bogon
The key's randomart image is:
+---[RSA 2048]----+
|                 |
|                 |
|     E  o .     .|
|    .  . * o   o.|
|   . .. S @   ..+|
|    . =. * o o o+|
|     . +o . O *.*|
|       ..  * O.*B|
|        o++.+oo.B|

cat id_rsa.pub >> authorized_keys

#安装Mysql,安装后搭建一主二从结构
#安装后Mysql数据库初始化相关账号
mysql> delete from mysql.user where user!='root' or host!='localhost';
Query OK, 5 rows affected (0.00 sec)

mysql> truncate table mysql.db;
Query OK, 0 rows affected (0.05 sec)

mysql> drop database test;
Query OK, 0 rows affected (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.03 sec)
mysql> grant all privileges on *.* to 'mha'@'%' identified by '123456';
Query OK, 0 rows affected (0.01 sec)

mysql> grant replication slave on *.* to 'repl' identified by 'repl4slave';
Query OK, 0 rows affected (0.03 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.04 sec)

#安装MHA软件
#yum install perl-DBD-MySQL
#yum install perl-Config-Tiny
#yum install perl-Log-Dispatch
#yum install perl-Parallel-ForkManager
rpm -ivh mha4mysql-manager
rpm -ivh mha4mysql-node

#配置
[root@test1 masterha]# more masterha_default.conf 
[server default]
#MySQL的用户和密码
user=mha
password=123456

#系统ssh用户
ssh_user=root

#复制用户
repl_user=repl
repl_password= repl4slave


#监控
ping_interval=1
shutdown_script=""

#切换调用的脚本
master_ip_failover_script= /etc/masterha/master_ip_failover
master_ip_online_change_script= /etc/masterha/master_ip_online_change
[root@test1 masterha]# more app1.conf 
[server default]


#mha manager工作目录
manager_workdir = /var/log/masterha/app1
manager_log = /var/log/masterha/app1/app1.log
remote_workdir = /var/log/masterha/app1

[server1]
hostname=test
port=3307
master_binlog_dir = /mydata/mysql/mysql_3307
candidate_master = 1
check_repl_delay = 0     #用防止master故障时,切换时slave有延迟,卡在那里切不过来。

[server2]
hostname=test1
port=3307
master_binlog_dir=/mydata/mysql/mysql_3307
no_master =1
check_repl_delay=0


[server3]
hostname=test2
port=3307
master_binlog_dir=/mydata/mysql/mysql_3307
candidate_master=1
check_repl_delay=0

#测试SSH
[root@test1 masterha]# masterha_check_ssh --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf 
Wed Sep 27 08:53:51 2017 - [info] Reading default configuration from /etc/masterha/masterha_default.conf..
Wed Sep 27 08:53:51 2017 - [info] Reading application default configuration from /etc/masterha/app1.conf..
Wed Sep 27 08:53:51 2017 - [info] Reading server configuration from /etc/masterha/app1.conf..
Wed Sep 27 08:53:51 2017 - [info] Starting SSH connection tests..
Wed Sep 27 08:53:53 2017 - [debug] 
Wed Sep 27 08:53:51 2017 - [debug]  Connecting via SSH from root@test(192.168.18.50:22) to root@test1(192.168.18.60:22)..
Wed Sep 27 08:53:51 2017 - [debug]   ok.
Wed Sep 27 08:53:51 2017 - [debug]  Connecting via SSH from root@test(192.168.18.50:22) to root@bogon(192.168.18.70:22)..
Wed Sep 27 08:53:52 2017 - [debug]   ok.
Wed Sep 27 08:53:53 2017 - [debug] 
Wed Sep 27 08:53:52 2017 - [debug]  Connecting via SSH from root@test1(192.168.18.60:22) to root@test(192.168.18.50:22)..
Wed Sep 27 08:53:52 2017 - [debug]   ok.
Wed Sep 27 08:53:52 2017 - [debug]  Connecting via SSH from root@test1(192.168.18.60:22) to root@bogon(192.168.18.70:22)..
Warning: Permanently added '192.168.18.70' (RSA) to the list of known hosts.
Wed Sep 27 08:53:53 2017 - [debug]   ok.
Wed Sep 27 08:53:53 2017 - [debug] 
Wed Sep 27 08:53:52 2017 - [debug]  Connecting via SSH from root@bogon(192.168.18.70:22) to root@test(192.168.18.50:22)..
Wed Sep 27 08:53:53 2017 - [debug]   ok.
Wed Sep 27 08:53:53 2017 - [debug]  Connecting via SSH from root@bogon(192.168.18.70:22) to root@test1(192.168.18.60:22)..
Warning: Permanently added '192.168.18.60' (RSA) to the list of known hosts.
Wed Sep 27 08:53:53 2017 - [debug]   ok.
Wed Sep 27 08:53:53 2017 - [info] All SSH connection tests passed successfully.

#测试主从结构
[root@test1 masterha]# masterha_check_repl --global_conf=masterha_default.conf --conf=app1.conf 
Thu Sep 28 06:14:11 2017 - [info] Reading default configuration from masterha_default.conf..
Thu Sep 28 06:14:11 2017 - [info] Reading application default configuration from app1.conf..
Thu Sep 28 06:14:11 2017 - [info] Reading server configuration from app1.conf..
Thu Sep 28 06:14:11 2017 - [info] MHA::MasterMonitor version 0.56.
Thu Sep 28 06:14:12 2017 - [info] GTID failover mode = 0
Thu Sep 28 06:14:12 2017 - [info] Dead Servers:
Thu Sep 28 06:14:12 2017 - [info] Alive Servers:
Thu Sep 28 06:14:12 2017 - [info]   test(192.168.18.50:3307)
Thu Sep 28 06:14:12 2017 - [info]   test1(192.168.18.60:3307)
Thu Sep 28 06:14:12 2017 - [info]   test2(192.168.18.70:3307)
Thu Sep 28 06:14:12 2017 - [info] Alive Slaves:
Thu Sep 28 06:14:12 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:14:12 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:14:12 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 06:14:12 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:14:12 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:14:12 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 06:14:12 2017 - [info] Current Alive Master: test(192.168.18.50:3307)
Thu Sep 28 06:14:12 2017 - [info] Checking slave configurations..
Thu Sep 28 06:14:12 2017 - [warning]  relay_log_purge=0 is not set on slave test1(192.168.18.60:3307).
Thu Sep 28 06:14:12 2017 - [warning]  relay_log_purge=0 is not set on slave test2(192.168.18.70:3307).
Thu Sep 28 06:14:12 2017 - [info] Checking replication filtering settings..
Thu Sep 28 06:14:12 2017 - [info]  binlog_do_db= AAA, binlog_ignore_db= 
Thu Sep 28 06:14:12 2017 - [info]  Replication filtering check ok.
Thu Sep 28 06:14:12 2017 - [info] GTID (with auto-pos) is not supported
Thu Sep 28 06:14:12 2017 - [info] Starting SSH connection tests..
Thu Sep 28 06:14:13 2017 - [info] All SSH connection tests passed successfully.
Thu Sep 28 06:14:13 2017 - [info] Checking MHA Node version..
Thu Sep 28 06:14:14 2017 - [info]  Version check ok.
Thu Sep 28 06:14:14 2017 - [info] Checking SSH publickey authentication settings on the current master..
Thu Sep 28 06:14:14 2017 - [info] HealthCheck: SSH to test is reachable.
Thu Sep 28 06:14:15 2017 - [info] Master MHA Node version is 0.56.
Thu Sep 28 06:14:15 2017 - [info] Checking recovery script configurations on test(192.168.18.50:3307)..
Thu Sep 28 06:14:15 2017 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/mydata/mysql/mysql_3307 --output_file=/var/log/masterha/app1/save_binary_logs_test --manager_version=0.56 --start_file=mybinlog.000001 
Thu Sep 28 06:14:15 2017 - [info]   Connecting to root@192.168.18.50(test:22).. 
  Creating /var/log/masterha/app1 if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /mydata/mysql/mysql_3307, up to mybinlog.000001
Thu Sep 28 06:14:15 2017 - [info] Binlog setting check done.
Thu Sep 28 06:14:15 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Thu Sep 28 06:14:15 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=test1 --slave_ip=192.168.18.60 --slave_port=3307 --workdir=/var/log/masterha/app1 --target_version=5.6.37-log --manager_version=0.56 --relay_dir=/mydata/mysql/mysql_3307 --current_relay_log=mysql-relay-bin.000002  --slave_pass=xxx
Thu Sep 28 06:14:15 2017 - [info]   Connecting to root@192.168.18.60(test1:22).. 
  Checking slave recovery environment settings..
    Relay log found at /mydata/mysql/mysql_3307, up to mysql-relay-bin.000002
    Temporary relay log file is /mydata/mysql/mysql_3307/mysql-relay-bin.000002
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Thu Sep 28 06:14:15 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=test2 --slave_ip=192.168.18.70 --slave_port=3307 --workdir=/var/log/masterha/app1 --target_version=5.6.37-log --manager_version=0.56 --relay_dir=/mydata/mysql/mysql_3307 --current_relay_log=mysql-relay-bin.000002  --slave_pass=xxx
Thu Sep 28 06:14:15 2017 - [info]   Connecting to root@192.168.18.70(test2:22).. 
  Checking slave recovery environment settings..
    Relay log found at /mydata/mysql/mysql_3307, up to mysql-relay-bin.000002
    Temporary relay log file is /mydata/mysql/mysql_3307/mysql-relay-bin.000002
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Thu Sep 28 06:14:16 2017 - [info] Slaves settings check done.
Thu Sep 28 06:14:16 2017 - [info] 
test(192.168.18.50:3307) (current master)
 +--test1(192.168.18.60:3307)
 +--test2(192.168.18.70:3307)

Thu Sep 28 06:14:16 2017 - [info] Checking replication health on test1..
Thu Sep 28 06:14:16 2017 - [info]  ok.
Thu Sep 28 06:14:16 2017 - [info] Checking replication health on test2..
Thu Sep 28 06:14:16 2017 - [info]  ok.
Thu Sep 28 06:14:16 2017 - [info] Checking master_ip_failover_script status:
Thu Sep 28 06:14:16 2017 - [info]   /etc/masterha/master_ip_failover --command=status --ssh_user=root --orig_master_host=test --orig_master_ip=192.168.18.50 --orig_master_port=3307 
Thu Sep 28 06:14:16 2017 - [info]  OK.
Thu Sep 28 06:14:16 2017 - [warning] shutdown_script is not defined.
Thu Sep 28 06:14:16 2017 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.


#首先手工启动VIP
[root@test masterha]# ./init_vip.sh 

#启动MHA
[root@test1 masterha]# masterha_manager --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf >/tmp/mha_manager.log 2>&1 &

#查看MHA启动日志
Thu Sep 28 06:27:29 2017 - [info] MHA::MasterMonitor version 0.56.
Thu Sep 28 06:27:29 2017 - [info] GTID failover mode = 0
Thu Sep 28 06:27:29 2017 - [info] Dead Servers:
Thu Sep 28 06:27:29 2017 - [info] Alive Servers:
Thu Sep 28 06:27:29 2017 - [info]   test(192.168.18.50:3307)
Thu Sep 28 06:27:29 2017 - [info]   test1(192.168.18.60:3307)
Thu Sep 28 06:27:29 2017 - [info]   test2(192.168.18.70:3307)
Thu Sep 28 06:27:29 2017 - [info] Alive Slaves:
Thu Sep 28 06:27:29 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:27:29 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:27:29 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 06:27:29 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:27:29 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:27:29 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 06:27:29 2017 - [info] Current Alive Master: test(192.168.18.50:3307)
Thu Sep 28 06:27:29 2017 - [info] Checking slave configurations..
Thu Sep 28 06:27:29 2017 - [warning]  relay_log_purge=0 is not set on slave test1(192.168.18.60:3307).
Thu Sep 28 06:27:29 2017 - [warning]  relay_log_purge=0 is not set on slave test2(192.168.18.70:3307).
Thu Sep 28 06:27:29 2017 - [info] Checking replication filtering settings..
Thu Sep 28 06:27:29 2017 - [info]  binlog_do_db= AAA, binlog_ignore_db= 
Thu Sep 28 06:27:29 2017 - [info]  Replication filtering check ok.
Thu Sep 28 06:27:29 2017 - [info] GTID (with auto-pos) is not supported
Thu Sep 28 06:27:29 2017 - [info] Starting SSH connection tests..
Thu Sep 28 06:27:31 2017 - [info] All SSH connection tests passed successfully.
Thu Sep 28 06:27:31 2017 - [info] Checking MHA Node version..
Thu Sep 28 06:27:31 2017 - [info]  Version check ok.
Thu Sep 28 06:27:31 2017 - [info] Checking SSH publickey authentication settings on the current master..
Thu Sep 28 06:27:31 2017 - [info] HealthCheck: SSH to test is reachable.
Thu Sep 28 06:27:31 2017 - [info] Master MHA Node version is 0.56.
Thu Sep 28 06:27:31 2017 - [info] Checking recovery script configurations on test(192.168.18.50:3307)..
Thu Sep 28 06:27:31 2017 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/mydata/mysql/mysql_3307 --output_file=/var/log/masterha/app1/save_binary_logs_test --manager_version=0.56 --start_file=mybinlog.000001 
Thu Sep 28 06:27:31 2017 - [info]   Connecting to root@192.168.18.50(test:22).. 
  Creating /var/log/masterha/app1 if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /mydata/mysql/mysql_3307, up to mybinlog.000001
Thu Sep 28 06:27:31 2017 - [info] Binlog setting check done.
Thu Sep 28 06:27:31 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Thu Sep 28 06:27:31 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=test1 --slave_ip=192.168.18.60 --slave_port=3307 --workdir=/var/log/masterha/app1 --target_version=5.6.37-log --manager_version=0.56 --relay_dir=/mydata/mysql/mysql_3307 --current_relay_log=mysql-relay-bin.000002  --slave_pass=xxx
Thu Sep 28 06:27:31 2017 - [info]   Connecting to root@192.168.18.60(test1:22).. 
  Checking slave recovery environment settings..
    Relay log found at /mydata/mysql/mysql_3307, up to mysql-relay-bin.000002
    Temporary relay log file is /mydata/mysql/mysql_3307/mysql-relay-bin.000002
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Thu Sep 28 06:27:32 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=test2 --slave_ip=192.168.18.70 --slave_port=3307 --workdir=/var/log/masterha/app1 --target_version=5.6.37-log --manager_version=0.56 --relay_dir=/mydata/mysql/mysql_3307 --current_relay_log=mysql-relay-bin.000002  --slave_pass=xxx
Thu Sep 28 06:27:32 2017 - [info]   Connecting to root@192.168.18.70(test2:22).. 
  Checking slave recovery environment settings..
    Relay log found at /mydata/mysql/mysql_3307, up to mysql-relay-bin.000002
    Temporary relay log file is /mydata/mysql/mysql_3307/mysql-relay-bin.000002
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Thu Sep 28 06:27:32 2017 - [info] Slaves settings check done.
Thu Sep 28 06:27:32 2017 - [info] 
test(192.168.18.50:3307) (current master)
 +--test1(192.168.18.60:3307)
 +--test2(192.168.18.70:3307)

Thu Sep 28 06:27:32 2017 - [info] Checking master_ip_failover_script status:
Thu Sep 28 06:27:32 2017 - [info]   /etc/masterha/master_ip_failover --command=status --ssh_user=root --orig_master_host=test --orig_master_ip=192.168.18.50 --orig_master_port=3307 
Thu Sep 28 06:27:32 2017 - [info]  OK.
Thu Sep 28 06:27:32 2017 - [warning] shutdown_script is not defined.
Thu Sep 28 06:27:32 2017 - [info] Set master ping interval 1 seconds.
Thu Sep 28 06:27:32 2017 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Thu Sep 28 06:27:32 2017 - [info] Starting ping health check on test(192.168.18.50:3307)..
Thu Sep 28 06:27:32 2017 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

#切换测试,关闭test服务器上的mysql服务
#查看切换日志
Thu Sep 28 06:32:42 2017 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Thu Sep 28 06:32:42 2017 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/mydata/mysql/mysql_3307 --output_file=/var/log/masterha/app1/save_binary_logs_test --manager_version=0.56 --binlog_prefix=mybinlog
Thu Sep 28 06:32:42 2017 - [info] HealthCheck: SSH to test is reachable.
Thu Sep 28 06:32:43 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Thu Sep 28 06:32:43 2017 - [warning] Connection failed 2 time(s)..
Thu Sep 28 06:32:44 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Thu Sep 28 06:32:44 2017 - [warning] Connection failed 3 time(s)..
Thu Sep 28 06:32:45 2017 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Thu Sep 28 06:32:45 2017 - [warning] Connection failed 4 time(s)..
Thu Sep 28 06:32:45 2017 - [warning] Master is not reachable from health checker!
Thu Sep 28 06:32:45 2017 - [warning] Master test(192.168.18.50:3307) is not reachable!
Thu Sep 28 06:32:45 2017 - [warning] SSH is reachable.
Thu Sep 28 06:32:45 2017 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha/masterha_default.conf and /etc/masterha/app1.conf again, and trying to connect to all servers to check server status..
Thu Sep 28 06:32:45 2017 - [info] Reading default configuration from /etc/masterha/masterha_default.conf..
Thu Sep 28 06:32:45 2017 - [info] Reading application default configuration from /etc/masterha/app1.conf..
Thu Sep 28 06:32:45 2017 - [info] Reading server configuration from /etc/masterha/app1.conf..
Thu Sep 28 06:32:45 2017 - [info] GTID failover mode = 0
Thu Sep 28 06:32:45 2017 - [info] Dead Servers:
Thu Sep 28 06:32:45 2017 - [info]   test(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info] Alive Servers:
Thu Sep 28 06:32:45 2017 - [info]   test1(192.168.18.60:3307)
Thu Sep 28 06:32:45 2017 - [info]   test2(192.168.18.70:3307)
Thu Sep 28 06:32:45 2017 - [info] Alive Slaves:
Thu Sep 28 06:32:45 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 06:32:45 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 06:32:45 2017 - [info] Checking slave configurations..
Thu Sep 28 06:32:45 2017 - [warning]  relay_log_purge=0 is not set on slave test1(192.168.18.60:3307).
Thu Sep 28 06:32:45 2017 - [warning]  relay_log_purge=0 is not set on slave test2(192.168.18.70:3307).
Thu Sep 28 06:32:45 2017 - [info] Checking replication filtering settings..
Thu Sep 28 06:32:45 2017 - [info]  Replication filtering check ok.
Thu Sep 28 06:32:45 2017 - [info] Master is down!
Thu Sep 28 06:32:45 2017 - [info] Terminating monitoring script.
Thu Sep 28 06:32:45 2017 - [info] Got exit code 20 (Master dead).
Thu Sep 28 06:32:45 2017 - [info] MHA::MasterFailover version 0.56.
Thu Sep 28 06:32:45 2017 - [info] Starting master failover.
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] * Phase 1: Configuration Check Phase..
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] GTID failover mode = 0
Thu Sep 28 06:32:45 2017 - [info] Dead Servers:
Thu Sep 28 06:32:45 2017 - [info]   test(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info] Checking master reachability via MySQL(double check)...
Thu Sep 28 06:32:45 2017 - [info]  ok.
Thu Sep 28 06:32:45 2017 - [info] Alive Servers:
Thu Sep 28 06:32:45 2017 - [info]   test1(192.168.18.60:3307)
Thu Sep 28 06:32:45 2017 - [info]   test2(192.168.18.70:3307)
Thu Sep 28 06:32:45 2017 - [info] Alive Slaves:
Thu Sep 28 06:32:45 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 06:32:45 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 06:32:45 2017 - [info] Starting Non-GTID based failover.
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] * Phase 2: Dead Master Shutdown Phase..
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] Forcing shutdown so that applications never connect to the current master..
Thu Sep 28 06:32:45 2017 - [info] Executing master IP deactivation script:
Thu Sep 28 06:32:45 2017 - [info]   /etc/masterha/master_ip_failover --orig_master_host=test --orig_master_ip=192.168.18.50 --orig_master_port=3307 --command=stopssh --ssh_user=root  
Thu Sep 28 06:32:45 2017 - [info]  done.
Thu Sep 28 06:32:45 2017 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Thu Sep 28 06:32:45 2017 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] * Phase 3: Master Recovery Phase..
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] The latest binary log file/position on all slaves is mybinlog.000001:1048
Thu Sep 28 06:32:45 2017 - [info] Latest slaves (Slaves that received relay log files to the latest):
Thu Sep 28 06:32:45 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 06:32:45 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 06:32:45 2017 - [info] The oldest binary log file/position on all slaves is mybinlog.000001:1048
Thu Sep 28 06:32:45 2017 - [info] Oldest slaves:
Thu Sep 28 06:32:45 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 06:32:45 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:45 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:45 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Thu Sep 28 06:32:45 2017 - [info] 
Thu Sep 28 06:32:45 2017 - [info] Fetching dead master's binary logs..
Thu Sep 28 06:32:45 2017 - [info] Executing command on the dead master test(192.168.18.50:3307): save_binary_logs --command=save --start_file=mybinlog.000001  --start_pos=1048 --binlog_dir=/mydata/mysql/mysql_3307 --output_file=/var/log/masterha/app1/saved_master_binlog_from_test_3307_20170928063245.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.56
  Creating /var/log/masterha/app1 if not exists..    ok.
 Concat binary/relay logs from mybinlog.000001 pos 1048 to mybinlog.000001 EOF into /var/log/masterha/app1/saved_master_binlog_from_test_3307_20170928063245.binlog ..
 Binlog Checksum enabled
  Dumping binlog format description event, from position 0 to 120.. ok.
  No need to dump effective binlog data from /mydata/mysql/mysql_3307/mybinlog.000001 (pos starts 1048, filesize 1048). Skipping.
sh: mysqlbinlog: command not found
Failed to save binary log: /var/log/masterha/app1/saved_master_binlog_from_test_3307_20170928063245.binlog is broken!
 at /usr/bin/save_binary_logs line 176
Thu Sep 28 06:32:46 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterFailover.pm, ln760] Failed to save binary log events from the orig master. Maybe disks on binary logs are not accessible or binary log itself is corrupt?
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] * Phase 3.3: Determining New Master Phase..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
Thu Sep 28 06:32:46 2017 - [info] All slaves received relay logs to the same position. No need to resync each other.
Thu Sep 28 06:32:46 2017 - [info] Searching new master from slaves..
Thu Sep 28 06:32:46 2017 - [info]  Candidate masters from the configuration file:
Thu Sep 28 06:32:46 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:46 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:46 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 06:32:46 2017 - [info]  Non-candidate masters:
Thu Sep 28 06:32:46 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 06:32:46 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 06:32:46 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 06:32:46 2017 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Thu Sep 28 06:32:46 2017 - [info] New master is test2(192.168.18.70:3307)
Thu Sep 28 06:32:46 2017 - [info] Starting master failover..
Thu Sep 28 06:32:46 2017 - [info] 
From:
test(192.168.18.50:3307) (current master)
 +--test1(192.168.18.60:3307)
 +--test2(192.168.18.70:3307)

To:
test2(192.168.18.70:3307) (new master)
 +--test1(192.168.18.60:3307)
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] * Phase 3.4: Master Log Apply Phase..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
Thu Sep 28 06:32:46 2017 - [info] Starting recovery on test2(192.168.18.70:3307)..
Thu Sep 28 06:32:46 2017 - [info]  This server has all relay logs. Waiting all logs to be applied.. 
Thu Sep 28 06:32:46 2017 - [info]   done.
Thu Sep 28 06:32:46 2017 - [info]  All relay logs were successfully applied.
Thu Sep 28 06:32:46 2017 - [info] Getting new master's binlog name and position..
Thu Sep 28 06:32:46 2017 - [info]  mybinlog.000003:847
Thu Sep 28 06:32:46 2017 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='test2 or 192.168.18.70', MASTER_PORT=3307, MASTER_LOG_FILE='mybinlog.000003', MASTER_LOG_POS=847, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Thu Sep 28 06:32:46 2017 - [info] Executing master IP activate script:
Thu Sep 28 06:32:46 2017 - [info]   /etc/masterha/master_ip_failover --command=start --ssh_user=root --orig_master_host=test --orig_master_ip=192.168.18.50 --orig_master_port=3307 --new_master_host=test2 --new_master_ip=192.168.18.70 --new_master_port=3307 --new_master_user='mha' --new_master_password='123456'  
Set read_only=0 on the new master.
Thu Sep 28 06:32:46 2017 - [info]  OK.
Thu Sep 28 06:32:46 2017 - [info] ** Finished master recovery successfully.
Thu Sep 28 06:32:46 2017 - [info] * Phase 3: Master Recovery Phase completed.
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] * Phase 4: Slaves Recovery Phase..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] -- Slave diff file generation on host test1(192.168.18.60:3307) started, pid: 5914. Check tmp log /var/log/masterha/app1/test1_3307_20170928063245.log if it takes time..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] Log messages from test1 ...
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Thu Sep 28 06:32:46 2017 - [info] End of log messages from test1.
Thu Sep 28 06:32:46 2017 - [info] -- test1(192.168.18.60:3307) has the latest relay log events.
Thu Sep 28 06:32:46 2017 - [info] Generating relay diff files from the latest slave succeeded.
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] -- Slave recovery on host test1(192.168.18.60:3307) started, pid: 5916. Check tmp log /var/log/masterha/app1/test1_3307_20170928063245.log if it takes time..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] Log messages from test1 ...
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] Starting recovery on test1(192.168.18.60:3307)..
Thu Sep 28 06:32:46 2017 - [info]  This server has all relay logs. Waiting all logs to be applied.. 
Thu Sep 28 06:32:46 2017 - [info]   done.
Thu Sep 28 06:32:46 2017 - [info]  All relay logs were successfully applied.
Thu Sep 28 06:32:46 2017 - [info]  Resetting slave test1(192.168.18.60:3307) and starting replication from the new master test2(192.168.18.70:3307)..
Thu Sep 28 06:32:46 2017 - [info]  Executed CHANGE MASTER.
Thu Sep 28 06:32:46 2017 - [info]  Slave started.
Thu Sep 28 06:32:46 2017 - [info] End of log messages from test1.
Thu Sep 28 06:32:46 2017 - [info] -- Slave recovery on host test1(192.168.18.60:3307) succeeded.
Thu Sep 28 06:32:46 2017 - [info] All new slave servers recovered successfully.
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] * Phase 5: New master cleanup phase..
Thu Sep 28 06:32:46 2017 - [info] 
Thu Sep 28 06:32:46 2017 - [info] Resetting slave info on the new master..
Thu Sep 28 06:32:47 2017 - [info]  test2: Resetting slave info succeeded.
Thu Sep 28 06:32:47 2017 - [info] Master failover to test2(192.168.18.70:3307) completed successfully.
Thu Sep 28 06:32:47 2017 - [info] 

----- Failover Report -----

app1: MySQL Master failover test(192.168.18.50:3307) to test2(192.168.18.70:3307) succeeded

Master test(192.168.18.50:3307) is down!

Check MHA Manager logs at test1:/var/log/masterha/app1/app1.log for details.

Started automated(non-interactive) failover.
Invalidated master IP address on test(192.168.18.50:3307)
The latest slave test1(192.168.18.60:3307) has all relay logs for recovery.
Selected test2(192.168.18.70:3307) as a new master.
test2(192.168.18.70:3307): OK: Applying all logs succeeded.
test2(192.168.18.70:3307): OK: Activated master IP address.
test1(192.168.18.60:3307): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
test1(192.168.18.60:3307): OK: Applying all logs succeeded. Slave started, replicating from test2(192.168.18.70:3307)
test2(192.168.18.70:3307): Resetting slave info succeeded.
Master failover to test2(192.168.18.70:3307) completed successfully.

#检查启动的状态
[root@test1 masterha]# masterha_check_status --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf 
app1 (pid:11183) is running(0:PING_OK), master:test
#关闭mha
[root@test1 masterha]# masterha_stop --global=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf


#手工在线切换,需要关闭masterha_manager
[root@test1 masterha]# masterha_master_switch --master_state=alive --global_conf=masterha_default.conf --conf=app1.conf 
Thu Sep 28 09:45:57 2017 - [info] MHA::MasterRotate version 0.56.
Thu Sep 28 09:45:57 2017 - [info] Starting online master switch..
Thu Sep 28 09:45:57 2017 - [info] 
Thu Sep 28 09:45:57 2017 - [info] * Phase 1: Configuration Check Phase..
Thu Sep 28 09:45:57 2017 - [info] 
Thu Sep 28 09:45:57 2017 - [info] Reading default configuration from masterha_default.conf..
Thu Sep 28 09:45:57 2017 - [info] Reading application default configuration from app1.conf..
Thu Sep 28 09:45:57 2017 - [info] Reading server configuration from app1.conf..
Thu Sep 28 09:45:57 2017 - [info] GTID failover mode = 0
Thu Sep 28 09:45:57 2017 - [info] Current Alive Master: test(192.168.18.50:3307)
Thu Sep 28 09:45:57 2017 - [info] Alive Slaves:
Thu Sep 28 09:45:57 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 09:45:57 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 09:45:57 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 09:45:57 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 09:45:57 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 09:45:57 2017 - [info]     Primary candidate for the new Master (candidate_master is set)

It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on test(192.168.18.50:3307)? (YES/no): y
Thu Sep 28 09:45:59 2017 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Thu Sep 28 09:45:59 2017 - [info]  ok.
Thu Sep 28 09:45:59 2017 - [info] Checking MHA is not monitoring or doing failover..
Thu Sep 28 09:45:59 2017 - [info] Checking replication health on test1..
Thu Sep 28 09:45:59 2017 - [info]  ok.
Thu Sep 28 09:45:59 2017 - [info] Checking replication health on test2..
Thu Sep 28 09:45:59 2017 - [info]  ok.
Thu Sep 28 09:45:59 2017 - [info] Searching new master from slaves..
Thu Sep 28 09:45:59 2017 - [info]  Candidate masters from the configuration file:
Thu Sep 28 09:45:59 2017 - [info]   test(192.168.18.50:3307)  Version=5.6.37-log log-bin:enabled
Thu Sep 28 09:45:59 2017 - [info]   test2(192.168.18.70:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 09:45:59 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 09:45:59 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Sep 28 09:45:59 2017 - [info]  Non-candidate masters:
Thu Sep 28 09:45:59 2017 - [info]   test1(192.168.18.60:3307)  Version=5.6.37-log (oldest major version between slaves) log-bin:enabled
Thu Sep 28 09:45:59 2017 - [info]     Replicating from 192.168.18.50(192.168.18.50:3307)
Thu Sep 28 09:45:59 2017 - [info]     Not candidate for the new Master (no_master is set)
Thu Sep 28 09:45:59 2017 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Thu Sep 28 09:45:59 2017 - [info] 
From:
test(192.168.18.50:3307) (current master)
 +--test1(192.168.18.60:3307)
 +--test2(192.168.18.70:3307)

To:
test2(192.168.18.70:3307) (new master)
 +--test1(192.168.18.60:3307)

Starting master switch from test(192.168.18.50:3307) to test2(192.168.18.70:3307)? (yes/NO): y
Thu Sep 28 09:46:00 2017 - [info] Checking whether test2(192.168.18.70:3307) is ok for the new master..
Thu Sep 28 09:46:00 2017 - [info]  ok.
Thu Sep 28 09:46:00 2017 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info] * Phase 2: Rejecting updates Phase..
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info] Executing master ip online change script to disable write on the current master:
Thu Sep 28 09:46:00 2017 - [info]   /etc/masterha/master_ip_online_change --command=stop --orig_master_host=test --orig_master_ip=192.168.18.50 --orig_master_port=3307 --orig_master_user='mha' --orig_master_password='123456' --new_master_host=test2 --new_master_ip=192.168.18.70 --new_master_port=3307 --new_master_user='mha' --new_master_password='123456' --orig_master_ssh_user=root --new_master_ssh_user=root  
Thu Sep 28 09:46:00 2017 120078 Set read_only on the new master.. ok.
Thu Sep 28 09:46:00 2017 124819 drop vip 192.168.18.100..
Thu Sep 28 09:46:00 2017 210095 Set read_only=1 on the orig master.. ok.
Thu Sep 28 09:46:00 2017 211292 Killing all application threads..
Thu Sep 28 09:46:00 2017 211318 done.
Thu Sep 28 09:46:00 2017 - [info]  ok.
Thu Sep 28 09:46:00 2017 - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Thu Sep 28 09:46:00 2017 - [info] Executing FLUSH TABLES WITH READ LOCK..
Thu Sep 28 09:46:00 2017 - [info]  ok.
Thu Sep 28 09:46:00 2017 - [info] Orig master binlog:pos is mybinlog.000001:514.
Thu Sep 28 09:46:00 2017 - [info]  Waiting to execute all relay logs on test2(192.168.18.70:3307)..
Thu Sep 28 09:46:00 2017 - [info]  master_pos_wait(mybinlog.000001:514) completed on test2(192.168.18.70:3307). Executed 0 events.
Thu Sep 28 09:46:00 2017 - [info]   done.
Thu Sep 28 09:46:00 2017 - [info] Getting new master's binlog name and position..
Thu Sep 28 09:46:00 2017 - [info]  mybinlog.000006:120
Thu Sep 28 09:46:00 2017 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='test2 or 192.168.18.70', MASTER_PORT=3307, MASTER_LOG_FILE='mybinlog.000006', MASTER_LOG_POS=120, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Thu Sep 28 09:46:00 2017 - [info] Executing master ip online change script to allow write on the new master:
Thu Sep 28 09:46:00 2017 - [info]   /etc/masterha/master_ip_online_change --command=start --orig_master_host=test --orig_master_ip=192.168.18.50 --orig_master_port=3307 --orig_master_user='mha' --orig_master_password='123456' --new_master_host=test2 --new_master_ip=192.168.18.70 --new_master_port=3307 --new_master_user='mha' --new_master_password='123456' --orig_master_ssh_user=root --new_master_ssh_user=root  
Thu Sep 28 09:46:00 2017 410159 Set read_only=0 on the new master.
Thu Sep 28 09:46:00 2017 410846Add vip 192.168.18.100 on eth1..
Thu Sep 28 09:46:00 2017 - [info]  ok.
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info] * Switching slaves in parallel..
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info] -- Slave switch on host test1(192.168.18.60:3307) started, pid: 12970
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info] Log messages from test1 ...
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info]  Waiting to execute all relay logs on test1(192.168.18.60:3307)..
Thu Sep 28 09:46:00 2017 - [info]  master_pos_wait(mybinlog.000001:514) completed on test1(192.168.18.60:3307). Executed 0 events.
Thu Sep 28 09:46:00 2017 - [info]   done.
Thu Sep 28 09:46:00 2017 - [info]  Resetting slave test1(192.168.18.60:3307) and starting replication from the new master test2(192.168.18.70:3307)..
Thu Sep 28 09:46:00 2017 - [info]  Executed CHANGE MASTER.
Thu Sep 28 09:46:00 2017 - [info]  Slave started.
Thu Sep 28 09:46:00 2017 - [info] End of log messages from test1 ...
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info] -- Slave switch on host test1(192.168.18.60:3307) succeeded.
Thu Sep 28 09:46:00 2017 - [info] Unlocking all tables on the orig master:
Thu Sep 28 09:46:00 2017 - [info] Executing UNLOCK TABLES..
Thu Sep 28 09:46:00 2017 - [info]  ok.
Thu Sep 28 09:46:00 2017 - [info] All new slave servers switched successfully.
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:00 2017 - [info] * Phase 5: New master cleanup phase..
Thu Sep 28 09:46:00 2017 - [info] 
Thu Sep 28 09:46:02 2017 - [info]  test2: Resetting slave info succeeded.
Thu Sep 28 09:46:02 2017 - [info] Switching master to test2(192.168.18.70:3307) completed successfully.








  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
MHA(Master High Availability)是一种用于MySQL数据库的高可用架构解决方案。它由日本DeNA公司开发,旨在实现MySQL主从复制的自动故障转移和故障恢复。 MHA架构通常由以下几个组件组成: 1. MHA Manager:负责监控MySQL主节点的状态,并在主节点故障时自动切换到备节点。MHA Manager使用SSH连接到主节点,并通过观察进制日志来检测主节点是否正常工作。 2. Master Agent:安装在所有MySQL主节点和备节点上的代理程序。它负责与MHA Manager通信,并在主节点故障时执行故障转移操作。Master Agent会自动将备节点提升为新的主节点。 3. Slave Agent:安装在备节点上的代理程序。它负责监控备节点的状态,并将备节点的状态信息发送给Master Agent。 MHA的工作流程如下: 1. MHA Manager定期检查主节点的状态。如果主节点无法正常工作(如网络故障、MySQL进程崩溃等),MHA Manager会发起故障转移操作。 2. 在故障转移过程中,MHA Manager会将一个备节点提升为新的主节点,并更新其他备节点的配置,使它们成为新主节点的从节点。 3. MHA Manager会使用SSH连接到新主节点,并在新主节点上启动MySQL进程,实现自动的故障恢复。 总结来说,MHA是一种基于MySQL主从复制的高可用架构解决方案,能够自动监控和管理MySQL主节点和备节点,实现故障转移和故障恢复的自动化。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值