MHA切换思路学习

本文详细记录了MHA MySQL集群在遇到节点异常时的脑裂测试过程,包括健康检查、网络问题判断,以及自动切换至备用节点的详细步骤。通过实例展示了如何通过GTID进行主从切换,并演示了手动和自动切换的流程。
摘要由CSDN通过智能技术生成

MHA作为MySQL经典的集群,一些思路值得学习,在此作对实验结论进行记录

拓扑
192.168.50.10 node1
192.168.50.11 node2
192.168.50.12 node3
192.168.50.13 master(manager)

  • 脑裂测试
Wed Jan 13 23:06:03 2021 - [info] Set master ping interval 1 seconds.
Wed Jan 13 23:06:03 2021 - [info] Set secondary check script: /usr/bin/masterha_secondary_check -s 192.168.50.10 -s 192.168.56.11 -s 192.168.50.10              
Wed Jan 13 23:06:03 2021 - [info] Starting ping health check on 192.168.50.10(192.168.50.10:3308)..
Wed Jan 13 23:06:03 2021 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

发现节点异常 开始进行检查
Wed Jan 13 23:09:24 2021 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)  
Wed Jan 13 23:09:24 2021 - [info] Executing SSH check script: exit 0

按照辅助检查规则依次从定义的节点去访问离开的主机
Wed Jan 13 23:09:25 2021 - [info] HealthCheck: SSH to 192.168.50.10 is reachable.
Monitoring server 192.168.50.10 is reachable, Master is not reachable from 192.168.50.10. OK.
Wed Jan 13 23:09:25 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.50.10' (111))
Wed Jan 13 23:09:25 2021 - [warning] Connection failed 2 time(s)..
Wed Jan 13 23:09:26 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.50.10' (111))
Wed Jan 13 23:09:26 2021 - [warning] Connection failed 3 time(s)..
Wed Jan 13 23:09:27 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.50.10' (111))
Wed Jan 13 23:09:27 2021 - [warning] Connection failed 4 time(s)..
ssh: connect to host 192.168.56.11 port 22: Connection timed out^M
Monitoring server 192.168.56.11 is NOT reachable!

确认主机主机始终无法访问,此时无法确定离开集群的主机状态,判断为网络问题,跳过Faiilover避免脑裂
Wed Jan 13 23:09:30 2021 - [warning] At least one of monitoring servers is not reachable from this script. This is likely a network problem. Failover should not happen.
Wed Jan 13 23:09:30 2021 - [warning] Secondary network check script returned errors. Failover should not start so checking server status again. Check network settings for details.
Wed Jan 13 23:09:30 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.50.10' (111))
Wed Jan 13 23:09:30 2021 - [warning] Connection failed 1 time(s)..
Wed Jan 13 23:09:30 2021 - [info] Executing SSH check script: exit 0
Wed Jan 13 23:09:31 2021 - [info] HealthCheck: SSH to 192.168.50.10 is reachable.
Monitoring server 192.168.50.10 is reachable, Master is not reachable from 192.168.50.10. OK.
Wed Jan 13 23:09:31 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.50.10' (111))
Wed Jan 13 23:09:31 2021 - [warning] Connection failed 2 time(s)..
Wed Jan 13 23:09:32 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.50.10' (111))
Wed Jan 13 23:09:32 2021 - [warning] Connection failed 3 time(s)..
Wed Jan 13 23:09:33 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.50.10' (111))
Wed Jan 13 23:09:33 2021 - [warning] Connection failed 4 time(s)..
ssh: connect to host 192.168.56.11 port 22: Connection timed out^M

反复循环动作,尝试从辅助规则定义的主机去访问目标机器
Monitoring server 192.168.56.11 is NOT reachable!
Wed Jan 13 23:09:36 2021 - [warning] At least one of monitoring servers is not reachable from this script. This is likely a network problem. Failover should not happen.
Wed Jan 13 23:09:36 2021 - [warning] Secondary network check script returned errors. Failover should not start so checking server status again. Check network settings for details.
Wed Jan 13 23:09:36 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.50.10' (111))
Wed Jan 13 23:09:36 2021 - [warning] Connection failed 1 time(s)..
Wed Jan 13 23:09:36 2021 - [info] Executing secondary network check script: /usr/bin/masterha_secondary_check -s 192.168.50.10 -s 192.168.56.11 -s 192.168.50.10  --user=root  --master_host=192.168.50.10  --master_ip=192.168.50.10  --master_port=3308 --master_user=dba_mha --master_password=123456 --ping_type=SELECT
Wed Jan 13 23:09:36 2021 - [info] Executing SSH check script: exit 0
Wed Jan 13 23:09:36 2021 - [info] HealthCheck: SSH to 192.168.50.10 is reachable.
Monitoring server 192.168.50.10 is reachable, Master is not reachable from 192.168.50.10. OK.

早期架构尚未采用的paxos/raft选举算法,理论上存在脑裂可能性,MHA在这点上以功能为牺牲,为安全性做出让步。

  • 自动切换
Thu Jan 14 04:27:42 2021 - [info] Set master ping interval 1 seconds.
Thu Jan 14 04:27:42 2021 - [info] Set secondary check script: /usr/bin/masterha_secondary_check -s 192.168.50.10 -s 192.168.50.11 -s 192.168.50.10
Thu Jan 14 04:27:42 2021 - [info] Starting ping health check on 192.168.50.10(192.168.50.10:3308)..
Thu Jan 14 04:27:42 2021 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

检测到节点异常
Thu Jan 14 04:28:04 2021 - [warning] Got error on MySQL select ping: 1053 (Server shutdown in progress)
Thu Jan 14 04:28:04 2021 - [info] Executing SSH check script: exit 0
Thu Jan 14 04:28:04 2021 - [info] HealthCheck: SSH to 192.168.50.10 is reachable.

SSH连通性正常
Monitoring server 192.168.50.10 is reachable, Master is not reachable from 192.168.50.10. OK.
Monitoring server 192.168.50.11 is reachable, Master is not reachable from 192.168.50.11. OK.

MySQL连通性异常
Thu Jan 14 04:28:05 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.50.10' (111))
Thu Jan 14 04:28:05 2021 - [warning] Connection failed 2 time(s)..
Monitoring server 192.168.50.10 is reachable, Master is not reachable from 192.168.50.10. OK.
Thu Jan 14 04:28:05 2021 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Thu Jan 14 04:28:06 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.50.10' (111))
Thu Jan 14 04:28:06 2021 - [warning] Connection failed 3 time(s)..
Thu Jan 14 04:28:07 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.50.10' (111))
Thu Jan 14 04:28:07 2021 - [warning] Connection failed 4 time(s)..
Thu Jan 14 04:28:07 2021 - [warning] Master is not reachable from health checker!
Thu Jan 14 04:28:07 2021 - [warning] Master 192.168.50.10(192.168.50.10:3308) is not reachable!
Thu Jan 14 04:28:07 2021 - [warning] SSH is reachable.
Thu Jan 14 04:28:07 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Jan 14 04:28:07 2021 - [info] Reading application default configuration from /home/mha/mysql-mha.conf..
Thu Jan 14 04:28:07 2021 - [info] Reading server configuration from /home/mha/mysql-mha.conf..
Thu Jan 14 04:28:08 2021 - [info] GTID failover mode = 1

判断节点死亡
Thu Jan 14 04:28:08 2021 - [info] Dead Servers:
Thu Jan 14 04:28:08 2021 - [info]   192.168.50.10(192.168.50.10:3308)
Thu Jan 14 04:28:08 2021 - [info] Alive Servers:
Thu Jan 14 04:28:08 2021 - [info]   192.168.50.11(192.168.50.11:3308)
Thu Jan 14 04:28:08 2021 - [info]   192.168.50.12(192.168.50.12:3308)

存活节点选举
Thu Jan 14 04:28:08 2021 - [info] Alive Slaves:
Thu Jan 14 04:28:08 2021 - [info]   192.168.50.11(192.168.50.11:3308)  Version=8.0.22-13 (oldest major version between slaves) log-bin:enabled
Thu Jan 14 04:28:08 2021 - [info]     GTID ON
Thu Jan 14 04:28:08 2021 - [info]     Replicating from 192.168.50.10(192.168.50.10:3308)
Thu Jan 14 04:28:08 2021 - [info]   192.168.50.12(192.168.50.12:3308)  Version=8.0.22-13 (oldest major version between slaves) log-bin:enabled
Thu Jan 14 04:28:08 2021 - [info]     GTID ON
Thu Jan 14 04:28:08 2021 - [info]     Replicating from 192.168.50.10(192.168.50.10:3308)
Thu Jan 14 04:28:08 2021 - [info] Checking slave configurations..
Thu Jan 14 04:28:08 2021 - [info]  read_only=1 is not set on slave 192.168.50.12(192.168.50.12:3308).
Thu Jan 14 04:28:08 2021 - [info] Checking replication filtering settings..
Thu Jan 14 04:28:08 2021 - [info]  Replication filtering check ok.
Thu Jan 14 04:28:08 2021 - [info] Master is down!
Thu Jan 14 04:28:08 2021 - [info] Terminating monitoring script.
Thu Jan 14 04:28:08 2021 - [info] Got exit code 20 (Master dead).
Thu Jan 14 04:28:08 2021 - [info] MHA::MasterFailover version 0.57.

开始新主切换
Thu Jan 14 04:28:08 2021 - [info] Starting master failover.
Thu Jan 14 04:28:08 2021 - [info]
Thu Jan 14 04:28:08 2021 - [info] * Phase 1: Configuration Check Phase..
Thu Jan 14 04:28:08 2021 - [info]
Thu Jan 14 04:28:09 2021 - [info] GTID failover mode = 1
Thu Jan 14 04:28:09 2021 - [info] Dead Servers:
Thu Jan 14 04:28:09 2021 - [info]   192.168.50.10(192.168.50.10:3308)
Thu Jan 14 04:28:09 2021 - [info] Checking master reachability via MySQL(double check)...
Thu Jan 14 04:28:09 2021 - [info]  ok.
Thu Jan 14 04:28:09 2021 - [info] Alive Servers:
Thu Jan 14 04:28:09 2021 - [info]   192.168.50.11(192.168.50.11:3308)
Thu Jan 14 04:28:09 2021 - [info]   192.168.50.12(192.168.50.12:3308)
Thu Jan 14 04:28:09 2021 - [info] Alive Slaves:
Thu Jan 14 04:28:09 2021 - [info]   192.168.50.11(192.168.50.11:3308)  Version=8.0.22-13 (oldest major version between slaves) log-bin:enabled
Thu Jan 14 04:28:09 2021 - [info]     GTID ON
Thu Jan 14 04:28:09 2021 - [info]     Replicating from 192.168.50.10(192.168.50.10:3308)
Thu Jan 14 04:28:09 2021 - [info]   192.168.50.12(192.168.50.12:3308)  Version=8.0.22-13 (oldest major version between slaves) log-bin:enabled
Thu Jan 14 04:28:09 2021 - [info]     GTID ON
Thu Jan 14 04:28:09 2021 - [info]     Replicating from 192.168.50.10(192.168.50.10:3308)

基于GTID进行切换
Thu Jan 14 04:28:09 2021 - [info] Starting GTID based failover.
Thu Jan 14 04:28:09 2021 - [info]
Thu Jan 14 04:28:09 2021 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Jan 14 04:28:09 2021 - [info]
Thu Jan 14 04:28:09 2021 - [info] * Phase 2: Dead Master Shutdown Phase..
Thu Jan 14 04:28:09 2021 - [info]

Thu Jan 14 04:28:09 2021 - [info]   /usr/bin/master_ip_failover --orig_master_host=192.168.50.10 --orig_master_ip=192.168.50.10 --orig_master_port=3308 --command=stopssh --ssh_user=root

移除旧主上VIP
Disabling the VIP on old master: 192.168.50.10
Thu Jan 14 04:28:10 2021 - [info]  done.
Thu Jan 14 04:28:10 2021 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Thu Jan 14 04:28:10 2021 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Thu Jan 14 04:28:10 2021 - [info]
Thu Jan 14 04:28:10 2021 - [info] * Phase 3: Master Recovery Phase..
Thu Jan 14 04:28:10 2021 - [info]
Thu Jan 14 04:28:10 2021 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Thu Jan 14 04:28:10 2021 - [info]

比较slave中较新位置
Thu Jan 14 04:28:10 2021 - [info] The latest binary log file/position on all slaves is master-binlog.000006:948
Thu Jan 14 04:28:10 2021 - [info] Latest slaves (Slaves that received relay log files to the latest):
Thu Jan 14 04:28:10 2021 - [info]   192.168.50.11(192.168.50.11:3308)  Version=8.0.22-13 (oldest major version between slaves) log-bin:enabled
Thu Jan 14 04:28:10 2021 - [info]     GTID ON
Thu Jan 14 04:28:10 2021 - [info]     Replicating from 192.168.50.10(192.168.50.10:3308)
Thu Jan 14 04:28:10 2021 - [info]   192.168.50.12(192.168.50.12:3308)  Version=8.0.22-13 (oldest major version between slaves) log-bin:enabled
Thu Jan 14 04:28:10 2021 - [info]     GTID ON
Thu Jan 14 04:28:10 2021 - [info]     Replicating from 192.168.50.10(192.168.50.10:3308)
Thu Jan 14 04:28:10 2021 - [info] The oldest binary log file/position on all slaves is master-binlog.000006:948
Thu Jan 14 04:28:10 2021 - [info] Oldest slaves:
Thu Jan 14 04:28:10 2021 - [info]   192.168.50.11(192.168.50.11:3308)  Version=8.0.22-13 (oldest major version between slaves) log-bin:enabled
Thu Jan 14 04:28:10 2021 - [info]     GTID ON
Thu Jan 14 04:28:10 2021 - [info]     Replicating from 192.168.50.10(192.168.50.10:3308)
Thu Jan 14 04:28:10 2021 - [info]   192.168.50.12(192.168.50.12:3308)  Version=8.0.22-13 (oldest major version between slaves) log-bin:enabled
Thu Jan 14 04:28:10 2021 - [info]     GTID ON
Thu Jan 14 04:28:10 2021 - [info]     Replicating from 192.168.50.10(192.168.50.10:3308)
Thu Jan 14 04:28:10 2021 - [info]

决定新主
Thu Jan 14 04:28:10 2021 - [info] * Phase 3.3: Determining New Master Phase..
Thu Jan 14 04:28:10 2021 - [info]
Thu Jan 14 04:28:10 2021 - [info] Searching new master from slaves..
Thu Jan 14 04:28:10 2021 - [info]  Candidate masters from the configuration file:
Thu Jan 14 04:28:10 2021 - [info]  Non-candidate masters:

选举成功
Thu Jan 14 04:28:10 2021 - [info] New master is 192.168.50.11(192.168.50.11:3308)

开始切换
Thu Jan 14 04:28:10 2021 - [info] Starting master failover..
Thu Jan 14 04:28:10 2021 - [info]
From:
192.168.50.10(192.168.50.10:3308) (current master)
 +--192.168.50.11(192.168.50.11:3308)
 +--192.168.50.12(192.168.50.12:3308)

To:
192.168.50.11(192.168.50.11:3308) (new master)
 +--192.168.50.12(192.168.50.12:3308)
Thu Jan 14 04:28:10 2021 - [info]

等待新主将relaylog应用完
Thu Jan 14 04:28:10 2021 - [info] * Phase 3.3: New Master Recovery Phase..
Thu Jan 14 04:28:10 2021 - [info]
Thu Jan 14 04:28:10 2021 - [info]  Waiting all logs to be applied..
Thu Jan 14 04:28:10 2021 - [info]   done.
Thu Jan 14 04:28:10 2021 - [info] Getting new master's binlog name and position..

获取新主的binlog位置
Thu Jan 14 04:28:10 2021 - [info]  slave-binlog.000019:236
Thu Jan 14 04:28:10 2021 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: slave-binlog.000019, 236, 313251f3-55ea-11eb-ac1c-080027a042d7:1-5,
86b47f1b-55ba-11eb-aeb3-0800277dfa3f:1
Thu Jan 14 04:28:10 2021 - [info] Executing master IP activate script:
Option new_master_user does not take an argument
Option new_master_password does not take an argument


IN SCRIPT TEST====root|/sbin/ifconfig enp0s3:1 down==root|/sbin/ifconfig enp0s3:1 192.168.50.110/24===

在新主上启用VIP
Enabling the VIP - 192.168.50.110/24 on the new master - 192.168.50.11
bind: Cannot assign requested address
Thu Jan 14 04:28:10 2021 - [info]  OK.

设置新主可写
Thu Jan 14 04:28:10 2021 - [info] Setting read_only=0 on 192.168.50.11(192.168.50.11:3308)..
Thu Jan 14 04:28:10 2021 - [info]  ok.
Thu Jan 14 04:28:10 2021 - [info] ** Finished master recovery successfully.
Thu Jan 14 04:28:10 2021 - [info] * Phase 3: Master Recovery Phase completed.
Thu Jan 14 04:28:10 2021 - [info]
Thu Jan 14 04:28:10 2021 - [info] * Phase 4: Slaves Recovery Phase..
Thu Jan 14 04:28:10 2021 - [info]
Thu Jan 14 04:28:10 2021 - [info]
Thu Jan 14 04:28:10 2021 - [info] * Phase 4.1: Starting Slaves in parallel..
Thu Jan 14 04:28:10 2021 - [info]

等待slave应用未完成的relaylog
Thu Jan 14 04:28:10 2021 - [info] -- Slave recovery on host 192.168.50.12(192.168.50.12:3308) started, pid: 2199. Check tmp log /home/mha/192.168.50.12_3308_20210114042808.log if it takes time..
Thu Jan 14 04:30:06 2021 - [info]
Thu Jan 14 04:30:06 2021 - [info] Log messages from 192.168.50.12 ...
Thu Jan 14 04:30:06 2021 - [info]

执行reset slave 和change master指向新主
Thu Jan 14 04:28:10 2021 - [info]  Resetting slave 192.168.50.12(192.168.50.12:3308) and starting replication from the new master 192.168.50.11(192.168.50.11:3308)..
Thu Jan 14 04:28:10 2021 - [info]  Executed CHANGE MASTER.
Thu Jan 14 04:30:06 2021 - [info]  Slave started.

通过gtid_wait确认追平位置
Thu Jan 14 04:30:06 2021 - [info]  gtid_wait(313251f3-55ea-11eb-ac1c-080027a042d7:1-5,
86b47f1b-55ba-11eb-aeb3-0800277dfa3f:1) completed on 192.168.50.12(192.168.50.12:3308). Executed 0 events.
Thu Jan 14 04:30:06 2021 - [info] End of log messages from 192.168.50.12.
Thu Jan 14 04:30:06 2021 - [info] -- Slave on host 192.168.50.12(192.168.50.12:3308) started.
Thu Jan 14 04:30:06 2021 - [info] All new slave servers recovered successfully.
Thu Jan 14 04:30:06 2021 - [info]

Thu Jan 14 04:30:06 2021 - [info] * Phase 5: New master cleanup phase..
Thu Jan 14 04:30:06 2021 - [info]

对新主reset slave清理已经无用的relaylog
Thu Jan 14 04:30:06 2021 - [info] Resetting slave info on the new master..
Thu Jan 14 04:30:06 2021 - [info]  192.168.50.11: Resetting slave info succeeded.
Thu Jan 14 04:30:06 2021 - [info] Master failover to 192.168.50.11(192.168.50.11:3308) completed successfully.
Thu Jan 14 04:30:06 2021 - [info]

切换报告信息
----- Failover Report -----

mysql-mha: MySQL Master failover 192.168.50.10(192.168.50.10:3308) to 192.168.50.11(192.168.50.11:3308) succeeded

Master 192.168.50.10(192.168.50.10:3308) is down!

Check MHA Manager logs at master.localdomain:/home/mha/manager.log for details.

Started automated(non-interactive) failover.
Invalidated master IP address on 192.168.50.10(192.168.50.10:3308)
Selected 192.168.50.11(192.168.50.11:3308) as a new master.
192.168.50.11(192.168.50.11:3308): OK: Applying all logs succeeded.
192.168.50.11(192.168.50.11:3308): OK: Activated master IP address.
192.168.50.12(192.168.50.12:3308): OK: Slave started, replicating from 192.168.50.11(192.168.50.11:3308)
192.168.50.11(192.168.50.11:3308): Resetting slave info succeeded.
Master failover to 192.168.50.11(192.168.50.11:3308) completed successfully.
  • 手工切换测试
[root@master mha]# masterha_master_switch --master_state=alive --conf=/home/mha/mysql-mha.conf --new_master_host=192.168.50.11 --new_master_port=3308 --orig_master_is_new_slave
Thu Jan 14 04:55:44 2021 - [info] MHA::MasterRotate version 0.57.
Thu Jan 14 04:55:44 2021 - [info] Starting online master switch..
Thu Jan 14 04:55:44 2021 - [info] 
Thu Jan 14 04:55:44 2021 - [info] * Phase 1: Configuration Check Phase..
Thu Jan 14 04:55:44 2021 - [info] 
Thu Jan 14 04:55:44 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Jan 14 04:55:44 2021 - [info] Reading application default configuration from /home/mha/mysql-mha.conf..           
Thu Jan 14 04:55:44 2021 - [info] Reading server configuration from /home/mha/mysql-mha.conf..          
Thu Jan 14 04:55:45 2021 - [warning] SQL Thread is stopped(no error) on 192.168.50.12(192.168.50.12:3308)           
Thu Jan 14 04:55:45 2021 - [info] Multi-master configuration is detected. Current primary(writable) master is 192.168.50.12(192.168.50.12:3308)  
Thu Jan 14 04:55:45 2021 - [info] Master configurations are as below: 
Master 192.168.50.11(192.168.50.11:3308), replicating from 192.168.50.12(192.168.50.12:3308), read-only      
Master 192.168.50.12(192.168.50.12:3308), replicating from 192.168.50.11(192.168.50.11:3308)

Thu Jan 14 04:55:45 2021 - [info] GTID failover mode = 1            

检测当前各节点复制信息
Thu Jan 14 04:55:45 2021 - [info] Current Alive Master: 192.168.50.12(192.168.50.12:3308)        
Thu Jan 14 04:55:45 2021 - [info] Alive Slaves:
Thu Jan 14 04:55:45 2021 - [info]   192.168.50.10(192.168.50.10:3308)  Version=8.0.22-13 (oldest major version between slaves) log-bin:enabled   
Thu Jan 14 04:55:45 2021 - [info]     GTID ON
Thu Jan 14 04:55:45 2021 - [info]     Replicating from 192.168.50.12(192.168.50.12:3308)
Thu Jan 14 04:55:45 2021 - [info]   192.168.50.11(192.168.50.11:3308)  Version=8.0.22-13 (oldest major version between slaves) log-bin:enabled  
Thu Jan 14 04:55:45 2021 - [info]     GTID ON
Thu Jan 14 04:55:45 2021 - [info]     Replicating from 192.168.50.12(192.168.50.12:3308)

在原主节点执行 FLUSH TABLES(NO_WRITE_TO_BINLOG) 刷新数据并校验复制是否正常
It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 192.168.50.12(192.168.50.12:3308)? (YES/no): yes
Thu Jan 14 04:55:48 2021 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Thu Jan 14 04:55:48 2021 - [info]  ok.

检测MHA不在执行 并且没有其他failover操作
Thu Jan 14 04:55:48 2021 - [info] Checking MHA is not monitoring or doing failover..   
Thu Jan 14 04:55:48 2021 - [info] Checking replication health on 192.168.50.10..
Thu Jan 14 04:55:48 2021 - [info]  ok.
Thu Jan 14 04:55:48 2021 - [info] Checking replication health on 192.168.50.11..
Thu Jan 14 04:55:48 2021 - [info]  ok.

复制检测正常 确认11可以成为新主
Thu Jan 14 04:55:48 2021 - [info] 192.168.50.11 can be new master.            
Thu Jan 14 04:55:48 2021 - [info] 
From:
192.168.50.12(192.168.50.12:3308) (current master)
 +--192.168.50.10(192.168.50.10:3308)
 +--192.168.50.11(192.168.50.11:3308)

To:
192.168.50.11(192.168.50.11:3308) (new master)
 +--192.168.50.10(192.168.50.10:3308)
 +--192.168.50.12(192.168.50.12:3308)

确认执行切换操作
Starting master switch from 192.168.50.12(192.168.50.12:3308) to 192.168.50.11(192.168.50.11:3308)? (yes/NO): yes
Thu Jan 14 04:55:49 2021 - [info] Checking whether 192.168.50.11(192.168.50.11:3308) is ok for the new master..
Thu Jan 14 04:55:49 2021 - [info]  ok.
Thu Jan 14 04:55:49 2021 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Jan 14 04:55:49 2021 - [info] 
Thu Jan 14 04:55:49 2021 - [info] * Phase 2: Rejecting updates Phase..
Thu Jan 14 04:55:49 2021 - [info] 

通过手动脚本处理VI转移,也可以切换开始前先移除,最后完成后手动加上,同样切断应用连接避免操作
master_ip_online_change_script is not defined. If you do not disable writes on the current master manually, applications keep writing on the current master. Is it ok to proceed? (yes/NO): yes

通过FLUSH TABLES WITH READ LOCK 锁住原主库表 并且获取此时binlog 位置
Thu Jan 14 04:55:51 2021 - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Thu Jan 14 04:55:51 2021 - [info] Executing FLUSH TABLES WITH READ LOCK..
Thu Jan 14 04:55:51 2021 - [info]  ok.
Thu Jan 14 04:55:51 2021 - [info] Orig master binlog:pos is slave-binlog.000009:587.

准备成为新主的slave上通过master_pos_wait()判断binlog应用到指定位置
Thu Jan 14 04:55:51 2021 - [info]  Waiting to execute all relay logs on 192.168.50.11(192.168.50.11:3308)..
Thu Jan 14 04:55:51 2021 - [info]  master_pos_wait(slave-binlog.000009:587) completed on 192.168.50.11(192.168.50.11:3308). Executed 0 events.
Thu Jan 14 04:55:51 2021 - [info]   done.

获取新主上binlog名字和位置 供其他节点change
Thu Jan 14 04:55:51 2021 - [info] Getting new master's binlog name and position..
Thu Jan 14 04:55:51 2021 - [info]  slave-binlog.000020:641

根据配置文件中信息和提取的新节点位置 组成change语句
Thu Jan 14 04:55:51 2021 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.50.11', MASTER_PORT=3308, MASTER_AUTO_POSITION=1, MASTER_USER='repluser', MASTER_PASSWORD='xxx';

将新主节点read_only设置为0 允许写入
Thu Jan 14 04:55:51 2021 - [info] Setting read_only=0 on 192.168.50.11(192.168.50.11:3308)..
Thu Jan 14 04:55:51 2021 - [info]  ok.
Thu Jan 14 04:55:51 2021 - [info] 

并发操作旧主以外从节点  通过master_pos_wait()确认其应用完旧主上的binlog位置
Thu Jan 14 04:55:51 2021 - [info] * Switching slaves in parallel..
Thu Jan 14 04:55:51 2021 - [info] 
Thu Jan 14 04:55:51 2021 - [info] -- Slave switch on host 192.168.50.10(192.168.50.10:3308) started, pid: 3615
Thu Jan 14 04:55:51 2021 - [info] 
Thu Jan 14 04:55:52 2021 - [info] Log messages from 192.168.50.10 ...
Thu Jan 14 04:55:52 2021 - [info] 
Thu Jan 14 04:55:51 2021 - [info]  Waiting to execute all relay logs on 192.168.50.10(192.168.50.10:3308)..
Thu Jan 14 04:55:51 2021 - [info]  master_pos_wait(slave-binlog.000009:587) completed on 192.168.50.10(192.168.50.10:3308). Executed 0 events.
Thu Jan 14 04:55:51 2021 - [info]   done.

对旧主以外的从节点执行reset slave清理relaylog 并change master指向新的主
Thu Jan 14 04:55:51 2021 - [info]  Resetting slave 192.168.50.10(192.168.50.10:3308) and starting replication from the new master 192.168.50.11(192.168.50.11:3308)..
Thu Jan 14 04:55:51 2021 - [info]  Executed CHANGE MASTER.
Thu Jan 14 04:55:51 2021 - [info]  Slave started.
Thu Jan 14 04:55:52 2021 - [info] End of log messages from 192.168.50.10 ...
Thu Jan 14 04:55:52 2021 - [info] 
Thu Jan 14 04:55:52 2021 - [info] -- Slave switch on host 192.168.50.10(192.168.50.10:3308) succeeded.

释放旧主上的READ LOCK
Thu Jan 14 04:55:52 2021 - [info] Unlocking all tables on the orig master:
Thu Jan 14 04:55:52 2021 - [info] Executing UNLOCK TABLES..
Thu Jan 14 04:55:52 2021 - [info]  ok.
Thu Jan 14 04:55:52 2021 - [info] Starting orig master as a new slave..

对旧主执行reset slave清理relaylog 并change mater指向新主
Thu Jan 14 04:55:52 2021 - [info]  Resetting slave 192.168.50.12(192.168.50.12:3308) and starting replication from the new master 192.168.50.11(192.168.50.11:3308)..
Thu Jan 14 04:55:52 2021 - [info]  Executed CHANGE MASTER.
Thu Jan 14 04:55:52 2021 - [info]  Slave started.
Thu Jan 14 04:55:52 2021 - [info] All new slave servers switched successfully.
Thu Jan 14 04:55:52 2021 - [info] 
Thu Jan 14 04:55:52 2021 - [info] * Phase 5: New master cleanup phase..
Thu Jan 14 04:55:52 2021 - [info] 

清理新主slave信息
Thu Jan 14 04:55:52 2021 - [info]  192.168.50.11: Resetting slave info succeeded.
Thu Jan 14 04:55:52 2021 - [info] Switching master to 192.168.50.11(192.168.50.11:3308) completed successfully.


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

-守仁-

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值