MHA分管理节点和数据库节点,数据库节点由mysql主从或者主主从构成,当主库挂掉后,管理节点会自动将从节点提升为主节点;管理节点的角色类似于oracle数据库中的fast start failover中的observer,但mha上层可以通过keepalive部署VIP,程序连接数据库使用VIP,从而实现后台数据库的故障切换透明化
MHA节点包含三个脚本,依赖perl模块;
save_binary_logs:保存和复制当掉的主服务器二进制日志;
apply_diff_relay_logs:识别差异的relay log事件,并应用于其他salve服务器;
purge_relay_logs:清除relay log文件;
需要在所有mysql服务器上安装MHA节点,MHA管理服务器也需要安装。MHA管理节点模块内部依赖MHA节点模块;
MHA管理节点通过ssh连接管理mysql服务器和执行MHA节点脚本。MHA节点依赖perl的DBD::mysql模块;
准备环境
配置M-M-S数据库复制,环境如下:
host1: 192.168.90.128 #manager, monitor
host2: 192.168.90.128 #master
hots3: 192.168.90.129 #备选master
host4: 192.168.90.130 #slave
安装MHA
安装mha node
在128、129、130上安装mha node
shell> rpm -q perl-DBD-MySQL ncftp
shell> tar zxvf mha4mysql-node-0.54.tar.gz
shell> perl Makefile.PL #依赖DBI、DBD::mysql模块
shell> make && make install
配置mha manager
1、 在128上安装mha manager
shell> yum install perl-Config-Tiny perl-Params-Validate perl-Log-Dispatch perl-Parallel-ForkManager
shell> tar zxvf mha4mysql-manager-0.55.tar.gz
shell> cd mha4mysql-manager-0.55
shell> perl Makefile.PL #依赖Time::HiRes、Config::Tiny、Log::Dispatch、Parallel::ForkManager、MHA::NodeConst 模块
*** Module::AutoInstall version 1.03
*** Checking for Perl dependencies...
[Core Features]
- DBI ...loaded. (1.609)
- DBD::mysql ...loaded. (4.029)
- Time::HiRes ...loaded. (1.9721)
- Config::Tiny ...loaded. (2.14)
- Log::Dispatch ...loaded. (2.44)
- Parallel::ForkManager ...loaded. (1.11)
- MHA::NodeConst ...loaded. (0.54)
*** Module::AutoInstall configuration finished.
Writing Makefile for mha4mysql::manager
shell> make && make install
2、 配置配置文件
shell> mkdir /etc/masterha
shell> mkdir -p /db/tool/masterha/app1
shell> cp samples/conf/* /etc/masterha/
shell> cat /etc/masterha/app1.cnf
================================================================================================
[server default]
manager_workdir=/db/tool/masterha/app1
manager_log=/db/tool/masterha/app1/manager.log
#mysql user and password
user=root
password=mypassword
ssh_user=root
repl_user=repluser
repl_password=mypassword
ping_interval=1
shutdown_script=""
#master_ip_failover_script=/usr/local/bin/master_ip_failover
#master_ip_online_change_script=/usr/local/bin/master_ip_online_change_script
#report_script=""
[server1]
hostname=192.168.90.128
master_binlog_dir=/mysql/mysql/
candidate_master=1
[server2]
hostname=192.168.90.129
master_binlog_dir=/mysql/mysql/
candidate_master=1
[server3]
hostname=192.168.90.130
master_binlog_dir=/mysql/mysql/
no_master=1
================================================================================================
配置ssh公钥免登录环境
在manager上192.168.90.128配置到个Node的无密码验证:
shell> ssh-keygen -t rsa
shell> ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.90.129
shell> ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.90.130
shell> cat .ssh/id_rsa.pub >> .ssh/authorized_keys
node129
shell> ssh-keygen -t rsa
shell> ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.90.128
shell> ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.90.130
node130
shell> ssh-keygen -t rsa
shell> ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.90.128
shell> ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.90.129
测试ssh连接
shell> masterha_check_ssh --conf=/etc/masterha/app1.cnf
Thu Feb 12 22:29:12 2015 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Feb 12 22:29:12 2015 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
Thu Feb 12 22:29:12 2015 - [info] Reading server configurations from /etc/masterha/app1.cnf..
Thu Feb 12 22:29:12 2015 - [info] Starting SSH connection tests..
Thu Feb 12 22:29:13 2015 - [debug]
Thu Feb 12 22:29:12 2015 - [debug] Connecting via SSH from root@192.168.90.128(192.168.90.128:22) to
root@192.168.90.129(192.168.90.129:22)..
Thu Feb 12 22:29:12 2015 - [debug] ok.
Thu Feb 12 22:29:12 2015 - [debug] Connecting via SSH from root@192.168.90.128(192.168.90.128:22) to
root@192.168.90.130(192.168.90.130:22)..
Thu Feb 12 22:29:13 2015 - [debug] ok.
Thu Feb 12 22:29:14 2015 - [debug]
Thu Feb 12 22:29:12 2015 - [debug] Connecting via SSH from root@192.168.90.129(192.168.90.129:22) to
root@192.168.90.128(192.168.90.128:22)..
Thu Feb 12 22:29:13 2015 - [debug] ok.
Thu Feb 12 22:29:13 2015 - [debug] Connecting via SSH from root@192.168.90.129(192.168.90.129:22) to
root@192.168.90.130(192.168.90.130:22)..
Thu Feb 12 22:29:13 2015 - [debug] ok.
Thu Feb 12 22:29:14 2015 - [debug]
Thu Feb 12 22:29:13 2015 - [debug] Connecting via SSH from root@192.168.90.130(192.168.90.130:22) to
root@192.168.90.128(192.168.90.128:22)..
Thu Feb 12 22:29:13 2015 - [debug] ok.
Thu Feb 12 22:29:13 2015 - [debug] Connecting via SSH from root@192.168.90.130(192.168.90.130:22) to
root@192.168.90.129(192.168.90.129:22)..
Thu Feb 12 22:29:14 2015 - [debug] ok.
Thu Feb 12 22:29:14 2015 - [info] All SSH connection tests passed successfully.
测试复制状况
shell> masterha_check_repl --conf=/etc/masterha/app1.cnf
Fri Feb 13 00:29:50 2015 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Feb 13 00:29:50 2015 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
Fri Feb 13 00:29:50 2015 - [info] Reading server configurations from /etc/masterha/app1.cnf..
Fri Feb 13 00:29:50 2015 - [info] MHA::MasterMonitor version 0.55.
Fri Feb 13 00:29:51 2015 - [info] Multi-master configuration is detected. Current primary(writable) master
is 192.168.90.128(192.168.90.128:3306)
Fri Feb 13 00:29:51 2015 - [info] Master configurations are as below:
Master 192.168.90.128(192.168.90.128:3306), replicating from 192.168.90.129(192.168.90.129:3306)
Master 192.168.90.129(192.168.90.129:3306), replicating from 192.168.90.128(192.168.90.128:3306), read-only
Fri Feb 13 00:29:51 2015 - [info] Dead Servers:
Fri Feb 13 00:29:51 2015 - [info] Alive Servers:
Fri Feb 13 00:29:51 2015 - [info] 192.168.90.128(192.168.90.128:3306)
Fri Feb 13 00:29:51 2015 - [info] 192.168.90.129(192.168.90.129:3306)
Fri Feb 13 00:29:51 2015 - [info] 192.168.90.130(192.168.90.130:3306)
Fri Feb 13 00:29:51 2015 - [info] Alive Slaves:
Fri Feb 13 00:29:51 2015 - [info] 192.168.90.129(192.168.90.129:3306) Version=5.6.19-log (oldest major version
between slaves) log-bin:enabled
Fri Feb 13 00:29:51 2015 - [info] Replicating from 192.168.90.128(192.168.90.128:3306)
Fri Feb 13 00:29:51 2015 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Feb 13 00:29:51 2015 - [info] 192.168.90.130(192.168.90.130:3306) Version=5.6.19-log (oldest major version
between slaves) log-bin:enabled
Fri Feb 13 00:29:51 2015 - [info] Replicating from 192.168.90.128(192.168.90.128:3306)
Fri Feb 13 00:29:51 2015 - [info] Not candidate for the new Master (no_master is set)
Fri Feb 13 00:29:51 2015 - [info] Current Alive Master: 192.168.90.128(192.168.90.128:3306)
Fri Feb 13 00:29:51 2015 - [info] Checking slave configurations..
Fri Feb 13 00:29:51 2015 - [info] Checking replication filtering settings..
Fri Feb 13 00:29:51 2015 - [info] binlog_do_db= , binlog_ignore_db=
Fri Feb 13 00:29:51 2015 - [info] Replication filtering check ok.
Fri Feb 13 00:29:51 2015 - [info] Starting SSH connection tests..
Fri Feb 13 00:29:55 2015 - [info] All SSH connection tests passed successfully.
Fri Feb 13 00:29:55 2015 - [info] Checking MHA Node version..
Fri Feb 13 00:29:56 2015 - [info] Version check ok.
Fri Feb 13 00:29:56 2015 - [info] Checking SSH publickey authentication settings on the current master..
Fri Feb 13 00:29:56 2015 - [info] HealthCheck: SSH to 192.168.90.128 is reachable.
Fri Feb 13 00:29:56 2015 - [info] Master MHA Node version is 0.54.
Fri Feb 13 00:29:56 2015 - [info] Checking recovery script configurations on the current master..
Fri Feb 13 00:29:56 2015 - [info] Executing command: save_binary_logs --command=test --start_pos=4
--binlog_dir=/mysql/mysql/ --output_file=/var/tmp/save_binary_logs_test
--manager_version=0.55 --start_file=mysql-info.000021
Fri Feb 13 00:29:56 2015 - [info] Connecting to root@192.168.90.128(192.168.90.128)..
Creating /var/tmp if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /mysql/mysql/, up to mysql-info.000021
Fri Feb 13 00:29:57 2015 - [info] Master setting check done.
Fri Feb 13 00:29:57 2015 - [info] Checking SSH publickey authentication and checking recovery script
configurations on all alive slave servers..
Fri Feb 13 00:29:57 2015 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root'
--slave_host=192.168.90.129 --slave_ip=192.168.90.129 --slave_port=3306
--workdir=/var/tmp --target_version=5.6.19-log --manager_version=0.55
--relay_log_info=/mysql/mysql/relay-log.info --relay_dir=/mysql/mysql/
--slave_pass=xxx
Fri Feb 13 00:29:57 2015 - [info] Connecting to root@192.168.90.129(192.168.90.129:22)..
Checking slave recovery environment settings..
Opening /mysql/mysql/relay-log.info ... ok.
Relay log found at /mysql/mysql, up to rac2-relay-bin.000039
Temporary relay log file is /mysql/mysql/rac2-relay-bin.000039
Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Fri Feb 13 00:29:58 2015 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root'
--slave_host=192.168.90.130 --slave_ip=192.168.90.130 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.19-log
--manager_version=0.55 --relay_log_info=/mysql/mysql/relay-log.info --relay_dir=/mysql/mysql/ --slave_pass=xxx
Fri Feb 13 00:29:58 2015 - [info] Connecting to root@192.168.90.130(192.168.90.130:22)..
Checking slave recovery environment settings..
Opening /mysql/mysql/relay-log.info ... ok.
Relay log found at /mysql/mysql, up to rac3-relay-bin.000008
Temporary relay log file is /mysql/mysql/rac3-relay-bin.000008
Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Fri Feb 13 00:29:58 2015 - [info] Slaves settings check done.
Fri Feb 13 00:29:58 2015 - [info]
192.168.90.128 (current master)
+--192.168.90.129
+--192.168.90.130
Fri Feb 13 00:29:58 2015 - [info] Checking replication health on 192.168.90.129..
Fri Feb 13 00:29:58 2015 - [info] ok.
Fri Feb 13 00:29:58 2015 - [info] Checking replication health on 192.168.90.130..
Fri Feb 13 00:29:58 2015 - [info] ok.
Fri Feb 13 00:29:58 2015 - [warning] master_ip_failover_script is not defined.
Fri Feb 13 00:29:58 2015 - [warning] shutdown_script is not defined.
Fri Feb 13 00:29:58 2015 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
如设置不正确,可能会报错,按照提示修改配置:
129、130设置
mysql> set global read_only=1;
mysql> set global relay_log_purge=0;
shell> ln -s /db/mysql/bin/mysql /usr/bin/mysql
shell> ln -s /db/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog
启动mha manager
shell> masterha_manager --conf=/etc/masterha/app1.cnf > /db/tool/masterha/app1/manager.log 2>&1 &
查看日志文件
shell> tail -f /db/tool/masterha/app1/manager.log
192.168.90.128 (current master)
+--192.168.90.129
+--192.168.90.130
Fri Feb 13 00:58:12 2015 - [warning] master_ip_failover_script is not defined.
Fri Feb 13 00:58:12 2015 - [warning] shutdown_script is not defined.
Fri Feb 13 00:58:12 2015 - [info] Set master ping interval 1 seconds.
Fri Feb 13 00:58:12 2015 - [warning] secondary_check_script is not defined. It is highly recommended setting it
to check master reachability from two or more routes.
Fri Feb 13 00:58:12 2015 - [info] Starting ping health check on 192.168.90.128(192.168.90.128:3306)..
Fri Feb 13 00:58:12 2015 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
查看mha状态:
shell> masterha_check_status --conf=/etc/masterha/app1.cnf
app1 (pid:7024) is running(0:PING_OK), master:192.168.90.128
停止mha manager
shell> masterha_stop --conf=/etc/masterha/app1.cnf
Stopped app1 successfully.
测试failover过程
关闭128主库
shell> mysqladmin -u root -p shutdown
查看切换日志
shell> tail -f /db/tool/masterha/app1/manager.log
Fri Feb 13 01:19:26 2015 - [warning] Got error on MySQL select ping: 2013 (Lost connection to MySQL server during query)
Fri Feb 13 01:19:26 2015 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4
--binlog_dir=/mysql/mysql/ --output_file=/var/tmp/save_binary_logs_test
--manager_version=0.55 --binlog_prefix=mysql-info
Fri Feb 13 01:19:26 2015 - [info] HealthCheck: SSH to 192.168.90.128 is reachable.
Fri Feb 13 01:19:29 2015 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.90.128')
Fri Feb 13 01:19:29 2015 - [warning] Connection failed 3 time(s)..
Fri Feb 13 01:19:29 2015 - [warning] Master is not reachable from health checker!
Fri Feb 13 01:19:29 2015 - [warning] Master 192.168.90.128(192.168.90.128:3306) is not reachable!
Fri Feb 13 01:19:29 2015 - [warning] SSH is reachable.
Fri Feb 13 01:19:29 2015 - [info] Reading server configurations from /etc/masterha/app1.cnf..
Fri Feb 13 01:19:30 2015 - [info] Dead Servers:
Fri Feb 13 01:19:30 2015 - [info] 192.168.90.128(192.168.90.128:3306)
Fri Feb 13 01:19:30 2015 - [info] Alive Servers:
Fri Feb 13 01:19:30 2015 - [info] 192.168.90.129(192.168.90.129:3306)
Fri Feb 13 01:19:30 2015 - [info] 192.168.90.130(192.168.90.130:3306)
Fri Feb 13 01:19:30 2015 - [info] Replicating from 192.168.90.128(192.168.90.128:3306)
Fri Feb 13 01:19:30 2015 - [info] Not candidate for the new Master (no_master is set)
Fri Feb 13 01:19:30 2015 - [info] Checking slave configurations..
Fri Feb 13 01:19:30 2015 - [info] Checking replication filtering settings..
Fri Feb 13 01:19:30 2015 - [info] Replication filtering check ok.
Fri Feb 13 01:19:30 2015 - [info] Master is down!
Fri Feb 13 01:19:30 2015 - [info] Terminating monitoring script.
Fri Feb 13 01:19:30 2015 - [info] Got exit code 20 (Master dead).
Fri Feb 13 01:19:30 2015 - [info] MHA::MasterFailover version 0.55.
Fri Feb 13 01:19:30 2015 - [info] Starting master failover.
Fri Feb 13 01:19:30 2015 - [info]
Fri Feb 13 01:19:30 2015 - [info] * Phase 1: Configuration Check Phase..
Fri Feb 13 01:19:31 2015 - [info] ** Phase 1: Configuration Check Phase completed.
Fri Feb 13 01:19:31 2015 - [info]
Fri Feb 13 01:19:31 2015 - [info] * Phase 2: Dead Master Shutdown Phase..
Fri Feb 13 01:19:32 2015 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Fri Feb 13 01:19:32 2015 - [info]
Fri Feb 13 01:19:32 2015 - [info] * Phase 3: Master Recovery Phase..
Fri Feb 13 01:19:32 2015 - [info]
Fri Feb 13 01:19:32 2015 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Fri Feb 13 01:19:32 2015 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Fri Feb 13 01:19:33 2015 - [info] * Phase 3.3: Determining New Master Phase..
Fri Feb 13 01:19:33 2015 - [info] New master is 192.168.90.129(192.168.90.129:3306)
Fri Feb 13 01:19:33 2015 - [info] Starting master failover..
Fri Feb 13 01:19:33 2015 - [info]
From:
192.168.90.128 (current master)
+--192.168.90.129
+--192.168.90.130
To:
192.168.90.129 (new master)
+--192.168.90.130
Fri Feb 13 01:19:33 2015 - [info]
Fri Feb 13 01:19:33 2015 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
Fri Feb 13 01:19:33 2015 - [info] * Phase 3.4: Master Log Apply Phase..
Fri Feb 13 01:19:33 2015 - [info] Setting read_only=0 on 192.168.90.129(192.168.90.129:3306)..
Fri Feb 13 01:19:33 2015 - [info] ok.
Fri Feb 13 01:19:33 2015 - [info] ** Finished master recovery successfully.
Fri Feb 13 01:19:33 2015 - [info] * Phase 3: Master Recovery Phase completed.
Fri Feb 13 01:19:33 2015 - [info]
Fri Feb 13 01:19:33 2015 - [info] * Phase 4: Slaves Recovery Phase..
Fri Feb 13 01:19:33 2015 - [info]
Fri Feb 13 01:19:33 2015 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
Fri Feb 13 01:19:34 2015 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
Fri Feb 13 01:19:34 2015 - [info] Resetting slave 192.168.90.130(192.168.90.130:3306) and starting replication
from the new master 192.168.90.129(192.168.90.129:3306)..
Fri Feb 13 01:19:34 2015 - [info] Executed CHANGE MASTER.
Fri Feb 13 01:19:34 2015 - [info] Slave started.
Fri Feb 13 01:19:35 2015 - [info] End of log messages from 192.168.90.130.
Fri Feb 13 01:19:35 2015 - [info] -- Slave recovery on host 192.168.90.130(192.168.90.130:3306) succeeded.
Fri Feb 13 01:19:35 2015 - [info] All new slave servers recovered successfully.
Fri Feb 13 01:19:35 2015 - [info]
Fri Feb 13 01:19:35 2015 - [info] * Phase 5: New master cleanup phase..
Fri Feb 13 01:19:35 2015 - [info]
Fri Feb 13 01:19:35 2015 - [info] Resetting slave info on the new master..
Fri Feb 13 01:19:35 2015 - [info] 192.168.90.129: Resetting slave info succeeded.
Fri Feb 13 01:19:35 2015 - [info] Master failover to 192.168.90.129(192.168.90.129:3306) completed successfully.
Fri Feb 13 01:19:35 2015 - [info]
----- Failover Report -----
app1: MySQL Master failover 192.168.90.128 to 192.168.90.129 succeeded
Master 192.168.90.128 is down!
Check MHA Manager logs at rac1:/db/tool/masterha/app1/manager.log for details.
Started automated(non-interactive) failover.
The latest slave 192.168.90.129(192.168.90.129:3306) has all relay logs for recovery.
Selected 192.168.90.129 as a new master.
192.168.90.129: OK: Applying all logs succeeded.
192.168.90.130: This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.90.130: OK: Applying all logs succeeded. Slave started, replicating from 192.168.90.129.
192.168.90.129: Resetting slave info succeeded.
Master failover to 192.168.90.129(192.168.90.129:3306) completed successfully.
重新启动128主库
shell> masterha_check_repl --conf=/etc/masterha/app1.cnf
...
Tue Mar 17 14:04:36 2015 - [info]
192.168.90.129 (current master)
+--192.168.90.128
+--192.168.90.130
Tue Mar 17 14:04:36 2015 - [info] Checking replication health on 192.168.90.128..
Tue Mar 17 14:04:36 2015 - [info] ok.
Tue Mar 17 14:04:36 2015 - [info] Checking replication health on 192.168.90.130..
Tue Mar 17 14:04:36 2015 - [info] ok.
Tue Mar 17 14:04:36 2015 - [warning] master_ip_failover_script is not defined.
Tue Mar 17 14:04:36 2015 - [warning] shutdown_script is not defined.
Tue Mar 17 14:04:36 2015 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
从上面的输出可以看出整个MHA的切换过程,共包括以下的步骤:
a). 配置文件检查阶段,这个阶段会检查整个集群配置文件配置
b). 宕机的master处理,这个阶段包括虚拟ip摘除操作,主机关机操作等
执行master_ip_failover_script,shutdown_script脚本如存在的话。
c). 复制dead maste和最新slave相差的relay log,并保存到MHA Manger具体的目录下
获得最近的slave(latest binary log file/position)
获得最旧的slave(latest binary log file/position)
Executing command on the dead master 192.168.90.128(192.168.90.128:3306): save_binary_logs --command=save
--start_file=mysql-info.000024 --start_pos=120 --binlog_dir=/mysql/mysql/ --output_file=/var/tmp/saved_
master_binlog_from_192.168.90.128_3306_20150317140026.binlog --handle_raw_binlog=1 --disable_log_bin=0
--manager_version=0.55
Creating /var/tmp if not exists.. ok.
scp from root@192.168.90.128:/var/tmp/saved_master_binlog_from_192.168.90.128_3306_20150317140026.binlog to
local:/db/tool/masterha/app1/saved_master_binlog_from_192.168.90.128_3306_20150317140026.binlog succeeded
d). 识别含有最新更新的slave作为主库并开始主库故障转移
e). new master应用从old master保存的二进制日志事件(binlog events)
Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.90.128_3306_20150317140026.binlog
on 192.168.90.129:3306. This may take long time...
Applying log files succeeded.
Tue Mar 17 14:00:32 2015 - [info] All relay logs were successfully applied.
Tue Mar 17 14:00:32 2015 - [info] Getting new master's binlog name and position..
f). 记录new master binlog文件及位置,并修改read_only状态
All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.90.129',
MASTER_PORT=3306, MASTER_LOG_FILE='mysql-info.000017', MASTER_LOG_POS=246, MASTER_USER='repluser', MASTER_PASSWORD='xxx';
Setting read_only=0
g). Slaves恢复并改变主库
根据从库Exec_Master_Log_Pos执行,同5,应用差异的二进制/中继日志文件
All relay logs were successfully applied.
Resetting slave 192.168.90.130(192.168.90.130:3306) and starting replication from the new master
192.168.90.129(192.168.90.129:3306)..
Executed CHANGE MASTER.
h). new master清理
Resetting slave info on the new master..
192.168.90.129: Resetting slave info succeeded.
Master failover to 192.168.90.129(192.168.90.129:3306) completed successfully
即使没有延迟也至少要3s以上的切换时间,因为在M发生宕机的时候需要3s的重连验证。
参考文档:
http://www.cnblogs.com/zhoujinyi/p/3808673.html
http://ylw6006.blog.51cto.com/470441/1568853
整理自网络
Svoid
2015-03-17