1.1 简介
MHA(Master HA)是一款开源的 MySQL 的高可用程序,它为 MySQL 主从复制架构提供了 automating master failover 功能。MHA 在监控到 master 节点故障时,会提升其中拥有最新数据的 slave 节点成为新的master 节点,在此期间,MHA 会通过于其它从节点获取额外信息来避免一致性方面的问题。MHA 还提供了 master 节点的在线切换功能,即按需切换 master/slave 节点。
MHA 是由日本人 yoshinorim(原就职于DeNA现就职于FaceBook)开发的比较成熟的 MySQL 高可用方案。MHA 能够在30秒内实现故障切换,并能在故障切换中,最大可能的保证数据一致性。目前淘宝也正在开发相似产品 TMHA, 目前已支持一主一从
1.2 组件
MHA 服务有两种角色, MHA Manager(管理节点)和 MHA Node(数据节点):
MHA Manager:
通常单独部署在一台独立机器上管理多个 master/slave 集群(组),每个 master/slave 集群称作一个 application,用来管理统筹整个集群。
MHA node:
运行在每台 MySQL 服务器上(master/slave/manager),它通过监控具备解析和清理 logs 功能的脚本来加快故障转移。
主要是接收管理节点所发出指令的代理,代理需要运行在每一个 mysql 节点上
1.3 MHA集群架构图
1.4 服务器
Master | 192.168.42.141 | 主机 |
Slave | 192.168.42.142 | 从机 |
Slave | 192.168.42.143 | 从机 |
Manager | 192.168.42.143 | 管理节点 |
1.5 Mysql安装
1.5.1 给每一台机器设置主机名
hostnamectl set-hostname mysql01
hostnamectl set-hostname mysql02
hostnamectl set-hostname mysql03
重启机器
reboot
1.5.2 每一台机器关闭防火墙
systemctl stop firewalld
/etc/host新增配置
vim /etc/hosts
192.168.42.141 mysql01
192.168.42.142 mysql02
192.168.42.143 mysql03
1.5.3 每一台机器安装工具
yum install -y conntrack ntpdate ntp ipvsadm ipset jq iptables curl sysstat libseccomp wget vim net-tools git iproute lrzsz bash-completion tree bridge-utils unzip bind-utils gcc
1.5.4 安装mysql
1.5.4.1 每一台机器下载以及安装依赖
wget http://dev.mysql.com/get/mysql57-community-release-el7-8.noarch.rpm
yum localinstall mysql57-community-release-el7-8.noarch.rpm
yum repolist enabled | grep "mysql.*-community.*"
1.5.4.2每一台机器安装mysql-community-server
yum install -y mysql-community-server
1.5.4.3 启动
systemctl start mysqld
1.5.4.4查看状态
systemctl status mysqld
1.5.4.5 设置开机启动
systemctl enable mysqld
1.5.4.6 重新加载配置
systemctl daemon-reload
1.5.4.7 查看初始化密码
grep 'temporary password' /var/log/mysqld.log
1.5.4.8 修改密码
全部机器修改root账号密码
登录
mysql -uroot -p
设置密码修改策略(校验级别最低,只校验密码长度)
set global validate_password_policy=0;
修改密码长度
set global validate_password_length=6;
设置root密码
set password for 'root'@'localhost'=password('123456');
刷新配置
flush privileges;
1.5.5 创建用户
全部机器都需要创建replication、mha账号
create user replication identified by '123456';
create user mha identified by '123456';
create user slave identified by '123456';
给用户replication,mha授权
grant all privileges on *.* to replication@'%' identified by '123456';
grant all privileges on *.* to mha @'%' identified by '123456';
grant all privileges on *.* to slave @'%' identified by '123456';
刷新配置
flush privileges;
检查用户创建是否成功
use mysql
select user,host from user;
验证replication能够相互登录
如果验证不成功需要关闭防火墙
192.168.42.141执行
mysql -ureplication -p123456 -h192.168.42.142
mysql -ureplication -p123456 -h192.168.42.143
192.168.42.142执行
mysql -ureplication -p123456 -h192.168.42.141
mysql -ureplication -p123456 -h192.168.42.143
192.168.42.143执行
mysql -ureplication -p123456 -h192.168.42.141
mysql -ureplication -p123456 -h192.168.42.142
1.5.6 192.168.1.141 配置my.cnf
vim /etc/my.cnf
log-bin
server-id=1
gtid_mode=on
enforce_gtid_consistency=on
1.5.7 192.168.1.142 配置my.cnf
vim /etc/my.cnf
log-bin
server-id=2
gtid_mode=on
enforce_gtid_consistency=on
1.5.8 192.168.1.143 配置my.cnf
vim /etc/my.cnf
log-bin
server-id=3
gtid_mode=on
enforce_gtid_consistency=on
1.5.9 重启mysql
systemctl restart mysqld;
1.6 主从配置
1.6.1 查看192.168.41.141主复制的指针位
mysql -uroot -p
show master status \G;
1.6.1 查看192.168.41.142主从配置
mysql -uroot -p
change master to master_host='192.168.42.141',
master_user='replication',
master_password='123456',
MASTER_LOG_FILE='mysql01-bin.000004', MASTER_LOG_POS=154;
start slave;
show slave status \G;
1.6.1 查看192.168.41.143主从配置
mysql -uroot -p
change master to master_host='192.168.42.141',
master_user='replication',
master_password='123456',
MASTER_LOG_FILE='mysql01-bin.000001', MASTER_LOG_POS=154;
start slave;
show slave status \G;
1.6.4 主从配置验证是否配置成功
192.168.42.141
192.168.42.142
192.168.42.143
1.7 ssh免密登录配置
每台机器配置
ssh-keygen -t rsa
cat /root/.ssh/id_rsa.pub |ssh root@192.168.42.141 'cat>>.ssh/authorized_keys'
cat /root/.ssh/id_rsa.pub |ssh root@192.168.42.142 'cat>>.ssh/authorized_keys'
cat /root/.ssh/id_rsa.pub |ssh root@192.168.42.143 'cat>>.ssh/authorized_keys'
1.8 MHA 免密配置
1.8.1所有机器安装epl依赖
yum -y install epel-release
wget http://rpms.remirepo.net/enterprise/remi-release-7.rpm
rpm -Uvh remi-release-7.rpm
1.8.2 所有机器下载与安装node
安装依赖
yum install -y perl-DBD-MySQL ncftp perl-DBI.x86
下载node
wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.centos.noarch.rpm
安装node
rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm
1.8.3 所有机器下载与安装manager
只在192.168.42.143机器安装
安装依赖
yum install -y perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager
wget https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
rpm -Uvh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
1.8.4 创建配置mha.conf
所有机器在/etc/mha目录下创建mha.conf
创建目录
mkdir -p /etc/mha
vim mha.conf
##MHA配置
[server default]
user=mha
password=123456
ssh_user=root
repl_user=replication
repl_password=123456
manager_workdir=/var/log/masterha/db1
manager_log=/var/log/masterha/app1/db1.log
master_binlog_dir=/var/lib/mysql
remote_workdir=/data/log/masterha
secondary_check_script=masterha_secondary_check -s 192.168.42.141 -s 192.168.42.142
ping_interval=3
master_ip_failover_script=/script/masterha/master_ip_failover
report_script=/script/masterha/send_master_failover_mail
[server1]
hostname=192.168.42.141
candidate_master=1
[server2]
hostname=192.168.42.142
candidate_master=1
[server3]
hostname=192.168.42.143
no_master=1
1.8.5 创建master_ip_failover
在19.168.42.143机器操作
创建目录
mkdir -p /script/masterha
vim master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
#############################添加内容部分#########################################
my $vip = '192.168.42.88';
my $brdc = '192.168.42.255';
my $ifdev = 'ens33';
my $key = '1';
my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
##################################################################################
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
修改文件权限
chmod 777 /script/masterha/master_ip_failover
1.8.6 创建send_master_failover_mail
vim /script/masterha/send_master_failover_mail
#!/usr/bin/perl
use strict;
use warnings FATAL => 'all';
use Mail::Sender;
use Getopt::Long;
#new_master_host and new_slave_hosts are set only when recovering master succeeded
my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body );
my $smtp='smtp.qq.com';
my $mail_from='xx@qq.com';
my $mail_user='xx@qq.com';
my $mail_pass='xx';
my $mail_to=['xx@36.cn'];
GetOptions(
'orig_master_host=s' => \$dead_master_host,
'new_master_host=s' => \$new_master_host,
'new_slave_hosts=s' => \$new_slave_hosts,
'subject=s' => \$subject,
'body=s' => \$body,
);
# Do whatever you want here
mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body);
sub mailToContacts {
my ( $smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg ) = @_;
open my $DEBUG, "> /etc/mha/monitormail.log"
or die "Can't open the debug file:$!\n";
my $sender = new Mail::Sender {
ctype => 'text/plain; charset=utf-8',
encoding => 'utf-8',
smtp => $smtp,
from => $mail_from,
auth => 'LOGIN',
TLS_allowed => '0',
authid => $user,
authpwd => $passwd,
to => $mail_to,
subject => $subject,
debug => $DEBUG
};
$sender->MailMsg(
{ msg => $msg,
debug => $DEBUG
}
) or print $Mail::Sender::Error;
return 1;
}
说明
my $smtp='smtp.qq.com';
my $mail_from='xx@qq.com';
my $mail_user='xx@qq.com';
my $mail_pass='xx';
my $mail_to=['xx@36.cn'];
需要修改成自己的配置
创建文件monitormail.log
touch /etc/mha/monitormail.log
修改文件权限
chmod 777 /script/masterha/send_master_failover_mail
1.8.7 配置VIP
在192.168.42.141
ifconfig ens33:1 192.168.42.88/24
1.8.8 验证配置
MYSQL 复制环境检测工具
masterha_check_repl --conf=/etc/mha/mha.conf
MHA 依赖的 ssh 环境监测工具
masterha_check_ssh --conf=/etc/mha/mha.conf
1.8.9 创建db1.log
mkdir -p /var/log/masterha/app1
touch /var/log/masterha/app1/db1.log
1.8.10 启动
nohup masterha_manager -conf=/etc/mha/mha.conf&> /etc/mha/manager.log &
1.8.11 查看状态
masterha_check_status --conf=/etc/mha/mha.conf
1.8.12 停止命令
masterha_stop -conf=/etc/mha/mha.conf
1.8.13 模拟宕机
162.168.42.141没有宕机时
162.168.42.141
162.168.42.143日志如下
162.168.42.141宕机时
162.168.42.142
vip到了162.168.42.142机器
收到邮件如下
162.168.42.143日志如下
IN SCRIPT TEST====/usr/sbin/ip addr del 192.168.42.88/24 dev ens33 label ens33:1==/usr/sbin/ip addr add 192.168.42.88/24 brd 192.168.42.255 dev ens33 label ens33:1;/usr/sbin/arping -q -A -c 1 -I ens33 192.168.42.88;iptables -F;===
Checking the Status of the script.. OK
Fri Aug 27 19:09:26 2021 - [info] OK.
Fri Aug 27 19:09:26 2021 - [warning] shutdown_script is not defined.
Fri Aug 27 19:09:26 2021 - [info] Set master ping interval 3 seconds.
Fri Aug 27 19:09:26 2021 - [info] Set secondary check script: masterha_secondary_check -s 192.168.42.141 -s 192.168.42.142
Fri Aug 27 19:09:26 2021 - [info] Starting ping health check on 192.168.42.141(192.168.42.141:3306)..
Fri Aug 27 19:09:26 2021 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
--------------------------------------------------------------------------------------
Fri Aug 27 19:16:50 2021 - [warning] Got error on MySQL select ping: 2013 (Lost connection to MySQL server during query)
Fri Aug 27 19:16:50 2021 - [info] Executing secondary network check script: masterha_secondary_check -s 192.168.42.141 -s 192.168.42.142 --user=root --master_host=192.168.42.141 --master_ip=192.168.42.141 --master_port=3306 --master_user=mha --master_password=123456 --ping_type=SELECT
Fri Aug 27 19:16:50 2021 - [info] Executing SSH check script: exit 0
Fri Aug 27 19:16:50 2021 - [info] HealthCheck: SSH to 192.168.42.141 is reachable.
Monitoring server 192.168.42.141 is reachable, Master is not reachable from 192.168.42.141. OK.
Monitoring server 192.168.42.142 is reachable, Master is not reachable from 192.168.42.142. OK.
Fri Aug 27 19:16:51 2021 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Fri Aug 27 19:16:53 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.42.141' (111))
Fri Aug 27 19:16:53 2021 - [warning] Connection failed 2 time(s)..
Fri Aug 27 19:16:56 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.42.141' (111))
Fri Aug 27 19:16:56 2021 - [warning] Connection failed 3 time(s)..
Fri Aug 27 19:16:59 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.42.141' (111))
Fri Aug 27 19:16:59 2021 - [warning] Connection failed 4 time(s)..
Fri Aug 27 19:16:59 2021 - [warning] Master is not reachable from health checker!
Fri Aug 27 19:16:59 2021 - [warning] Master 192.168.42.141(192.168.42.141:3306) is not reachable!
Fri Aug 27 19:16:59 2021 - [warning] SSH is reachable.
Fri Aug 27 19:16:59 2021 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/mha.conf again, and trying to connect to all servers to check server status..
Fri Aug 27 19:16:59 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Aug 27 19:16:59 2021 - [info] Reading application default configuration from /etc/mha/mha.conf..
Fri Aug 27 19:16:59 2021 - [info] Reading server configuration from /etc/mha/mha.conf..
Fri Aug 27 19:17:00 2021 - [info] GTID failover mode = 1
Fri Aug 27 19:17:00 2021 - [info] Dead Servers:
Fri Aug 27 19:17:00 2021 - [info] 192.168.42.141(192.168.42.141:3306)
Fri Aug 27 19:17:00 2021 - [info] Alive Servers:
Fri Aug 27 19:17:00 2021 - [info] 192.168.42.142(192.168.42.142:3306)
Fri Aug 27 19:17:00 2021 - [info] 192.168.42.143(192.168.42.143:3306)
Fri Aug 27 19:17:00 2021 - [info] Alive Slaves:
Fri Aug 27 19:17:00 2021 - [info] 192.168.42.142(192.168.42.142:3306) Version=5.7.35-log (oldest major version between slaves) log-bin:enabled
Fri Aug 27 19:17:00 2021 - [info] GTID ON
Fri Aug 27 19:17:00 2021 - [info] Replicating from 192.168.42.141(192.168.42.141:3306)
Fri Aug 27 19:17:00 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Aug 27 19:17:00 2021 - [info] 192.168.42.143(192.168.42.143:3306) Version=5.7.35-log (oldest major version between slaves) log-bin:enabled
Fri Aug 27 19:17:00 2021 - [info] GTID ON
Fri Aug 27 19:17:00 2021 - [info] Replicating from 192.168.42.141(192.168.42.141:3306)
Fri Aug 27 19:17:00 2021 - [info] Not candidate for the new Master (no_master is set)
Fri Aug 27 19:17:00 2021 - [info] Checking slave configurations..
Fri Aug 27 19:17:00 2021 - [info] read_only=1 is not set on slave 192.168.42.142(192.168.42.142:3306).
Fri Aug 27 19:17:00 2021 - [info] read_only=1 is not set on slave 192.168.42.143(192.168.42.143:3306).
Fri Aug 27 19:17:00 2021 - [info] Checking replication filtering settings..
Fri Aug 27 19:17:00 2021 - [info] Replication filtering check ok.
Fri Aug 27 19:17:00 2021 - [info] Master is down!
Fri Aug 27 19:17:00 2021 - [info] Terminating monitoring script.
Fri Aug 27 19:17:00 2021 - [info] Got exit code 20 (Master dead).
Fri Aug 27 19:17:00 2021 - [info] MHA::MasterFailover version 0.58.
Fri Aug 27 19:17:00 2021 - [info] Starting master failover.
Fri Aug 27 19:17:00 2021 - [info]
Fri Aug 27 19:17:00 2021 - [info] * Phase 1: Configuration Check Phase..
Fri Aug 27 19:17:00 2021 - [info]
Fri Aug 27 19:17:01 2021 - [info] GTID failover mode = 1
Fri Aug 27 19:17:01 2021 - [info] Dead Servers:
Fri Aug 27 19:17:01 2021 - [info] 192.168.42.141(192.168.42.141:3306)
Fri Aug 27 19:17:01 2021 - [info] Checking master reachability via MySQL(double check)...
Fri Aug 27 19:17:01 2021 - [info] ok.
Fri Aug 27 19:17:01 2021 - [info] Alive Servers:
Fri Aug 27 19:17:01 2021 - [info] 192.168.42.142(192.168.42.142:3306)
Fri Aug 27 19:17:01 2021 - [info] 192.168.42.143(192.168.42.143:3306)
Fri Aug 27 19:17:01 2021 - [info] Alive Slaves:
Fri Aug 27 19:17:01 2021 - [info] 192.168.42.142(192.168.42.142:3306) Version=5.7.35-log (oldest major version between slaves) log-bin:enabled
Fri Aug 27 19:17:01 2021 - [info] GTID ON
Fri Aug 27 19:17:01 2021 - [info] Replicating from 192.168.42.141(192.168.42.141:3306)
Fri Aug 27 19:17:01 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Aug 27 19:17:01 2021 - [info] 192.168.42.143(192.168.42.143:3306) Version=5.7.35-log (oldest major version between slaves) log-bin:enabled
Fri Aug 27 19:17:01 2021 - [info] GTID ON
Fri Aug 27 19:17:01 2021 - [info] Replicating from 192.168.42.141(192.168.42.141:3306)
Fri Aug 27 19:17:01 2021 - [info] Not candidate for the new Master (no_master is set)
Fri Aug 27 19:17:01 2021 - [info] Starting GTID based failover.
Fri Aug 27 19:17:01 2021 - [info]
Fri Aug 27 19:17:01 2021 - [info] ** Phase 1: Configuration Check Phase completed.
Fri Aug 27 19:17:01 2021 - [info]
Fri Aug 27 19:17:01 2021 - [info] * Phase 2: Dead Master Shutdown Phase..
Fri Aug 27 19:17:01 2021 - [info]
Fri Aug 27 19:17:01 2021 - [info] Forcing shutdown so that applications never connect to the current master..
Fri Aug 27 19:17:01 2021 - [info] Executing master IP deactivation script:
Fri Aug 27 19:17:01 2021 - [info] /script/masterha/master_ip_failover --orig_master_host=192.168.42.141 --orig_master_ip=192.168.42.141 --orig_master_port=3306 --command=stopssh --ssh_user=root
IN SCRIPT TEST====/usr/sbin/ip addr del 192.168.42.88/24 dev ens33 label ens33:1==/usr/sbin/ip addr add 192.168.42.88/24 brd 192.168.42.255 dev ens33 label ens33:1;/usr/sbin/arping -q -A -c 1 -I ens33 192.168.42.88;iptables -F;===
Disabling the VIP on old master: 192.168.42.141
Fri Aug 27 19:17:02 2021 - [info] done.
Fri Aug 27 19:17:02 2021 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Fri Aug 27 19:17:02 2021 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Fri Aug 27 19:17:02 2021 - [info]
Fri Aug 27 19:17:02 2021 - [info] * Phase 3: Master Recovery Phase..
Fri Aug 27 19:17:02 2021 - [info]
Fri Aug 27 19:17:02 2021 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Fri Aug 27 19:17:02 2021 - [info]
Fri Aug 27 19:17:02 2021 - [info] The latest binary log file/position on all slaves is mysql01-bin.000007:318
Fri Aug 27 19:17:02 2021 - [info] Latest slaves (Slaves that received relay log files to the latest):
Fri Aug 27 19:17:02 2021 - [info] 192.168.42.142(192.168.42.142:3306) Version=5.7.35-log (oldest major version between slaves) log-bin:enabled
Fri Aug 27 19:17:02 2021 - [info] GTID ON
Fri Aug 27 19:17:02 2021 - [info] Replicating from 192.168.42.141(192.168.42.141:3306)
Fri Aug 27 19:17:02 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Aug 27 19:17:02 2021 - [info] 192.168.42.143(192.168.42.143:3306) Version=5.7.35-log (oldest major version between slaves) log-bin:enabled
Fri Aug 27 19:17:02 2021 - [info] GTID ON
Fri Aug 27 19:17:02 2021 - [info] Replicating from 192.168.42.141(192.168.42.141:3306)
Fri Aug 27 19:17:02 2021 - [info] Not candidate for the new Master (no_master is set)
Fri Aug 27 19:17:02 2021 - [info] The oldest binary log file/position on all slaves is mysql01-bin.000007:318
Fri Aug 27 19:17:02 2021 - [info] Oldest slaves:
Fri Aug 27 19:17:02 2021 - [info] 192.168.42.142(192.168.42.142:3306) Version=5.7.35-log (oldest major version between slaves) log-bin:enabled
Fri Aug 27 19:17:02 2021 - [info] GTID ON
Fri Aug 27 19:17:02 2021 - [info] Replicating from 192.168.42.141(192.168.42.141:3306)
Fri Aug 27 19:17:02 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Aug 27 19:17:02 2021 - [info] 192.168.42.143(192.168.42.143:3306) Version=5.7.35-log (oldest major version between slaves) log-bin:enabled
Fri Aug 27 19:17:02 2021 - [info] GTID ON
Fri Aug 27 19:17:02 2021 - [info] Replicating from 192.168.42.141(192.168.42.141:3306)
Fri Aug 27 19:17:02 2021 - [info] Not candidate for the new Master (no_master is set)
Fri Aug 27 19:17:02 2021 - [info]
Fri Aug 27 19:17:02 2021 - [info] * Phase 3.3: Determining New Master Phase..
Fri Aug 27 19:17:02 2021 - [info]
Fri Aug 27 19:17:02 2021 - [info] Searching new master from slaves..
Fri Aug 27 19:17:02 2021 - [info] Candidate masters from the configuration file:
Fri Aug 27 19:17:02 2021 - [info] 192.168.42.142(192.168.42.142:3306) Version=5.7.35-log (oldest major version between slaves) log-bin:enabled
Fri Aug 27 19:17:02 2021 - [info] GTID ON
Fri Aug 27 19:17:02 2021 - [info] Replicating from 192.168.42.141(192.168.42.141:3306)
Fri Aug 27 19:17:02 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Aug 27 19:17:02 2021 - [info] Non-candidate masters:
Fri Aug 27 19:17:02 2021 - [info] 192.168.42.143(192.168.42.143:3306) Version=5.7.35-log (oldest major version between slaves) log-bin:enabled
Fri Aug 27 19:17:02 2021 - [info] GTID ON
Fri Aug 27 19:17:02 2021 - [info] Replicating from 192.168.42.141(192.168.42.141:3306)
Fri Aug 27 19:17:02 2021 - [info] Not candidate for the new Master (no_master is set)
Fri Aug 27 19:17:02 2021 - [info] Searching from candidate_master slaves which have received the latest relay log events..
Fri Aug 27 19:17:02 2021 - [info] New master is 192.168.42.142(192.168.42.142:3306)
Fri Aug 27 19:17:02 2021 - [info] Starting master failover..
Fri Aug 27 19:17:02 2021 - [info]
From:
192.168.42.141(192.168.42.141:3306) (current master)
+--192.168.42.142(192.168.42.142:3306)
+--192.168.42.143(192.168.42.143:3306)
To:
192.168.42.142(192.168.42.142:3306) (new master)
+--192.168.42.143(192.168.42.143:3306)
Fri Aug 27 19:17:02 2021 - [info]
Fri Aug 27 19:17:02 2021 - [info] * Phase 3.3: New Master Recovery Phase..
Fri Aug 27 19:17:02 2021 - [info]
Fri Aug 27 19:17:02 2021 - [info] Waiting all logs to be applied..
Fri Aug 27 19:17:02 2021 - [info] done.
Fri Aug 27 19:17:02 2021 - [info] Getting new master's binlog name and position..
Fri Aug 27 19:17:02 2021 - [info] mysql02-bin.000003:194
Fri Aug 27 19:17:02 2021 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.42.142', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='replication', MASTER_PASSWORD='xxx';
Fri Aug 27 19:17:02 2021 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql02-bin.000003, 194, 527601db-06fa-11ec-8eb1-000c29a03380:1-6
Fri Aug 27 19:17:02 2021 - [info] Executing master IP activate script:
Fri Aug 27 19:17:02 2021 - [info] /script/masterha/master_ip_failover --command=start --ssh_user=root --orig_master_host=192.168.42.141 --orig_master_ip=192.168.42.141 --orig_master_port=3306 --new_master_host=192.168.42.142 --new_master_ip=192.168.42.142 --new_master_port=3306 --new_master_user='mha' --new_master_password=xxx
Unknown option: new_master_user
Unknown option: new_master_password
IN SCRIPT TEST====/usr/sbin/ip addr del 192.168.42.88/24 dev ens33 label ens33:1==/usr/sbin/ip addr add 192.168.42.88/24 brd 192.168.42.255 dev ens33 label ens33:1;/usr/sbin/arping -q -A -c 1 -I ens33 192.168.42.88;iptables -F;===
Enabling the VIP - 192.168.42.88 on the new master - 192.168.42.142
Fri Aug 27 19:17:02 2021 - [info] OK.
Fri Aug 27 19:17:02 2021 - [info] ** Finished master recovery successfully.
Fri Aug 27 19:17:02 2021 - [info] * Phase 3: Master Recovery Phase completed.
Fri Aug 27 19:17:02 2021 - [info]
Fri Aug 27 19:17:02 2021 - [info] * Phase 4: Slaves Recovery Phase..
Fri Aug 27 19:17:02 2021 - [info]
Fri Aug 27 19:17:02 2021 - [info]
Fri Aug 27 19:17:02 2021 - [info] * Phase 4.1: Starting Slaves in parallel..
Fri Aug 27 19:17:02 2021 - [info]
Fri Aug 27 19:17:02 2021 - [info] -- Slave recovery on host 192.168.42.143(192.168.42.143:3306) started, pid: 12892. Check tmp log /var/log/masterha/db1/192.168.42.143_3306_20210827191700.log if it takes time..
Fri Aug 27 19:17:04 2021 - [info]
Fri Aug 27 19:17:04 2021 - [info] Log messages from 192.168.42.143 ...
Fri Aug 27 19:17:04 2021 - [info]
Fri Aug 27 19:17:02 2021 - [info] Resetting slave 192.168.42.143(192.168.42.143:3306) and starting replication from the new master 192.168.42.142(192.168.42.142:3306)..
Fri Aug 27 19:17:02 2021 - [info] Executed CHANGE MASTER.
Fri Aug 27 19:17:03 2021 - [info] Slave started.
Fri Aug 27 19:17:03 2021 - [info] gtid_wait(527601db-06fa-11ec-8eb1-000c29a03380:1-6) completed on 192.168.42.143(192.168.42.143:3306). Executed 0 events.
Fri Aug 27 19:17:04 2021 - [info] End of log messages from 192.168.42.143.
Fri Aug 27 19:17:04 2021 - [info] -- Slave on host 192.168.42.143(192.168.42.143:3306) started.
Fri Aug 27 19:17:04 2021 - [info] All new slave servers recovered successfully.
Fri Aug 27 19:17:04 2021 - [info]
Fri Aug 27 19:17:04 2021 - [info] * Phase 5: New master cleanup phase..
Fri Aug 27 19:17:04 2021 - [info]
Fri Aug 27 19:17:04 2021 - [info] Resetting slave info on the new master..
Fri Aug 27 19:17:04 2021 - [info] 192.168.42.142: Resetting slave info succeeded.
Fri Aug 27 19:17:04 2021 - [info] Master failover to 192.168.42.142(192.168.42.142:3306) completed successfully.
Fri Aug 27 19:17:04 2021 - [info] Deleted server1 entry from /etc/mha/mha.conf .
Fri Aug 27 19:17:04 2021 - [info]
----- Failover Report -----
mha: MySQL Master failover 192.168.42.141(192.168.42.141:3306) to 192.168.42.142(192.168.42.142:3306) succeeded
Master 192.168.42.141(192.168.42.141:3306) is down!
Check MHA Manager logs at mysql03:/var/log/masterha/app1/db1.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on 192.168.42.141(192.168.42.141:3306)
Selected 192.168.42.142(192.168.42.142:3306) as a new master.
192.168.42.142(192.168.42.142:3306): OK: Applying all logs succeeded.
192.168.42.142(192.168.42.142:3306): OK: Activated master IP address.
192.168.42.143(192.168.42.143:3306): OK: Slave started, replicating from 192.168.42.142(192.168.42.142:3306)
192.168.42.142(192.168.42.142:3306): Resetting slave info succeeded.
Master failover to 192.168.42.142(192.168.42.142:3306) completed successfully.
1.8.14 模拟宕机修复
删除文件
先把192.168.42.143的这个文件删除(如果存在的话)
rm -rf /var/log/masterha/db1/mha.failover.complete
备份192.168.42.142主库以及查看指针位
备份主库
mysqldump --single-transaction --master-data=2 --all-databases -uroot -p123456>all.sql
把备份主库复制到192.168.42.141
scp all.sql root@192.168.42.141:/root
宕机需要复制主机的位置以及文件
查看192.168.42.142
mysql -uroot -p
show master status \G;
192.168.42.141主从配置
启动mysql
mysql start mysqld;
执行all.sql
mysql -uroot -p<all.sql
因为192.168.42.141原来是主机,所以登录重置下master
登入
mysql -uroot -p
重置主机
reset master;
stop slave;
由于
所以
change master to master_host='192.168.42.142',
master_user='replication',
master_password='123456',
MASTER_LOG_FILE='mysql02-bin.000001', MASTER_LOG_POS=154;
start slave;
show slave status \G;
192.168.42.143重启以及查看状态
重启
nohup masterha_manager -conf=/etc/mha/mha.conf&> /etc/mha/manager.log &
查看状态
masterha_check_status --conf=/etc/mha/mha.conf
查看日志
tail -f /var/log/masterha/app1/db1.log