此文档 解决mysql服务宕机处理的问题,通过mha实现了自动故障转移,master宕机后,mha会在slave节点中选择数据最全的salve节点设置为master。并且通过keepalived配合mha的脚本,实现ip漂移,保证mysql的master节点宕机后,其他服务正常使用数据库
一、基础环境
系统:centos7
架构:x86
数据库:mysql-5.7.33
mha:mha4mysql-0.58
yum仓库服务器:192.168.1.164
mha-manager服务器:192.168.1.169(hostname:manager)
mysql-master服务器: 192.168.1.170(hostname:master)
mysql-slave1服务器: 192.168.1.171(hostname:slave1)
mysql-slave2服务器: 192.168.1.172(hostname:slave2)
*后面各服务器的操作以 hostname对应的名字代替。
二、前提配置
1. 每台服务器配置连接本地yum仓库(能联网的话可以忽略这一步)
配置过程参考我的另一篇文章yum本地仓库搭建,开启远程仓库
配置baseurl的时候记得配置为自己的yum服务器地址
2. 修改ssh的UseDNS为no
*如果不需要解析dns
vi /etc/ssh/sshd_config
#UseDNS yes 改为 UseDNS no
3. 关闭 selinux
vi /etc/selinux/config
SELINUX=enforcing 改为 SELINUX=disabled
4. 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
5. 修改hostname (改不改都行,看个人喜好)
vi /etc/hostname
6. 重启
reboot
三、MySQL安装与基础配置
master,slave1,slave2三台服务器上执行2~8步骤,安装mysql
1. 官网下载对应系统的数据库
mysql-5.7.33-1.el7.x86_64.rpm-bundle.tar 点击下载
2. 上传至服务器任意位置,解压
tar -xvf mysql-5.7.33-1.el7.x86_64.rpm-bundle.tar
3. 解压后安装
yum remove mysql-libs -y
yum install -y mysql*.rpm
4. 启动mysql,查看初始密码
systemctl start mysqld.service
systemctl status mysqld.service
grep "password" /var/log/mysqld.log
*最后一行可以看到初始密码(冒号后面)
5. 登录mysql
mysql -uroot -p
#输入刚才的初始密码
6. 修改密码
ALTER USER 'root'@'localhost' IDENTIFIED BY '你的密码';
7. 开启远程
use mysql;
update user set host = '%' where user = 'root';
flush privileges;
8. 授权
grant all on *.* to 'root'@'当前机器ip' identified by '你的密码';
flush privileges;
四、MySQL主从配置
1. 修改mysql配置文件
vi /etc/my.cnf
*三台mysql服务器对应配置如下
master服务器上:
server-id=11
log-bin=master-log
log-slave-updates=1
bind-address=0.0.0.0
relay_log_purge=0
skip_name_resolve
slave1服务器上:
server-id=12
log-bin=master-log
log_slave_updates=1
bind-address=0.0.0.0
relay_log_purge=0
skip_name_resolve
read_only=1
slave2服务器上:
server-id=13
log-bin=master-log
log_slave_updates=1
bind-address=0.0.0.0
relay_log_purge=0
skip_name_resolve
read_only=1
2. 三台MySQL服务器重启mysql服务
systemctl restart mysqld
3. 查看master服务器上的master日志状态
登录mysql后输入:
show master status;
看到:
+-------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-------------------+----------+--------------+------------------+-------------------+
| master-log.000001 | 154 | | | |
+-------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
记住 File列对应的值 我这边是master-log.000001
记住 Position列对用的值 我这边是154
4. 在slave1,slave2服务器上设置主从配置
(1) 登录mysql后输入:
change master to master_host='192.168.1.170', master_port=3306 ,master_user='root', master_password='123456', master_log_file='master-log.000001', master_log_pos=154;
#192.168.1.170 对应master服务器ip
#123456 为mysql 的密码
#master-log.000001 为上一步File列对应的值
#154 为上一步Position列对应的值
(2)启动slave,查看slave状态
start slave;
show slave status\G
(3)看到这两行为yes则表示主从配置成功
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
(4)成功的话可以不用看这一步了
如果slave状态不是(3)的结果,首先保证没有数据写入,无法保证可以执行锁库flush tables with read lock;
(主从配置好后解除锁定unlock tables;
),在slave节点依次输入sql语句stop slave;reset slave;
,然后重复(1)-(3)。
五、MHA安装配置
1. ssh互通配置
(1)在 manager,master,slave1,slave2 四台服务器执行以下命令
ssh-keygen -t rsa
#一直回车
cat /root/.ssh/id_rsa.pub
#记住四台服务器的密钥
(2)在 manager上配置四台机器的密钥(包括自己):
cd /root/.ssh
touch authorized_keys
vi authorized_keys
#复制粘贴刚才四台机器的密钥
chmod 600 authorized_keys
(3)在manager服务器上将authorized_keys同步到其他三台服务器
scp /root/.ssh/authorized_keys root@192.168.1.170:/root/.ssh/
scp /root/.ssh/authorized_keys root@192.168.1.171:/root/.ssh/
scp /root/.ssh/authorized_keys root@192.168.1.172:/root/.ssh/
(4)在manager服务器上ssh任意其他服务器验证
ssh 192.168.1.170
#会直接连接到170服务器,用exit命令退出
2. mha安装
(1)准备mha文件
mha-0.58.zip点击下载
解压后:
mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
mha4mysql-manager-0.58.tar.gz
mha4mysql-node-0.58-0.el7.centos.noarch.rpm
mha4mysql-node-0.58.tar.gz
上传mha4mysql-node-0.58-0.el7.centos.noarch.rpm
到manager,master,slave1,slave2 四台服务器
上传mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
到manager服务器
(2)在 manager,master,slave1,slave2节点 安装mha的node组件
yum install -y mha4mysql-node-0.58-0.el7.centos.noarch.rpm
(3)在 manager服务器安装 mha的manager组件
所需的额外依赖
mha4-manager额外依赖.zip
点击下载
解压后:
perl-Config-Tiny-2.14-7.el7.noarch.rpm
perl-Email-Date-Format-1.002-15.el7.noarch.rpm
perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm
perl-Mail-Sender-0.8.23-1.el7.noarch.rpm
perl-Mail-Sendmail-0.79-21.el7.noarch.rpm
perl-MIME-Lite-3.030-1.el7.noarch.rpm
perl-MIME-Types-1.38-2.el7.noarch.rpm
perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm
依次安装,建议放到一个文件夹内,执行
yum install -y *.rpm
(4)在 manager服务器安装 mha的manager组件
yum install -y mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
3. mha配置
mkdir /etc/mha_master
cd /etc/mha_master
vi app1.cnf
配置文件如下:
[server default]
# manager的工作目录
manager_workdir=/var/log/masterha/app1
# manager的日志目录
manager_log=/var/log/masterha/app1/manager.log
# mysql服务的工作目录
remote_workdir=/var/log/masterha/app1
# 发生故障时 vip切换执行文件
master_ip_failover_script=/usr/local/bin/master_ip_failover
# 多服务器对master状态的检查配置
# secondary_check_script= masterha_secondary_check -s 192.168.1.169 -s 192.168.1.171
#mysql用户名和密码
user=root
password=你的mysql密码
[server1]
hostname=192.168.1.170
candidate_master=1
[server2]
hostname=192.168.1.171
candidate_master=1
[server3]
hostname=192.168.1.172
4.master_ip_failover配置
cd /usr/local/bin/
vi master_ip_failover
配置如下
#!/usr/bin/env perl
# Copyright (C) 2011 DeNA Co.,Ltd.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
## Note: This is a sample script and is not complete. Modify the script based on your environment.
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
use MHA::DBHelper;
my (
$command, $ssh_user, $orig_master_host,
$orig_master_ip, $orig_master_port, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password
);
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
);
exit &main();
sub main {
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
`ssh root\@$orig_master_host \"systemctl stop keepalived\"`;
# updating global catalog, etc
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
my $new_master_handler = new MHA::DBHelper();
# args: hostname, port, user, password, raise_error_or_not
$new_master_handler->connect( $new_master_ip, $new_master_port,
$new_master_user, $new_master_password, 1 );
## Set read_only=0 on the new master
$new_master_handler->disable_log_bin_local();
print "Set read_only=0 on the new master.\n";
$new_master_handler->disable_read_only();
## Creating an app user on the new master
print "Creating app user on the new master..\n";
#FIXME_xxx_create_user( $new_master_handler->{dbh} );
$new_master_handler->enable_log_bin_local();
$new_master_handler->disconnect();
## Update master ip on the catalog database, etc
# FIXME_xxx;
`ssh root\@$new_master_host \"systemctl start keepalived\"`;
$exit_code = 0;
};
if ($@) {
warn $@;
# If you want to continue failover, exit 10.
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
# do nothing
exit 0;
}
else {
&usage();
exit 1;
}
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
此处稍微说明一下,这里用的keepalived来ip飘移,虚拟ip是固定的,根据keepalived服务启停来在三台服务器之间飘移。也可以用ifconfig 或者ip addr 命令来实现。上面的配置文件是官方给的默认配置文件稍作修改。具体原理就是,当mha监听到master宕掉后,会执行上面的脚本,可以看到上面脚本中会有,start,stop代码块,我们需要在代码块中添加自己想要执行的命令。
1.在stop中添加:ssh root\@$orig_master_host \"systemctl stop keepalived\"
;
**orig_master_host是 原master的ip,systemctl stop keepalived是停掉keepalived,keepalived停掉的话,虚拟ip也会失效。
2.在start中添加:ssh root\@$new_master_host \"systemctl start keepalived\"
;
**new_master_host 是新master节点的ip,systemctl start keepalived是启动keepalived服务,服务启动后,虚拟ip生效。
3. 默认配置文件中FIXME_xxx_create_user
和FIXME_xxx;
这两行要注释掉,此处用不到。
5.授权master_ip_failover
chmod +x /usr/local/bin/master_ip_failover
六、keepalived安装配置
1.在master,slave1,slave2安装配置keepalived
yum install -y keepalived
2.配置keepalived
下面的配置中,虚拟ip为 192.168.1.180
vi /etc/keepalived/keepalived.conf
配置如下:
! Configuration File for keepalived
global_defs {
#每个服务器配置设置一个标识
router_id mysql-master
}
vrrp_instance VI_1 {
state BACKUP
#设置网卡名 可以用 ip addr 查看
interface ens33
virtual_router_id 51
priority 100
advert_int 1
nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
#此处设置虚拟ip
192.168.1.180
}
}
#此处设置虚拟ip和mysql端口号
virtual_server 192.168.1.180 3306 {
delay_loop 6
lb_algo rr
lb_kind NAT
persistence_timeout 50
protocol TCP
#此处是当前服务器的真实ip和mysql真实端口号
#master,slave1,slave2三台服务器配置时这里要改成对应的服务器ip
real_server 192.168.1.170 3306 {
weight 1
}
}
3.在当前master服务器上启动keepalived
systemctl start keepalived
七、在manager服务器上启动mha
1.检查ssh互通
masterha_check_ssh -conf=/etc/mha_master/app1.cnf
成功提示:
All SSH connection tests passed successfully.
2.检查mysql集群主从
masterha_check_repl -conf=/etc/mha_master/app1.cnf
成功提示:
MySQL Replication Health is OK.
3.启动mha
nohup masterha_manager -conf=/etc/mha_master/app1.cnf -ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
4.查看mha日志
tail -f /var/log/masterha/app1/manager.log
成功提示:
[info] OK.
[warning] shutdown_script is not defined.
[info] Set master ping interval 3 seconds.
[warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
[info] Starting ping health check on 192.168.1.170(192.168.1.170:3306)..
[info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
5.查看mha状态
masterha_check_status -conf=/etc/mha_master/app1.cnf
成功提示:
app1 (pid:4316) is running(0:PING_OK), master:192.168.1.170
八、测试mha故障处理以及ip漂移
1.模拟master服务器mysql服务宕机
systemctl stop mysqld
2.在slave2上验证mha故障处理
mysql -uroot -p
#输入密码
进入后执行
show slave status\G
显示
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.171
Master_User: root
Master_Port: 3306
通过Master_Host可以看到从刚才的170切换为了171
在旧master 170服务器上 ip addr 可以看到虚拟ip 192.168.1.180失效(已经找不到180)
在新master 171服务器上ip addr可以看到虚拟ip 192.168.1.180生效
九、 关于宕机节点的恢复简要说明
1. 如果要恢复宕机节点,先在宕机节点启动mysql服务(已启动忽略)
2. 宕机节点设置只读,登入mysql执行
set global read_only=1;
3.在新的master节点 锁库,登入mysql执行
flush tables with read lock;
4.参考步骤四,设置宕机节点的主从关系
5.在新的master节点 取消锁库,登入mysql执行
unlock tables;