MYSQL MHA高可用环境搭建

1.环境架构介绍

机器名称IP角色权限
Mysql_Master10.0.0.111数据库Master数据库Master可读写、主库
Mysql_Slave110.0.0.112数据库Master只读、从库
Mysql_Slave210.0.0.113数据库Master只读、从库
Mysql_MHA10.0.0.114数据库Master高可用监控

2.主从搭建

2.1 MySQL安装

2.1.1 下载

wget https://cdn.mysql.com/archives/mysql-5.7/mysql-5.7.28-1.el7.x86_64.rpm-bundle.tar

解压

tar xvf mysql-5.7.28-1.el7.x86_64.rpm-bundle.tar

2.1.2 安装

移除mariadb-lib依赖

[root@localhost ~]# rpm -qa|grep mariadb
mariadb-libs-5.5.60-1.el7_5.x86_64
[root@localhost ~]# rpm -e mariadb-libs-5.5.60-1.el7_5.x86_64 --nodeps

安装依赖 libaio 为mysql-community-server-5.7.28-1.el7.x86_64.rpm需要的依赖

[root@localhost ~]# yum install -y libaio

缺失情况会报以下错误

[root@localhost ~]# rpm -ivh mysql-community-server-5.7.28-1.el7.x86_64.rpm 
warning: mysql-community-server-5.7.28-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
error: Failed dependencies:
	/usr/bin/perl is needed by mysql-community-server-5.7.28-1.el7.x86_64
	libaio.so.1()(64bit) is needed by mysql-community-server-5.7.28-1.el7.x86_64
	libaio.so.1(LIBAIO_0.1)(64bit) is needed by mysql-community-server-5.7.28-1.el7.x86_64
	libaio.so.1(LIBAIO_0.4)(64bit) is needed by mysql-community-server-5.7.28-1.el7.x86_64
	net-tools is needed by mysql-community-server-5.7.28-1.el7.x86_64
	perl(Getopt::Long) is needed by mysql-community-server-5.7.28-1.el7.x86_64
	perl(strict) is needed by mysql-community-server-5.7.28-1.el7.x86_64

安装mysql

rpm -ivh mysql-community-common-5.7.28-1.el7.x86_64.rpm 
rpm -ivh mysql-community-libs-5.7.28-1.el7.x86_64.rpm 
rpm -ivh mysql-community-libs-compat-5.7.28-1.el7.x86_64.rpm 
rpm -ivh mysql-community-client-5.7.28-1.el7.x86_64.rpm 
rpm -ivh mysql-community-server-5.7.28-1.el7.x86_64.rpm 
rpm -ivh mysql-community-devel-5.7.28-1.el7.x86_64.rpm

2.1.3 启动MySQL

初始化用户

mysqld --initialize --user=mysql

查看初始化密码

cat /var/log/mysqld.log | grep password

启动MySQL服务

systemctl start mysqld

配置为开机启动

systemctl enable mysqld

修改默认密码

mysql -uroot -p
xxxxxx 输入默认密码
SET PASSWORD = PASSWORD('123456');

2.2 设置防火墙

2.2.1 配置/etc/sysconfig/iptables增加防火墙规则(生产推荐)

2.2.2.直接关闭防火墙(测试环境)

记得关,血的教训。

systemctl stop firewalld

2.3 Mysql主从配置

2.3.1 Master节点配置

修改Master的my.cnf配置文件
vi /etc/my.cnf
#bin_log配置 
log_bin=mysql-bin 
server-id=1 
sync-binlog=1 
binlog-ignore-db=information_schema 
binlog-ignore-db=mysql 
binlog-ignore-db=performance_schema 
binlog-ignore-db=sys 
#relay_log配置 
relay_log=mysql-relay-bin 
log_slave_updates=1 
relay_log_purge=0
重启服务
systemctl restart mysqld
主库给从库授权

登录主库后执行命令:

mysql -uroot -p123456

mysql> grant replication slave on *.* to root@'%' identified by '123456';
mysql> grant all privileges on *.* to root@'%' identified by '123456';

mysql> flush privileges;
//查看主库状态信息,例如master_log_file='mysql-bin.000001',master_log_pos=869 
mysql> show master status;

+------------------+----------+--------------+-------------------------------------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB                                | Executed_Gtid_Set |
+------------------+----------+--------------+-------------------------------------------------+-------------------+
| mysql-bin.000001 |      869 |              | information_schema,mysql,performance_schema,sys |                   |
+------------------+----------+--------------+-------------------------------------------------+-------------------+
1 row in set (0.00 sec)

注意:这个File:mysql-bin.000001 和 Position:869 是下面用于从节点同步填入的参数信息。

change master to master_host='10.0.0.111',master_port=3306,master_user='root',master_password ='123456',master_log_file='mysql-bin.000001',master_log_pos=869;

2.3.2 Slave节点配置

修改Slave的my.cnf配置文件
vi /etc/my.cnf

10.0.0.112配置

#bin_log配置 
log_bin=mysql-bin 
server-id=2
sync-binlog=1 
binlog-ignore-db=information_schema 
binlog-ignore-db=mysql 
binlog-ignore-db=performance_schema 
binlog-ignore-db=sys 
#relay_log配置 
relay_log=mysql-relay-bin 
log_slave_updates=1 
relay_log_purge=0
read_only=1

10.0.0.113配置

#bin_log配置 
log_bin=mysql-bin 
server-id=3
sync-binlog=1 
binlog-ignore-db=information_schema 
binlog-ignore-db=mysql 
binlog-ignore-db=performance_schema 
binlog-ignore-db=sys 
#relay_log配置 
relay_log=mysql-relay-bin 
log_slave_updates=1 
relay_log_purge=0
read_only=1
重启服务
systemctl restart mysqld
开启同步

登录Slave节点MySQL,执行同步操作命令,该命令内部参数为上面 show master status操作显示的参数

change master to master_host='10.0.0.111',master_port=3306,master_user='root',master_password ='123456',master_log_file='mysql-bin.000001',master_log_pos=869;

start slave; // 开启同步

2.4 配置半同步复制

2.4.1 Master节点

登录主节点MySQL,安装插件
mysql> install plugin rpl_semi_sync_master soname 'semisync_master.so';
Query OK, 0 rows affected (0.03 sec)

mysql> show variables like '%semi%';
+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | OFF        |
| rpl_semi_sync_master_timeout              | 10000      |
| rpl_semi_sync_master_trace_level          | 32         |
| rpl_semi_sync_master_wait_for_slave_count | 1          |
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
+-------------------------------------------+------------+
6 rows in set (0.00 sec)

mysql> 
使用vi /etc/my.cnf,修改MySQL配置文件
# 自动开启半同步复制 
rpl_semi_sync_master_enabled=ON 
rpl_semi_sync_master_timeout=1000
重启MySQL服务
systemctl restart mysqld

2.4.2 Slave节点

登录Slave命令行执行命令安装插件
mysql> install plugin rpl_semi_sync_slave soname 'semisync_slave.so';
Query OK, 0 rows affected (0.02 sec)

mysql> show variables like '%semi%';
+---------------------------------+-------+
| Variable_name                   | Value |
+---------------------------------+-------+
| rpl_semi_sync_slave_enabled     | OFF   |
| rpl_semi_sync_slave_trace_level | 32    |
+---------------------------------+-------+
2 rows in set (0.00 sec)

mysql> 
使用vi /etc/my.cnf,修改MySQL配置文件
# 自动开启半同步复制 
rpl_semi_sync_slave_enabled=ON
重启MySQL服务
systemctl restart mysqld
测试半同步状态

通过MySQL命令检查参数是否开启半同步

show variables like '%semi%';

主节点信息:

mysql> show variables like '%semi%';
+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | ON         |
| rpl_semi_sync_master_timeout              | 1000       |
| rpl_semi_sync_master_trace_level          | 32         |
| rpl_semi_sync_master_wait_for_slave_count | 1          |
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
+-------------------------------------------+------------+
6 rows in set (0.00 sec)

mysql> 

从节点信息:

mysql> show variables like '%semi%';
+---------------------------------+-------+
| Variable_name                   | Value |
+---------------------------------+-------+
| rpl_semi_sync_slave_enabled     | ON    |
| rpl_semi_sync_slave_trace_level | 32    |
+---------------------------------+-------+
2 rows in set (0.00 sec)

mysql> 

查看主节点日志

cat -f /var/log/mysqld.log

如果正确启动会出现如下日志信息

Start binlog_dump to master_thread_id(5) slave_server(2), pos(mysql-bin.000002, 154)
[Note] Start semi-sync binlog_dump to slave (server_id: 2), pos(mysql-bin.000002, 154)

3.MHA高可用搭建

3.1 配置四台机器ssh互通

全部服务器执行如下命令,生成公钥和私钥

ssh-keygen -t rsa

在Master、Slave1、Slave2分别执行下面命令,将公钥拷贝到MHA服务器

ssh-copy-id 10.0.0.114

MHA服务器检查.ssh/authorized_key文件是否包含3个公钥

cat /root/.ssh/authorized_keys

将MHA的公钥追加到authorized_keys文件中(此时共计4个公钥)

cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys

从MHA服务器执行命令,向Master、Slave1、Slave2同步公钥信息。

scp /root/.ssh/authorized_keys root@10.0.0.111:/root/.ssh/authorized_keys
scp /root/.ssh/authorized_keys root@10.0.0.112:/root/.ssh/authorized_keys
scp /root/.ssh/authorized_keys root@10.0.0.113:/root/.ssh/authorized_keys

在MHA服务器通过SSH命令分别测试连接Master、Slave1、Slave2服务器。

ssh 10.0.0.111
exit
ssh 10.0.0.112
exit
ssh 10.0.0.113
exit

3.2 MHA下载安装

MHA下载

MySQL5.7对应的MHA版本是0.5.8

采用wget命令分别下载manager和node包

所有服务器都安装Node

wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.centos.noarch.rpm

MHA Manager服务器需要安装manager

wget https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

MHA Node 安装

所有服务器上安装mha4mysql-node(反复检查确认全部安装成功,血的教训)

MHA的Node依赖per-DBD-MySQL,需要先安装per-DBD-MySQL。

yum install perl-DBD-MySQL -y

安装mha4mysql-node

rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm

MHA Manager安装

MHA Manager依赖 per-Config-Tiny、per-Log-Dispatch、per-Parallel-ForkManager。

安装EPEL源,不安装的话 yum install 无法找到perl-Log-Dispatch、perl-Parallel-ForkManager的包。

wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm 
rpm -ivh epel-release-latest-7.noarch.rpm

更新后执行命令:

yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager -y

安装manager

rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

MHA 配置文件

MHA服务器需要为每个监控的Master/Slave集群提供一个专用的配置文件,所有的Master/Slave集群共享配置文件。

初始化配置目录
#目录说明 
#/var/log (CentOS目录) 
#       /mha (MHA监控根目录) 
#          /app1 (MHA监控实例根目录) 
#              /manager.log (MHA监控实例日志文件) 
mkdir -p /var/log/mha/app1 
touch /var/log/mha/app1/manager.log
配置监控全局配置文件

在主数据库(10.0.0.111) mysql库中创建一个新用户

mysql> create user 'mha'@'%' identified by '123123';
Query OK, 0 rows affected (0.00 sec)

mysql> grant all on *.* to mha@'%' identified by '123123';
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

mysql> 

MHA服务器(10.0.0.114)创建全局监控文件

vi /etc/masterha_default.cnf
[server default] 
user=mha 
password=123123 
port=3306 
#ssh登录账号 
ssh_user=root 
#从库复制账号和密码 
repl_user=root 
repl_password=123456 
port=3306 
#ping次数 
ping_interval=1 
#二次检查的主机 
secondary_check_script=masterha_secondary_check -s 10.0.0.111 -s 10.0.0.112 -s 10.0.0.113
配置监控实例配置文件
# 先创建目录
mkdir -p /etc/mha 
vi /etc/mha/app1.cnf
[server default] 
#MHA监控实例根目录 
manager_workdir=/var/log/mha/app1 
#MHA监控实例日志文件 
manager_log=/var/log/mha/app1/manager.log 

#[serverx] 服务器编号 
#hostname 主机名 
#candidate_master 可以做主库 
#master_binlog_dir binlog日志文件目录 

[server1] 
hostname=10.0.0.111
candidate_master=1 
master_binlog_dir="/var/lib/mysql" 

[server2] 
hostname=10.0.0.112
candidate_master=1 
master_binlog_dir="/var/lib/mysql"

[server3] 
hostname=10.0.0.113
candidate_master=1 
master_binlog_dir="/var/lib/mysql"

MHA配置检查

执行ssh通信检查

在MHA Manager服务器执行命令:

masterha_check_ssh --conf=/etc/mha/app1.cnf

正确响应结果

[root@localhost mha]# masterha_check_ssh --conf=/etc/mha/app1.cnf
Fri Jun 11 06:46:22 2021 - [info] Reading default configuration from /etc/masterha_default.cnf..
Fri Jun 11 06:46:22 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Fri Jun 11 06:46:22 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
Fri Jun 11 06:46:22 2021 - [info] Starting SSH connection tests..
Fri Jun 11 06:46:24 2021 - [debug] 
Fri Jun 11 06:46:22 2021 - [debug]  Connecting via SSH from root@10.0.0.111(10.0.0.111:22) to root@10.0.0.112(10.0.0.112:22)..
Warning: Permanently added '10.0.0.112' (ECDSA) to the list of known hosts.
Fri Jun 11 06:46:22 2021 - [debug]   ok.
Fri Jun 11 06:46:22 2021 - [debug]  Connecting via SSH from root@10.0.0.111(10.0.0.111:22) to root@10.0.0.113(10.0.0.113:22)..
Fri Jun 11 06:46:23 2021 - [debug]   ok.
Fri Jun 11 06:46:24 2021 - [debug] 
Fri Jun 11 06:46:22 2021 - [debug]  Connecting via SSH from root@10.0.0.112(10.0.0.112:22) to root@10.0.0.111(10.0.0.111:22)..
Warning: Permanently added '10.0.0.111' (ECDSA) to the list of known hosts.
Fri Jun 11 06:46:23 2021 - [debug]   ok.
Fri Jun 11 06:46:23 2021 - [debug]  Connecting via SSH from root@10.0.0.112(10.0.0.112:22) to root@10.0.0.113(10.0.0.113:22)..
Fri Jun 11 06:46:24 2021 - [debug]   ok.
Fri Jun 11 06:46:25 2021 - [debug] 
Fri Jun 11 06:46:23 2021 - [debug]  Connecting via SSH from root@10.0.0.113(10.0.0.113:22) to root@10.0.0.111(10.0.0.111:22)..
Fri Jun 11 06:46:24 2021 - [debug]   ok.
Fri Jun 11 06:46:24 2021 - [debug]  Connecting via SSH from root@10.0.0.113(10.0.0.113:22) to root@10.0.0.112(10.0.0.112:22)..
Fri Jun 11 06:46:24 2021 - [debug]   ok.
Fri Jun 11 06:46:25 2021 - [info] All SSH connection tests passed successfully.
检测MySQL主从复制

在MHA Manager服务器上执行:

masterha_check_repl --conf=/etc/mha/app1.cnf

出现如下信息,表示集群没有问题。

[root@localhost ~]# masterha_check_repl --conf=/etc/mha/app1.cnf
Fri Jun 11 06:54:46 2021 - [info] Reading default configuration from /etc/masterha_default.cnf..
Fri Jun 11 06:54:46 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Fri Jun 11 06:54:46 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
Fri Jun 11 06:54:46 2021 - [info] MHA::MasterMonitor version 0.58.
Fri Jun 11 06:54:47 2021 - [info] GTID failover mode = 0
Fri Jun 11 06:54:47 2021 - [info] Dead Servers:
Fri Jun 11 06:54:47 2021 - [info] Alive Servers:
Fri Jun 11 06:54:47 2021 - [info]   10.0.0.111(10.0.0.111:3306)
Fri Jun 11 06:54:47 2021 - [info]   10.0.0.112(10.0.0.112:3306)
Fri Jun 11 06:54:47 2021 - [info]   10.0.0.113(10.0.0.113:3306)
Fri Jun 11 06:54:47 2021 - [info] Alive Slaves:
Fri Jun 11 06:54:47 2021 - [info]   10.0.0.112(10.0.0.112:3306)  Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
Fri Jun 11 06:54:47 2021 - [info]     Replicating from 10.0.0.111(10.0.0.111:3306)
Fri Jun 11 06:54:47 2021 - [info]     Primary candidate for the new Master (candidate_master is set)
Fri Jun 11 06:54:47 2021 - [info]   10.0.0.113(10.0.0.113:3306)  Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
Fri Jun 11 06:54:47 2021 - [info]     Replicating from 10.0.0.111(10.0.0.111:3306)
Fri Jun 11 06:54:47 2021 - [info]     Primary candidate for the new Master (candidate_master is set)
Fri Jun 11 06:54:47 2021 - [info] Current Alive Master: 10.0.0.111(10.0.0.111:3306)
Fri Jun 11 06:54:47 2021 - [info] Checking slave configurations..
Fri Jun 11 06:54:47 2021 - [info] Checking replication filtering settings..
Fri Jun 11 06:54:47 2021 - [info]  binlog_do_db= , binlog_ignore_db= information_schema,mysql,performance_schema,sys
Fri Jun 11 06:54:47 2021 - [info]  Replication filtering check ok.
Fri Jun 11 06:54:47 2021 - [info] GTID (with auto-pos) is not supported
Fri Jun 11 06:54:47 2021 - [info] Starting SSH connection tests..
Fri Jun 11 06:54:49 2021 - [info] All SSH connection tests passed successfully.
Fri Jun 11 06:54:49 2021 - [info] Checking MHA Node version..
Fri Jun 11 06:54:50 2021 - [info]  Version check ok.
Fri Jun 11 06:54:50 2021 - [info] Checking SSH publickey authentication settings on the current master..
Fri Jun 11 06:54:50 2021 - [info] HealthCheck: SSH to 10.0.0.111 is reachable.
Fri Jun 11 06:54:51 2021 - [info] Master MHA Node version is 0.58.
Fri Jun 11 06:54:51 2021 - [info] Checking recovery script configurations on 10.0.0.111(10.0.0.111:3306)..
Fri Jun 11 06:54:51 2021 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000003 
Fri Jun 11 06:54:51 2021 - [info]   Connecting to root@10.0.0.111(10.0.0.111:22).. 
  Creating /var/tmp if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /var/lib/mysql, up to mysql-bin.000003
Fri Jun 11 06:54:51 2021 - [info] Binlog setting check done.
Fri Jun 11 06:54:51 2021 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Fri Jun 11 06:54:51 2021 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=10.0.0.112 --slave_ip=10.0.0.112 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.28-log --manager_version=0.58 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Fri Jun 11 06:54:51 2021 - [info]   Connecting to root@10.0.0.112(10.0.0.112:22).. 
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to mysql-relay-bin.000005
    Temporary relay log file is /var/lib/mysql/mysql-relay-bin.000005
    Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.
    Testing mysql connection and privileges..
mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Fri Jun 11 06:54:52 2021 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=10.0.0.113 --slave_ip=10.0.0.113 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.28-log --manager_version=0.58 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Fri Jun 11 06:54:52 2021 - [info]   Connecting to root@10.0.0.113(10.0.0.113:22).. 
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to mysql-relay-bin.000005
    Temporary relay log file is /var/lib/mysql/mysql-relay-bin.000005
    Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.
    Testing mysql connection and privileges..
mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Fri Jun 11 06:54:52 2021 - [info] Slaves settings check done.
Fri Jun 11 06:54:52 2021 - [info] 
10.0.0.111(10.0.0.111:3306) (current master)
 +--10.0.0.112(10.0.0.112:3306)
 +--10.0.0.113(10.0.0.113:3306)

Fri Jun 11 06:54:52 2021 - [info] Checking replication health on 10.0.0.112..
Fri Jun 11 06:54:52 2021 - [info]  ok.
Fri Jun 11 06:54:52 2021 - [info] Checking replication health on 10.0.0.113..
Fri Jun 11 06:54:52 2021 - [info]  ok.
Fri Jun 11 06:54:52 2021 - [warning] master_ip_failover_script is not defined.
Fri Jun 11 06:54:52 2021 - [warning] shutdown_script is not defined.
Fri Jun 11 06:54:52 2021 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

监测中发现的问题

  1. 从节点没有安装节点的错误信息

    [root@localhost mha]# masterha_check_repl --conf=/etc/mha/app1.cnf
    Fri Jun 11 06:49:15 2021 - [info] Reading default configuration from /etc/masterha_default.cnf..
    Fri Jun 11 06:49:15 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
    Fri Jun 11 06:49:15 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
    Fri Jun 11 06:49:15 2021 - [info] MHA::MasterMonitor version 0.58.
    Fri Jun 11 06:49:16 2021 - [info] GTID failover mode = 0
    Fri Jun 11 06:49:16 2021 - [info] Dead Servers:
    Fri Jun 11 06:49:16 2021 - [info] Alive Servers:
    Fri Jun 11 06:49:16 2021 - [info]   10.0.0.111(10.0.0.111:3306)
    Fri Jun 11 06:49:16 2021 - [info]   10.0.0.112(10.0.0.112:3306)
    Fri Jun 11 06:49:16 2021 - [info]   10.0.0.113(10.0.0.113:3306)
    Fri Jun 11 06:49:16 2021 - [info] Alive Slaves:
    Fri Jun 11 06:49:16 2021 - [info]   10.0.0.112(10.0.0.112:3306)  Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
    Fri Jun 11 06:49:16 2021 - [info]     Replicating from 10.0.0.111(10.0.0.111:3306)
    Fri Jun 11 06:49:16 2021 - [info]     Primary candidate for the new Master (candidate_master is set)
    Fri Jun 11 06:49:16 2021 - [info]   10.0.0.113(10.0.0.113:3306)  Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
    Fri Jun 11 06:49:16 2021 - [info]     Replicating from 10.0.0.111(10.0.0.111:3306)
    Fri Jun 11 06:49:16 2021 - [info]     Primary candidate for the new Master (candidate_master is set)
    Fri Jun 11 06:49:16 2021 - [info] Current Alive Master: 10.0.0.111(10.0.0.111:3306)
    Fri Jun 11 06:49:16 2021 - [info] Checking slave configurations..
    Fri Jun 11 06:49:16 2021 - [info] Checking replication filtering settings..
    Fri Jun 11 06:49:16 2021 - [info]  binlog_do_db= , binlog_ignore_db= information_schema,mysql,performance_schema,sys
    Fri Jun 11 06:49:16 2021 - [info]  Replication filtering check ok.
    Fri Jun 11 06:49:16 2021 - [info] GTID (with auto-pos) is not supported
    Fri Jun 11 06:49:16 2021 - [info] Starting SSH connection tests..
    Fri Jun 11 06:49:18 2021 - [info] All SSH connection tests passed successfully.
    Fri Jun 11 06:49:18 2021 - [info] Checking MHA Node version..
    Fri Jun 11 06:49:19 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm, ln122] Got error when getting node version. Error:
    Fri Jun 11 06:49:19 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm, ln123] 
    bash: apply_diff_relay_logs: command not found
    Fri Jun 11 06:49:19 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm, ln150] node version on 10.0.0.112 not found! Is MHA Node package installed ?
     at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 377.
    Fri Jun 11 06:49:19 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. node version on 10.0.0.112 not found! Is MHA Node package installed ?
     at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 377.
    	...propagated at /usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm line 151.
    Fri Jun 11 06:49:19 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
    Fri Jun 11 06:49:19 2021 - [info] Got exit code 1 (Not master dead).
    
    MySQL Replication Health is NOT OK!
    
  2. 主节点没有安装node的错误信息

    [root@localhost mha]# masterha_check_repl --conf=/etc/mha/app1.cnf
    Fri Jun 11 06:51:14 2021 - [info] Reading default configuration from /etc/masterha_default.cnf..
    Fri Jun 11 06:51:14 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
    Fri Jun 11 06:51:14 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
    Fri Jun 11 06:51:14 2021 - [info] MHA::MasterMonitor version 0.58.
    Fri Jun 11 06:51:15 2021 - [info] GTID failover mode = 0
    Fri Jun 11 06:51:15 2021 - [info] Dead Servers:
    Fri Jun 11 06:51:15 2021 - [info] Alive Servers:
    Fri Jun 11 06:51:15 2021 - [info]   10.0.0.111(10.0.0.111:3306)
    Fri Jun 11 06:51:15 2021 - [info]   10.0.0.112(10.0.0.112:3306)
    Fri Jun 11 06:51:15 2021 - [info]   10.0.0.113(10.0.0.113:3306)
    Fri Jun 11 06:51:15 2021 - [info] Alive Slaves:
    Fri Jun 11 06:51:15 2021 - [info]   10.0.0.112(10.0.0.112:3306)  Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
    Fri Jun 11 06:51:15 2021 - [info]     Replicating from 10.0.0.111(10.0.0.111:3306)
    Fri Jun 11 06:51:15 2021 - [info]     Primary candidate for the new Master (candidate_master is set)
    Fri Jun 11 06:51:15 2021 - [info]   10.0.0.113(10.0.0.113:3306)  Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
    Fri Jun 11 06:51:15 2021 - [info]     Replicating from 10.0.0.111(10.0.0.111:3306)
    Fri Jun 11 06:51:15 2021 - [info]     Primary candidate for the new Master (candidate_master is set)
    Fri Jun 11 06:51:15 2021 - [info] Current Alive Master: 10.0.0.111(10.0.0.111:3306)
    Fri Jun 11 06:51:15 2021 - [info] Checking slave configurations..
    Fri Jun 11 06:51:15 2021 - [info] Checking replication filtering settings..
    Fri Jun 11 06:51:15 2021 - [info]  binlog_do_db= , binlog_ignore_db= information_schema,mysql,performance_schema,sys
    Fri Jun 11 06:51:15 2021 - [info]  Replication filtering check ok.
    Fri Jun 11 06:51:16 2021 - [info] GTID (with auto-pos) is not supported
    Fri Jun 11 06:51:16 2021 - [info] Starting SSH connection tests..
    Fri Jun 11 06:51:19 2021 - [info] All SSH connection tests passed successfully.
    Fri Jun 11 06:51:19 2021 - [info] Checking MHA Node version..
    Fri Jun 11 06:51:20 2021 - [info]  Version check ok.
    Fri Jun 11 06:51:20 2021 - [info] Checking SSH publickey authentication settings on the current master..
    Fri Jun 11 06:51:20 2021 - [info] HealthCheck: SSH to 10.0.0.111 is reachable.
    Fri Jun 11 06:51:21 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm, ln122] Got error when getting node version. Error:
    Fri Jun 11 06:51:21 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm, ln123] 
    bash: apply_diff_relay_logs: command not found
    Fri Jun 11 06:51:21 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln135] Failed to get MHA node version on the current master even though current master is reachable via SSH!
    Fri Jun 11 06:51:21 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/bin/masterha_check_repl line 48.
    Fri Jun 11 06:51:21 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
    Fri Jun 11 06:51:21 2021 - [info] Got exit code 1 (Not master dead).
    
    MySQL Replication Health is NOT OK!
    

启动MHA Manager

MHA Manager服务器执行命令:

nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf -- ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1 &

查看监控状态:

[root@localhost ~]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:5503) is running(0:PING_OK), master:10.0.0.111
[root@localhost ~]# 

查看监控日志:

[root@localhost ~]# tail -f /var/log/mha/app1/manager.log
10.0.0.111(10.0.0.111:3306) (current master)
 +--10.0.0.112(10.0.0.112:3306)
 +--10.0.0.113(10.0.0.113:3306)

Fri Jun 11 07:00:20 2021 - [warning] master_ip_failover_script is not defined.
Fri Jun 11 07:00:20 2021 - [warning] shutdown_script is not defined.
Fri Jun 11 07:00:20 2021 - [info] Set master ping interval 1 seconds.
Fri Jun 11 07:00:20 2021 - [info] Set secondary check script: masterha_secondary_check -s 10.0.0.111 -s 10.0.0.112 -s 10.0.0.113
Fri Jun 11 07:00:20 2021 - [info] Starting ping health check on 10.0.0.111(10.0.0.111:3306)..
Fri Jun 11 07:00:20 2021 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

测试MHA故障转移

在MHA Manager服务器监控日志信息:

tail -200f /var/log/mha/app1/manager.log

关闭主节点MySQL服务,模拟主节点宕机

systemctl stop mysqld

监控日志输出如下信息:

可以看到当前主节点转移到了112

Selected 10.0.0.112(10.0.0.112:3306) as a new master.


----- Failover Report -----

app1: MySQL Master failover 10.0.0.111(10.0.0.111:3306) to 10.0.0.112(10.0.0.112:3306) succeeded

Master 10.0.0.111(10.0.0.111:3306) is down!

Check MHA Manager logs at localhost.localdomain:/var/log/mha/app1/manager.log for details.

Started automated(non-interactive) failover.
The latest slave 10.0.0.112(10.0.0.112:3306) has all relay logs for recovery.
Selected 10.0.0.112(10.0.0.112:3306) as a new master.
10.0.0.112(10.0.0.112:3306): OK: Applying all logs succeeded.
10.0.0.113(10.0.0.113:3306): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
10.0.0.113(10.0.0.113:3306): OK: Applying all logs succeeded. Slave started, replicating from 10.0.0.112(10.0.0.112:3306)
10.0.0.112(10.0.0.112:3306): Resetting slave info succeeded.
Master failover to 10.0.0.112(10.0.0.112:3306) completed successfully.

测试宕机服务恢复,检查数据一致性

在刚选出的主节点上执行测试脚本创建表 插入数据
CREATE DATABASE test;
USE test;

CREATE TABLE position ( id INT ( 20 ), NAME VARCHAR ( 50 ), salary VARCHAR ( 20 ), city VARCHAR ( 50 ) ) ENGINE = INNODB charset = utf8;

create TABLE position_detail ( id int(20), pid int(20), description text ) ENGINE=innodb charset=utf8;

insert into position values(1, 'Java', 13000, 'shanghai'); 
insert into position values(2, 'DBA', 20000, 'beijing');

insert into position_detail values(1, 1, 'Java Developer'); 
insert into position_detail values(2, 2, 'Database Administrator');

启动刚下线的节点服务
systemctl start mysqld
查询当前主节点的信息
mysql> show master status;
+------------------+----------+--------------+-------------------------------------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB                                | Executed_Gtid_Set |
+------------------+----------+--------------+-------------------------------------------------+-------------------+
| mysql-bin.000005 |     2908 |              | information_schema,mysql,performance_schema,sys |                   |
+------------------+----------+--------------+-------------------------------------------------+-------------------+
1 row in set (0.00 sec)
刚恢复的节点上执行如下SQL加入到集群中
stop slave;//关闭同步

change master to master_host='10.0.0.112',master_port=3306,master_user='root',master_password ='123456',master_log_file='mysql-bin.000005',master_log_pos=2908;

start slave; // 开启同步
三个节点启动成功后,再次尝试关掉主节点(112):

当关掉主节点112的时候 发现mha并没有立即切换其他从节点为主节点。

我们检查MHA Manager服务器执行命令以及监控状态和日志信息发现MHA并没有在执行。

检查监控实例,发现之前宕机的主节点(111)也不再监控文件中。

解决方案:

  1. 补充实例到 vi /etc/mha/app1.cnf文件中。
  2. 重新启动MHA Manager执行命令。
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值