MySQL高可用架构MHA

最新推荐文章于 2024-10-05 10:25:52 发布

简单爱你心所爱

最新推荐文章于 2024-10-05 10:25:52 发布

阅读量211

点赞数 7

分类专栏： MySQL 文章标签： mysql 架构数据库

本文链接：https://blog.csdn.net/evil_wdpp/article/details/132425216

版权

MySQL 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

MHA简介

MHA（Master High Availability）目前在MySQL高可用方面是一个相对成熟的解决方案，它由日本公司的 youshimaton（现就职于 Facebook 公司）开发，是一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件。
在MySQL故障切换过程中，MHA能做到在0~30秒之内自动完成数据库的故障切换操作，并且在进行故障切换的过程中，MHA能在最大程度上保证数据的一致性，以达到真正意义上的高可用。
该软件由两部分组成：MHA Manager（管理节点）和 MHA Node（数据节点）。MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群，也可以部署在一台 slave 节点上。
MHA Node运行在每台MySQL服务器上，MHA Manager会定时探测集群中的master节点，当master出现故障时，它可以自动将最新数据的slave提升为新的master，然后将所有其他的slave重新指向新的master。整个故障转移过程对应用程序完全透明。
目前MHA主要支持一主多从的架构，要搭建MHA, 要求一个复制集群中必须最少有三台数据库服务器，一主二从，即一台充当master，一台充当备用master，另外一台充当从库。

主从规划

IP	主机名	主从	os用户名	os密码	db用户名	db密码	MySQL版本	复制类型
192.168.6.101	node01	主	root	oracle	root	oracle	8.0.33	GTID
192.168.6.102	node02	从	root	oracle	root	oracle	8.0.33	GTID
192.168.6.103	node03	从	root	oracle	root	oracle	8.0.33	GTID

MHA规划

IP	主机名	角色	备注	db用户名	db密码	MySQL版本	MHA版本
192.168.6.101	node01	manager	管理节点	root	oracle	8.0.33	0.58
192.168.6.101	node01	node	数据节点	root	oracle	8.0.33	0.58
192.168.6.102	node02	node	数据节点	root	oracle	8.0.33	0.58
192.168.6.103	node03	node	数据节点	root	oracle	8.0.33	0.58
192.168.6.200	node01	VIP	虚拟IP

MHA部署与测试

1、配置主机SSH免密登录
2、安装manager和node
3、管理节点配置MHA文件
4、验证SSH互信是否成功
5、验证mysql复制是否成功
6、配置VIP
7、启动MHA manager
8、测试master宕机failover是否成功
9、将原master切换为slave后重新搭建MHA
10、测试主动切换switchover

配置主机SSH免密登录

1、执行ssh-keygen
[root@node01 ~]# ssh-keygen -t rsa
[root@node02 ~]# ssh-keygen -t rsa
[root@node03 ~]# ssh-keygen -t rsa
 
2、执行ssh-copy-id
[root@node01 ~]# ssh-copy-id node01
[root@node02 ~]# ssh-copy-id node02
[root@node03 ~]# ssh-copy-id node03

安装manager和node

1、下载rpm安装包
[root@node01 ~]# wget https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
[root@node01 ~]# wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.centos.noarch.rpm

[root@node02 ~]# wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.centos.noarch.rpm
[root@node03 ~]# wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.centos.noarch.rpm

[root@node01 ~]# ll -h mha4mysql-*
-rw-r--r-- 1 root root 80K Dec  7  2021 mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
-rw-r--r-- 1 root root 36K Dec  7  2021 mha4mysql-node-0.58-0.el7.centos.noarch.rpm

[root@node02 ~]# ll -h mha4mysql-node-0.58-0.el7.centos.noarch.rpm 
-rw-r--r-- 1 root root 36K Dec  7  2021 mha4mysql-node-0.58-0.el7.centos.noarch.rpm

[root@node03 ~]# ll -h mha4mysql-node-0.58-0.el7.centos.noarch.rpm 
-rw-r--r-- 1 root root 36K Dec  7  2021 mha4mysql-node-0.58-0.el7.centos.noarch.rpm

2、添加epel仓库
[root@node01 ~]# yum install epel-release -y
[root@node02 ~]# yum install epel-release -y
[root@node03 ~]# yum install epel-release -y

3、管理节点安装node和manager
先安装node再安装manager
[root@node01 ~]# yum install mha4mysql-node-0.58-0.el7.centos.noarch.rpm -y
[root@node01 ~]# yum install mha4mysql-manager-0.58-0.el7.centos.noarch.rpm -y

4、数据节点安装node
[root@node02 ~]# yum install mha4mysql-node-0.58-0.el7.centos.noarch.rpm -y
[root@node03 ~]# yum install mha4mysql-node-0.58-0.el7.centos.noarch.rpm -y

管理节点配置MHA文件

1、创建mha工作目录
[root@node01 ~]# mkdir -p /data/mastermha
[root@node01 ~]# mkdir -p /data/mastermha/app1

2、创建配置文件
[root@node01 ~]# vim /data/mastermha/app1.cnf
[server default]

#mha工作目录
manager_workdir=/data/mastermha/app1
manager_log=/data/mastermha/app1/manager.log
master_binlog_dir=/data/mysql/3306/data

#failover
master_ip_failover_script=/data/mastermha/master_ip_failover
#switchover
master_ip_online_change_script=/data/mastermha/master_ip_online_change

#ssh免密
ssh_user=root

#mysql用户
user=root
password=oracle

#每隔3秒检测主库心跳，最多四次机会，如果都没有心跳，主库宕机
ping_interval=3
ping_type=INSERT

#复制用户
repl_user=repl
repl_password=oracle

#主节点
[server1]
hostname=node01
ip=192.168.6.101
port=3306
#设置为候选master，如果设置该参数以后，发生主从切换以后将会将此从库提升为主库，即使这个主库不是集群中事件最新的slave
candidate_master=1

#从节点
[server2]
hostname=node02
ip=192.168.6.102
port=3306
candidate_master=1
check_repl_delay=0

#从节点
[server3]
hostname=node03
ip=192.168.6.103
port=3306
#不选择作为主库
no_master=1

验证SSH互信是否成功

1、MHA工具
[root@node01 ~]# masterha_
masterha_check_repl       masterha_check_status     masterha_manager          masterha_master_switch    masterha_stop
masterha_check_ssh        masterha_conf_host        masterha_master_monitor   masterha_secondary_check  

2、验证SSH互信是否成功
[root@node01 ~]# masterha_check_ssh --conf=/data/mastermha/app1.cnf 
Mon Aug 21 12:36:14 2023 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Aug 21 12:36:14 2023 - [info] Reading application default configuration from /data/mastermha/app1.cnf..
Mon Aug 21 12:36:14 2023 - [info] Reading server configuration from /data/mastermha/app1.cnf..
Mon Aug 21 12:36:14 2023 - [info] Starting SSH connection tests..
Mon Aug 21 12:36:15 2023 - [debug] 
Mon Aug 21 12:36:14 2023 - [debug]  Connecting via SSH from root@node01(192.168.6.101:22) to root@node02(192.168.6.102:22)..
Mon Aug 21 12:36:14 2023 - [debug]   ok.
Mon Aug 21 12:36:14 2023 - [debug]  Connecting via SSH from root@node01(192.168.6.101:22) to root@node03(192.168.6.103:22)..
Mon Aug 21 12:36:14 2023 - [debug]   ok.
Mon Aug 21 12:36:15 2023 - [debug] 
Mon Aug 21 12:36:14 2023 - [debug]  Connecting via SSH from root@node02(192.168.6.102:22) to root@node01(192.168.6.101:22)..
Mon Aug 21 12:36:14 2023 - [debug]   ok.
Mon Aug 21 12:36:14 2023 - [debug]  Connecting via SSH from root@node02(192.168.6.102:22) to root@node03(192.168.6.103:22)..
Mon Aug 21 12:36:15 2023 - [debug]   ok.
Mon Aug 21 12:36:16 2023 - [debug] 
Mon Aug 21 12:36:15 2023 - [debug]  Connecting via SSH from root@node03(192.168.6.103:22) to root@node01(192.168.6.101:22)..
Mon Aug 21 12:36:15 2023 - [debug]   ok.
Mon Aug 21 12:36:15 2023 - [debug]  Connecting via SSH from root@node03(192.168.6.103:22) to root@node02(192.168.6.102:22)..
Mon Aug 21 12:36:15 2023 - [debug]   ok.
Mon Aug 21 12:36:16 2023 - [info] All SSH connection tests passed successfully.

验证复制是否成功

1、验证复制失败
[root@node01 ~]# masterha_check_repl --conf=/data/mastermha/app1.cnf 
Mon Aug 21 12:39:21 2023 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Aug 21 12:39:21 2023 - [info] Reading application default configuration from /data/mastermha/app1.cnf..
Mon Aug 21 12:39:21 2023 - [info] Reading server configuration from /data/mastermha/app1.cnf..
Mon Aug 21 12:39:21 2023 - [info] MHA::MasterMonitor version 0.58.
Creating directory /data/mastermha/app1.. done.
Mon Aug 21 12:39:22 2023 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln188] There is no alive server. We can't do failover
Mon Aug 21 12:39:22 2023 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Mon Aug 21 12:39:22 2023 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Aug 21 12:39:22 2023 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

2、解决方案
8.0需要修改mysql用户'root'@'192.168.6.%'和复制用户'repl'@'192.168.6.%'密码插件，5.7不需要修改，插件默认是mysql_native_password
root@node01 [(none)]> select user,host,plugin from mysql.user;
+------------------+-------------+-----------------------+
| user             | host        | plugin                |
+------------------+-------------+-----------------------+
| clone            | %           | caching_sha2_password |
| pmm              | 192.168.6.% | mysql_native_password |
| repl             | 192.168.6.% | caching_sha2_password |
| root             | 192.168.6.% | caching_sha2_password |
| mysql.infoschema | localhost   | caching_sha2_password |
| mysql.session    | localhost   | caching_sha2_password |
| mysql.sys        | localhost   | caching_sha2_password |
| root             | localhost   | mysql_native_password |
+------------------+-------------+-----------------------+
8 rows in set (0.03 sec)

root@node01 [mysql]> alter user 'repl'@'192.168.6.%' identified with mysql_native_password by 'oracle';
Query OK, 0 rows affected (1.26 sec)

root@node01 [mysql]> flush privileges;
Query OK, 0 rows affected (0.24 sec)

root@node01 [mysql]> alter user 'root'@'192.168.6.%' identified with mysql_native_password by 'oracle';
Query OK, 0 rows affected (0.05 sec)

root@node01 [mysql]> flush privileges;
Query OK, 0 rows affected (0.01 sec)

root@node01 [mysql]> select user,host,plugin from mysql.user;
+------------------+-------------+-----------------------+
| user             | host        | plugin                |
+------------------+-------------+-----------------------+
| clone            | %           | caching_sha2_password |
| pmm              | 192.168.6.% | mysql_native_password |
| repl             | 192.168.6.% | mysql_native_password |
| root             | 192.168.6.% | mysql_native_password |
| mysql.infoschema | localhost   | caching_sha2_password |
| mysql.session    | localhost   | caching_sha2_password |
| mysql.sys        | localhost   | caching_sha2_password |
| root             | localhost   | mysql_native_password |
+------------------+-------------+-----------------------+
8 rows in set (0.02 sec)

3、验证复制成功
[root@node01 ~]# masterha_check_repl --conf=/data/mastermha/app1.cnf 
Mon Aug 21 13:58:44 2023 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Aug 21 13:58:44 2023 - [info] Reading application default configuration from /data/mastermha/app1.cnf..
Mon Aug 21 13:58:44 2023 - [info] Reading server configuration from /data/mastermha/app1.cnf..
Mon Aug 21 13:58:44 2023 - [info] MHA::MasterMonitor version 0.58.
Mon Aug 21 13:58:46 2023 - [info] GTID failover mode = 1
Mon Aug 21 13:58:46 2023 - [info] Dead Servers:
Mon Aug 21 13:58:46 2023 - [info] Alive Servers:
Mon Aug 21 13:58:46 2023 - [info]   node01(192.168.6.101:3306)
Mon Aug 21 13:58:46 2023 - [info]   node02(192.168.6.102:3306)
Mon Aug 21 13:58:46 2023 - [info]   node03(192.168.6.103:3306)
Mon Aug 21 13:58:46 2023 - [info] Alive Slaves:
Mon Aug 21 13:58:46 2023 - [info]   node02(192.168.6.102:3306)  Version=8.0.33 (oldest major version between slaves) log-bin:enabled
Mon Aug 21 13:58:46 2023 - [info]     GTID ON
Mon Aug 21 13:58:46 2023 - [info]     Replicating from 192.168.6.101(192.168.6.101:3306)
Mon Aug 21 13:58:46 2023 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Aug 21 13:58:46 2023 - [info]   node03(192.168.6.103:3306)  Version=8.0.33 (oldest major version between slaves) log-bin:enabled
Mon Aug 21 13:58:46 2023 - [info]     GTID ON
Mon Aug 21 13:58:46 2023 - [info]     Replicating from 192.168.6.101(192.168.6.101:3306)
Mon Aug 21 13:58:46 2023 - [info]     Not candidate for the new Master (no_master is set)
Mon Aug 21 13:58:46 2023 - [info] Current Alive Master: node01(192.168.6.101:3306)
Mon Aug 21 13:58:46 2023 - [info] Checking slave configurations..
Mon Aug 21 13:58:46 2023 - [info]  read_only=1 is not set on slave node02(192.168.6.102:3306).
Mon Aug 21 13:58:46 2023 - [info]  read_only=1 is not set on slave node03(192.168.6.103:3306).
Mon Aug 21 13:58:46 2023 - [info] Checking replication filtering settings..
Mon Aug 21 13:58:46 2023 - [info]  binlog_do_db= , binlog_ignore_db= 
Mon Aug 21 13:58:46 2023 - [info]  Replication filtering check ok.
Mon Aug 21 13:58:46 2023 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Mon Aug 21 13:58:46 2023 - [info] Checking SSH publickey authentication settings on the current master..
Mon Aug 21 13:58:46 2023 - [info] HealthCheck: SSH to node01 is reachable.
Mon Aug 21 13:58:46 2023 - [info] 
node01(192.168.6.101:3306) (current master)
 +--node02(192.168.6.102:3306)
 +--node03(192.168.6.103:3306)

Mon Aug 21 13:58:46 2023 - [info] Checking replication health on node02..
Mon Aug 21 13:58:46 2023 - [info]  ok.
Mon Aug 21 13:58:46 2023 - [info] Checking replication health on node03..
Mon Aug 21 13:58:46 2023 - [info]  ok.
Mon Aug 21 13:58:46 2023 - [warning] master_ip_failover_script is not defined.
Mon Aug 21 13:58:46 2023 - [warning] shutdown_script is not defined.
Mon Aug 21 13:58:46 2023 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

配置VIP

1、手动添加虚拟ip
[root@node01 ~]# ifconfig ens33:88 192.168.6.200
[root@node01 ~]# ip addr | grep ens33
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    inet 192.168.6.101/24 brd 192.168.6.255 scope global noprefixroute ens33
    inet 192.168.6.200/24 brd 192.168.6.255 scope global secondary ens33:88

2、使用VIP连接数据库
[root@node01 ~]# mysql -h192.168.6.200 -uroot -poracle
root@192.168.6.200 [(none)]> select user();
+-------------+
| user()      |
+-------------+
| root@node01 |
+-------------+
1 row in set (0.01 sec)

root@192.168.6.200 [(none)]> select @@server_id;
+-------------+
| @@server_id |
+-------------+
|        1001 |
+-------------+
1 row in set (0.00 sec)

root@192.168.6.200 [(none)]> show master status;
+---------------+----------+--------------+------------------+---------------------------------------------+
| File          | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                           |
+---------------+----------+--------------+------------------+---------------------------------------------+
| binlog.000017 |   341388 |              |                  | b0ae4cca-32af-11ee-8ff1-000c29004cdf:1-1342 |
+---------------+----------+--------------+------------------+---------------------------------------------+
1 row in set (0.01 sec)

启动MHA manager

1、后台启动manager
[root@node01 ~]# nohup masterha_manager --conf=/data/mastermha/app1.cnf &

2、查看mha状态
[root@node01 ~]# masterha_check_status --conf=/data/mastermha/app1.cnf 
app1 (pid:10126) is running(0:PING_OK), master:node01

3、查看日志
[root@node01 ~]# tail -100f /data/mastermha/app1/manager.log 
Mon Aug 21 14:23:44 2023 - [info] MHA::MasterMonitor version 0.58.
Mon Aug 21 14:23:46 2023 - [info] GTID failover mode = 1
Mon Aug 21 14:23:46 2023 - [info] Dead Servers:
Mon Aug 21 14:23:46 2023 - [info] Alive Servers:
Mon Aug 21 14:23:46 2023 - [info]   node01(192.168.6.101:3306)
Mon Aug 21 14:23:46 2023 - [info]   node02(192.168.6.102:3306)
Mon Aug 21 14:23:46 2023 - [info]   node03(192.168.6.103:3306)
Mon Aug 21 14:23:46 2023 - [info] Alive Slaves:
Mon Aug 21 14:23:46 2023 - [info]   node02(192.168.6.102:3306)  Version=8.0.33 (oldest major version between slaves) log-bin:enabled
Mon Aug 21 14:23:46 2023 - [info]     GTID ON
Mon Aug 21 14:23:46 2023 - [info]     Replicating from 192.168.6.101(192.168.6.101:3306)
Mon Aug 21 14:23:46 2023 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Aug 21 14:23:46 2023 - [info]   node03(192.168.6.103:3306)  Version=8.0.33 (oldest major version between slaves) log-bin:enabled
Mon Aug 21 14:23:46 2023 - [info]     GTID ON
Mon Aug 21 14:23:46 2023 - [info]     Replicating from 192.168.6.101(192.168.6.101:3306)
Mon Aug 21 14:23:46 2023 - [info]     Not candidate for the new Master (no_master is set)
Mon Aug 21 14:23:46 2023 - [info] Current Alive Master: node01(192.168.6.101:3306)
Mon Aug 21 14:23:46 2023 - [info] Checking slave configurations..
Mon Aug 21 14:23:46 2023 - [info]  read_only=1 is not set on slave node02(192.168.6.102:3306).
Mon Aug 21 14:23:46 2023 - [info]  read_only=1 is not set on slave node03(192.168.6.103:3306).
Mon Aug 21 14:23:46 2023 - [info] Checking replication filtering settings..
Mon Aug 21 14:23:46 2023 - [info]  binlog_do_db= , binlog_ignore_db= 
Mon Aug 21 14:23:46 2023 - [info]  Replication filtering check ok.
Mon Aug 21 14:23:46 2023 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Mon Aug 21 14:23:46 2023 - [info] Checking SSH publickey authentication settings on the current master..
Mon Aug 21 14:23:46 2023 - [info] HealthCheck: SSH to node01 is reachable.
Mon Aug 21 14:23:46 2023 - [info] 
node01(192.168.6.101:3306) (current master)
 +--node02(192.168.6.102:3306)
 +--node03(192.168.6.103:3306)

Mon Aug 21 14:23:46 2023 - [info] Checking master_ip_failover_script status:
Mon Aug 21 14:23:46 2023 - [info]   /data/mastermha/master_ip_failover --command=status --ssh_user=root --orig_master_host=node01 --orig_master_ip=192.168.6.101 --orig_master_port=3306 


IN SCRIPT TEST====/sbin/ifconfig ens33:88 down==/sbin/ifconfig ens33:88 192.168.6.111/24===

Checking the Status of the script.. OK 
Mon Aug 21 14:23:46 2023 - [info]  OK.
Mon Aug 21 14:23:46 2023 - [warning] shutdown_script is not defined.
Mon Aug 21 14:23:46 2023 - [info] Set master ping interval 3 seconds.
Mon Aug 21 14:23:46 2023 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Mon Aug 21 14:23:46 2023 - [info] Starting ping health check on node01(192.168.6.101:3306)..
Mon Aug 21 14:23:48 2023 - [info] Ping(INSERT) succeeded, waiting until MySQL doesn't respond..

测试master宕机failover是否成功

1、修改配置文件master_ip_failover
[root@node01 ~]# vim /data/mastermha/master_ip_failover 
my $vip = '192.168.6.200/24';
my $key = '88';
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";

2、添加执行权限
[root@node01 ~]# chmod +x /data/mastermha/master_ip_failover

3、关闭MySQL
[root@node01 ~]# systemctl stop mysqld
[root@node01 ~]# masterha_check_status --conf=/data/mastermha/app1.cnf 
app1 is stopped(2:NOT_RUNNING).
  
4、查看日志，failover发生自动故障转移，node02提升为master
[root@node01 ~]# tail -100f /data/mastermha/app1/manager.log 

----- Failover Report -----

app1: MySQL Master failover node01(192.168.6.101:3306) to node02(192.168.6.102:3306) succeeded

Master node01(192.168.6.101:3306) is down!

Check MHA Manager logs at node01:/data/mastermha/app1/manager.log for details.

Started automated(non-interactive) failover.
Invalidated master IP address on node01(192.168.6.101:3306)
Selected node02(192.168.6.102:3306) as a new master.
node02(192.168.6.102:3306): OK: Applying all logs succeeded.
node02(192.168.6.102:3306): OK: Activated master IP address.
node03(192.168.6.103:3306): OK: Slave started, replicating from node02(192.168.6.102:3306)
node02(192.168.6.102:3306): Resetting slave info succeeded.
Master failover to node02(192.168.6.102:3306) completed successfully.

5、可以看到vip连接的是node02节点
root@192.168.6.200 [(none)]> select @@server_id;
+-------------+
| @@server_id |
+-------------+
|        1002 |
+-------------+
1 row in set (0.00 sec)

root@192.168.6.200 [(none)]> show processlist;
+----+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
| Id | User            | Host         | db   | Command          | Time | State                                                           | Info             |
+----+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
|  5 | event_scheduler | localhost    | NULL | Daemon           | 2593 | Waiting on empty queue                                          | NULL             |
|  8 | root            | node02:56284 | NULL | Sleep            | 1217 |                                                                 | NULL             |
| 29 | repl            | node03:60278 | NULL | Binlog Dump GTID | 1105 | Source has sent all binlog to replica; waiting for more updates | NULL             |
| 31 | root            | node01:41758 | NULL | Query            |    0 | init                                                            | show processlist |
+----+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
4 rows in set (0.00 sec)

将原master切换为slave后重新搭建MHA

1、node02升级成主库后，将node01切换为slave
启动mysql
[root@node01 ~]# systemctl start mysqld

2、将node01切换为slave
root@node01 [(none)]> change master to
    ->     master_host='192.168.6.102',
    ->     master_port=3306,
    ->     master_user='repl',
    ->     master_password='oracle',
    ->     master_auto_position=1;
Query OK, 0 rows affected, 7 warnings (0.18 sec)

root@node01 [(none)]> start slave;
Query OK, 0 rows affected, 1 warning (0.00 sec)        

root@node01 [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for source to send event
                  Master_Host: 192.168.6.102
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000006
          Read_Master_Log_Pos: 175901
               Relay_Log_File: node01-relay-bin.000002
                Relay_Log_Pos: 411
        Relay_Master_Log_File: binlog.000006
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
                       
root@192.168.6.200 [(none)]> show processlist;
+----+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
| Id | User            | Host         | db   | Command          | Time | State                                                           | Info             |
+----+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
|  5 | event_scheduler | localhost    | NULL | Daemon           | 2957 | Waiting on empty queue                                          | NULL             |
|  8 | root            | node02:56284 | NULL | Sleep            |  339 |                                                                 | NULL             |
| 29 | repl            | node03:60278 | NULL | Binlog Dump GTID | 1469 | Source has sent all binlog to replica; waiting for more updates | NULL             |
| 31 | root            | node01:41758 | NULL | Query            |    0 | init                                                            | show processlist |
| 34 | repl            | node01:42764 | NULL | Binlog Dump GTID |   17 | Source has sent all binlog to replica; waiting for more updates | NULL             |
+----+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
5 rows in set (0.01 sec)

3、修改配置文件，当前node02是master，node01是slave，重新搭建MHA
[root@node01 ~]# vim /data/mastermha/app1.cnf 
[server1]
hostname=node02
ip=192.168.6.102

[server2]
hostname=node01
ip=192.168.6.101

4、删除app1.failover.complete文件，否则启动manager报错
[root@node01 ~]# tail -100f /data/mastermha/app1/manager.log
Mon Aug 21 16:06:03 2023 - [error][/usr/share/perl5/vendor_perl/MHA/MasterFailover.pm, ln310] Last failover was done at 2023/08/21 14:48:23. Current time is too early to do failover again. If you want to do failover, manually remove /data/mastermha/app1/app1.failover.complete and run this script again.
Mon Aug 21 16:06:03 2023 - [error][/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm, ln177] Got ERROR:  at /usr/bin/masterha_manager line 65.

测试主动切换switchover

1、切换前需要关闭manager
[root@node01 ~]# masterha_stop --conf=/data/mastermha/app1.cnf
MHA Manager is not running on app1(2:NOT_RUNNING).

2、修改master_ip_online_change配置文件
[root@node01 ~]# vim/data/mastermha/master_ip_online_change
my $vip = '192.168.6.200/24';
my $key = '88';
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";

3、将node02切换为从库，node01提升为主库
[root@node01 ~]# masterha_master_switch --master_state=alive --conf=/data/mastermha/app1.cnf --new_master_host=node01 --orig_master_is_new_slave  --running_updates_limit=10000

IN SCRIPT TEST====/sbin/ifconfig ens33:88 down==/sbin/ifconfig ens33:88 192.168.6.200/24===

Enabling the VIP - 192.168.6.200/24 on the new master - node01 
Mon Aug 21 18:05:48 2023 - [info]  ok.
Mon Aug 21 18:05:48 2023 - [info] Setting read_only=0 on node01(192.168.6.101:3306)..
Mon Aug 21 18:05:48 2023 - [info]  ok.
Mon Aug 21 18:05:48 2023 - [info] 
Mon Aug 21 18:05:48 2023 - [info] * Switching slaves in parallel..
Mon Aug 21 18:05:48 2023 - [info] 
Mon Aug 21 18:05:48 2023 - [info] -- Slave switch on host node03(192.168.6.103:3306) started, pid: 72129
Mon Aug 21 18:05:48 2023 - [info] 
Mon Aug 21 18:05:49 2023 - [info] Log messages from node03 ...
Mon Aug 21 18:05:49 2023 - [info] 
Mon Aug 21 18:05:48 2023 - [info]  Waiting to execute all relay logs on node03(192.168.6.103:3306)..
Mon Aug 21 18:05:48 2023 - [info]  master_pos_wait(binlog.000008:237) completed on node03(192.168.6.103:3306). Executed 0 events.
Mon Aug 21 18:05:48 2023 - [info]   done.
Mon Aug 21 18:05:48 2023 - [info]  Resetting slave node03(192.168.6.103:3306) and starting replication from the new master node01(192.168.6.101:3306)..
Mon Aug 21 18:05:48 2023 - [info]  Executed CHANGE MASTER.
Mon Aug 21 18:05:48 2023 - [info]  Slave started.
Mon Aug 21 18:05:49 2023 - [info] End of log messages from node03 ...
Mon Aug 21 18:05:49 2023 - [info] 
Mon Aug 21 18:05:49 2023 - [info] -- Slave switch on host node03(192.168.6.103:3306) succeeded.
Mon Aug 21 18:05:49 2023 - [info] Unlocking all tables on the orig master:
Mon Aug 21 18:05:49 2023 - [info] Executing UNLOCK TABLES..
Mon Aug 21 18:05:49 2023 - [info]  ok.
Mon Aug 21 18:05:49 2023 - [info] Starting orig master as a new slave..
Mon Aug 21 18:05:49 2023 - [info]  Resetting slave node02(192.168.6.102:3306) and starting replication from the new master node01(192.168.6.101:3306)..
Mon Aug 21 18:05:49 2023 - [info]  Executed CHANGE MASTER.
Mon Aug 21 18:05:49 2023 - [info]  Slave started.
Mon Aug 21 18:05:49 2023 - [info] All new slave servers switched successfully.
Mon Aug 21 18:05:49 2023 - [info] 
Mon Aug 21 18:05:49 2023 - [info] * Phase 5: New master cleanup phase..
Mon Aug 21 18:05:49 2023 - [info] 
Mon Aug 21 18:05:49 2023 - [info]  node01: Resetting slave info succeeded.
Mon Aug 21 18:05:49 2023 - [info] Switching master to node01(192.168.6.101:3306) completed successfully.

root@192.168.6.200 [(none)]> select @@server_id;
+-------------+
| @@server_id |
+-------------+
|        1001 |
+-------------+
1 row in set (0.00 sec)

root@192.168.6.200 [(none)]> show processlist;
+------+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
| Id   | User            | Host         | db   | Command          | Time | State                                                           | Info             |
+------+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
|    5 | event_scheduler | localhost    | NULL | Daemon           | 4240 | Waiting on empty queue                                          | NULL             |
| 1553 | root            | node01:51308 | NULL | Sleep            | 1177 |                                                                 | NULL             |
| 2831 | repl            | node03:44486 | NULL | Binlog Dump GTID |   38 | Source has sent all binlog to replica; waiting for more updates | NULL             |
| 2832 | repl            | node02:47518 | NULL | Binlog Dump GTID |   37 | Source has sent all binlog to replica; waiting for more updates | NULL             |
| 2835 | root            | node01:58570 | NULL | Query            |    0 | init                                                            | show processlist |
+------+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
5 rows in set (0.00 sec)

4、修改配置文件
[root@node01 ~]# vim /data/mastermha/app1.cnf 

[server1]
hostname=node01
ip=192.168.6.101

[server2]
hostname=node02
ip=192.168.6.102