MHA简介
MHA(Master High Availability)目前在MySQL高可用方面是一个相对成熟的解决方案,它由日本公司的 youshimaton(现就职于 Facebook 公司)开发,是一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件。
在MySQL故障切换过程中,MHA能做到在0~30秒之内自动完成数据库的故障切换操作,并且在进行故障切换的过程中,MHA能在最大程度上保证数据的一致性,以达到真正意义上的高可用。
该软件由两部分组成:MHA Manager(管理节点)和 MHA Node(数据节点)。MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署在一台 slave 节点上。
MHA Node运行在每台MySQL服务器上,MHA Manager会定时探测集群中的master节点,当master出现故障时,它可以自动将最新数据的slave提升为新的master,然后将所有其他的slave重新指向新的master。整个故障转移过程对应用程序完全透明。
目前MHA主要支持一主多从的架构,要搭建MHA, 要求一个复制集群中必须最少有三台数据库服务器,一主二从,即一台充当master,一台充当备用master,另外一台充当从库。
主从规划
IP | 主机名 | 主从 | os用户名 | os密码 | db用户名 | db密码 | MySQL版本 | 复制类型 |
192.168.6.101 | node01 | 主 | root | oracle | root | oracle | 8.0.33 | GTID |
192.168.6.102 | node02 | 从 | root | oracle | root | oracle | 8.0.33 | GTID |
192.168.6.103 | node03 | 从 | root | oracle | root | oracle | 8.0.33 | GTID |
MHA规划
IP | 主机名 | 角色 | 备注 | db用户名 | db密码 | MySQL版本 | MHA版本 |
192.168.6.101 | node01 | manager | 管理节点 | root | oracle | 8.0.33 | 0.58 |
192.168.6.101 | node01 | node | 数据节点 | root | oracle | 8.0.33 | 0.58 |
192.168.6.102 | node02 | node | 数据节点 | root | oracle | 8.0.33 | 0.58 |
192.168.6.103 | node03 | node | 数据节点 | root | oracle | 8.0.33 | 0.58 |
192.168.6.200 | node01 | VIP | 虚拟IP |
MHA部署与测试
1、配置主机SSH免密登录
2、安装manager和node
3、管理节点配置MHA文件
4、验证SSH互信是否成功
5、验证mysql复制是否成功
6、配置VIP
7、启动MHA manager
8、测试master宕机failover是否成功
9、将原master切换为slave后重新搭建MHA
10、测试主动切换switchover
配置主机SSH免密登录
1、执行ssh-keygen
[root@node01 ~]# ssh-keygen -t rsa
[root@node02 ~]# ssh-keygen -t rsa
[root@node03 ~]# ssh-keygen -t rsa
2、执行ssh-copy-id
[root@node01 ~]# ssh-copy-id node01
[root@node02 ~]# ssh-copy-id node02
[root@node03 ~]# ssh-copy-id node03
安装manager和node
1、下载rpm安装包
[root@node01 ~]# wget https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
[root@node01 ~]# wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.centos.noarch.rpm
[root@node02 ~]# wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.centos.noarch.rpm
[root@node03 ~]# wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.centos.noarch.rpm
[root@node01 ~]# ll -h mha4mysql-*
-rw-r--r-- 1 root root 80K Dec 7 2021 mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
-rw-r--r-- 1 root root 36K Dec 7 2021 mha4mysql-node-0.58-0.el7.centos.noarch.rpm
[root@node02 ~]# ll -h mha4mysql-node-0.58-0.el7.centos.noarch.rpm
-rw-r--r-- 1 root root 36K Dec 7 2021 mha4mysql-node-0.58-0.el7.centos.noarch.rpm
[root@node03 ~]# ll -h mha4mysql-node-0.58-0.el7.centos.noarch.rpm
-rw-r--r-- 1 root root 36K Dec 7 2021 mha4mysql-node-0.58-0.el7.centos.noarch.rpm
2、添加epel仓库
[root@node01 ~]# yum install epel-release -y
[root@node02 ~]# yum install epel-release -y
[root@node03 ~]# yum install epel-release -y
3、管理节点安装node和manager
先安装node再安装manager
[root@node01 ~]# yum install mha4mysql-node-0.58-0.el7.centos.noarch.rpm -y
[root@node01 ~]# yum install mha4mysql-manager-0.58-0.el7.centos.noarch.rpm -y
4、数据节点安装node
[root@node02 ~]# yum install mha4mysql-node-0.58-0.el7.centos.noarch.rpm -y
[root@node03 ~]# yum install mha4mysql-node-0.58-0.el7.centos.noarch.rpm -y
管理节点配置MHA文件
1、创建mha工作目录
[root@node01 ~]# mkdir -p /data/mastermha
[root@node01 ~]# mkdir -p /data/mastermha/app1
2、创建配置文件
[root@node01 ~]# vim /data/mastermha/app1.cnf
[server default]
#mha工作目录
manager_workdir=/data/mastermha/app1
manager_log=/data/mastermha/app1/manager.log
master_binlog_dir=/data/mysql/3306/data
#failover
master_ip_failover_script=/data/mastermha/master_ip_failover
#switchover
master_ip_online_change_script=/data/mastermha/master_ip_online_change
#ssh免密
ssh_user=root
#mysql用户
user=root
password=oracle
#每隔3秒检测主库心跳,最多四次机会,如果都没有心跳,主库宕机
ping_interval=3
ping_type=INSERT
#复制用户
repl_user=repl
repl_password=oracle
#主节点
[server1]
hostname=node01
ip=192.168.6.101
port=3306
#设置为候选master,如果设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中事件最新的slave
candidate_master=1
#从节点
[server2]
hostname=node02
ip=192.168.6.102
port=3306
candidate_master=1
check_repl_delay=0
#从节点
[server3]
hostname=node03
ip=192.168.6.103
port=3306
#不选择作为主库
no_master=1
验证SSH互信是否成功
1、MHA工具
[root@node01 ~]# masterha_
masterha_check_repl masterha_check_status masterha_manager masterha_master_switch masterha_stop
masterha_check_ssh masterha_conf_host masterha_master_monitor masterha_secondary_check
2、验证SSH互信是否成功
[root@node01 ~]# masterha_check_ssh --conf=/data/mastermha/app1.cnf
Mon Aug 21 12:36:14 2023 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Aug 21 12:36:14 2023 - [info] Reading application default configuration from /data/mastermha/app1.cnf..
Mon Aug 21 12:36:14 2023 - [info] Reading server configuration from /data/mastermha/app1.cnf..
Mon Aug 21 12:36:14 2023 - [info] Starting SSH connection tests..
Mon Aug 21 12:36:15 2023 - [debug]
Mon Aug 21 12:36:14 2023 - [debug] Connecting via SSH from root@node01(192.168.6.101:22) to root@node02(192.168.6.102:22)..
Mon Aug 21 12:36:14 2023 - [debug] ok.
Mon Aug 21 12:36:14 2023 - [debug] Connecting via SSH from root@node01(192.168.6.101:22) to root@node03(192.168.6.103:22)..
Mon Aug 21 12:36:14 2023 - [debug] ok.
Mon Aug 21 12:36:15 2023 - [debug]
Mon Aug 21 12:36:14 2023 - [debug] Connecting via SSH from root@node02(192.168.6.102:22) to root@node01(192.168.6.101:22)..
Mon Aug 21 12:36:14 2023 - [debug] ok.
Mon Aug 21 12:36:14 2023 - [debug] Connecting via SSH from root@node02(192.168.6.102:22) to root@node03(192.168.6.103:22)..
Mon Aug 21 12:36:15 2023 - [debug] ok.
Mon Aug 21 12:36:16 2023 - [debug]
Mon Aug 21 12:36:15 2023 - [debug] Connecting via SSH from root@node03(192.168.6.103:22) to root@node01(192.168.6.101:22)..
Mon Aug 21 12:36:15 2023 - [debug] ok.
Mon Aug 21 12:36:15 2023 - [debug] Connecting via SSH from root@node03(192.168.6.103:22) to root@node02(192.168.6.102:22)..
Mon Aug 21 12:36:15 2023 - [debug] ok.
Mon Aug 21 12:36:16 2023 - [info] All SSH connection tests passed successfully.
验证复制是否成功
1、验证复制失败
[root@node01 ~]# masterha_check_repl --conf=/data/mastermha/app1.cnf
Mon Aug 21 12:39:21 2023 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Aug 21 12:39:21 2023 - [info] Reading application default configuration from /data/mastermha/app1.cnf..
Mon Aug 21 12:39:21 2023 - [info] Reading server configuration from /data/mastermha/app1.cnf..
Mon Aug 21 12:39:21 2023 - [info] MHA::MasterMonitor version 0.58.
Creating directory /data/mastermha/app1.. done.
Mon Aug 21 12:39:22 2023 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln188] There is no alive server. We can't do failover
Mon Aug 21 12:39:22 2023 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Mon Aug 21 12:39:22 2023 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Aug 21 12:39:22 2023 - [info] Got exit code 1 (Not master dead).
MySQL Replication Health is NOT OK!
2、解决方案
8.0需要修改mysql用户'root'@'192.168.6.%'和复制用户'repl'@'192.168.6.%'密码插件,5.7不需要修改,插件默认是mysql_native_password
root@node01 [(none)]> select user,host,plugin from mysql.user;
+------------------+-------------+-----------------------+
| user | host | plugin |
+------------------+-------------+-----------------------+
| clone | % | caching_sha2_password |
| pmm | 192.168.6.% | mysql_native_password |
| repl | 192.168.6.% | caching_sha2_password |
| root | 192.168.6.% | caching_sha2_password |
| mysql.infoschema | localhost | caching_sha2_password |
| mysql.session | localhost | caching_sha2_password |
| mysql.sys | localhost | caching_sha2_password |
| root | localhost | mysql_native_password |
+------------------+-------------+-----------------------+
8 rows in set (0.03 sec)
root@node01 [mysql]> alter user 'repl'@'192.168.6.%' identified with mysql_native_password by 'oracle';
Query OK, 0 rows affected (1.26 sec)
root@node01 [mysql]> flush privileges;
Query OK, 0 rows affected (0.24 sec)
root@node01 [mysql]> alter user 'root'@'192.168.6.%' identified with mysql_native_password by 'oracle';
Query OK, 0 rows affected (0.05 sec)
root@node01 [mysql]> flush privileges;
Query OK, 0 rows affected (0.01 sec)
root@node01 [mysql]> select user,host,plugin from mysql.user;
+------------------+-------------+-----------------------+
| user | host | plugin |
+------------------+-------------+-----------------------+
| clone | % | caching_sha2_password |
| pmm | 192.168.6.% | mysql_native_password |
| repl | 192.168.6.% | mysql_native_password |
| root | 192.168.6.% | mysql_native_password |
| mysql.infoschema | localhost | caching_sha2_password |
| mysql.session | localhost | caching_sha2_password |
| mysql.sys | localhost | caching_sha2_password |
| root | localhost | mysql_native_password |
+------------------+-------------+-----------------------+
8 rows in set (0.02 sec)
3、验证复制成功
[root@node01 ~]# masterha_check_repl --conf=/data/mastermha/app1.cnf
Mon Aug 21 13:58:44 2023 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Aug 21 13:58:44 2023 - [info] Reading application default configuration from /data/mastermha/app1.cnf..
Mon Aug 21 13:58:44 2023 - [info] Reading server configuration from /data/mastermha/app1.cnf..
Mon Aug 21 13:58:44 2023 - [info] MHA::MasterMonitor version 0.58.
Mon Aug 21 13:58:46 2023 - [info] GTID failover mode = 1
Mon Aug 21 13:58:46 2023 - [info] Dead Servers:
Mon Aug 21 13:58:46 2023 - [info] Alive Servers:
Mon Aug 21 13:58:46 2023 - [info] node01(192.168.6.101:3306)
Mon Aug 21 13:58:46 2023 - [info] node02(192.168.6.102:3306)
Mon Aug 21 13:58:46 2023 - [info] node03(192.168.6.103:3306)
Mon Aug 21 13:58:46 2023 - [info] Alive Slaves:
Mon Aug 21 13:58:46 2023 - [info] node02(192.168.6.102:3306) Version=8.0.33 (oldest major version between slaves) log-bin:enabled
Mon Aug 21 13:58:46 2023 - [info] GTID ON
Mon Aug 21 13:58:46 2023 - [info] Replicating from 192.168.6.101(192.168.6.101:3306)
Mon Aug 21 13:58:46 2023 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Aug 21 13:58:46 2023 - [info] node03(192.168.6.103:3306) Version=8.0.33 (oldest major version between slaves) log-bin:enabled
Mon Aug 21 13:58:46 2023 - [info] GTID ON
Mon Aug 21 13:58:46 2023 - [info] Replicating from 192.168.6.101(192.168.6.101:3306)
Mon Aug 21 13:58:46 2023 - [info] Not candidate for the new Master (no_master is set)
Mon Aug 21 13:58:46 2023 - [info] Current Alive Master: node01(192.168.6.101:3306)
Mon Aug 21 13:58:46 2023 - [info] Checking slave configurations..
Mon Aug 21 13:58:46 2023 - [info] read_only=1 is not set on slave node02(192.168.6.102:3306).
Mon Aug 21 13:58:46 2023 - [info] read_only=1 is not set on slave node03(192.168.6.103:3306).
Mon Aug 21 13:58:46 2023 - [info] Checking replication filtering settings..
Mon Aug 21 13:58:46 2023 - [info] binlog_do_db= , binlog_ignore_db=
Mon Aug 21 13:58:46 2023 - [info] Replication filtering check ok.
Mon Aug 21 13:58:46 2023 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Mon Aug 21 13:58:46 2023 - [info] Checking SSH publickey authentication settings on the current master..
Mon Aug 21 13:58:46 2023 - [info] HealthCheck: SSH to node01 is reachable.
Mon Aug 21 13:58:46 2023 - [info]
node01(192.168.6.101:3306) (current master)
+--node02(192.168.6.102:3306)
+--node03(192.168.6.103:3306)
Mon Aug 21 13:58:46 2023 - [info] Checking replication health on node02..
Mon Aug 21 13:58:46 2023 - [info] ok.
Mon Aug 21 13:58:46 2023 - [info] Checking replication health on node03..
Mon Aug 21 13:58:46 2023 - [info] ok.
Mon Aug 21 13:58:46 2023 - [warning] master_ip_failover_script is not defined.
Mon Aug 21 13:58:46 2023 - [warning] shutdown_script is not defined.
Mon Aug 21 13:58:46 2023 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
配置VIP
1、手动添加虚拟ip
[root@node01 ~]# ifconfig ens33:88 192.168.6.200
[root@node01 ~]# ip addr | grep ens33
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
inet 192.168.6.101/24 brd 192.168.6.255 scope global noprefixroute ens33
inet 192.168.6.200/24 brd 192.168.6.255 scope global secondary ens33:88
2、使用VIP连接数据库
[root@node01 ~]# mysql -h192.168.6.200 -uroot -poracle
root@192.168.6.200 [(none)]> select user();
+-------------+
| user() |
+-------------+
| root@node01 |
+-------------+
1 row in set (0.01 sec)
root@192.168.6.200 [(none)]> select @@server_id;
+-------------+
| @@server_id |
+-------------+
| 1001 |
+-------------+
1 row in set (0.00 sec)
root@192.168.6.200 [(none)]> show master status;
+---------------+----------+--------------+------------------+---------------------------------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+---------------+----------+--------------+------------------+---------------------------------------------+
| binlog.000017 | 341388 | | | b0ae4cca-32af-11ee-8ff1-000c29004cdf:1-1342 |
+---------------+----------+--------------+------------------+---------------------------------------------+
1 row in set (0.01 sec)
启动MHA manager
1、后台启动manager
[root@node01 ~]# nohup masterha_manager --conf=/data/mastermha/app1.cnf &
2、查看mha状态
[root@node01 ~]# masterha_check_status --conf=/data/mastermha/app1.cnf
app1 (pid:10126) is running(0:PING_OK), master:node01
3、查看日志
[root@node01 ~]# tail -100f /data/mastermha/app1/manager.log
Mon Aug 21 14:23:44 2023 - [info] MHA::MasterMonitor version 0.58.
Mon Aug 21 14:23:46 2023 - [info] GTID failover mode = 1
Mon Aug 21 14:23:46 2023 - [info] Dead Servers:
Mon Aug 21 14:23:46 2023 - [info] Alive Servers:
Mon Aug 21 14:23:46 2023 - [info] node01(192.168.6.101:3306)
Mon Aug 21 14:23:46 2023 - [info] node02(192.168.6.102:3306)
Mon Aug 21 14:23:46 2023 - [info] node03(192.168.6.103:3306)
Mon Aug 21 14:23:46 2023 - [info] Alive Slaves:
Mon Aug 21 14:23:46 2023 - [info] node02(192.168.6.102:3306) Version=8.0.33 (oldest major version between slaves) log-bin:enabled
Mon Aug 21 14:23:46 2023 - [info] GTID ON
Mon Aug 21 14:23:46 2023 - [info] Replicating from 192.168.6.101(192.168.6.101:3306)
Mon Aug 21 14:23:46 2023 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Aug 21 14:23:46 2023 - [info] node03(192.168.6.103:3306) Version=8.0.33 (oldest major version between slaves) log-bin:enabled
Mon Aug 21 14:23:46 2023 - [info] GTID ON
Mon Aug 21 14:23:46 2023 - [info] Replicating from 192.168.6.101(192.168.6.101:3306)
Mon Aug 21 14:23:46 2023 - [info] Not candidate for the new Master (no_master is set)
Mon Aug 21 14:23:46 2023 - [info] Current Alive Master: node01(192.168.6.101:3306)
Mon Aug 21 14:23:46 2023 - [info] Checking slave configurations..
Mon Aug 21 14:23:46 2023 - [info] read_only=1 is not set on slave node02(192.168.6.102:3306).
Mon Aug 21 14:23:46 2023 - [info] read_only=1 is not set on slave node03(192.168.6.103:3306).
Mon Aug 21 14:23:46 2023 - [info] Checking replication filtering settings..
Mon Aug 21 14:23:46 2023 - [info] binlog_do_db= , binlog_ignore_db=
Mon Aug 21 14:23:46 2023 - [info] Replication filtering check ok.
Mon Aug 21 14:23:46 2023 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Mon Aug 21 14:23:46 2023 - [info] Checking SSH publickey authentication settings on the current master..
Mon Aug 21 14:23:46 2023 - [info] HealthCheck: SSH to node01 is reachable.
Mon Aug 21 14:23:46 2023 - [info]
node01(192.168.6.101:3306) (current master)
+--node02(192.168.6.102:3306)
+--node03(192.168.6.103:3306)
Mon Aug 21 14:23:46 2023 - [info] Checking master_ip_failover_script status:
Mon Aug 21 14:23:46 2023 - [info] /data/mastermha/master_ip_failover --command=status --ssh_user=root --orig_master_host=node01 --orig_master_ip=192.168.6.101 --orig_master_port=3306
IN SCRIPT TEST====/sbin/ifconfig ens33:88 down==/sbin/ifconfig ens33:88 192.168.6.111/24===
Checking the Status of the script.. OK
Mon Aug 21 14:23:46 2023 - [info] OK.
Mon Aug 21 14:23:46 2023 - [warning] shutdown_script is not defined.
Mon Aug 21 14:23:46 2023 - [info] Set master ping interval 3 seconds.
Mon Aug 21 14:23:46 2023 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Mon Aug 21 14:23:46 2023 - [info] Starting ping health check on node01(192.168.6.101:3306)..
Mon Aug 21 14:23:48 2023 - [info] Ping(INSERT) succeeded, waiting until MySQL doesn't respond..
测试master宕机failover是否成功
1、修改配置文件master_ip_failover
[root@node01 ~]# vim /data/mastermha/master_ip_failover
my $vip = '192.168.6.200/24';
my $key = '88';
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
2、添加执行权限
[root@node01 ~]# chmod +x /data/mastermha/master_ip_failover
3、关闭MySQL
[root@node01 ~]# systemctl stop mysqld
[root@node01 ~]# masterha_check_status --conf=/data/mastermha/app1.cnf
app1 is stopped(2:NOT_RUNNING).
4、查看日志,failover发生自动故障转移,node02提升为master
[root@node01 ~]# tail -100f /data/mastermha/app1/manager.log
----- Failover Report -----
app1: MySQL Master failover node01(192.168.6.101:3306) to node02(192.168.6.102:3306) succeeded
Master node01(192.168.6.101:3306) is down!
Check MHA Manager logs at node01:/data/mastermha/app1/manager.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on node01(192.168.6.101:3306)
Selected node02(192.168.6.102:3306) as a new master.
node02(192.168.6.102:3306): OK: Applying all logs succeeded.
node02(192.168.6.102:3306): OK: Activated master IP address.
node03(192.168.6.103:3306): OK: Slave started, replicating from node02(192.168.6.102:3306)
node02(192.168.6.102:3306): Resetting slave info succeeded.
Master failover to node02(192.168.6.102:3306) completed successfully.
5、可以看到vip连接的是node02节点
root@192.168.6.200 [(none)]> select @@server_id;
+-------------+
| @@server_id |
+-------------+
| 1002 |
+-------------+
1 row in set (0.00 sec)
root@192.168.6.200 [(none)]> show processlist;
+----+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
| 5 | event_scheduler | localhost | NULL | Daemon | 2593 | Waiting on empty queue | NULL |
| 8 | root | node02:56284 | NULL | Sleep | 1217 | | NULL |
| 29 | repl | node03:60278 | NULL | Binlog Dump GTID | 1105 | Source has sent all binlog to replica; waiting for more updates | NULL |
| 31 | root | node01:41758 | NULL | Query | 0 | init | show processlist |
+----+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
4 rows in set (0.00 sec)
将原master切换为slave后重新搭建MHA
1、node02升级成主库后,将node01切换为slave
启动mysql
[root@node01 ~]# systemctl start mysqld
2、将node01切换为slave
root@node01 [(none)]> change master to
-> master_host='192.168.6.102',
-> master_port=3306,
-> master_user='repl',
-> master_password='oracle',
-> master_auto_position=1;
Query OK, 0 rows affected, 7 warnings (0.18 sec)
root@node01 [(none)]> start slave;
Query OK, 0 rows affected, 1 warning (0.00 sec)
root@node01 [(none)]> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for source to send event
Master_Host: 192.168.6.102
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: binlog.000006
Read_Master_Log_Pos: 175901
Relay_Log_File: node01-relay-bin.000002
Relay_Log_Pos: 411
Relay_Master_Log_File: binlog.000006
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
root@192.168.6.200 [(none)]> show processlist;
+----+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
| 5 | event_scheduler | localhost | NULL | Daemon | 2957 | Waiting on empty queue | NULL |
| 8 | root | node02:56284 | NULL | Sleep | 339 | | NULL |
| 29 | repl | node03:60278 | NULL | Binlog Dump GTID | 1469 | Source has sent all binlog to replica; waiting for more updates | NULL |
| 31 | root | node01:41758 | NULL | Query | 0 | init | show processlist |
| 34 | repl | node01:42764 | NULL | Binlog Dump GTID | 17 | Source has sent all binlog to replica; waiting for more updates | NULL |
+----+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
5 rows in set (0.01 sec)
3、修改配置文件,当前node02是master,node01是slave,重新搭建MHA
[root@node01 ~]# vim /data/mastermha/app1.cnf
[server1]
hostname=node02
ip=192.168.6.102
[server2]
hostname=node01
ip=192.168.6.101
4、删除app1.failover.complete文件,否则启动manager报错
[root@node01 ~]# tail -100f /data/mastermha/app1/manager.log
Mon Aug 21 16:06:03 2023 - [error][/usr/share/perl5/vendor_perl/MHA/MasterFailover.pm, ln310] Last failover was done at 2023/08/21 14:48:23. Current time is too early to do failover again. If you want to do failover, manually remove /data/mastermha/app1/app1.failover.complete and run this script again.
Mon Aug 21 16:06:03 2023 - [error][/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm, ln177] Got ERROR: at /usr/bin/masterha_manager line 65.
测试主动切换switchover
1、切换前需要关闭manager
[root@node01 ~]# masterha_stop --conf=/data/mastermha/app1.cnf
MHA Manager is not running on app1(2:NOT_RUNNING).
2、修改master_ip_online_change配置文件
[root@node01 ~]# vim/data/mastermha/master_ip_online_change
my $vip = '192.168.6.200/24';
my $key = '88';
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
3、将node02切换为从库,node01提升为主库
[root@node01 ~]# masterha_master_switch --master_state=alive --conf=/data/mastermha/app1.cnf --new_master_host=node01 --orig_master_is_new_slave --running_updates_limit=10000
IN SCRIPT TEST====/sbin/ifconfig ens33:88 down==/sbin/ifconfig ens33:88 192.168.6.200/24===
Enabling the VIP - 192.168.6.200/24 on the new master - node01
Mon Aug 21 18:05:48 2023 - [info] ok.
Mon Aug 21 18:05:48 2023 - [info] Setting read_only=0 on node01(192.168.6.101:3306)..
Mon Aug 21 18:05:48 2023 - [info] ok.
Mon Aug 21 18:05:48 2023 - [info]
Mon Aug 21 18:05:48 2023 - [info] * Switching slaves in parallel..
Mon Aug 21 18:05:48 2023 - [info]
Mon Aug 21 18:05:48 2023 - [info] -- Slave switch on host node03(192.168.6.103:3306) started, pid: 72129
Mon Aug 21 18:05:48 2023 - [info]
Mon Aug 21 18:05:49 2023 - [info] Log messages from node03 ...
Mon Aug 21 18:05:49 2023 - [info]
Mon Aug 21 18:05:48 2023 - [info] Waiting to execute all relay logs on node03(192.168.6.103:3306)..
Mon Aug 21 18:05:48 2023 - [info] master_pos_wait(binlog.000008:237) completed on node03(192.168.6.103:3306). Executed 0 events.
Mon Aug 21 18:05:48 2023 - [info] done.
Mon Aug 21 18:05:48 2023 - [info] Resetting slave node03(192.168.6.103:3306) and starting replication from the new master node01(192.168.6.101:3306)..
Mon Aug 21 18:05:48 2023 - [info] Executed CHANGE MASTER.
Mon Aug 21 18:05:48 2023 - [info] Slave started.
Mon Aug 21 18:05:49 2023 - [info] End of log messages from node03 ...
Mon Aug 21 18:05:49 2023 - [info]
Mon Aug 21 18:05:49 2023 - [info] -- Slave switch on host node03(192.168.6.103:3306) succeeded.
Mon Aug 21 18:05:49 2023 - [info] Unlocking all tables on the orig master:
Mon Aug 21 18:05:49 2023 - [info] Executing UNLOCK TABLES..
Mon Aug 21 18:05:49 2023 - [info] ok.
Mon Aug 21 18:05:49 2023 - [info] Starting orig master as a new slave..
Mon Aug 21 18:05:49 2023 - [info] Resetting slave node02(192.168.6.102:3306) and starting replication from the new master node01(192.168.6.101:3306)..
Mon Aug 21 18:05:49 2023 - [info] Executed CHANGE MASTER.
Mon Aug 21 18:05:49 2023 - [info] Slave started.
Mon Aug 21 18:05:49 2023 - [info] All new slave servers switched successfully.
Mon Aug 21 18:05:49 2023 - [info]
Mon Aug 21 18:05:49 2023 - [info] * Phase 5: New master cleanup phase..
Mon Aug 21 18:05:49 2023 - [info]
Mon Aug 21 18:05:49 2023 - [info] node01: Resetting slave info succeeded.
Mon Aug 21 18:05:49 2023 - [info] Switching master to node01(192.168.6.101:3306) completed successfully.
root@192.168.6.200 [(none)]> select @@server_id;
+-------------+
| @@server_id |
+-------------+
| 1001 |
+-------------+
1 row in set (0.00 sec)
root@192.168.6.200 [(none)]> show processlist;
+------+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
| 5 | event_scheduler | localhost | NULL | Daemon | 4240 | Waiting on empty queue | NULL |
| 1553 | root | node01:51308 | NULL | Sleep | 1177 | | NULL |
| 2831 | repl | node03:44486 | NULL | Binlog Dump GTID | 38 | Source has sent all binlog to replica; waiting for more updates | NULL |
| 2832 | repl | node02:47518 | NULL | Binlog Dump GTID | 37 | Source has sent all binlog to replica; waiting for more updates | NULL |
| 2835 | root | node01:58570 | NULL | Query | 0 | init | show processlist |
+------+-----------------+--------------+------+------------------+------+-----------------------------------------------------------------+------------------+
5 rows in set (0.00 sec)
4、修改配置文件
[root@node01 ~]# vim /data/mastermha/app1.cnf
[server1]
hostname=node01
ip=192.168.6.101
[server2]
hostname=node02
ip=192.168.6.102