目录
一.MHA概述
1.MHA简介
- 由日本DeNA公司youshimaton开发
- 是一套优秀的实现MySQL高可用的解决方案
- 数据库的自动故障切换操作能做到在0~30秒之内完成
- MHA能确保在故障切换过程中最大限度保证数据的一致性,以达到真正意义上的高可用
2.MHA组成
- MHA Manager(管理节点)
- 可以单独部署在一台独立的机器上
- 也可以部署在某台数据库服务器上- MHA Node(数据节点)
- 存储数据的MySQL服务器
- 运行在每台MySQL服务器上
3.MHA工作过程
- 由Manager定时探测集群中的master节点
- 当master故障时,Manager自动将拥有最新数据的slave提升为新的master
二.部署MHA集群
1.环境准备架构图
2.数据库服务器基础配置
1) - 配置服务器192.168.1.11
[root@mysql1 ~]# vim /etc/my.cnf [mysqld] server_id = 11 log-bin = master11 plugin-load="rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so" rpl_semi_sync_master_enabled=1 rpl_semi_sync_slave_enabled=1 relay_log_purge=0 # 禁止slave_sql线程执行完relay log后将其删除。 ... ... [root@mysql1 ~]# systemctl start mysqld [root@mysql1 ~]# mysql -uroot -pNSD2021@tedu.cn mysql> grant replication slave on *.* to repluser@'%' Identified by 'NSD2021@tedu.cn';
2) 配置服务器192.168.1.12
[root@mysql2 ~]# vim /etc/my.cnf [mysqld] server_id=12 log_bin=master12 plugin-load="rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so" rpl_semi_sync_master_enabled=1 rpl_semi_sync_slave_enabled=1 relay_log_purge=0 [root@mysql2 ~]# systemctl start mysqld [root@mysql2 ~]# mysql -uroot -pNSD2021@tedu.cn mysql> grant replication slave on *.* to repluser@'%' Identified by 'NSD2021@tedu.cn';
3)配置服务器192.168.1.13
[root@mysql3 ~]# vim /etc/my.cnf [mysqld] server_id=13 log_bin=master13 plugin-load="rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so" rpl_semi_sync_master_enabled=1 rpl_semi_sync_slave_enabled=1 relay_log_purge=0 ... ... [root@mysql3 ~]# systemctl start mysqld [root@mysql3 ~]# mysql -uroot -pNSD2021@tedu.cn mysql> grant replication slave on *.* to repluser@'%' Identified by 'NSD2021@tedu.cn';
3.配置ssh免密登录
- 配置服务器192.168.1.11
[root@mysql1 ~]# ssh-keygen [root@mysql1 ~]# for i in 12 13 > do > ssh-copy-id root@192.168.1.$i > done
- 配置服务器192.168.1.12
[root@mysql2 ~]# ssh-keygen [root@mysql2 ~]# for i in 11 13; do ssh-copy-id root@192.168.1.$i; done
- 配置服务器192.168.1.13
[root@mysql3 ~]# ssh-keygen [root@mysql3 ~]# for i in 11 12; do ssh-copy-id root@192.168.1.$i; done
### 配置管理服务器
- 实现免密连接三台数据库服务器
[root@mha1 ~]# ssh-keygen [root@mha1 ~]# for i in 11 12 13; do ssh-copy-id root@192.168.1.$i; done
4.安装MHA
1) 配置yum源
[root@zzgrhel8 mha-soft-student]# ls mha4mysql-node-0.56-0.el6.noarch.rpm perl-Config-Tiny-2.14-7.el7.noarch.rpm perl-Email-Date-Format-1.002-15.el7.noarch.rpm perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm perl-Mail-Sender-0.8.23-1.el7.noarch.rpm perl-Mail-Sendmail-0.79-21.el7.art.noarch.rpm perl-MIME-Lite-3.030-1.el7.noarch.rpm perl-MIME-Types-1.38-2.el7.noarch.rpm perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm [root@zzgrhel8 mha-soft-student]# cp *.rpm /var/www/html/mysql/ [root@zzgrhel8 mha-soft-student]# cd /var/www/html/mysql/ [root@zzgrhel8 mysql]# createrepo -v .
2)在三台数据库服务器和管理节点上安装perl软件包
# mha node节点 [root@mysql{1,2,3} ~]# yum clean all [root@mysql{1,2,3} ~]# yum install -y gcc pcre-devel pkgconfig autoconf automake perl-ExtUtils-MakeMaker perl-CPAN perl-DBI perl-DBD-MySQL # mha 管理节点 [root@mha1 ~]# yum clean all [root@mha1 ~]# yum install -y perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes perl-ExtUtils-MakeMaker perl-CPAN
5.配置一主多从结构
1) 配置主服务器192.168.1.11
[root@mysql1 ~]# mysql -uroot -pNSD2021@tedu.cn
mysql> show master status;
+-----------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-----------------+----------+--------------+------------------+-------------------+
| master11.000002 | 701 | | | |
+-----------------+----------+--------------+------------------+-------------------+
1 row in set (0.18 sec)
2)配置主服务器192.168.1.12
[root@mysql2 ~]# mysql -uroot -pNSD2021@tedu.cnmysql> change master to -> master_host="192.168.1.11", -> master_user="repluser", -> master_password="NSD2021@tedu.cn", -> master_log_file="master11.000002", -> master_log_pos=701; mysql> start slave; mysql> show slave status\G ... ... Slave_IO_Running: Yes Slave_SQL_Running: Yes
3)配置主服务器192.168.1.13
[root@mysql3 ~]# mysql -uroot -pNSD2021@tedu.cn mysql> change master to -> master_host="192.168.1.11", -> master_user="repluser", -> master_password="NSD2021@tedu.cn", -> master_log_file="master11.000002", -> master_log_pos=701; mysql> start slave; mysql> show slave status\G ... ... Slave_IO_Running: Yes Slave_SQL_Running: Yes ... ...
6.配置MHA管理节点
1)安装软件
[root@mha1 ~]# yum clean all [root@mha1 ~]# yum install -y mha4mysql-node
# 拷贝mha服务端到管理节点
[root@zzgrhel8 mha]# scp mha4mysql-manager-0.56.tar.gz 192.168.1.15:/root [root@mha1 ~]# tar xf mha4mysql-manager-0.56.tar.gz [root@mha1 ~]# cd mha4mysql-manager-0.56 [root@mha1 mha4mysql-manager-0.56]# perl Makefile.PL [root@mha1 mha4mysql-manager-0.56]# make [root@mha1 mha4mysql-manager-0.56]# make install
2)修改主配置文件
[root@mha1 ~]# mkdir /etc/mha # 拷贝配置文件到mha服务器,并修改,或改本地配置文件 [root@zzgrhel8 mha]# scp app1.cnf 192.168.1.15:/etc/mha/ [root@mha1 ~]# vim /etc/mha/app1.cnf [server default] manager_log=/var/log/manager.log # 日志文件 manager_workdir=/etc/mha # 工作目录 master_ip_failover_script=/etc/mha/master_ip_failover # 故障切换脚本 repl_user=repluser # 主服务器数据同步授权用户 repl_password=NSD2021@tedu.cn # 密码 ssh_port=22 # ssh服务端口 ssh_user=root # 访问ssh服务用户 user=mhamon # 监控用户 password=NSD2021@tedu.cn # 密码 [server1] # 第1台数据库服务器配置 candidate_master=1 hostname=192.168.1.11 port=3306 [server2] # 第2台数据库服务器配置 candidate_master=1 hostname=192.168.1.12 port=3306 [server3] # 第3台数据库服务器配置 candidate_master=1 hostname=192.168.1.13 port=3306
3)创建故障切换脚本
# 拷贝故障切换脚本文件到mha服务器,并修改注:在/root/mha4mysql-manager-0.56该目录下故障文件,与主配置文都有
[root@zzgrhel8 mha]# scp master_ip_failover 192.168.1.15:/etc/mha/ [root@mha1 ~]# vim +35 /etc/mha/master_ip_failover ... ... my $vip = '192.168.1.100/24'; # Virtual IP my $key = "1"; my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip"; my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down"; ... ... [root@mha1 ~]# chmod +x /etc/mha/master_ip_failover - 在当前主服务器部署vip地址 [root@mysql1 ~]# ifconfig eth0:1 192.168.1.100/24 up [root@mysql1 ~]# ip a s eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:22:3a:a0 brd ff:ff:ff:ff:ff:ff inet 192.168.1.11/24 brd 192.168.1.255 scope global noprefixroute eth0 valid_lft forever preferred_lft forever inet 192.168.1.100/24 brd 192.168.1.255 scope global secondary eth0:1 valid_lft forever preferred_lft forever inet6 fe80::96a0:26f0:cac4:f130/64 scope link noprefixroute valid_lft forever preferred_lft forever
7. 配置数据节点
1) 在所有数据库服务器上,安装mha-node软件包
[root@mysql{1,2,3} ~]# yum clean all [root@mysql{1,2,3} ~]# yum install -y mha4mysql-node-0.56-0.el6
2)在所有数据服务器上添加监控用户
[root@mysql1 ~]# mysql -uroot -pNSD2021@tedu.cn mysql> grant all on *.* to mhamon@"%" identified by "NSD2021@tedu.cn"; mysql> show grants for mhamon; +---------------------------------------------+ | Grants for mhamon@% | +---------------------------------------------+ | GRANT ALL PRIVILEGES ON *.* TO 'mhamon'@'%' | +---------------------------------------------+ 1 row in set (0.00 sec) # 在从服务器上查看同步过来的监控用户 [root@mysql2 ~]# mysql -uroot -pNSD2021@tedu.cn mysql> show grants for mhamon; +---------------------------------------------+ | Grants for mhamon@% | +---------------------------------------------+ | GRANT ALL PRIVILEGES ON *.* TO 'mhamon'@'%' | +---------------------------------------------+ 1 row in set (0.00 sec) [root@mysql3 ~]# mysql -uroot -pNSD2021@tedu.cn mysql> show grants for mhamon; +---------------------------------------------+ | Grants for mhamon@% | +---------------------------------------------+ | GRANT ALL PRIVILEGES ON *.* TO 'mhamon'@'%' | +---------------------------------------------+ 1 row in set (0.00 sec)
8. 测试配置
#### 测试集群环境
- 在管理主机,测试ssh配置
[root@mha1 ~]# masterha_check_ssh --conf=/etc/mha/app1.cnf ... ... Thu Apr 15 20:15:39 2021 - [info] All SSH connection tests passed successfully.
- 在管理主机,测试主从同步
[root@mha1 ~]# masterha_check_repl --conf=/etc/mha/app1.cnf ... ... MySQL Replication Health is OK.
- 启动管理服务
[root@mha1 ~]# masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover # 打开新终端查看服务状态 [root@mha1 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:7199) is running(0:PING_OK), master:192.168.1.11 # 查看工作目录文件列表 [root@mha1 ~]# ls /etc/mha app1.cnf app1.master_status.health master_ip_failover ```
9.访问集群
- 在主服务器 添加访问数据的连接用户
[root@mysql1 ~]# mysql -uroot -pNSD2021@tedu.cn mysql> create database db1 default charset utf8mb4; Query OK, 1 row affected (0.02 sec) mysql> grant select,insert on db1.* to dbuser1@"%" identified by 'NSD2021@tedu.cn'; Query OK, 0 rows affected, 1 warning (0.02 sec) mysql> create table db1.students(id int primary key auto_increment, name varchar(20)); Query OK, 0 rows affected (0.74 sec)
- 客户端192.168.1.10连接vip192.168.1.100地址访问集群
[root@node10 ~]# mysql -h192.168.1.100 -udbuser1 -pNSD2021@tedu.cn mysql> insert into db1.students values(1, 'tom'); Query OK, 1 row affected (0.04 sec) mysql> select * from db1.students; +----+------+ | id | name | +----+------+ | 1 | tom | +----+------+ 1 row in set (0.00 sec)
- 在两台从服务器查看数据
[root@mysql2 ~]# mysql -uroot -pNSD2021@tedu.cn mysql> select * from db1.students; +----+------+ | id | name | +----+------+ | 1 | tom | +----+------+ 1 row in set (0.00 sec) [root@mysql3 ~]# mysql -uroot -pNSD2021@tedu.cn mysql> select * from db1.students; +----+------+ | id | name | +----+------+ | 1 | tom | +----+------+ 1 row in set (0.00 sec)
10.测试高可用
- 停止主服务器192.168.1.11的mysql服务
[root@mysql1 ~]# systemctl stop mysqld
- 查看管理服务 ,输出的监控信息
# 此时启动管理服务的监控主机终端已经退出[root@mha1 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 is stopped(2:NOT_RUNNING).
- 客户端依然连接vip192.168.1.100地址,可以访问到数据。如果稍有卡顿,请耐心等待
[root@node10 ~]# mysql -h192.168.1.100 -udbuser1 -pNSD2021@tedu.cn mysql> select * from db1.students; +----+------+ | id | name | +----+------+ | 1 | tom | +----+------+ 1 row in set (0.00 sec)
- 在工作的两台主机192.168.1.12和192.168.1.13上查看VIP地址。VIP地址将会转移到某一台主机上。
[root@mysql2 ~]# ip a s eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 52:54:00:cf:98:9a brd ff:ff:ff:ff:ff:ff inet 192.168.1.12/24 brd 192.168.1.255 scope global noprefixroute eth0 valid_lft forever preferred_lft forever inet 192.168.1.100/24 brd 192.168.1.255 scope global secondary eth0:1 valid_lft forever preferred_lft forever inet6 fe80::88e7:d3e5:dc70:7b30/64 scope link noprefixroute valid_lft forever preferred_lft forever
- 在新的主服务器上更新数据
[root@mysql2 ~]# mysql -uroot -pNSD2021@tedu.cn mysql> insert into db1.students values(2, 'jerry'); Query OK, 1 row affected (0.08 sec) # 在从服务器上查看数据是否同步 [root@mysql3 ~]# mysql -uroot -pNSD2021@tedu.cn mysql> select * from db1.students; +----+-------+ | id | name | +----+-------+ | 1 | tom | | 2 | jerry | +----+-------+ 2 rows in set (0.00 sec)
11.修复故障服务器
- 启动出现故障服务器的mysqld服务
[root@mysql1 ~]# systemctl start mysqld
- 同步现有数据到修复的服务器
[root@mysql2 ~]# mysqldump -uroot -pNSD2021@tedu.cn --master-data db1 > db1.sql [root@mysql2 ~]# scp db1.sql 192.168.1.11:/root/ [root@mysql1 ~]# mysql -uroot -pNSD2021@tedu.cn db1 < db1.sql
- 配置修复的服务器为当前主服务器的从服务器
[root@mysql1 ~]# grep master12 db1.sql CHANGE MASTER TO MASTER_LOG_FILE='master12.000002', MASTER_LOG_POS=964; [root@mysql1 ~]# mysql -uroot -pNSD2021@tedu.cn mysql> change master to -> master_host="192.168.1.12", -> master_user="repluser", -> master_password="NSD2021@tedu.cn", -> master_log_file="master12.000002", -> master_log_pos=964; Query OK, 0 rows affected, 2 warnings (0.42 sec) mysql> start slave; Query OK, 0 rows affected (0.03 sec) mysql> show slave status\G ... ... Slave_IO_Running: Yes Slave_SQL_Running: Yes ... ...
- 配置管理服务器,将修复的服务器添加到服务器声明中
[root@mha1 ~]# vim /etc/mha/app1.cnf # 尾部追加以下内容 ... ... [server1] candidate_master=1 hostname=192.168.1.11 port=3306 [点击并拖拽以移动] # 测试集群环境 [root@mha1 ~]# masterha_check_ssh --conf=/etc/mha/app1.cnf ... ... Thu Apr 15 20:52:53 2021 - [info] All SSH connection tests passed successfully. # 测试主从同步 [root@mha1 ~]# masterha_check_repl --conf=/etc/mha/app1.cnf ... ... MySQL Replication Health is OK.
- 重启管理服务
[root@mha1 ~]# masterha_stop --conf=/etc/mha/app1.cnf [root@mha1 ~]# masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover & [root@mha1 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:7856) is running(0:PING_OK), master:192.168.1.12