本文首发于我的个人网站: https://hewanyue.com/
本文作者: Hechao
本文链接: https://hewanyue.com/blog/108203a7.html
MHA(master high availability)目前是MySQL高可用方面是一个相对成熟的解决方案。在切换过程中,mha能做到0-30s内自动完成数据库的切换,并且在切换过程中最大的保持数据的一致性,以达到真正意义上的高可用
MHA的组成:
MHA Manager(管理节点)和MHA Node(数据节点)。MHA Manager可以独立部署在一台独立的机器上,管理多个集群,也可以部署在从从库上。
当Master出现故障的时候,它可以自动将最新的数据的Slave提升为新的Master,然后将所有的Slave重新指向新的Master,整个故障转移过程是完全透明的。
实验演示图如下:
操作如下:
1、在192.168.32.71主机和192.168.32.72主机上搭建好1主3从的主从配置服务器;
2、在192.168.32.7主机作为监控主机,安装mhamanager包和node包,192.168.32.71主机、192.168.32.72主机只安装node包;
3、将三台主机的ssh公钥信息互相储存(或共用一个公钥私钥对),设置好相互之间的基于key验证免密登录。(因为用的是多实例数据库,SSH连通性验证时还会检查各个数据库之间的连通性,这就包括3307、3308、3309端口数据库之间的SSH连接,所以一定要在192.168.32.72主机家目录的.ssh/authorized_keys文件中加入192.168.32.72本机的公钥,否则会报错,切记,已踩坑)
4、编写mha配置文件,路径随意。
vim /etc/mha.cnf
[server default]
user=mhauser
password=mhuser
manager_workdir=/data/mastermha/group1/
manager_log=/data/mastermha/group1/manager.log
remote_workdir=/data/mastermha/group1/
ssh_user=root
repl_user=repluser
repl_password=repluser
ping_interval=1
master_binlog_dir=/data/mysql
[server1]
hostname=192.168.32.71
[server2]
hostname=192.168.32.72
port=3307
candidate_master=1
[server3]
hostname=192.168.32.72
port=3308
[server4]
hostname=192.168.32.72
port=3309
5、在主数据库服务器也就是192.168.32.71主机上创建监控账号(user=mhauser password=mhauser)和复制帐号(repl_user=repluser repl_password=repluser)(要和配置文件相同,已有则无需创建,可以用现成的,权限符合即可)
GRANT ALL ON *.* TO mhauser@'192.168.32.7%' IDENTIFIED BY 'mhauser';
GRANT REPLICATION SLAVE ON *.* TO 'repluser'@'192.168.32.7%' IDENTIFIED BY 'repluser'
6、用MHA包自带工具进行测试
usr/bin/masterha_check_ssh --conf=/etc/mha.cnf
[root@CentOS7 ~]#/usr/bin/masterha_check_ssh --conf=/etc/mha.cnf
Fri Sep 27 11:17:25 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Sep 27 11:17:25 2019 - [info] Reading application default configuration from /etc/mha.cnf..
Fri Sep 27 11:17:25 2019 - [info] Reading server configuration from /etc/mha.cnf..
Fri Sep 27 11:17:25 2019 - [info] Starting SSH connection tests..
Fri Sep 27 11:17:28 2019 - [debug]
Fri Sep 27 11:17:25 2019 - [debug] Connecting via SSH from root@192.168.32.71(192.168.32.71:22) to root@192.168.32.72(192.168.32.72:22)..
Fri Sep 27 11:17:25 2019 - [debug] ok.
Fri Sep 27 11:17:25 2019 - [debug] Connecting via SSH from root@192.168.32.71(192.168.32.71:22) to root@192.168.32.72(192.168.32.72:22)..
Fri Sep 27 11:17:26 2019 - [debug] ok.
Fri Sep 27 11:17:26 2019 - [debug] Connecting via SSH from root@192.168.32.71(192.168.32.71:22) to root@192.168.32.72(192.168.32.72:22)..
Fri Sep 27 11:17:27 2019 - [debug] ok.
Fri Sep 27 11:17:28 2019 - [debug]
Fri Sep 27 11:17:26 2019 - [debug] Connecting via SSH from root@192.168.32.72(192.168.32.72:22) to root@192.168.32.71(192.168.32.71:22)..
Fri Sep 27 11:17:27 2019 - [debug] ok.
Fri Sep 27 11:17:27 2019 - [debug] Connecting via SSH from root@192.168.32.72(192.168.32.72:22) to root@192.168.32.72(192.168.32.72:22)..
Fri Sep 27 11:17:28 2019 - [debug] ok.
Fri Sep 27 11:17:28 2019 - [debug] Connecting via SSH from root@192.168.32.72(192.168.32.72:22) to root@192.168.32.72(192.168.32.72:22)..
Fri Sep 27 11:17:29 2019 - [info] All SSH connection tests passed successfully.
usr/bin/masterha_check_repl --conf=/etc/mha.cnf
[root@CentOS7 ~]#/usr/bin/masterha_check_repl --conf=/etc/mha.cnf
Fri Sep 27 11:18:12 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Sep 27 11:18:12 2019 - [info] Reading application default configuration from /etc/mha.cnf..
Fri Sep 27 11:18:12 2019 - [info] Reading server configuration from /etc/mha.cnf..
Fri Sep 27 11:18:12 2019 - [info] MHA::MasterMonitor version 0.56.
Fri Sep 27 11:18:13 2019 - [info] GTID failover mode = 0
Fri Sep 27 11:18:13 2019 - [info] Dead Servers:
Fri Sep 27 11:18:13 2019 - [info] Alive Servers:
Fri Sep 27 11:18:13 2019 - [info] 192.168.32.71(192.168.32.71:3306)
Fri Sep 27 11:18:13 2019 - [info] 192.168.32.72(192.168.32.72:3307)
Fri Sep 27 11:18:13 2019 - [info] 192.168.32.72(192.168.32.72:3308)
Fri Sep 27 11:18:13 2019 - [info] 192.168.32.72(192.168.32.72:3309)
Fri Sep 27 11:18:13 2019 - [info] Alive Slaves:
Fri Sep 27 11:18:13 2019 - [info] 192.168.32.72(192.168.32.72:3307) Version=10.4.8-MariaDB-log (oldest major version between slaves) log-bin:enabled
Fri Sep 27 11:18:13 2019 - [info] Replicating from 192.168.32.71(192.168.32.71:3306)
Fri Sep 27 11:18:13 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Sep 27 11:18:13 2019 - [info] 192.168.32.72(192.168.32.72:3308) Version=10.4.8-MariaDB-log (oldest major version between slaves) log-bin:enabled
Fri Sep 27 11:18:13 2019 - [info] Replicating from 192.168.32.71(192.168.32.71:3306)
Fri Sep 27 11:18:13 2019 - [info] 192.168.32.72(192.168.32.72:3309) Version=10.4.8-MariaDB-log (oldest major version between slaves) log-bin:enabled
Fri Sep 27 11:18:13 2019 - [info] Replicating from 192.168.32.71(192.168.32.71:3306)
Fri Sep 27 11:18:13 2019 - [info] Current Alive Master: 192.168.32.71(192.168.32.71:3306)
Fri Sep 27 11:18:13 2019 - [info] Checking slave configurations..
Fri Sep 27 11:18:13 2019 - [info] Checking replication filtering settings..
Fri Sep 27 11:18:13 2019 - [info] binlog_do_db= , binlog_ignore_db=
Fri Sep 27 11:18:13 2019 - [info] Replication filtering check ok.
Fri Sep 27 11:18:13 2019 - [info] GTID (with auto-pos) is not supported
Fri Sep 27 11:18:13 2019 - [info] Starting SSH connection tests..
Fri Sep 27 11:18:17 2019 - [info] All SSH connection tests passed successfully.
Fri Sep 27 11:18:17 2019 - [info] Checking MHA Node version..
Fri Sep 27 11:18:18 2019 - [info] Version check ok.
Fri Sep 27 11:18:18 2019 - [info] Checking SSH publickey authentication settings on the current master..
Fri Sep 27 11:18:18 2019 - [info] HealthCheck: SSH to 192.168.32.71 is reachable.
Fri Sep 27 11:18:18 2019 - [info] Master MHA Node version is 0.56.
Fri Sep 27 11:18:18 2019 - [info] Checking recovery script configurations on 192.168.32.71(192.168.32.71:3306)..
Fri Sep 27 11:18:18 2019 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql --output_file=/data/mastermha/group1//save_binary_logs_test --manager_version=0.56 --start_file=master-bin.000012
Fri Sep 27 11:18:18 2019 - [info] Connecting to root@192.168.32.71(192.168.32.71:22)..
Creating /data/mastermha/group1 if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /data/mysql, up to master-bin.000012
Fri Sep 27 11:18:19 2019 - [info] Binlog setting check done.
Fri Sep 27 11:18:19 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Fri Sep 27 11:18:19 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mhauser' --slave_host=192.168.32.72 --slave_ip=192.168.32.72 --slave_port=3307 --workdir=/data/mastermha/group1/ --target_version=10.4.8-MariaDB-log --manager_version=0.56 --relay_log_info=/data/mysql3307/data/relay-log.info --relay_dir=/data/mysql3307/data/ --slave_pass=xxx
Fri Sep 27 11:18:19 2019 - [info] Connecting to root@192.168.32.72(192.168.32.72:22)..
Checking slave recovery environment settings..
Opening /data/mysql3307/data/relay-log.info ... ok.
Relay log found at /data/mysql3307/data, up to relay-log.000034
Temporary relay log file is /data/mysql3307/data/relay-log.000034
Testing mysql connection and privileges.. done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Fri Sep 27 11:18:19 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mhauser' --slave_host=192.168.32.72 --slave_ip=192.168.32.72 --slave_port=3308 --workdir=/data/mastermha/group1/ --target_version=10.4.8-MariaDB-log --manager_version=0.56 --relay_log_info=/data/mysql3308/data/relay-log.info --relay_dir=/data/mysql3308/data/ --slave_pass=xxx
Fri Sep 27 11:18:19 2019 - [info] Connecting to root@192.168.32.72(192.168.32.72:22)..
Checking slave recovery environment settings..
Opening /data/mysql3308/data/relay-log.info ... ok.
Relay log found at /data/mysql3308/data, up to relay-log.000009
Temporary relay log file is /data/mysql3308/data/relay-log.000009
Testing mysql connection and privileges.. done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Fri Sep 27 11:18:20 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mhauser' --slave_host=192.168.32.72 --slave_ip=192.168.32.72 --slave_port=3309 --workdir=/data/mastermha/group1/ --target_version=10.4.8-MariaDB-log --manager_version=0.56 --relay_log_info=/data/mysql3309/data/relay-log.info --relay_dir=/data/mysql3309/data/ --slave_pass=xxx
Fri Sep 27 11:18:20 2019 - [info] Connecting to root@192.168.32.72(192.168.32.72:22)..
Checking slave recovery environment settings..
Opening /data/mysql3309/data/relay-log.info ... ok.
Relay log found at /data/mysql3309/data, up to relay-log.000009
Temporary relay log file is /data/mysql3309/data/relay-log.000009
Testing mysql connection and privileges.. done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Fri Sep 27 11:18:20 2019 - [info] Slaves settings check done.
Fri Sep 27 11:18:20 2019 - [info]
192.168.32.71(192.168.32.71:3306) (current master)
+--192.168.32.72(192.168.32.72:3307)
+--192.168.32.72(192.168.32.72:3308)
+--192.168.32.72(192.168.32.72:3309)
Fri Sep 27 11:18:20 2019 - [info] Checking replication health on 192.168.32.72..
Fri Sep 27 11:18:20 2019 - [info] ok.
Fri Sep 27 11:18:20 2019 - [info] Checking replication health on 192.168.32.72..
Fri Sep 27 11:18:20 2019 - [info] ok.
Fri Sep 27 11:18:20 2019 - [info] Checking replication health on 192.168.32.72..
Fri Sep 27 11:18:20 2019 - [info] ok.
Fri Sep 27 11:18:20 2019 - [warning] master_ip_failover_script is not defined.
Fri Sep 27 11:18:20 2019 - [warning] shutdown_script is not defined.
Fri Sep 27 11:18:20 2019 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
均显示测试成功~
可以启动MHAmanager了,注意这是一个前端程序,如果检测到主数据库宕机,则会自动切换从数据为新的主服务器并使其他的从服务器从新的主服务器数据库上面复制数据,此后程序自动终止。
[root@CentOS7 ~]#/usr/bin/masterha_manager --conf=/etc/mha.cnf
Fri Sep 27 11:35:37 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Sep 27 11:35:37 2019 - [info] Reading application default configuration from /etc/mha.cnf..
Fri Sep 27 11:35:37 2019 - [info] Reading server configuration from /etc/mha.cnf..
在主数据库上将mysql进程杀掉后,MHA进程自动结束。
在3307端口数据库上查询,已经没有从节点信息了。
在3308端口数据库以及3309端口数据上显示,主数据库已变为更为3307端口数据库,说明转换主从结构成功!
PS:因为几个数据库都是源码编译安装,数据库路径和二进制文件储存位置都做出了调整,与默认位置不同,而yum安装的MHAmaster中如果配置文件中不写明master_binlog_dir的位置,则会默认去 /usr/bin/msyql/目录下找主机二进制文件(当然,mysql数据库也是yum安装的可忽略这些配置,一切默认就好,而配置文件中添加server主机时,如果端口号不是默认3306,一定要记得写明port端口号,如本文中多实例用的不同端口,标记清楚即可。)