MHA(Master High Availability)目前在MySQL高可用方面是一个相对成熟的解决方案,是一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件。在MySQL故障切换过程中,MHA能做到在0~30秒之内自动完成数据库的故障切换操作,并且在进行故障切换的过程中,MHA能在最大程度上保证数据的一致性,以达到真正意义上的高可用。
该软件由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)。MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署在一台slave节点上。MHA Node运行在每台MySQL服务器上,MHA Manager会定时探测集群中的master节点,当master出现故障时,它可以自动将最新数据的slave提升为新的master,然后将所有其他的slave重新指向新的master。整个故障转移过程对应用程序完全透明。
实验环境:rhel6.5 iptables and selinux disabled
server1 | IP:10.0.0.1 | MHA-manager |
server2 | IP:10.0.0.2 | MySQL-Master + MHA Node |
server3 | IP:10.0.0.3 | MySQL-Slave1 + MHA Node |
server4 | IP:10.0.0.4 | MySQL-Slave2 + MHA Node |
安装Mysql数据库(Server2、Server3、Server4):
# 安装MySQL数据库
[root@server2 ~]# ls
mysql-community-client-5.7.17-1.el6.x86_64.rpm
mysql-community-common-5.7.17-1.el6.x86_64.rpm
mysql-community-libs-5.7.17-1.el6.x86_64.rpm
mysql-community-libs-compat-5.7.17-1.el6.x86_64.rpm
mysql-community-server-5.7.17-1.el6.x86_64.rpm
[root@server2 ~]# yum install -y *
# 启动MySQL数据库
启动mysql
[root@server2 ~]# /etc/init.d/mysqld start
Initializing MySQL database: [ OK ]
Installing validate password plugin: [ OK ]
Starting mysqld: [ OK ]
# 查看系统初始密码
[root@server2 ~]# grep 'temporary password' /var/log/mysqld.log
2018-10-26T16:21:16.893671Z 1 [Note] A temporary password is generated for root@localhost: rJsv*43QbeNe
# MySQL数据库安全初始化,修改密码、root用户远程权限等配置
[root@server2 ~]# mysql_secure_installation
编辑MySQL配置文件/etc/my.cnf,添加以下内容
# 编辑MySQL配置文件
[root@server2 ~]# vim /etc/my.cnf
[mysqld]
# GTID
server_id=1 # ServerID,三台MySQL不能相同
gtid_mode=ON # 开启GTID模式
enforce_gtid_consistency=on
# bin_log
log_bin=mysql-binlog # 开启Log_bin
log_slave_updates=1
binlog_format=row
# relaylog
skip_slave_start=1
# 重启MySQL,使配置生效
[root@server2 ~]# /etc/init.d/mysqld restart
Stopping mysqld: [ OK ]
Starting mysqld: [ OK ]
配置MySQL主从复制
# 查看GTID状态
mysql> show global variables like '%gtid%';
+----------------------------------+-------+
| Variable_name | Value |
+----------------------------------+-------+
| binlog_gtid_simple_recovery | ON |
| enforce_gtid_consistency | ON |
| gtid_executed | |
| gtid_executed_compression_period | 1000 |
| gtid_mode | ON |
| gtid_owned | |
| gtid_purged | |
| session_track_gtids | OFF |
+----------------------------------+-------+
8 rows in set (0.01 sec)
# 在Master上创建同步用户
mysql> GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%' IDENTIFIED BY 'P@ssword1';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.01 sec)
# 在Slave上,指定Master信息
mysql> CHANGE MASTER TO MASTER_HOST='10.0.0.2',
-> MASTER_USER='repl',
-> MASTER_PASSWORD='P@ssword1',
-> MASTER_PORT=3306,
-> MASTER_AUTO_POSITION=1;
Query OK, 0 rows affected, 2 warnings (0.07 sec)
# 开启Slave
mysql> START SLAVE;
Query OK, 0 rows affected (0.01 sec)
# 查看Slave状态,Slave_IO_Running和Slave_IO_Running都为Yes,则正常
mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.0.0.2
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-binlog.000003
Read_Master_Log_Pos: 589
Relay_Log_File: server3-relay-bin.000002
Relay_Log_Pos: 808
Relay_Master_Log_File: mysql-binlog.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
.................................................................
# 进行主从复制的创建,需要确保主从库的一致性,可以备份Master库并导入Slave来确保库的一致性。
# 因为我们是新安装的MySQL,没有进行其他多余的操作,所以无需进行同步库操作。
# 最好执行RESET MASTER;和RESET SLAVE;清除MASTER和SLAVE信息避免不必要的错误。
安装MHA软件
# mha软件rpm包
[root@server1 ~]# ls
mha4mysql-manager-0.56-0.el6.noarch.rpm
mha4mysql-node-0.56-0.el6.noarch.rpm
perl-Config-Tiny-2.12-7.1.el6.noarch.rpm
perl-Email-Date-Format-1.002-5.el6.noarch.rpm
perl-Log-Dispatch-2.27-1.el6.noarch.rpm
perl-Mail-Sender-0.8.16-3.el6.noarch.rpm
perl-Mail-Sendmail-0.79-12.el6.noarch.rpm
perl-MIME-Lite-3.027-2.el6.noarch.rpm
perl-MIME-Types-1.28-2.el6.noarch.rpm
perl-Parallel-ForkManager-0.7.9-1.el6.noarch.rpm
[root@server1 ~]# yum install -y *
配置MHA
# 创建工作目录和配置文件目录
[root@server1 ~]# mkdir /usr/local/mha
[root@server1 ~]# mkdir /etc/mha
# 创建配置文件
[root@server1 ~]# vim /etc/mha/mysql.conf
[server default]
manager_workdir=/usr/local/mha # 工作目录
manager_log=/usr/local/mha/mha.log # 日志目录
master_binlog_dir=/var/lib/mysql # 数据库binlog目录
user=root # 数据库用户,用于监控管理
password=P@ssword1 # 数据库用户密码
ping_interval=1
remote_workdir=/tmp
repl_usr=repl # 主从复制用户
pepl_password=P@ssword1 # 主从复制用户密码
ssh_user=root # ssh用户
# 三台mysql服务指定
[server1]
hostname=10.0.0.2
port=3306
[server2]
hostname=10.0.0.3
port=3306
candidate_master=1 # 表示备用master
check_repl_delay=0 # 日志差异过大不做master
[server3]
hostname=10.0.0.4
port=3306
# no_master=1 # 表示不做master
创建节点间用于ssh免密连接的密钥,
# 生成公钥和私钥
[root@server1 ~]# ssh-keygen
# 发送至三台MySQL主机
[root@server1 ~]# ssh-copy-id 10.0.0.2
[root@server1 ~]# ssh-copy-id 10.0.0.3
[root@server1 ~]# ssh-copy-id 10.0.0.4
# ssh连接三台主机进行测试免密连接
MySQL主机安装MHA-Node
[root@server1 ~]# scp mha4mysql-node-0.56-0.el6.noarch.rpm server2:/root/
mha4mysql-node-0.56-0.el6.noarch.rpm 100% 35KB 35.5KB/s 00:00
[root@server1 ~]# scp mha4mysql-node-0.56-0.el6.noarch.rpm server3:/root/
mha4mysql-node-0.56-0.el6.noarch.rpm 100% 35KB 35.5KB/s 00:00
[root@server1 ~]# scp mha4mysql-node-0.56-0.el6.noarch.rpm server4:/root/
mha4mysql-node-0.56-0.el6.noarch.rpm 100% 35KB 35.5KB/s 00:00
# 三个节点分别安装MHA-Node
[root@server2 ~]# yum install -y mha4mysql-node-0.56-0.el6.noarch.rpm
[root@server3 ~]# yum install -y mha4mysql-node-0.56-0.el6.noarch.rpm
[root@server4 ~]# yum install -y mha4mysql-node-0.56-0.el6.noarch.rpm
MHA用户授权
mysql> GRANT ALL PRIVILEGES ON *.* TO root@'%' IDENTIFIED BY 'P@ssword1';
Query OK, 0 rows affected, 1 warning (0.00 sec)
MANAGER的工具
masterha_check_ssh 检查MHA的SSH配置状况
masterha_check_repl 检查MySQL复制状况
masterha_manger 启动MHA
masterha_check_status 检测当前MHA运行状态
masterha_master_monitor 检测master是否宕机
masterha_master_switch 控制故障转移(自动或者手动)
masterha_conf_host 添加或删除配置的server信息
Node的工具:
save_binary_logs 保存和复制master的二进制日志
apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用于其他的slave
filter_mysqlbinlog 去除不必要的ROLLBACK事件(MHA已不再使用这个工具)
purge_relay_logs 清除中继日志(不会阻塞SQL线程)
检查MHA的SSH配置状况,
[root@server1 ~]# masterha_check_ssh --conf=/etc/mha/mysql.conf
Sat Oct 27 06:33:02 2018 - [warning] Global configuration file /etc/masterha_def ault.cnf not found. Skipping.
Sat Oct 27 06:33:02 2018 - [info] Reading application default configuration from /etc/mha/mysql.conf..
Sat Oct 27 06:33:02 2018 - [info] Reading server configuration from /etc/mha/mys ql.conf..
Sat Oct 27 06:33:02 2018 - [info] Starting SSH connection tests..
Sat Oct 27 06:33:02 2018 - [debug]
Sat Oct 27 06:33:02 2018 - [debug] Connecting via SSH from root@10.0.0.2(10.0.0 .2:22) to root@10.0.0.3(10.0.0.3:22)..
Sat Oct 27 06:33:02 2018 - [debug] ok.
Sat Oct 27 06:33:02 2018 - [debug] Connecting via SSH from root@10.0.0.2(10.0.0 .2:22) to root@10.0.0.4(10.0.0.4:22)..
Sat Oct 27 06:33:02 2018 - [debug] ok.
Sat Oct 27 06:33:03 2018 - [debug]
Sat Oct 27 06:33:02 2018 - [debug] Connecting via SSH from root@10.0.0.3(10.0.0 .3:22) to root@10.0.0.2(10.0.0.2:22)..
Sat Oct 27 06:33:03 2018 - [debug] ok.
Sat Oct 27 06:33:03 2018 - [debug] Connecting via SSH from root@10.0.0.3(10.0.0 .3:22) to root@10.0.0.4(10.0.0.4:22)..
Sat Oct 27 06:33:03 2018 - [debug] ok.
Sat Oct 27 06:33:03 2018 - [debug]
Sat Oct 27 06:33:03 2018 - [debug] Connecting via SSH from root@10.0.0.4(10.0.0 .4:22) to root@10.0.0.2(10.0.0.2:22)..
Sat Oct 27 06:33:03 2018 - [debug] ok.
Sat Oct 27 06:33:03 2018 - [debug] Connecting via SSH from root@10.0.0.4(10.0.0 .4:22) to root@10.0.0.3(10.0.0.3:22)..
Sat Oct 27 06:33:03 2018 - [debug] ok.
Sat Oct 27 06:33:03 2018 - [info] All SSH connection tests passed successfully.
检查SQL复制状况
[root@server1 ~]# masterha_check_ssh --conf=/etc/mha/mysql.conf
Sat Oct 27 06:29:05 2018 - [warning] Global configuration file /etc/masterha_def ault.cnf not found. Skipping.
Sat Oct 27 06:29:05 2018 - [info] Reading application default configuration from /etc/mha/mysql.conf..
Sat Oct 27 06:29:05 2018 - [info] Reading server configuration from /etc/mha/mys ql.conf..
Sat Oct 27 06:29:05 2018 - [info] Starting SSH connection tests..
Sat Oct 27 06:29:05 2018 - [error][/usr/share/perl5/vendor_perl/MHA/SSHCheck.pm, ln63]
Sat Oct 27 06:29:05 2018 - [debug] Connecting via SSH from root@10.0.0.2(10.0.0 .2:22) to root@10.0.0.3(10.0.0.3:22)..
Warning: Permanently added '10.0.0.3' (RSA) to the list of known hosts.
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Sat Oct 27 06:29:05 2018 - [error][/usr/share/perl5/vendor_perl/MHA/SSHCheck.pm, ln111] SSH connection from root@10.0.0.2(10.0.0.2:22) to root@10.0.0.3(10.0.0.3 :22) failed!
Sat Oct 27 06:29:06 2018 - [error][/usr/share/perl5/vendor_perl/MHA/SSHCheck.pm, ln63]
Sat Oct 27 06:29:05 2018 - [debug] Connecting via SSH from root@10.0.0.3(10.0.0 .3:22) to root@10.0.0.2(10.0.0.2:22)..
Warning: Permanently added '10.0.0.2' (RSA) to the list of known hosts.
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Sat Oct 27 06:29:06 2018 - [error][/usr/share/perl5/vendor_perl/MHA/SSHCheck.pm, ln111] SSH connection from root@10.0.0.3(10.0.0.3:22) to root@10.0.0.2(10.0.0.2 :22) failed!
Sat Oct 27 06:29:06 2018 - [error][/usr/share/perl5/vendor_perl/MHA/SSHCheck.pm, ln63]
Sat Oct 27 06:29:06 2018 - [debug] Connecting via SSH from root@10.0.0.4(10.0.0 .4:22) to root@10.0.0.2(10.0.0.2:22)..
Warning: Permanently added '10.0.0.2' (RSA) to the list of known hosts.
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Sat Oct 27 06:29:06 2018 - [error][/usr/share/perl5/vendor_perl/MHA/SSHCheck.pm, ln111] SSH connection from root@10.0.0.4(10.0.0.4:22) to root@10.0.0.2(10.0.0.2 :22) failed!
SSH Configuration Check Failed!
at /usr/bin/masterha_check_ssh line 44
[root@server1 ~]# masterha_check_ssh --conf=/etc/mha/mysql.conf
Sat Oct 27 06:33:02 2018 - [warning] Global configuration file /etc/masterha_def ault.cnf not found. Skipping.
Sat Oct 27 06:33:02 2018 - [info] Reading application default configuration from /etc/mha/mysql.conf..
Sat Oct 27 06:33:02 2018 - [info] Reading server configuration from /etc/mha/mys ql.conf..
Sat Oct 27 06:33:02 2018 - [info] Starting SSH connection tests..
Sat Oct 27 06:33:02 2018 - [debug]
Sat Oct 27 06:33:02 2018 - [debug] Connecting via SSH from root@10.0.0.2(10.0.0 .2:22) to root@10.0.0.3(10.0.0.3:22)..
Sat Oct 27 06:33:02 2018 - [debug] ok.
Sat Oct 27 06:33:02 2018 - [debug] Connecting via SSH from root@10.0.0.2(10.0.0 .2:22) to root@10.0.0.4(10.0.0.4:22)..
Sat Oct 27 06:33:02 2018 - [debug] ok.
Sat Oct 27 06:33:03 2018 - [debug]
Sat Oct 27 06:33:02 2018 - [debug] Connecting via SSH from root@10.0.0.3(10.0.0 .3:22) to root@10.0.0.2(10.0.0.2:22)..
Sat Oct 27 06:33:03 2018 - [debug] ok.
Sat Oct 27 06:33:03 2018 - [debug] Connecting via SSH from root@10.0.0.3(10.0.0 .3:22) to root@10.0.0.4(10.0.0.4:22)..
Sat Oct 27 06:33:03 2018 - [debug] ok.
Sat Oct 27 06:33:03 2018 - [debug]
Sat Oct 27 06:33:03 2018 - [debug] Connecting via SSH from root@10.0.0.4(10.0.0 .4:22) to root@10.0.0.2(10.0.0.2:22)..
Sat Oct 27 06:33:03 2018 - [debug] ok.
Sat Oct 27 06:33:03 2018 - [debug] Connecting via SSH from root@10.0.0.4(10.0.0 .4:22) to root@10.0.0.3(10.0.0.3:22)..
Sat Oct 27 06:33:03 2018 - [debug] ok.
Sat Oct 27 06:33:03 2018 - [info] All SSH connection tests passed successfully.
[root@server1 ~]# masterha_check_repl --conf=/etc/mha/mysql.conf
Sat Oct 27 06:38:20 2018 - [warning] Global configuration file /etc/masterha_defau lt.cnf not found. Skipping.
Sat Oct 27 06:38:20 2018 - [info] Reading application default configuration from / etc/mha/mysql.conf..
Sat Oct 27 06:38:20 2018 - [info] Reading server configuration from /etc/mha/mysql .conf..
Sat Oct 27 06:38:20 2018 - [info] MHA::MasterMonitor version 0.56.
Sat Oct 27 06:38:20 2018 - [info] GTID failover mode = 1
Sat Oct 27 06:38:20 2018 - [info] Dead Servers:
Sat Oct 27 06:38:20 2018 - [info] Alive Servers:
Sat Oct 27 06:38:20 2018 - [info] 10.0.0.2(10.0.0.2:3306)
Sat Oct 27 06:38:20 2018 - [info] 10.0.0.3(10.0.0.3:3306)
Sat Oct 27 06:38:20 2018 - [info] 10.0.0.4(10.0.0.4:3306)
Sat Oct 27 06:38:20 2018 - [info] Alive Slaves:
Sat Oct 27 06:38:20 2018 - [info] 10.0.0.3(10.0.0.3:3306) Version=5.7.17-log (o ldest major version between slaves) log-bin:enabled
Sat Oct 27 06:38:20 2018 - [info] GTID ON
Sat Oct 27 06:38:20 2018 - [info] Replicating from 10.0.0.2(10.0.0.2:3306)
Sat Oct 27 06:38:20 2018 - [info] Primary candidate for the new Master (candid ate_master is set)
Sat Oct 27 06:38:20 2018 - [info] 10.0.0.4(10.0.0.4:3306) Version=5.7.17-log (o ldest major version between slaves) log-bin:enabled
Sat Oct 27 06:38:20 2018 - [info] GTID ON
Sat Oct 27 06:38:20 2018 - [info] Replicating from 10.0.0.2(10.0.0.2:3306)
Sat Oct 27 06:38:20 2018 - [info] Current Alive Master: 10.0.0.2(10.0.0.2:3306)
Sat Oct 27 06:38:20 2018 - [info] Checking slave configurations..
Sat Oct 27 06:38:20 2018 - [info] read_only=1 is not set on slave 10.0.0.3(10.0.0 .3:3306).
Sat Oct 27 06:38:20 2018 - [info] read_only=1 is not set on slave 10.0.0.4(10.0.0 .4:3306).
Sat Oct 27 06:38:20 2018 - [info] Checking replication filtering settings..
Sat Oct 27 06:38:20 2018 - [info] binlog_do_db= , binlog_ignore_db=
Sat Oct 27 06:38:20 2018 - [info] Replication filtering check ok.
Sat Oct 27 06:38:20 2018 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Sat Oct 27 06:38:20 2018 - [info] Checking SSH publickey authentication settings o n the current master..
Sat Oct 27 06:38:20 2018 - [info] HealthCheck: SSH to 10.0.0.2 is reachable.
Sat Oct 27 06:38:20 2018 - [info]
10.0.0.2(10.0.0.2:3306) (current master)
+--10.0.0.3(10.0.0.3:3306)
+--10.0.0.4(10.0.0.4:3306)
Sat Oct 27 06:38:20 2018 - [info] Checking replication health on 10.0.0.3..
Sat Oct 27 06:38:20 2018 - [info] ok.
Sat Oct 27 06:38:20 2018 - [info] Checking replication health on 10.0.0.4..
Sat Oct 27 06:38:20 2018 - [info] ok.
Sat Oct 27 06:38:20 2018 - [warning] master_ip_failover_script is not defined.
Sat Oct 27 06:38:20 2018 - [warning] shutdown_script is not defined.
Sat Oct 27 06:38:20 2018 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
启动MHA
[root@server1 ~]# masterha_manager --conf=/etc/mha/mysql.conf
Sat Oct 27 06:47:04 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Oct 27 06:47:04 2018 - [info] Reading application default configuration from /etc/mha/mysql.conf..
Sat Oct 27 06:47:04 2018 - [info] Reading server configuration from /etc/mha/mysql.conf..
查看MHA状态
[root@server1 ~]# masterha_check_status --conf=/etc/mha/mysql.conf
mysql (pid:11172) is running(0:PING_OK), master:10.0.0.2
测试:Server2为Master,停掉Server2的MySQL后,Server3的Slave自动转为Master,Server4的Master重新指向了Server3