MySQL_MHA主从自动切换实例
一、部署环境介绍
master:
Distributor ID: CentOS
Description: CentOS Linux release 7.8.2003 (Core)
Release: 7.8.2003
Codename: Core
slave1:
Distributor ID: CentOS
Description: CentOS Linux release 7.8.2003 (Core)
Release: 7.8.2003
Codename: Core
slave2:
Distributor ID: CentOS
Description: CentOS Linux release 7.8.2003 (Core)
Release: 7.8.2003
Codename: Core
HMA:
Distributor ID: CentOS
Description: CentOS Linux release 7.8.2003 (Core)
Release: 7.8.2003
Codename: Core
二、手动部署Mysql操作如下
1.环境检查
版本: Ver 14.14 Distrib 5.7.30, for linux-glibc2.12 (x86_64) using EditLine wrapper
检测系统是否自带Mysql
# rpm -qa|grep mysql
如果有进行强行卸载
# rpm -e --nodeps xxxxx
检测系统是否自带mariadb
# rpm -qa|grep mariadb
如果有进行强行卸载,删除命令和删除MySQL一样,这边就不在重复了。
检查mysql用户组和用户是否存在,如果没有,则创建
[root@wale /]# cat /etc/group | grep mysql [root@wale /]# cat /etc/passwd |grep mysql [root@wale /]# groupadd mysql [root@wale /]# useradd -r -g mysql mysql
2.安装
2.1.解压 安装包
tar xzvf mysql-5.7.30-linux-glibc2.12-x86_64.tar.gz
2.2.将解压包移动到usr/local 文件夹下
mv mysql-5.7.30-linux-glibc2.12-x86_64 /usr/local/mysql
2.3.更改mysql目录下所有的目录及文件夹所属的用户组和用户,以及权限
[root@wale /]# chown -R mysql:mysql /usr/local/mysql [root@wale /]# chmod -R 755 /usr/local/mysql
2.4.编译安装并初始化mysql,务必记住初始化输出日志末尾的密码(数据库管理员临时密码)
[root@wale /]# cd /usr/local/mysql/bin [root@wale bin]# ./mysqld --initialize --user=mysql --datadir=/usr/local/mysql/data --basedir=/usr/local/mysql
运行初始化命令成功后,末尾会打印数据库的登录密码
2.5在etc下新建my.cnf文件,并添加内容。
vi /etc/my.cnf [mysqld] datadir=/usr/local/mysql/data port=11306 sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES symbolic-links=0 max_connections=600 innodb_file_per_table=1 lower_case_table_names=1
新建my.cnf文件
3.启动
3.1测试启动mysql服务器
/usr/local/mysql/support-files/mysql.server start
3.2添加软连接,并重启mysql服务
[root@wale /]# ln -s /usr/local/mysql/support-files/mysql.server /etc/init.d/mysql [root@wale /]# ln -s /usr/local/mysql/bin/mysql /usr/bin/mysql [root@wale /]# service mysql restart
3.3登录mysql,修改密码(密码为步骤5生成的临时密码)
[root@wale /]# mysql -u root -p Enter password: mysql>set password for root@localhost = password('yourpass');
4.关闭liunx的防火墙
[root@wale1 etc]# firewall-cmd --state running [root@wale1 etc]# systemctl stop firewalld.service [root@wale1 etc]# systemctl disable firewalld.service
三、主从数据库配置
1.配置master主库
my.cnf文件:
# 开启binlog
log_bin=mysql-bin
server-id = 1
sync-binlog=1
#忽略不需要同步的表
binlog-ignore-db=performance_schema
binlog-ignore-db=information_schema
binlog-ignore-db=sys
#指定需要同步的数据库
#binlog-do-db=lg_s4m1_test
修改完成后,重启mysql
#启动mysql
service mysql start
#停止mysql
service mysql stop
#重启mysql
service mysql restart
创建同步用户与连接执行连接授权
mysql> grant replication slave on *.* to 'mysync'@'192.168.85.%' identified by '123456';
mysql> flush privileges;
配置完成后,查看详细信息
mysql> SHOW MASTER STATUS;
+-------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-------------------+----------+--------------+------------------+-------------------+
| master-bin.000001 | 1949132 | | | |
+-------------------+----------+--------------+------------------+-------------------+
1 row in set (0.01 sec)
2.配置slave从库
my.cnf 文件:可以不开启binlog,但是需要指定server-id,不能和主库重复。
server-id=2
#指定中继日志的名称
relay_log=mysql-relay-bin
#指定只读
read_only=1
重启mysql服务:
#启动mysql
service mysql start
#停止mysql
service mysql stop
#重启mysql
service mysql restart
登录mysql, 查询从机状态
show slave status;
reset slave
修改连接主库的信息
change master to
master_host='192.168.85.220',
master_port=13306,
master_user='mysync',
master_password='123456',
master_log_file='mysql-bin.000001',
master_log_pos=1949132;
启动从库配置
mysql> start slave;
再次查看状态
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.85.220
Master_User: mysync
Master_Port: 11306
Connect_Retry: 60
Master_Log_File: master-bin.000001
Read_Master_Log_Pos: 1949132
Relay_Log_File: relay-log.000002
Relay_Log_Pos: 1163
Relay_Master_Log_File: master-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1949132
Relay_Log_Space: 1364
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 10
Master_UUID: a89b89c1-e2e2-11ea-87c8-000c29bd297d
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
四、半同步复制配置安装
1.master端:
1.1查询数据库是否支持动态安装插件
select @@have_dynamic_loading;
1.2查询插件列表
show plugins;
1.3安装semi插件
# liunx环境下安装的插件时,别名的后缀使用.so,但是windows下需要换成.dll,因为win系统不支持.so文件 install plugin rpl_semi_sync_master soname 'semisync_master.so'; # win系统安装插件 install plugin rpl_semi_sync_master soname 'semisync_master.dll';
1.4安装完成后查询插件参数
show variables like '%semi%';
参数:
- rpl_semi_sync_master_enabled | OFF | // 是否开启 off 为关闭
- rpl_semi_sync_master_timeout | 10000 | // 同步时长,默认10秒
修改semi参数
# 开启半同步 set global rpl_semi_sync_master_enabled=1; # 修改同步时间为1秒 set global rpl_semi_sync_master_timeout=1000;
2.slave端:
2.1 查询步骤同master端,此处省略。
2.2 安装semi插件
# liunx环境下安装的插件时,别名的后缀使用.so,但是windows下需要换成.dll,因为win系统不支持.so文件 install plugin rpl_semi_sync_slave soname 'semisync_slave.so'; # win系统安装插件 install plugin rpl_semi_sync_slave soname 'semisync_slave.dll';
2.3 安装完成后查询插件参数
install plugin rpl_semi_sync_slave soname 'semisync_slave.so';
slave端的参数只有连个,此处只需要设置开启就可以了。
参数:rpl_semi_sync_slave_enabled OFF // 是否开启 off 为关闭
修改参数
# 开启半同步 set global rpl_semi_sync_slave_enabled=1;
2.4 重启slave
stop slave; start slave;
五、MHA安装与配置
1.主机配置
192.168.85.223 Manager 用于监控管理
192.168.85.220 Master mysql主库
192.168.85.209 Slave mysql从库1
192.168.85.223 Slave mysql从库2
在各个节点的etc/hosts文件中添加一下内容
192.168.85.223 node1.wale.com node1
192.168.85.220 node2.wale.com node2
192.168.85.209 node3.wale.com node3
192.168.85.223 node4.wale.com node4
执行命令 /etc/init.d/network restart 重启hosts;
执行命令”cat /etc/hosts“可以查看到hosts文件修改成功。
2.配置ssh免密登录
ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub root@node1
当四台机器都进行了上述操作以后,我们可以在 manager 机器上看到如下文件
cd .ssh/
cat authorized_keys
文件为4组加密字符串:
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC86fgtXVELP/K6U6fKA7cOwtPROx7A9cR4gE7MXtH4MIkrMDrbmXLzFWl9RVB+wnfio8mR2B/emUIv7YN1NOeilx4PKyRx5+D1N0Wxhgf7mdr9fkrXse14eCXL/IcMRX/VJ7bo+hKkiQXwIST+Am0/cK7bNzE0h1bQtaupS8/6HPPlcl6+yP1vORoFUKEaP3GTAymAxCcsIBkmHGNDFhYN+V6t3X3qlcKBvP3LgGrj6/3wWLAXGcYshzmL8EN55S4L2LIxbCfz1L2L3apEtx+UcXODbqU6ty0gVM8qhtpls4QNhw/5Rk4nUAfioOi2hClzL//GzKZNbOx30d8OWOZV root@wale1.com
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDxm27VzmPPPM5l3cK80vKAwcgsf3CwXvvNUHXa8ePoG1dsLEIjJJ/kSMZKBaq7y1HTHL/IS8beh+eDsXmaRIEQvf0A+TPVfajSMAhFi11ov11WduI/esXnkxgLLmO2slrpzdLhPrOXM5ZzzAaaGR71D/tcCY/eQ7VrtwL6MTZ2QBL1xjPO7v0SXii+327iEPH06EZhWd3Nu3HgaJ+gR0kywtOxFwdAOuAVC4WJl00KyWysfdwPXHC+t13e/x/RAkcJmtwtypJVEvJNoW6bpgTNsYaKQSAga2seTECE0jr18RoYsWPZaQDARgBWE7qz9T4pZhhC+HqJgjReBUuJr7Uz root@wale1.com
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC3r2tJk+ALgU/BghZExuG1S7ING7MyJ7d1CNI7GL39OMeTRcgeFmTaxQvBeD3x1x6kiwoXe7tT1/GbEtQ/ol3OncrYGiDpix4R0xuoIEwTLoID8IezwzJJaGPOv9geywQtxSp3+wCr62f42SqRGgkNmy0/nbyrtIqzmpSFg6REqJqpGYJnU0RSAy36VS0Ve/TNvxg4srn5Hwr4vPR4bzJHI4PyAce5TLY+qM9N6CsrWzGvRfU6cZGBFpupetg4JVaLEKZDAUTnL4QQ1L8RiFxFVSEmmXdNoslP3aL7+n855ngnNUps7omySjLdnuS0gcJgaHf5Je/+4yK4Z/NoVUjT root@wale1.com
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCzVnAv7nbwoZoo0oo0LIu17iH9tkmc1waPrHc/l0Xf0Fby+dQ+lXQdyrYfr1Ec266WWFkVJ9g+idQXQtAin1gwJKsVUBlhpKoXTW8btXpPAwlvDL8H3tMX67/X0hJnoEWrBJ15Fm9jyyf5OonrhbOV1YWBjC7tr3+m5Ehx3RhsIfg5Kf+VzPUrj3y0cSdMo5oK/iIhpJzsmoWtfs3WrDP40J0zhL0ukYJfPcZnWPJSPBavFs9vIkAHROvVcQVYl8jlMN73RKzmD91RJMqy7lHitxHGNxcQ3ZQSlxx8c2Ly/r05/o8MlrvXyrm4pYP3zbzhn2NCCwlyX0pLYcQ5Covd root@wale1.com
这个文件发送至另外三台机器,这四台机器就可以实现 ssh 无密码互通了:
scp authorized_keys root@node2:~/.ssh/
scp authorized_keys root@node3:~/.ssh/
scp authorized_keys root@node4:~/.ssh/
3.安装perl
MHA需要安装perl,安装步骤如下
3.1 安装gcc
yum install -y gcc
3.2 创建/usr/local/perl 目录,用于后面的安装
mkdir /usr/local/perl
3.3 加压perl ,解压后进入perl解压目中,执行
./Configure -des -Dprefix=/usr/local/perl -Dusethreads -Uversiononly
3.4 执行结束后在执行一下命令
make
make install
3.5 执行完成后perl -version 参看版本,显示版本信息,表示安装成功。
3.6 注意,安装manager,还需要安装perl的部分插件,一下是对应的安装目录,请在perl安装完成后,在Manager服务器上进行安装
perl-Config-Tiny-2.12-1.el6.rfx.noarch.rpm
perl-Email-Date-Format-1.002-15.el7.noarch.rpm
perl-libs-5.16.3-295.el7.x86_64.rpm
perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm
perl-Mail-Sender-0.8.23-1.el7.noarch.rpm
perl-Mail-Sendmail-0.79-21.el7.noarch.rpm
perl-MIME-Lite-3.030-1.el7.noarch.rpm
perl-MIME-Types-1.38-2.el7.noarch.rpm
perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm
六、安装MHA包
1.在manager节点服务器上安装mha4mysql-node-0.58-0.el7.centos.noarch.rpm 和 mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
其他3台数据库服务器上只需要安装mha4mysql-node-0.58-0.el7.centos.noarch.rpm
安装命令如下:
yum install -y mha4mysql-node-0.58-0.el7.centos.noarch.rpm
yum install -y mha4mysql-node-0.58-0.el7.centos.noarch.rpm
2.配置相关文件
在4台主机上创建mha的配置文件
mkdir /etc/mha_master
vim /etc/mha_master/mha.cnf
mha.cnf内容如下:
# 适用于server1,2,3个server的配置
[server default]
# mha管理用户/密码
user=mhaadmin
password=mhaadmin
# mha_master自己的工作路径
manager_workdir=/etc/mha_master/app1
# mha_master自己的日志文件
manager_log=/etc/mha_master/manager.log
# 每个远程主机的工作目录在何处
remote_workdir=/data/mha_master/app1
# 基于ssh的密钥认证
ssh_user=root
# 数据库用户名
repl_user=root
# 数据库密码
repl_password=123456
# ping间隔时长
ping_interval=1
# 节点2
[server1]
# 节点2主机地址
hostname=192.168.85.220
# 节点2的ssh端口
ssh_port=22
# 将来可不可以成为master候选节点/主节点
candidate_master=1
master_binlog_dir=/usr/local/mysql/data/
[server2]
hostname=192.168.85.209
ssh_port=22
candidate_master=1
master_binlog_dir=/usr/local/mysql/data/
[server3]
hostname=192.168.85.223
ssh_port=22
candidate_master=1
master_binlog_dir=/usr/local/mysql/data/
3.检查4个节点
3.1检测各节点间 ssh 互信通信配置是否 ok,在 Manager 机器上输入下述命令来检测:
masterha_check_ssh -conf=/etc/mha_master/mha.cnf
最后显示
All SSH connection tests passed successfully. – 表示互信成功
3.2检查管理的MySQL复制集群的连接配置参数是否OK,执行一下命令:
masterha_check_repl -conf=/etc/mha_master/mha.cnf
最后显示:
MySQL Replication Health is OK --表示检查成功
3.3启动MHA,执行一下命令
nohup masterha_manager -conf=/etc/mha_master/mha.cnf &> /etc/mha_master/manager.log &
查看 master 节点的状态:
masterha_check_status -conf=/etc/mha_master/mha.cnf
显示j结果为:mha (pid:4260) is running(0:PING_OK), master:192.168.81.67
表示mah启动成功
3.4停止MHA,执行一下命令
masterha_stop -conf=/etc/mha_master/mha.cnf
查看 master 节点的状态:
masterha_check_status -conf=/etc/mha_master/mha.cnf
显示j结果为:mha is stopped(2:NOT_RUNNING).
表示mah已经停止了
4.MHA故障转移测试步骤:
4.1登录master服务,执行ps -ef |grep mysql, 查询master的MySql的进程号,执行kill -9 ,杀掉进程,模拟mysql崩溃场景。
4.2登录manger 服务,查看日志:
tail -200 /etc/mha_master/manager.log
显示结果如下:
----- Failover Report -----
mha: MySQL Master failover 192.168.85.220(192.168.85.220:11306) to 192.168.85.209(192.168.85.209:11306) succeeded
Master 192.168.85.220(92.168.85.220:11306) is down!
Check MHA Manager logs at wale1.com:/etc/mha_master/manager.log for details.
该结果表示:master节点已经从79转移到67上了。
此时mha自动停止。
5.master服务的mysql恢复运行,加入从节点
5.1从slave2节点恢复数据导master服务数据库上。
mkdir /backup // 恢复数据的sql存放文件夹
mysqldump --all-database > /backup/mysql-backup-`date +%F-%T`-all.sql // 将binlog日志导出为sql
scp /backup/mysql-backup-2020-08-13-11:57:42-all.sql root@node2:~ // 将sql发送master服务器
5.2登录master服务,进行数据恢复
mysql < mysql-backup-2020-08-13-11:57:42-all.sql
5.3设置master为slave2服务器的从数据库,启动slaver并查询slave状态
change master to master_host='192.168.81.67',master_port=3306,master_user='root',master_password='123456',master_log_file='mysql-bin.000005',master_log_pos=414;
Query OK, 0 rows affected, 2 warnings (0.36 sec)
mysql> start slave;
Query OK, 0 rows affected (0.27 sec)
mysql> show slave status \G;
5.4再次执行检查操作
masterha_check_repl -conf=/etc/mha_master/mha.cnf
若没有问题,则启动 manager,注意,这次启动要记录日志:
masterha_manager -conf=/etc/mha_master/mha.cnf > /etc/mha_master/manager.log 2>&1 &
启动成功以后,我们来查看一下 master 节点的状态:
[root@wale1 ~]# masterha_check_status -conf=/etc/mha_master/mha.cnf
mha (pid:4260) is running(0:PING_OK), master:192.168.81.67
mha启动成功。进行主从数据测试。
完成以上步骤,MHA搭建测试基本完成。
七、遇到的问题
1.问题一
Failed to save binary log: Binlog not found from /var/lib/mysql,/var/log/mysql! If you got this error at MHA Manager, please set "master_binlog_dir=/path/to/binlog_directory_of_the_master
原因:mysql的binlog日志目录没有在默认的/var/log/mysql下,而是在其他地方,导致目录找不到。
解决办法:在manager节点的配置文件中每个节点下制定二进制日志的目录路径。
[server1]
hostname=192.168.199.105
candidate_master=1
master_binlog_dir=/mysql_data/ #指定二进制日志文件目录
2.问题二
Can't exec "mysqlbinlog": 没有那个文件或目录 at /usr/local/share/perl5/MHA/BinlogManager.pm line 106.mysqlbinlog version command failed with rc 1:0, please verify PATH, LD_LIBRARY_PATH, and client options at /usr/local/bin/apply_diff_relay_logs line 493.
解决办法:
在所有节点上执行如下命令
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/local/bin/mysqlbinlog
ln -s /usr/local/mysql/bin/mysql /usr/local/bin/mysql
3.问题三
mysqldump命令找不到,执行一下命令:
ln -fs /usr/local/mysql/bin/mysqldump /usr/bin
4.问题四
故障转移后,短时间里再次进行故障转移,转移失败
[error][/usr/share/perl5/vendor_perl/MHA/MasterFailover.pm, ln310] Last failover was done at 2020/08/13 11:29:35. Current time is too early to do failover again. If you want to do failover, manually remove /etc/mha_master/app1/mha.failover.complete and run this script again.
上次故障切换是在2020/08/13 11:29:35完成的。当前时间太早,无法再次执行故障转移。如果要执行故障转移,请手动删除/etc/mha_master/app1/mha.failover.complete再运行这个脚本。