MySql-MHA-Docker集群部署搭建
本文概述
本文主要介绍 docker 安装配置Mysql 、安装配置MHA 从而实现快速构建高可用mysql集群。
机器准备
一个简单的表格是这么创建的:
地址 | 软件 | 备注 |
---|---|---|
172.22.1.68 | mysql (master)、mha data node | |
172.22.1.88 | mysql (slave)、mha data node | |
172.22.1.94 | mysql (备master)、mha data node | |
172.22.1.81 | MHA、mha data node、mha manage | 管理 |
架构图:
前置条件
- docker环境安装;
docker 安装参考:docker安装 - 机器具备访问外网的条件;
MySQL主从复制集群搭建
1、安装mysql:准备3台机器安装mysql
$ docker pull mysql:5.7.34
$ mkdir -p /usr/local/mysql
$ cd /usr/local/mysql
$ docker run -itd --privileged -p 3306:3306 --name mysql-5.7.34 -v $PWD/conf:/etc/mysql/conf.d -v $PWD/logs:/var/log/mysql -v $PWD/data:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=zlwy@mysql -d mysql:5.7.34
$ docker exec -it mysql-5.7.34 bash
$ mysql -u root -p
$ GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'zlwy@mysql' WITH GRANT OPTION;
$ flush privileges;
2、安装配置mysql master
- 创建修改 my.cnf
$ vi /usr/local/mysql/conf/my.cnf
[mysqld]
log_bin=master-bin-68
log_bin_index=master-bin.index
server-id=1
# 每次写入都同步到binlog
sync-binlog=1
binlog-ignore-db=information_schema
# 忽略不同步
binlog-ignore-db=performance_schema
binlog-ignore-db=sys
gtid_mode=on
enforce_gtid_consistency=1
log_slave_updates=1
relay_log_index=master_relay_bin.index
relay_log=master_relay_bin
plugin_load="rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled=1
loose_rpl_semi_sync_slave_enabled=1
loose_rpl_semi_sync_master_timeout=5000
2 . 重启mysql: systemctl restart mysqld.service
$ docker restart mysql-5.7.34
$ docker exec -it mysql-5.7.34 bash
3 . 登录mysql进行账户授权,注:identified by 后需要填写你root账户对应的密码
$ grant replication slave on *.* to 'root'@'%' identified by 'zlwy@mysql';
$ GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'zlwy@mysql';
$ flush privileges;
$ show master status;
3、安装配置(88)mysql slave
- 创建修改 my.cnf
$ vi /usr/local/mysql/conf/my.cnf
[mysqld]
server-id=2
sync-binlog=1
log_bin=master-bin-68
binlog-ignore-db=information_schema
binlog-ignore-db=performance_schema
binlog-ignore-db=sys
relay_log_index=master_relay_bin.index
relay_log=master_relay_bin
relay_log_purge=0
innodb_log_file_size=256M
expire-logs-days=1
read_only=1
gtid_mode=on
enforce_gtid_consistency=on
log_slave_updates=1
plugin_load="rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled=1
loose_rpl_semi_sync_slave_enabled=1
loose_rpl_semi_sync_master_timeout=5000
$ docker restart mysql-5.7.34
$ docker exec -it mysql-5.7.34 bash
$ mysql -u root -p
- 配置MySQL slave
MySQL > show master status;
# master_log_file 和 master_log_pos的值来自 68的 show master status;
mysql > change master to
master_host='172.22.1.68',
master_port=3306,
master_user='root',
master_password='zlwy@mysql',
master_log_file='master-bin-68.000004',
master_log_pos=927,
MASTER_AUTO_POSITION=0;
mysql >start slave;
mysql >show slave status\G
3.可以查看配置信息
mysql > show global variables like '%gtid%';
4、安装配置(94)mysql slave
- 创建修改 my.cnf
$ vi /usr/local/mysql/conf/my.cnf
[mysqld]
server-id=3
sync-binlog=1
log_bin=master-bin-68
binlog-ignore-db=information_schema
binlog-ignore-db=performance_schema
binlog-ignore-db=sys
relay_log_index=master_relay_bin.index
relay_log=master_relay_bin
innodb_log_file_size=256M
expire-logs-days=1
read_only=1
gtid_mode=on
enforce_gtid_consistency=on
log_slave_updates=1
# 半同步 show variables like '%semi%';
plugin_load="rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled=1
loose_rpl_semi_sync_slave_enabled=1
loose_rpl_semi_sync_master_timeout=5000
$ docker restart mysql-5.7.34
$ docker exec -it mysql-5.7.34 bash
$ mysql -u root -p
- 配置MySQL slave
MySQL > show master status;
# master_log_file 和 master_log_pos的值来自 68的 show master status;
mysql > change master to
master_host='172.22.1.68',
master_port=3306,
master_user='root',
master_password='zlwy@mysql',
master_log_file='master-bin-68.000004',
master_log_pos=927,
MASTER_AUTO_POSITION=0;
# 参考:stop slave; reset slave all;
mysql >start slave;
mysql >show slave status\G
4、测试主从复制
# master(68) 创建一个数据库test
mysql> create database test;
# slave1 查看是否创建同步成功
mysql> show databases;
mysql> drop database test;
配置公钥互信 (四台机器ssh互通)
# 1. 启动ssh
service sshd restart
# 2. 在分发密钥是需要输入系统密码
# 生成密钥对
ssh-keygen -t rsa
# 分发到所有节点包括自己(4机器逗得)
ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.22.1.68
ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.22.1.88
ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.22.1.94
ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.22.1.81
MHA管理软件搭建
1、环境准备
四台机器节点安装以下组件: 4个节点都得装
$ yum install epel-release -y
$ yum install perl-DBD-MySQL -y
$ yum install perl-Config-Tiny -y
$ yum install perl-Log-Dispatch -y
$ yum install perl-Parallel-ForkManager -y
2、安装data node节点(四台机器)
wget https://qiniu.wsfnk.com/mha4mysql-node-0.58-0.el7.centos.noarch.rpm --no-check-certificate
rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm
3、manager上安装(81)
wget https://qiniu.wsfnk.com/mha4mysql-manager-0.58-0.el7.centos.noarch.rpm --no-check-certificate
rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
vim /etc/mha-manager.cnf
[server default]
manager_log=/var/log/mha/app1/manager.log
#mha_master自己的工作路径
manager_workdir=/var/log/mha/app1
#binlog
master_binlog_dir=/var/lib/mysql
#mha管理密码
password=zlwy@mysql
#ping间隔时长
ping_interval=3
#每个远程主机的工作目录在何处
remote_workdir=/var/log/mha/app1
#数据库密码
repl_password=zlwy@mysql
#数据库用户名
repl_user=root
#基于ssh的密钥认证
ssh_user=root
#mha管理用户
user=root
[server1]
hostname=172.22.1.68
port=3306
#将来可不可以成为master候选节点/主节点
candidate_master=1
[server2]
hostname=172.22.1.94
port=3306
candidate_master=1
[server3]
hostname=172.22.1.88
port=3306
no_master=1
- 运行检查SSH配置
$ masterha_check_ssh --conf=/etc/mha-manager.cnf
[info] All SSH connection tests passed successfully.
- 配置验证
masterha_check_repl --conf=/etc/mha-manager.cnf
Thu Jul 15 16:36:22 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Jul 15 16:36:22 2021 - [info] Reading application default configuration from /etc/mha-manager.cnf..
Thu Jul 15 16:36:22 2021 - [info] Reading server configuration from /etc/mha-manager.cnf..
Thu Jul 15 16:36:22 2021 - [info] MHA::MasterMonitor version 0.58.
Thu Jul 15 16:36:23 2021 - [info] GTID failover mode = 1
Thu Jul 15 16:36:23 2021 - [info] Dead Servers:
Thu Jul 15 16:36:23 2021 - [info] Alive Servers:
Thu Jul 15 16:36:23 2021 - [info] 172.22.1.68(172.22.1.68:3306)
Thu Jul 15 16:36:23 2021 - [info] 172.22.1.94(172.22.1.94:3306)
Thu Jul 15 16:36:23 2021 - [info] 172.22.1.88(172.22.1.88:3306)
Thu Jul 15 16:36:23 2021 - [info] Alive Slaves:
Thu Jul 15 16:36:23 2021 - [info] 172.22.1.94(172.22.1.94:3306) Version=5.7.34-log (oldest major version between slaves) log-bin:enabled
Thu Jul 15 16:36:23 2021 - [info] GTID ON
Thu Jul 15 16:36:23 2021 - [info] Replicating from 172.22.1.68(172.22.1.68:3306)
Thu Jul 15 16:36:23 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Jul 15 16:36:23 2021 - [info] 172.22.1.88(172.22.1.88:3306) Version=5.7.34-log (oldest major version between slaves) log-bin:enabled
Thu Jul 15 16:36:23 2021 - [info] GTID ON
Thu Jul 15 16:36:23 2021 - [info] Replicating from 172.22.1.68(172.22.1.68:3306)
Thu Jul 15 16:36:23 2021 - [info] Not candidate for the new Master (no_master is set)
Thu Jul 15 16:36:23 2021 - [info] Current Alive Master: 172.22.1.68(172.22.1.68:3306)
Thu Jul 15 16:36:23 2021 - [info] Checking slave configurations..
Thu Jul 15 16:36:23 2021 - [info] Checking replication filtering settings..
Thu Jul 15 16:36:23 2021 - [info] binlog_do_db= , binlog_ignore_db= information_schema,performance_schema,sys
Thu Jul 15 16:36:23 2021 - [info] Replication filtering check ok.
Thu Jul 15 16:36:23 2021 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Thu Jul 15 16:36:23 2021 - [info] Checking SSH publickey authentication settings on the current master..
Thu Jul 15 16:36:28 2021 - [warning] HealthCheck: Got timeout on checking SSH connection to 172.22.1.68! at /usr/share/perl5/vendor_perl/MHA/HealthCheck.pm line 343.
Thu Jul 15 16:36:28 2021 - [info]
172.22.1.68(172.22.1.68:3306) (current master)
+--172.22.1.94(172.22.1.94:3306)
+--172.22.1.88(172.22.1.88:3306)
Thu Jul 15 16:36:28 2021 - [info] Checking replication health on 172.22.1.94..
Thu Jul 15 16:36:28 2021 - [info] ok.
Thu Jul 15 16:36:28 2021 - [info] Checking replication health on 172.22.1.88..
Thu Jul 15 16:36:28 2021 - [info] ok.
Thu Jul 15 16:36:28 2021 - [warning] master_ip_failover_script is not defined.
Thu Jul 15 16:36:28 2021 - [warning] shutdown_script is not defined.
Thu Jul 15 16:36:28 2021 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
- 启动 manager
$ nohup masterha_manager --conf=/etc/mha-manager.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1 &
Thu Jul 15 16:36:48 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Jul 15 16:36:48 2021 - [info] Reading application default configuration from /etc/mha-manager.cnf..
Thu Jul 15 16:36:48 2021 - [info] Reading server configuration from /etc/mha-manager.cnf..
Thu Jul 15 16:36:48 2021 - [info] MHA::MasterMonitor version 0.58.
Thu Jul 15 16:36:49 2021 - [info] GTID failover mode = 1
Thu Jul 15 16:36:49 2021 - [info] Dead Servers:
Thu Jul 15 16:36:49 2021 - [info] Alive Servers:
Thu Jul 15 16:36:49 2021 - [info] 172.22.1.68(172.22.1.68:3306)
Thu Jul 15 16:36:49 2021 - [info] 172.22.1.94(172.22.1.94:3306)
Thu Jul 15 16:36:49 2021 - [info] 172.22.1.88(172.22.1.88:3306)
Thu Jul 15 16:36:49 2021 - [info] Alive Slaves:
Thu Jul 15 16:36:49 2021 - [info] 172.22.1.94(172.22.1.94:3306) Version=5.7.34-log (oldest major version between slaves) log-bin:enabled
Thu Jul 15 16:36:49 2021 - [info] GTID ON
Thu Jul 15 16:36:49 2021 - [info] Replicating from 172.22.1.68(172.22.1.68:3306)
Thu Jul 15 16:36:49 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Jul 15 16:36:49 2021 - [info] 172.22.1.88(172.22.1.88:3306) Version=5.7.34-log (oldest major version between slaves) log-bin:enabled
Thu Jul 15 16:36:49 2021 - [info] GTID ON
Thu Jul 15 16:36:49 2021 - [info] Replicating from 172.22.1.68(172.22.1.68:3306)
Thu Jul 15 16:36:49 2021 - [info] Not candidate for the new Master (no_master is set)
Thu Jul 15 16:36:49 2021 - [info] Current Alive Master: 172.22.1.68(172.22.1.68:3306)
Thu Jul 15 16:36:49 2021 - [info] Checking slave configurations..
Thu Jul 15 16:36:49 2021 - [info] Checking replication filtering settings..
Thu Jul 15 16:36:49 2021 - [info] binlog_do_db= , binlog_ignore_db= information_schema,performance_schema,sys
Thu Jul 15 16:36:49 2021 - [info] Replication filtering check ok.
Thu Jul 15 16:36:49 2021 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Thu Jul 15 16:36:49 2021 - [info] Checking SSH publickey authentication settings on the current master..
Thu Jul 15 16:36:54 2021 - [warning] HealthCheck: Got timeout on checking SSH connection to 172.22.1.68! at /usr/share/perl5/vendor_perl/MHA/HealthCheck.pm line 343.
Thu Jul 15 16:36:54 2021 - [info]
172.22.1.68(172.22.1.68:3306) (current master)
+--172.22.1.94(172.22.1.94:3306)
+--172.22.1.88(172.22.1.88:3306)
Thu Jul 15 16:36:54 2021 - [warning] master_ip_failover_script is not defined.
Thu Jul 15 16:36:54 2021 - [warning] shutdown_script is not defined.
Thu Jul 15 16:36:54 2021 - [info] Set master ping interval 3 seconds.
Thu Jul 15 16:36:54 2021 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Thu Jul 15 16:36:54 2021 - [info] Starting ping health check on
- 查询日志:
tailf -n 100 /var/log/mha/app1/manager.log
- 检查集群状态
masterha_check_status --conf=/etc/mha-manager.cnf
mha-manager (pid:12443) is running(0:PING_OK), master:172.22.1.68
- 停止集群操作
masterha_stop --conf=/etc/mha-manager.cnf
报错
报错1:The replication receiver thread cannot start because the master has GTID_MODE = ON and this server has GTID_MODE = OFF.
# 添加如下设置
mysql> set global gtid_mode=OFF_PERMISSIVE;
mysql> set global gtid_mode=ON_PERMISSIVE;
mysql> set global enforce_gtid_consistency=on;
mysql> set global gtid_mode=ON;
# 可以查看配置信息
mysql > show global variables like '%gtid%';
报错2:Error ‘Can’t drop database ‘test’; database doesn’t exist’ on query.
问题出现原因:在主库上进行删除了一个数据库test,然后在其他的服务器上在搭建主库的从库,在从库上没有相应都已经删除的数据库test而导致的Slave_IO_Running: Yes,Slave_SQL_Running: No 。
处理的办法:停掉从库,然后在创建相应的已经删除的信息,然后在开启从库。
mysql> stop slave;
mysql> create database test;
mysql> start slave;
mysql> show slave status\G
$ rm -rf /var/log/mha/app1/mha-manager.failover.complete
报错3:Parameters MASTER_LOG_FILE, MASTER_LOG_POS, RELAY_LOG_FILE and RELAY_LOG_POS
问题介绍: 主库GTID关闭了。
解决的办法:
由于master_auto_position=1,现在改回0,重新用pos方法change就可以了。
MySQL > change master to master_auto_position=0;
###报错4:Got fatal error 1236 from master when reading data from binary log: ‘Binary log is not open’
mysql > stop slave;
mysql > reset slave;
mysql > start slave;