通过MGR(MySQL Group Replication)实现MySQL主从复制集群的时候,遇到很多问题:1.[GCS] There is no local IP address matching the one configured for the local node 。2.[GCS] Connection attempt from IP address 17.31.196.91 refused. Address is not in the IP whitelist。
一. 环境:CentOS Linux release 7.7 + MySQL5.7.28
二. 离线安装mysql
- 下载安装包:mysql-5.7.28-linux-glibc2.12-x86_64.tar.gz,地址:https://dev.mysql.com/downloads/mysql/
- 依次执行下面的命令,执行命令前,还要修改 hosts 文件。
vi /etc/hosts
tar -zxvf mysql-5.7.28-linux-glibc2.12-x86_64.tar.gz mv mysql-5.7.28-linux-glibc2.12-x86_64 mysql5728 mv mysql5728 /usr/local cd /usr/local/mysql5728 vim my.cnf
# hosts 添加以下内容 17.31.196.91 MGR-node1 17.31.196.92 MGR-node2
- my.cnf 配置内容如下
[mysql] default-character-set=UTF8 [client] port=3307 default-character-set=UTF8 [mysqld] port=3307 default-time-zone='+08:00' basedir=/usr/local/mysql5728 datadir=/usr/local/mysql5728/data max_connections=2000 max_connect_errors=10 character-set-server=UTF8 default-storage-engine=InnoDB sql_mode=STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION pid-file=/usr/local/mysql5728/mysqld.pid log-error=/usr/local/mysql5728/logs/mysqld.log #复制框架 server_id=1 gtid_mode=ON enforce_gtid_consistency=ON binlog_checksum=NONE log_bin=binlog log_slave_updates=ON binlog_format=ROW master_info_repository=TABLE relay_log_info_repository=TABLE #组复制设置 #server必须为每个事务收集写集合,并使用XXHASH64哈希算法将其编码为散列 transaction_write_set_extraction=XXHASH64 #告知插件加入或创建组命名,UUID loose-group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa" #server启动时不自启组复制,为了避免每次启动自动引导具有相同名称的第二个组,所以设置为OFF。 loose-group_replication_start_on_boot=off #告诉插件使用IP地址,端口24901用于接收组中其他成员转入连接 loose-group_replication_local_address="17.31.199.48:33061" #启动组server,种子server,加入组应该连接这些的ip和端口;其他server要加入组得由组成员同意 loose-group_replication_group_seeds="17.31.199.48:33061,17.31.199.49:33061" loose-group_replication_bootstrap_group=off report_host=17.31.199.48 report_port=3307
-
依次执行以下命令
mkdir logs touch logs/mysqld.log cd ..
-
没有mysql这个组和用户的话就创建
groupadd mysql useradd -g mysql mysql #文件夹赋权 chmod -R +660 mysql5728/ #更改文件属主和属组 chown -R mysql:mysql mysql5728/ chgrp -R mysql mysql5728/ #--defaults-file指定初始化时用到的配置文件,如果不指定会加载/etc/my.cnf,如果/etc/my.cnf不存在 #会加载默认配置 ./bin/mysqld --defaults-file=/usr/local/mysql5728/my.cnf --initialize --user=root -- basedir=/usr/local/mysql5728
-
初始化后,会生成data文件夹
chmod -R +660 data/ chown -R mysql:mysql data/ chgrp -R mysql data/
-
因为服务器已经部署了一个mysql,所以使用下面的方式启动
./mysqld_safe --defaults-file=/usr/local/mysql5728/my.cnf &
-
进入mysql客户端
#获取mysql初始密码 cat logs/mysqld.log | grep 'A temporary' #进入客户端 ./bin/mysql -u root -p
-
组复制操作命令
# 关闭binlog日志,经确认,binlog不是在start group_replication之后开始生成,而是在mysql服务启动的 # 时候,因为my.cnf里面配置了复制框架,在修改或添加用户的时候一定要设为0,关闭binlog,让修改或添加用 # 户的sql不计入binlog,这样从机也就不会执行到相关sql语句,因为binlog没有相关sql,那么从机的 # relaylog也不会有,不然从机在start group_replication的时候,会遇到事务执行中断的错误 SET SQL_LOG_BIN=0; set password='root'; # 让所有IP都可以访问 update mysql.user set host='%' where user='root'; grant all on *.* to 'root'@'%'; FLUSH PRIVILEGES; # 创建组复制用户 CREATE USER repl@'%' IDENTIFIED BY 'repl'; GRANT REPLICATION SLAVE ON *.* TO repl@'%'; FLUSH PRIVILEGES; SET SQL_LOG_BIN=1; CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='repl' FOR CHANNEL 'group_replication_recovery'; INSTALL PLUGIN group_replication SONAME 'group_replication.so';
-
创建MGR主机,在START GROUP_REPLICATION时失败,查看日志:[GCS] There is no local IP address matching the one configured for the local node (17.31.199.48:33061),原因:在开启组复制的时候,会查询本机可用的前12个IP(mysql官网有说明,地址https://bugs.mysql.com/bug.php?id=86772),想到17.31.199.48是路由器映射的IP,不是服务器的本地IP,可通过Linux命令 ifconfig 查看
SET GLOBAL group_replication_bootstrap_group=ON; START GROUP_REPLICATION;
-
更改my.cnf里面的 loose-group_replication_local_address、loose-group_replication_group_seeds、report_host 如下
loose-group_replication_local_address="17.31.196.91:33061" loose-group_replication_group_seeds="17.31.196.91:33061,17.31.196.92:33061" report_host=17.31.196.91
-
重启mysql服务,进入mysql客户端,并创建MGR主机,在START GROUP_REPLICATION时失败,查看日志:[GCS] Connection attempt from IP address 17.31.196.91 refused. Address is not in the IP whitelist。原因:没有设置组复制白名单。
SET GLOBAL group_replication_bootstrap_group=ON; START GROUP_REPLICATION;
-
更改my.cnf,在里面的组复制相关设置中添加白名单
loose-group_replication_ip_whitelist='17.31.196.91/24,17.31.196.92/24,127.0.0.1/8'
-
重启mysql服务,进入mysql客户端,并创建MGR主机
SET GLOBAL group_replication_bootstrap_group=ON; START GROUP_REPLICATION; SET GLOBAL group_replication_bootstrap_group=OFF; SELECT * FROM performance_schema.replication_group_members;
-
从服务器也按以上步骤创建mysql,要更改 my.cnf 的以下内容
server_id=2 loose-group_replication_local_address="17.31.196.92:33061" loose-group_replication_group_seeds="17.31.196.91:33061,17.31.196.92:33061" report_host=17.31.196.92
-
创建MGR从机
set global group_replication_allow_local_disjoint_gtids_join=ON; START GROUP_REPLICATION; SELECT * FROM performance_schema.replication_group_members;
-
如果查看到的 slave 状态一直是 recovring,则说明出了问题,查看MySQL报错日志,relaylog 回放失败,解决如下
STOP GROUP_REPLICATION; # Master_Log_File 主机binlog,Master_Log_Pos 执行错误点 change master to Master_Log_File='binlog.000004', Master_Log_Pos=394186; START GROUP_REPLICATION; SELECT * FROM performance_schema.replication_group_members;
-
关于 change master to 的详细说明请参见:https://blog.csdn.net/weixin_34408624/article/details/89773292