GR节点故障The member contains transactions not present in the group

背景

1个共用的测试环境:3节点、单主的的group replication。一天启动节点3(node3)时,报错:[ERROR] Plugin group_replication reported: ‘The member contains transactions not present in the group. The member will now exit the group.’
本文是处理过程和总结。

错误的含义

节点包含了非group内的事务。查看node3的数据,确实有与其他节点不一样的数据。应该是没启动GR服务时,被当做独立库操作了。

处理方法

查阅资料,可以通过重置master信息,重新加入GR的方式解决。

  1. 查看当前Master状态
mysql> show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000021
         Position: 246
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: 14a7ed7d-7e42-11ea-add1-dc4a3e82bdf9:1-66725,
ba279ebc-9ac8-11ea-a379-a08cfdcc0735:1-250791:1000181-1000191
1 row in set (0.01 sec)
  1. 重置配置信息
    官方的描述:RESET MASTER enables you to delete any binary log files and their related binary log index file, returning the master to its state before binary logging was started.
    删除了binlog,清空了master配置信息。
RESET MASTER

再看master状态,日志已经清空。

mysql> show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000001
         Position: 150
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set:
1 row in set (0.00 sec)
  1. 重新加入GR
CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='password' 
FOR CHANNEL 'group_replication_recovery';
start group_replication;
  1. 看日志已经启动GR,并进行恢复
    Slave I/O thread for channel ‘group_replication_recovery’: connected to master ‘repl@s1:3306’,replication started in log ‘FIRST’ at position 4
2020-06-15T23:09:28.805095Z 2 [Note] Plugin group_replication reported: 'Group communication SSL configuration: group_replication_ssl_mode: "DISABLED"'
2020-06-15T23:09:28.805232Z 2 [Note] Plugin group_replication reported: '[GCS] Added automatically IP ranges 10.91.57.189/26,127.0.0.1/8 to the whitelist'
2020-06-15T23:09:28.805563Z 2 [Note] Plugin group_replication reported: '[GCS] Translated 's3' to 127.0.0.1'
2020-06-15T23:09:28.805713Z 2 [Warning] Plugin group_replication reported: '[GCS] Automatically adding IPv4 localhost address to the whitelist. It is mandatory that it is added.'
2020-06-15T23:09:28.805741Z 2 [Note] Plugin group_replication reported: '[GCS] SSL was not enabled'
2020-06-15T23:09:28.805752Z 2 [Note] Plugin group_replication reported: 'Initialized group communication with configuration: group_replication_group_name: "ba279ebc-9ac8-11ea-a379-a08cfdcc0735"; group_replication_local_address: "s3:33061"; group_replication_group_seeds: "s1:33061,s2:33061,s3:33061"; group_replication_bootstrap_group: false; group_replication_poll_spin_loops: 0; group_replication_compression_threshold: 1000000; group_replication_ip_whitelist: "AUTOMATIC"'
2020-06-15T23:09:28.805859Z 2 [Note] Plugin group_replication reported: '[GCS] Configured number of attempts to join: 0'
2020-06-15T23:09:28.805880Z 2 [Note] Plugin group_replication reported: '[GCS] Configured time between attempts to join: 5 seconds'
2020-06-15T23:09:28.805893Z 2 [Note] Plugin group_replication reported: 'Member configuration: member_id: 3; member_uuid: "14a7ed7d-7e42-11ea-add1-dc4a3e82bdf9"; single-primary mode: "true"; group_replication_auto_increment_increment: 7; '
2020-06-15T23:09:28.806102Z 185 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2020-06-15T23:09:28.949470Z 188 [Note] Slave SQL thread for channel 'group_replication_applier' initialized, starting replication in log 'FIRST' at position 0, relay log './s3-relay-bin-group_replication_applier.000001' position: 4
2020-06-15T23:09:28.949474Z 2 [Note] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'
2020-06-15T23:09:28.949674Z 2 [Note] Plugin group_replication reported: 'auto_increment_increment is set to 7'
2020-06-15T23:09:28.949683Z 2 [Note] Plugin group_replication reported: 'auto_increment_offset is set to 3'
2020-06-15T23:09:28.949934Z 0 [Note] Plugin group_replication reported: 'XCom protocol version: 3'
2020-06-15T23:09:28.949962Z 0 [Note] Plugin group_replication reported: 'XCom initialized and ready to accept incoming connections on port 33061'
2020-06-15T23:09:30.817333Z 2 [Note] Plugin group_replication reported: 'This server is working as secondary member with primary member address s2:3306.'
2020-06-15T23:09:30.817574Z 191 [Note] Plugin group_replication reported: 'Establishing group recovery connection with a possible donor. Attempt 1/10'
2020-06-15T23:09:30.817687Z 0 [Note] Plugin group_replication reported: 'Group membership changed to s3:3306, s2:3306, s1:3306 on view 15901818388998341:33.'
2020-06-15T23:09:31.060257Z 191 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='s2', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='s1', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''.
2020-06-15T23:09:31.235763Z 191 [Note] Plugin group_replication reported: 'Establishing connection to a group replication recovery donor 9a44a006-7b43-11ea-a3eb-a08cfdc63e31 at s1 port: 3306.'
2020-06-15T23:09:31.236009Z 193 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2020-06-15T23:09:31.237547Z 193 [Note] Slave I/O thread for channel 'group_replication_recovery': connected to master 'repl@s1:3306',replication started in log 'FIRST' at position 4

查看GR状态:

mysql> SELECT * FROM performance_schema.replication_group_members;

可以检测到3个节点,但node3一直处于recovering。且过一会就会失败断开。

遇到的问题

  1. Error_code: 1050 Table ‘repl0523’ already exists
2020-06-15T19:16:28.282318Z 22 [ERROR] Slave SQL for channel 'group_replication_recovery': Error 'Table 'repl0523' already exists' on query. Default database: 'test'. Query: 'create table repl0523 (id int,name varchar(30))', Error_code: 1050

表已经存在,删除该表后继续。
因为已经是GR的从节点,默认是不能写的,同时,也不能把事务发送到其他节点,影响正常数据。

关闭binlog    SET SQL_LOG_BIN=0;
修改全局参数   set global read_only=0;
删除表       drop table repl0523
开启binlog   SET SQL_LOG_BIN=1;  
重设master   
重启         start group_replication;
  1. Multiple primary key defined
2020-06-15T20:14:40.801344Z 150 [ERROR] Slave SQL for channel 'group_replication_recovery': Error 'Multiple primary key defined' on query. Default database: 'test'. Query: 'ALTER TABLE `test`.`test_record_table`

在表上要建多个主键。一个表肯定只能有一个主键,查看node3上,已经有主键了。查看binlog的内容,是又再建主键。删除node3上表的主键后继续。

  1. Error_code:1007 database exists
 2020-06-15T21:57:50.313727Z 165 [ERROR] Slave SQL for channel 'group_replication_recovery': Error 'Can't create database 'test_db'; database exists' on query. Default database: 'test_db'. Query: '/* ApplicationName=DataGrip 2019.3.1 */ create schema TEST_DB', Error_code: 1007

数据库已存在。删除该库后继续。

总结

  1. resetmaster后,会当做新节点重新加入GR,GR创建后的事务会全部重做一遍。需要把之前残留的数据删除。
  2. 新加入节点,之前本地事务不受影响。本地事务就是导入node3脱离GR的原因。
  3. 有个疑问: 在线recovery太慢,怎么把以后数据导入新节点后,在加入GR。后续再研究。
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值