方法一具体案例:
拓扑架构为A->B->C,假如从库C的复制报错了,原因:需要的binlog在B上被purge掉了,而A上的binlog完整无损。
解决办法:将C指向A进行复制。
在C上执行:
stop slave;
change master to master_host='A的IP',master_port=A的端口号,MASTER_USER='RepUser',MASTER_PASSWORD='密码',master_auto_position=1;
start slave;
show slave status \G检查
再将C的复制重新指向B。
方法二具体案例:
一 问题描述
从库主从复制报错:
mysql> show slave status \G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 192.168.1.203
Master_User: RepUser
Master_Port: 3307
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 436
Relay_Log_File: pc3-relay-bin.000003
Relay_Log_Pos: 649
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: No
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 436
Relay_Log_Space: 1598
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
Master_UUID: 50418265-e17f-11e9-95b3-080027040516
Master_Info_File: /data/server/mysql_3307/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp: 200313 13:34:44
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: a4ac8cd2-e17c-11e9-a602-080027040516:40-42
Executed_Gtid_Set: 50418265-e17f-11e9-95b3-080027040517:1-155,
a4ac8cd2-e17c-11e9-a602-080027040516:1-42
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
ERROR:
No query specified
可以在自己的虚拟机里这样模拟故障:
- 停止从库复制:stop slave;
- 在主库上flush logs;插入几条测试数据;flush logs; purge binary logs to '最新的binlog';
- 启动从库复制,就能看到报错了。
二 出错原因
从库执行复制所需要的binlog在主库上被purge了
三 解决办法
3.1 找到从库比主库少执行的事务
3.1.1 查看主从的Executed_Gtid_Set
查看命令:
show master status;
#主库信息如下
#从库信息如下
推测从库比主库缺少a4ac8cd2-e17c-11e9-a602-080027040516:43-44这两个事务,这两个事务所在的binlog被purge了,导致从库复制报错。
3.1.2 获取binlog里相关GTID对应的sql
mysqlbinlog -v mysql-bin.000016 --include-gtids='a4ac8cd2-e17c-11e9-a602-080027040516:43-44' >44.log
less 44.log
里面可以看到有这两个GTID相关信息:
SET @@SESSION.GTID_NEXT= 'a4ac8cd2-e17c-11e9-a602-080027040516:43'/*!*/;
……
其下面有对应的sql语句:
BINLOG '
NhtrXhMBAAAAKwAAAD8KAAAAAG4AAAAAAAEAA2RiYQABdAABAwABeSxbrQ==
NhtrXh4BAAAAKAAAAGcKAAAAAG4AAAAAAAEAAgAB//4gAAAA4AsBIQ==
'/*!*/;
### INSERT INTO `dba`.`t`
### SET
### @1=32
……
SET @@SESSION.GTID_NEXT= 'a4ac8cd2-e17c-11e9-a602-080027040516:44'/*!*/;
# at 2759
#200313 13:33:58 server id 1 end_log_pos 2830 CRC32 0xd24b2596 Query thread_id=2 exec_time=0 error_code=0
SET TIMESTAMP=1584077638/*!*/;
BEGIN
/*!*/;
# at 2830
#200313 13:33:58 server id 1 end_log_pos 2873 CRC32 0xad0f2ca4 Table_map: `dba`.`t` mapped to number 110
# at 2873
#200313 13:33:58 server id 1 end_log_pos 2913 CRC32 0xa80a5f68 Write_rows: table id 110 flags: STMT_END_F
BINLOG '
RhtrXhMBAAAAKwAAADkLAAAAAG4AAAAAAAEAA2RiYQABdAABAwABpCwPrQ==
RhtrXh4BAAAAKAAAAGELAAAAAG4AAAAAAAEAAgAB//4hAAAAaF8KqA==
'/*!*/;
### INSERT INTO `dba`.`t`
### SET
### @1=33
3.1.3 在主从里验证下是否缺少该数据
select * from dba.t where id=33;
3.2 在从库上手动执行这俩事务
mysqlbinlog mysql-bin.000016 --include-gtids='a4ac8cd2-e17c-11e9-a602-080027040516:43-44' | mysql -h 192.168.1.204 -u root -psystem@123 -P 3307
3.3 在从库上重启主从复制
stop slave;
start slave;
show slave status \G; #检查主从复制是否正常。