1. 集群管理
1.1. 跳过复制错误
1.1.1. GTID模式下跳过单个复制错误
(1)分析从库报错信息
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 172.21.167.73
Master_User: Repl
Master_Port: 20201
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 1083
Relay_Log_File: mysql-relay-bin.000002
Relay_Log_Pos: 971
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1062
Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'eef85c8f-bef5-11eb-a336-00e0ed90dfcb:4' at master log mysql-bin.000001, end_log_pos 1052. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Skip_Counter: 0
Exec_Master_Log_Pos: 805
Relay_Log_Space: 1456
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1062
Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'eef85c8f-bef5-11eb-a336-00e0ed90dfcb:4' at master log mysql-bin.000001, end_log_pos 1052. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Replicate_Ignore_Server_Ids:
Master_Server_Id: 16773
Master_UUID: eef85c8f-bef5-11eb-a336-00e0ed90dfcb
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 210527 22:24:43
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: eef85c8f-bef5-11eb-a336-00e0ed90dfcb:1-4
Executed_Gtid_Set: eef85c8f-bef5-11eb-a336-00e0ed90dfcb:1-3
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
Retrieved_Gtid_Set 是slave接收到的事务的信息,Executed_Gtid_Set是slave已经执行的master的信息,意味着复制的时候从库遇到主库的事务eef85c8f-bef5-11eb-a336-00e0ed90dfcb:4发生了错误,理论上跳过 eef85c8f-bef5-11eb-a336-00e0ed90dfcb:4 就可以的。
1)对于直接提示GTID的,可通过如下流程跳过:
mysql> stop slave;
mysql> set gtid_next='eef85c8f-bef5-11eb-a336-00e0ed90dfcb:4';
mysql> begin;commit;
mysql> set gtid_next='AUTOMATIC';
mysql> start slave;
mysql> show slave status\G
2)对于未直接提示GTID,则需要通过分析主库binlog日志定位需要跳过的GTID,如下流程:
从复制报错来看,如下:
mysql> select * from performance_schema.replication_applier_status_by_worker;
+--------------+-----------+-----------+---------------+----------------------------------------+-------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------+
| CHANNEL_NAME | WORKER_ID | THREAD_ID | SERVICE_STATE | LAST_SEEN_TRANSACTION | LAST_ERROR_NUMBER | LAST_ERROR_MESSAGE | LAST_ERROR_TIMESTAMP |
+--------------+-----------+-----------+---------------+----------------------------------------+-------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------+
| | 1 | NULL | OFF | eef85c8f-bef5-11eb-a336-00e0ed90dfcb:4 | 1062 | Worker 1 failed executing transaction 'eef85c8f-bef5-11eb-a336-00e0ed90dfcb:4' at master log mysql-bin.000001, end_log_pos 1052; Could not execute Write_rows event on table zhili7.testerr; Duplicate entry '2' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000001, end_log_pos 1052 | 2021-05-27 22:24:43 |
| | 2 | NULL | OFF | | 0
+--------------+-----------+-----------+---------------+----------------------------------------+-------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------+
此次的报错是 Could not execute Write_rows event on table zhili7.testerr; Duplicate entry '2' for key 'PRIMARY', Error_code: 1062:
从库的 Exec_Master_Log_Pos 为 805,Last_Error中的 end_log_pos 为 1052,那就是说在这个偏移量之间的事务是导致slave无法复制的。
(2)解析主库的binlog日志,确认事务ID
# mysqlbinlog --no-defaults --start-position=805 --stop-position=1052 -vv --base64-output=decode-rows mysql-bin.000001
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 805
#210527 22:24:43 server id 16773 end_log_pos 870 CRC32 0x54ffe61c GTID last_committed=3 sequence_number=4 rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= 'eef85c8f-bef5-11eb-a336-00e0ed90dfcb:4'/*!*/;
# at 870
#210527 22:24:43 server id 16773 end_log_pos 949 CRC32 0x5521a9c1 Query thread_id=3 exec_time=0 error_code=0
SET TIMESTAMP=1622125483/*!*/;
SET @@session.pseudo_thread_id=3/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1073741824/*!*/;
SET @@session.auto_increment_increment=2, @@session.auto_increment_offset=1/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=45/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
BEGIN
/*!*/;
# at 949
#210527 22:24:43 server id 16773 end_log_pos 1004 CRC32 0xc1b251b0 Table_map: `zhili7`.`testerr` mapped to number 108
# at 1004
#210527 22:24:43 server id 16773 end_log_pos 1052 CRC32 0x63f07e60 Write_rows: table id 108 flags: STMT_END_F
### INSERT INTO `zhili7`.`testerr`
### SET
### @1=2 /* INT meta=0 nullable=0 is_null=0 */
### @2='The Err' /* STRING(40) meta=65064 nullable=1 is_null=0 */
ROLLBACK /* added by mysqlbinlog */ /*!*/;
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
DELIMITER ;
# End of log file
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
在偏移量805至1052之间,有事务eef85c8f-bef5-11eb-a336-00e0ed90dfcb:4,需要跳过它,才能恢复复制。
(3))跳过复制报错
mysql> stop slave;
Query OK, 0 rows affected (0.01 sec)
mysql> set gtid_next='eef85c8f-bef5-11eb-a336-00e0ed90dfcb:4';
Query OK, 0 rows affected (0.00 sec)
mysql> begin;commit;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
mysql> set gtid_next='AUTOMATIC';
Query OK, 0 rows affected (0.00 sec)
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Checking master version
Master_Host: 172.21.167.73
Master_User: Repl
Master_Port: 20201
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 1083
Relay_Log_File: mysql-relay-bin.000002
Relay_Log_Pos: 971
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 805
Relay_Log_Space: 1456
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 979
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 16773
Master_UUID: eef85c8f-bef5-11eb-a336-00e0ed90dfcb
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: eef85c8f-bef5-11eb-a336-00e0ed90dfcb:1-4
Executed_Gtid_Set: eef85c8f-bef5-11eb-a336-00e0ed90dfcb:1-4
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)