MySQL主从复制中的跳过错误处理参数--slave_skip_errors、sql_slave_skip_counter、slave_exec_mode
跳过复制错误——slave_skip_errors、slave_exec_mode
跳过复制错误——sql_slave_skip_counter
1 简介
mysql在主从复制过程中,由于各种的原因,从服务器可能会遇到执行BINLOG中的SQL出错的情况,在默认情况下,服务器会停止复制进程,不再进行同步,等到用户自行来处理。
slave-skip-errors的作用就是用来定义复制过程中从服务器可以自动跳过的错误号,当复制过程中遇到定义的错误号,就可以自动跳过,直接执行后面的SQL语句。
slave_skip_errors选项有四个可用值,分别为:off,all,ErorCode,ddl_exist_errors。
默认情况下该参数值是off,我们可以列出具体的error code,也可以选择all,mysql5.6及MySQL Cluster NDB 7.3以及后续版本增加了参数ddl_exist_errors,该参数包含一系列error code(1007,1008,1050,1051,1054,1060,1061,1068,1094,1146)
一些error code代表的错误如下:
1007:数据库已存在,创建数据库失败
1008:数据库不存在,删除数据库失败
1050:数据表已存在,创建数据表失败
1051:数据表不存在,删除数据表失败
1054:字段不存在,或程序文件跟数据库有冲突
1060:字段重复,导致无法插入
1061:重复键名
1068:定义了多个主键
1094:位置线程ID
1146:数据表缺失,请恢复数据库
1053:复制过程中主服务器宕机
1062:主键冲突 Duplicate entry '%s' for key %d
my.cnf中的写法:
slave_skip_errors=1062,1053
slave_skip_errors=all
slave_skip_errors=ddl_exist_errors
作为mysql启动参数的写法:
--slave-skip-errors=1062,1053
--slave-skip-errors=all
--slave-skip-errors=ddl_exist_errors
从数据库中查看该参数的值:
mysql> show variables like 'slave_skip%';
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| slave_skip_errors | 1007 |
+-------------------+-------+
3 举例分析
3.1 测试说明
配置好mysql主从同步,然后在从上写入数据,造成主从不一致。
3.2 准备测试表结构
在主机上创建表:
create table replication (c1 int not null primary key, c2 varchar(10));
3.3 准备测试数据
在主机上插入基础数据
mysql> insert into replication values (1, 'test1');
mysql> insert into replication values (2, 'test2');
此时,主机从机replication表里面都有两条记录
3.4 开始测试
从机插入一条记录
mysql> insert into replication values (3, 'test3');
然后在主机上执行相同的操作
mysql> insert into replication values (3, 'test3');
在从机上查看复制状态
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.222
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 16700
Relay_Log_File: mysql-relay-bin.000003
Relay_Log_Pos: 16595
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table: mysql.ibbackup_binlog_marker
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table: mysql.backup_%
Last_Errno: 1062
Last_Error: Error 'Duplicate entry '3' for key 'PRIMARY'' on query. Default database: 'test'. Query: 'insert into replication values (3, 'test3')'
Skip_Counter: 0
Exec_Master_Log_Pos: 16425
Relay_Log_Space: 17544
可以看到:sql线程已经停止工作 Slave_SQL_Running: No
错误号为:Last_Errno: 1062
错误信息为:Last_Error: Error 'Duplicate entry '3' for key 'PRIMARY'' on query. Default database: 'test'. Query: 'insert into replication values (3, 'test3')'
如果我们在my.cnf中加入如下选项,则可跳过此错误,数据同步继续进行。
[mysqld]
slave_skip_errors=1062
具体测试方法同上,大家可自己验证。
4 从backup恢复时从机复制出错的一些解释
mysql企业版备份工具meb提供在线热备功能,如果在备份过程中执行ddl操作,从机需要从主机的备份恢复时可能会异常,从而导致从机同步数据失败。原因是从机恢复时需要先从备份文件恢复(包含备份过程中执行的ddl语句),
同步时不是从全备后的最后一个位置同步,而是从ddl的上个位置同步,如果再次执行该ddl语句在从机上不会造成冲突,
则同步继续,如果会造成冲突,同步终止。解决此冲突的办法是在my.cnf文件中加入一行
[mysqld]
slave_skip_errors=ddl_exist_errors
5 注意事项
5.1 该参数为全局静态参数,不能动态调整,可在my.cnf中加入该参数列表后重启mysql服务器生效。
5.2 必须注意的是,启动这个参数,如果处理不当,很可能造成主从数据库的数据不同步,在应用中需要根据实际情况,如果对数据完整性要求不是很严格,那么这个选项确实可以减轻维护的成本
****************************************************************************************
sql_slave_skip_counter 介绍:
摘自MySQL官方的解释( 强烈建议阅读英文原文 。中文版,是笔者自己的理解,只能说仁者见仁)
SET GLOBAL sql_slave_skip_counter Syntax:
SET GLOBAL sql_slave_skip_counter = N
This statement skips the next N events from the master. This is useful for recovering from replication stops caused by a statement.
跳过N个events。注意:以event为单位,而不是以事务为单位,只有在由单条语句组成的事务时,两者才等价。
如:一个事务由多个EVENT组成,BEGIN;INSERT;UPDATE;DELETE;COMMOIT; 这种情况下,两者绝不相等
This statement is valid only when the slave threads are not running. Otherwise, it produces an error.
When using this statement, it is important to understand that the binary log is actually organized as a sequence of groups known as event groups. Each event group consists of a sequence of events.
For transactional tables, an event group corresponds to a transaction.
对于事务表,一个event group对应一个事务
or nontransactional tables, an event group corresponds to a single SQL statement.
对于非事务表,一个event group对应一条SQL
When you use SET GLOBAL sql_slave_skip_counter to skip events and the result is in the middle of a group, the slave continues to skip events until it reaches the end of the group. Execution then starts with the next event group
当你跳过event的时候,如果N的值,处于event group之中,那么slave会继续跳过event,直至跳过这个event group,从下一个event group开始
对于事务表使用 sql_slave_skip_counter 的情况:
1、跳过1032复制错误(update/delete error)
跳过由单条SQL组成的事务:
在Slave主机上人为的删除两条数据:
DELETE FROM `edusoho_e`.`t1` WHERE `id` = '9';
DELETE FROM `edusoho_e`.`t1` WHERE `id` = '11';
而Master在变更上述两条记录的时候会报错,导致复制中断:
INSERT INTO `edusoho_e`.`t1` (`xname`, `address`, `hobby`) VALUES ('孙权', '吴国', '妹妹');
UPDATE `edusoho_e`.`t1` SET xname='游戏' WHERE id=7;
UPDATE `edusoho_e`.`t1` SET age=40 WHERE id=11; #报错
DELETE FROM `edusoho_e`.`t1` WHERE age=40; #报错
INSERT INTO `edusoho_e`.`t1` (`xname`, `address`, `hobby`) VALUES ('曹丕', '魏国', '甄姬');
DELETE FROM `edusoho_e`.`t1` WHERE id=1;
UPDATE `edusoho_e`.`t1` SET hobby='Games' WHERE id=3;
在Slave查看主从复制状态时,就会发现报错信息:
mysql> show slave status\G;
*************************** 1. row ***************************
Read_Master_Log_Pos: 2176
Exec_Master_Log_Pos: 874
Last_Errno: 1032
Last_Error: Could not execute Update_rows event on table edusoho_e.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000002, end_log_pos 1127
Slave_IO_Running: Yes
Slave_SQL_Running: No
在Master主机上查看position做了什么操作:
mysql> show binlog events in 'mysql-bin.000002' from 874;
+------------------+------+-------------+-----------+-------------+---------------------------------+
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
+------------------+------+-------------+-----------+-------------+---------------------------------+
| mysql-bin.000002 | 874 | Query | 2 | 956 | BEGIN |
| mysql-bin.000002 | 956 | Table_map | 2 | 1017 | table_id: 213 (edusoho_e.t1) |
| mysql-bin.000002 | 1017 | Update_rows | 2 | 1127 | table_id: 213 flags: STMT_END_F |
| mysql-bin.000002 | 1127 | Xid | 2 | 1158 | COMMIT /* xid=437 */ |
| mysql-bin.000002 | 1158 | Query | 2 | 1240 | BEGIN |
| mysql-bin.000002 | 1240 | Table_map | 2 | 1301 | table_id: 213 (edusoho_e.t1) |
| mysql-bin.000002 | 1301 | Delete_rows | 2 | 1407 | table_id: 213 flags: STMT_END_F |
| mysql-bin.000002 | 1407 | Xid | 2 | 1438 | COMMIT /* xid=446 */ |
| mysql-bin.000002 | 1438 | Query | 2 | 1520 | BEGIN |
| mysql-bin.000002 | 1520 | Table_map | 2 | 1581 | table_id: 213 (edusoho_e.t1) |
| mysql-bin.000002 | 1581 | Write_rows | 2 | 1644 | table_id: 213 flags: STMT_END_F |
| mysql-bin.000002 | 1644 | Xid | 2 | 1675 | COMMIT /* xid=455 */ |
| mysql-bin.000002 | 1675 | Query | 2 | 1757 | BEGIN |
| mysql-bin.000002 | 1757 | Table_map | 2 | 1818 | table_id: 213 (edusoho_e.t1) |
| mysql-bin.000002 | 1818 | Delete_rows | 2 | 1880 | table_id: 213 flags: STMT_END_F |
| mysql-bin.000002 | 1880 | Xid | 2 | 1911 | COMMIT /* xid=464 */ |
| mysql-bin.000002 | 1911 | Query | 2 | 1993 | BEGIN |
| mysql-bin.000002 | 1993 | Table_map | 2 | 2054 | table_id: 213 (edusoho_e.t1) |
| mysql-bin.000002 | 2054 | Update_rows | 2 | 2145 | table_id: 213 flags: STMT_END_F |
| mysql-bin.000002 | 2145 | Xid | 2 | 2176 | COMMIT /* xid=473 */ |
+------------------+------+-------------+-----------+-------------+---------------------------------+
在Slave跳过第一个 Update_rows event 复制报错:
mysql> set global sql_slave_skip_counter=1;
mysql> start slave sql_thread;
mysql> show slave status\G;
Slave_IO_Running: Yes
Slave_SQL_Running: No
Exec_Master_Log_Pos: 1158
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows event on table edusoho_e.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000002, end_log_pos 1407
成功跳过第一个events group
在Slave继续跳过第二个 Delete_rows event 复制报错:
mysql> set global sql_slave_skip_counter=1;
mysql> start slave sql_thread;
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Last_SQL_Errno: 0
Last_SQL_Error:
成功跳过第二个 events group
注意:
虽然主从复制出现的故障成功跳过了,但只是暂时恢复了正常的主从复制状态,需要尽快的对Slave缺失的数据进行补齐,不然Master对Slave不存在的数据做的变更,仍然会重复导致主从复制故障,笔者觉得如果你的数据量差异不是太大的话,可以考虑使用 pt-table-checksum 和 pt-table-sync 工具进行恢复,如果你的数据量很大且数据差异很多,还是建议重做Slave较好,因为使用工具会锁表,会对线上业务造成一定的影响,具体情况,请自行考量。
跳过由多条SQL(event)组成的事务:
在Slave主机上人为的删除一条数据:
DELETE FROM `edusoho_e`.`t1` WHERE `id` = '7';
在Master主机上产生一个由多条SQL组成的事务:
BEGIN;
DELETE FROM `edusoho_e`.`t1` WHERE `id` = '7';
INSERT INTO `edusoho_e`.`t1` (`xname`, `address`, `hobby`) VALUES ('懒死', '不知道', '吃了睡睡了吃');
COMMIT;
因为Slave主机上已经删除id=7的数据,在Slave查看主从复制状态时,就会发现报错信息:
mysql> show slave status\G;
*************************** 1. row ***************************
Read_Master_Log_Pos: 7219
Exec_Master_Log_Pos: 6840
Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Errno: 1032
Last_Error: Could not execute Delete_rows event on table edusoho_e.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000002, end_log_pos 7049
在Master主机上查看position做了什么操作:
mysql> show binlog events in 'mysql-bin.000002' from 6840;
+------------------+------+-------------+-----------+-------------+---------------------------------+
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
+------------------+------+-------------+-----------+-------------+---------------------------------+
| mysql-bin.000002 | 6840 | Query | 2 | 6922 | BEGIN |
| mysql-bin.000002 | 6922 | Table_map | 2 | 6983 | table_id: 213 (edusoho_e.t1) |
| mysql-bin.000002 | 6983 | Delete_rows | 2 | 7049 | table_id: 213 flags: STMT_END_F |
| mysql-bin.000002 | 7049 | Table_map | 2 | 7110 | table_id: 213 (edusoho_e.t1) |
| mysql-bin.000002 | 7110 | Write_rows | 2 | 7188 | table_id: 213 flags: STMT_END_F |
| mysql-bin.000002 | 7188 | Xid | 2 | 7219 | COMMIT /* xid=825 */ |
+------------------+------+-------------+-----------+-------------+---------------------------------+
可以看到,这个事务是由两个SQL(event)组成的
如果使用 sql_slave_skip_counter=N 跳过由多条SQL组成的事务会怎样呢?
mysql> set global sql_slave_skip_counter=1;
mysql> start slave sql_thread;
mysql> show slave status\G;
*************************** 1. row ***************************
Read_Master_Log_Pos: 7219
Exec_Master_Log_Pos: 7219
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Last_Errno: 0
Last_Error:
发现问题没有 ,在使用sql_slave_skip_counter跳过由多条SQL(event)组成的事务时,从在Master上执行的 show binlog events 可以看到,如果只是跳过出报错SQL语句,那么 Exec_Master_Log_Pos 值应该为7110,但是现在为7219,说明将整个event group跳过了,但是7110的SQL数据是我们需要的,所以,和单条SQL组成的事务一样,主从复制状态虽然恢复,但是数据仍处于不一致状态,要抓紧时间补齐数据或重做Slave
2、 由多条SQL(event)组成的事务时,仅跳过一个event,而不是一个event group:
在Slave主机上人为的删除一条数据:
DELETE FROM `edusoho_e`.`t1` WHERE `id` = '17';
在Master主机上产生一个由多条SQL组成的事务:
BEGIN;
DELETE FROM `edusoho_e`.`t1` WHERE `id` = '17';
INSERT INTO `edusoho_e`.`t1` (`xname`, `address`, `hobby`) VALUES ('我是谁', '不知道', '吃了睡睡了吃');
COMMIT;
因为Slave主机上已经删除id=17的数据,在Slave查看主从复制状态时,就会发现报错信息:
Exec_Master_Log_Pos: 120
Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Errno: 1032
Last_Error: Could not execute Delete_rows event on table edusoho_e.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000004, end_log_pos 341
在Master主机上查看position做了什么操作:
mysqlbinlog -v --base64-output=decode --start-position=120 mysql-bin.000004
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 120
#190507 13:52:05 server id 2 end_log_pos 202 CRC32 0x0ca0c280 Query thread_id=3 exec_time=0 error_code=0
SET TIMESTAMP=1557208325/*!*/;
SET @@session.pseudo_thread_id=3/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1073741824/*!*/;
SET @@session.auto_increment_increment=2, @@session.auto_increment_offset=1/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=33/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
BEGIN
/*!*/;
# at 202
#190507 13:52:05 server id 2 end_log_pos 263 CRC32 0x20d2e89d Table_map: `edusoho_e`.`t1` mapped to number 216
# at 263
#190507 13:52:05 server id 2 end_log_pos 341 CRC32 0xbec6fd45 Delete_rows: table id 216 flags: STMT_END_F
### DELETE FROM `edusoho_e`.`t1`
### WHERE
### @1=17
### @2='懒死'
### @3='不知道'
### @4=1
### @5='吃了睡睡了吃'
### @6=18
# at 341
#190507 13:52:05 server id 2 end_log_pos 402 CRC32 0xa37bc5c9 Table_map: `edusoho_e`.`t1` mapped to number 216
# at 402
#190507 13:52:05 server id 2 end_log_pos 483 CRC32 0x0d774707 Write_rows: table id 216 flags: STMT_END_F
### INSERT INTO `edusoho_e`.`t1`
### SET
### @1=21
### @2='我是谁'
### @3='不知道'
### @4=1
### @5='吃了睡睡了吃'
### @6=18
# at 483
#190507 13:52:05 server id 2 end_log_pos 514 CRC32 0x8c333b30 Xid = 411
COMMIT /*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
可以看到,绿色的部分就是我们需要跳过的,而第二个event是需要我们保留的
这个时候,就需要用到 slave_exec_mode 这个变量了,至于slave_exec_mode详细介绍,还是请参考MySQL官网资料
mysql> set global slave_exec_mode='IDEMPOTENT';
mysql> start slave sql_thread;
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Exec_Master_Log_Pos: 514
去Slave上edusoho_e.t1表上查看,数据id=21的数据已经过去了,此时,数据处于一致性状态
3、 跳过主键冲突1062错误(Duplicate entry):
在Slave主键上先插入一条id值:
INSERT INTO `edusoho_e`.`t1` (`id`,`xname`, `address`, `hobby`, `age`) VALUES (19,'小玩子', '明朝', '皇后', '25');
因为Slave已经占用了Master要自动产生的主键值id=19,所以Slave主机会报错:
INSERT INTO `edusoho_e`.`t1` (`id`,`xname`, `address`, `hobby`, `age`) VALUES (19,'朱棣', '明朝', '皇帝', '36');
查看Slave主从复制状态发现已经发生了主从复制报错:
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Errno: 1062
Last_Error: Could not execute Write_rows event on table edusoho_e.t1; Duplicate entry '19' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000002, end_log_pos 7425
Exec_Master_Log_Pos: 7219
查看Master binlog:
mysql> show binlog events in 'mysql-bin.000002' from 7219;
+------------------+------+------------+-----------+-------------+---------------------------------+
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
+------------------+------+------------+-----------+-------------+---------------------------------+
| mysql-bin.000002 | 7219 | Query | 2 | 7301 | BEGIN |
| mysql-bin.000002 | 7301 | Table_map | 2 | 7362 | table_id: 213 (edusoho_e.t1) |
| mysql-bin.000002 | 7362 | Write_rows | 2 | 7425 | table_id: 213 flags: STMT_END_F |
| mysql-bin.000002 | 7425 | Xid | 2 | 7456 | COMMIT /* xid=893 */ |
+------------------+------+------------+-----------+-------------+---------------------------------+
思考:
因为Slave这条数据已经存在,如果在Slave主机上把这条数据删除了,Slave会不会直接同步过来?(答案是:不会。需要重启Slave thread):
DELETE FROM `edusoho_e`.`t1` WHERE `id` = '19';
mysql> stop slave;
mysql> start slave user='repliter' password='123456';
验证的时候,发现数据已经同步过去了
题外:
以上是笔者对于单条SQL组成的事务、多条SQL组成的事务,及在这些单/多条SQL组成的事务下,人为设置的1032和1062复制错误和解决方法,还有sql_slave_skip_counter和slave_exec_mode各自的用法和跳过的范围,当然了, 笔者呢,做的只是线上应用前的部署测试,并没有经过任何的实战检测。一方面,仅为广大同行做个参考;另一方面,记录笔者自己的心得和针对问题解决的思路做个总结,当问题真正发生的时候,有个方向可以进行参考,而不至于手忙脚乱,不知所措,所以,对其中有误之处和理解不到位的地方,望请下方留言指正,不胜感激!
还有,笔者做的,只是针对事务表,做的 sql_slave_skip_counter和slave_exec_mode测试,对于非事务表, sql_slave_skip_counter和slave_exec_mode用途会稍有不同,请自行百度吧。
slave_exec_mode=IDEMPOTENT 在MySQL复制环境中是个很有用的参数:只要在备机运行set global slave_exec_mode=IDEMPOTENT ,备机的sql thread就运行在冥等模式下,可以让备机在insert主键、唯一键冲突,update、delete值未找到错误发生时不断开复制而保持冥等性(当即生效,连slave的sql线程都不用重启哟);而类似sql_slave_skip_counter=N和slave-skip-errors = N 这样的粗暴跳过错误方法可能破坏主备一致性。但官方文档的描述很简洁,我一直好奇slave_exec_mode=IDEMPOTENT 是如何在复制出错时保持一致性的--譬如主键冲突时是简单跳过还是覆写,今天在Percona 5.7下做了个实验(binlog是row格式),实验过程就省略了,直接总结如下:
1.insert场景
此时insert into语句在备机的效果就跟replace into一样,但却并不是把insert into转换成replace into来执行,分两种情况:
a.MySQL配置成autocommit,直接一条insert into ...
如这样的insert
insert into test set c1='a',c2='b';
此时insert into语句在备机执行时假如遇到主键冲突就先转化为delete再insert
delete from test where c1='old_value' and c2='old_value';
insert into test set c1='a',c2='b';
假如遇到非主键的唯一键冲突就转换为update
update test set set c1='a',c2='b' where c1='old_value' and c2='old_value';
b.当显示开始事务时(begin...insert into...commit;)
如这样的sql
begin;
......
insert into test set c1='a',c2='b';
......
commit;
此时begin...commit里的insert into语句在备机执行时假如遇到主键冲突、唯一键冲突都是先转化为delete再insert
begin;
......
delete from test where c1='old_value' and c2='old_value';
insert into test set c1='a',c2='b';
......
commit;
2.update场景
当备机不存在要更新的记录,这条update跳过不执行
3.delete场景
同update场景一样,备机跳过此delete啥也不干
注意:使用冥等模式时表要有主键
冥等模式并不是万能的,除了不能对DDL操作冥等,对字段长度不同导致的错误也不是冥等(譬如主机一个字段是char(20)而备机是char(10)),还有一个限制就是表有主键才会对insert的冥等设置有效:因为insert的冥等行为是通过主键来判断备机是否有重复值从而产生覆写操作,如果表没有主键,则备机即使设了冥等也可能会比主机多重复数据。
slave_exec_mode设置可以跳过1032(记录没有找到)和1062(主键重复)错误,并记录到错误日志中。
slave_exec_mode和slave_skip_errors作用是一样的,只是slave_exec_mode可以在线动态设置。slave_skip_errors必须添加到配置文件中,重启生效。
备库
mysql> select * from testdb1.student;
+------+------+-------+-------+
| id | name | class | score |
+------+------+-------+-------+
| 1 | a | 1 | 45 |
| 2 | b | 1 | 46 |
| 3 | c | 2 | 89 |
| 4 | d | 2 | 90 |
| 5 | e | 3 | 67 |
| 6 | f | 3 | 87 |
| 7 | g | 4 | 77 |
| 8 | h | 4 | 91 |
+------+------+-------+-------+
8 rows in set (0.00 sec)
mysql> delete from testdb1.student where id >5;
Query OK, 3 rows affected (0.00 sec)
mysql> commit;
Query OK, 0 rows affected (0.00 sec)
#修改参数之前
主库master
mysql> delete from testdb1.student where id >7;
Query OK, 1 row affected (0.00 sec)
备库查看状态
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.56.91
Master_User: rep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: ray-bin.000008
Read_Master_Log_Pos: 1272
Relay_Log_File: ray-relay-bin.000003
Relay_Log_Pos: 1226
Relay_Master_Log_File: ray-bin.000008
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1032
Last_Error: Could not execute Delete_rows event on table testdb1.student; Can't find record in 'student', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log ray-bin.000008, end_log_pos 1241
Skip_Counter: 0
Exec_Master_Log_Pos: 1065
Relay_Log_Space: 1957
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows event on table testdb1.student; Can't find record in 'student', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log ray-bin.000008, end_log_pos 1241
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
Master_UUID: 840f94e0-8ea0-11e5-af92-080027a94012
Master_Info_File: /data/3307/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 151126 12:42:37
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
1 row in set (0.00 sec)
mysql> stop slave;
Query OK, 0 rows affected (0.78 sec)
mysql> set global sql_slave_skip_counter=1;
Query OK, 0 rows affected (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.33 sec)
#修改备库参数
mysql> show variables like '%slave_exec_mode%';
+-----------------+--------+
| Variable_name | Value |
+-----------------+--------+
| slave_exec_mode | STRICT |
+-----------------+--------+
1 row in set (0.00 sec)
mysql> set global slave_exec_mode=idempotent;
Query OK, 0 rows affected (0.00 sec)
mysql> show variables like '%slave_exec_mode%';
+-----------------+------------+
| Variable_name | Value |
+-----------------+------------+
| slave_exec_mode | IDEMPOTENT |
+-----------------+------------+
1 row in set (0.00 sec)
主库删除数据
mysql> delete from testdb1.student where id >6;
Query OK, 1 row affected (0.01 sec)
mysql> commit;
Query OK, 0 rows affected (0.00 sec)
备库查看状态和错误日志
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.56.91
Master_User: rep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: ray-bin.000008
Read_Master_Log_Pos: 1479
Relay_Log_File: ray-relay-bin.000004
Relay_Log_Pos: 488
Relay_Master_Log_File: ray-bin.000008
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1479
Relay_Log_Space: 1972
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
Master_UUID: 840f94e0-8ea0-11e5-af92-080027a94012
Master_Info_File: /data/3307/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
1 row in set (0.00 sec)
[root@ray ~]# tail -20f /data/3307/data/mysql_ray.err
2015-11-26 12:50:29 12127 [Warning] Slave SQL: Could not execute Delete_rows event on table testdb1.student; Can't find record in 'student', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log ray-bin.000008, end_log_pos 1655, Error_code: 1032
注:pos可能对不上,因为截取的问题,不必在意。