MySQLReplication常见错误处理方法

最新推荐文章于 2022-06-23 15:58:16 发布

yxy369

最新推荐文章于 2022-06-23 15:58:16 发布

阅读量1k

点赞数

分类专栏： MySQL

本文链接：https://blog.csdn.net/yxy369/article/details/95175519

版权

MySQL 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

一、删除一条记录：master上删除一条记录，slave上找不到对应记录而报错。

原因：主从之间数据不一致。

解决方法：直接跳过

mysql> stop slave;

mysql> set global sql_slave_skip_counter=1;

mysql> start slave;

二、重复主键：在master上插入一条记录，slave报主键重复错误。

原因：主从之间数据不一致。

解决方法：在slave上根据主键查找到相应记录并删除，然后再开启复制。

mysql> stop slave;

mysql> delete from t1 where id=?;

mysql> start slave;

mysql> show slave status\G;

三、找不到要更新的记录：在master上更新记录，但在slave上找不到对应的记录。

原因：主从之间数据不一致。

解决方法：在master上，用MySQLbinlog工具分析出错的binlog日志，在master上查找到更新后的记录，然后将该记录插入到slave中。

四、slave的中继日志损坏：

原因：slave意外宕机。

解决方法：重新定位binlog日志和POS点，然后重新同步。

mysql> stop slave;

mysql> show slave status\G;

以Relay_master_Log_File参数和Exec_master_Log_Pos参数为准。

mysql> change master to ...

mysql> start slave;

mysql> show slave status\G;

五、主库中无主键大表更新操作：

环境：CentOS6.9 、MySQL5.6 、一主两从、row格式binlog。

现象：show slave status命令显示两台从机的Relay_master_Log_File和Exec_master_Log_Pos一直不变并且相同，Slave_IO_Running和Slave_SQL_Running值为Yes，Seconds_Behind_Master值很大并且不断增长，Slave_SQL_Running_State为Reading event from the relay log。

分析：通过Slave_IO_Running和Slave_SQL_Running的状态可以判断两台从机的复制进程并没有错误或终止，查看主从服务器的MySQL error文件也没有发现错误信息，由于两台从机的Relay_master_Log_File和Exec_master_Log_Pos值完全相同，可以初步判断是由于主服务器的某一操作导致从服务器的复制延迟。

用mysqlbinlog命令根据Relay_master_Log_File和Exec_master_Log_Pos查看主服务器上的binlog日志，定位导致延迟的操作。发现是对一个表大量的删除记录操作，在主库上打开该表显示为空表，说明已经全部删除完成，并且发现该表没有主键。在从库上打开同一个表，发现有近500万行记录。

在从库上执行：

select * from information_schema.INNODB_TRX\G;

可以看到trx_started显示该事务已经启动很久了，并且还在运行。

原因：这个问题在MySQL的技术支持文档中有专门的描述：

When the SQL thread applies the change from a row based event, it will have to locate the exact row that was updated. With a primary key, this is trivial as only one row can possible have the same value for the primary key.

However if there is no primary key on the table on the replication slave side, the SQL thread must search the entire table to locate the row to update or delete. The search is repeated for each updated row. This search is both very labor intensive (usually using close to one CPU at 100%) and slow causing the slave to fall behind.

For InnoDB tables, the hidden key used for the clustered index for tables without a primary key cannot be used to avoid searching the entire table for the rows to update or delete. The hidden key is unique only to each MySQL instance, so the replication master and replication slave will not in general have the same values for the hidden key for the same row.

For cascading/chained replication (for example master to slave 1 to slave 2) where master and slave1 both use binlog_format = MIXED, (affects 5.7 and earlier) can cause a statement based event using a temporary table to be converted to a row based on slave 1 causing slave 2 to fall behind if the table does not have a primary key.

解决方法：确保主库中的每一个表都有主键。临时解决方案是在从机执行如下命令：

SET GLOBAL slave_rows_search_algorithms = 'INDEX_SCAN,HASH_SCAN';

解决方案的官方说明：

The best solution is to ensure all tables have a primary key. This not only ensures the SQL thread can easily locate rows to update or delete, but is also considered best practice as it ensures all rows are unique. If there is no natural primary key for the table, a potential solution is to add an auto-increment unsigned integer column as the primary key.

Note: For InnoDB tables, there must always be a unique NOT NULL key for all tables as it is required for the clustered index. So adding an explicit "dummy" column will not add to the overall storage requirements as it will merely replace the hidden key.

If it is not possible to add a primary key immediately, a short term solution is to change the search algorithm used by the replication slave to locate the rows changed by row based events. The search algorithm is set using the slave_rows_search_algorithms option which is available in MySQL 5.6 and later. The default value is to use an index scan if possible, otherwise a table scan. However for tables without a primary key using a hash scan which causes the SQL thread to temporarily cache hashes to reduce the overhead of searching the whole table. The value of slave_rows_search_algorithms can be changed dynamically using:

mysql> SET GLOBAL slave_rows_search_algorithms = 'INDEX_SCAN,HASH_SCAN';
Query OK, 0 rows affected (0.01 sec)

INDEX_SCAN,HASH_SCAN is the default value in MySQL 8.0.2 and later.

It is not necessary to restart neither the whole MySQL instance nor the replication threads for the change to take effect. Setting the value to INDEX_SCAN,HASH_SCAN or INDEX_SCAN,HASH_SCAN,TABLE_SCAN makes no difference. Ensure to update your MySQL configuration file as well to make the change persist when MySQL is restarted.

One thing to be aware of when using hash scans that the hashes are only reused within one row based events. (Each row based event may have changes to several rows in the same table originating from the same SQL statement.) The maximum size of a row based event is controlled with the binlog_row_event_max_size option on the replication master. The default max event size is 8kB. This means that switching to hash scans only improves the performance of the SQL thread when:

Several rows fit into one row based event. It may help to increase the value of binlog_row_event_max_size on the replication master, if you perform updates or deletes on large rows (e.g. with blob or text data). binlog_row_event_max_size can only be set in the MySQL configuration file and a restart is required after setting a new value.
One statement changes several rows.

There is no significant drawbacks of enabling hash scans.

Even if enabling hash scans improves the performance enough for the replication slave to keep up, it is strongly recommended to keep working towards adding a primary key to all tables.