1. 系统环境
系统版本
mysql> select @@version;
+------------+
| @@version |
+------------+
| 5.7.26-log |
+------------+
主库
mysql> select @@port;
+--------+
| @@port |
+--------+
| 3307 |
+--------+
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000006 | 3277 | | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
创建主从复制的用户
grant replication slave on *.* to repl@'10.0.0.%' identified by '123';
____________________________________________________________________________
从库
mysql> select @@port;
+--------+
| @@port |
+--------+
| 3308 |
+--------+
构建主从关系
mysql> CHANGE MASTER TO
-> MASTER_HOST='10.0.0.51',
-> MASTER_USER='repl',
-> MASTER_PASSWORD='123',
-> MASTER_PORT=3307,
-> MASTER_LOG_FILE='mysql-bin.000006',
-> MASTER_LOG_POS=3277,
-> MASTER_CONNECT_RETRY=10;
Query OK, 0 rows affected, 2 warnings (0.05 sec)
设置从库延时复制主库等待的时间300秒
mysql> change master to master_delay=300;
Query OK, 0 rows affected (0.00 sec)
开启主从复制
mysql> start slave;
Query OK, 0 rows affected (0.06 sec)
检查主从状态是否开启成功
show slave status\G;
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
检查延时时间设置
SQL_Delay: 300
2. 故障模拟
一主一从,主库建库建表且插入数据,然后误删库。
3. 解决思路
1)在主库删库后300秒内,发现这个危险操作
2)停止从库的SQL线程
3)截取relay-log的起始点位置
起点:在show slave status后,Relay_Log_File的Relay_Log_Pos
终点:在show relaylog events in ‘Relay_Log_File’后,删库操作的Anonymous_Gtid号
4)恢复截取的日志到从库
5)原来的从库变成新的主库
或
5)将截取的数据通过mysqldump以库为单位备份,在主库恢复,主从关系不变
4. 实际操作
1)模拟故障
主库操作
mysql> create database delay charset utf8mb4;
Query OK, 1 row affected (0.10 sec)
mysql> use delay;
Database changed
mysql> create table t1 (id int);
Query OK, 0 rows affected (0.31 sec)
mysql> insert into t1 values(1),(2),(3);
Query OK, 3 rows affected (0.27 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> commit;
Query OK, 0 rows affected (0.00 sec)
mysql> drop database delay;
Query OK, 1 row affected (0.31 sec)
2)恢复数据
从库操作
1)停止SQL线程
mysql> stop slave sql_thread;
Query OK, 0 rows affected (0.00 sec)
2)找relay-log的起点
mysql> show slave status\G;
Relay_Log_File: db01-relay-bin.000002
Relay_Log_Pos: 320
3)找relay-log的终点
| db01-relay-bin.000002 | 320 | Anonymous_Gtid | 7 | 3342 | SET @@SESSION.GTID_NEXT= 'ANONYMOUS' |
| db01-relay-bin.000002 | 385 | Query | 7 | 3455 | create database delay charset utf8mb4 |
| db01-relay-bin.000002 | 498 | Anonymous_Gtid | 7 | 3520 | SET @@SESSION.GTID_NEXT= 'ANONYMOUS' |
| db01-relay-bin.000002 | 563 | Query | 7 | 3620 | use `delay`; create table t1 (id int) |
| db01-relay-bin.000002 | 663 | Anonymous_Gtid | 7 | 3685 | SET @@SESSION.GTID_NEXT= 'ANONYMOUS' |
| db01-relay-bin.000002 | 728 | Query | 7 | 3758 | BEGIN |
| db01-relay-bin.000002 | 801 | Table_map | 7 | 3804 | table_id: 176 (delay.t1) |
| db01-relay-bin.000002 | 847 | Write_rows | 7 | 3854 | table_id: 176 flags: STMT_END_F |
| db01-relay-bin.000002 | 897 | Xid | 7 | 3885 | COMMIT /* xid=1123 */ |
| db01-relay-bin.000002 | 928 | Anonymous_Gtid | 7 | 3950 | SET @@SESSION.GTID_NEXT= 'ANONYMOUS' |
| db01-relay-bin.000002 | 993 | Query | 7 | 4045 | drop database delay
由图可知,删库前的postion是928.
因此起始点是320和928.
3)截取日志
[root@db01 ~]# mysqlbinlog --start-position=320 --stop-position=928 /data/3308/data/db01-relay-bin.000002 > /tmp/relay.sql
4)恢复数据
mysql> source /tmp/relay.sql
5)检查数据
mysql> select * from delay.t1;
+------+
| id |
+------+
| 1 |
| 2 |
| 3 |
+------+
3 rows in set (0.00 sec)
6)从库身份解除
stop slave;
reset slave all;
在主库
reset master;
或将数据备份到主库
6-a)备份delay库的数据
[root@db01 ~]# mysqldump -S /data/3308/mysql.sock -B delay --master-data=2 --triggers --single-transaction -R -E >/tmp/delay.sql
6-b)在主库恢复数据并确认
mysql> set sql_log_bin=0;
mysql> source /tmp/delay.sql
mysql> select * from delay.t1;
+------+
| id |
+------+
| 1 |
| 2 |
| 3 |
+------+
3 rows in set (0.00 sec)