问题来源
报警群收到线上服务报警:Error 1213 (40001): Deadlock found when trying to get lock; try restarting transaction。
排查过程
- 查看日志系统,找到报错的代码位置。
-
在mysql服务器执行命令“ show engine innodb status; ”。发现具体死锁原因如下:
2023-08-15 05:58:18 140196042598144
*** (1) TRANSACTION:
TRANSACTION 28808, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 5 lock struct(s), heap size 1128, 3 row lock(s), undo log entries 1
MySQL thread id 5975, OS thread handle 140195455186688, query id 703983 172.16.26.26 root update
INSERT INTO `table1` (`a`,`b`,`be_deleted`) VALUES (4218, 0, 0),
(4218, 0, 0),
(4218, 0, 0),
(4218, 0, 0)*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 57 page no 232 n bits 264 index idx_a_b of table `table1` trx id 28808 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 57 page no 232 n bits 264 index idx_a_b of table `table1` trx id 28808 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;*** (2) TRANSACTION:
TRANSACTION 28809, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 5 lock struct(s), heap size 1128, 3 row lock(s), undo log entries 1
MySQL thread id 5971, OS thread handle 140195456243456, query id 703985 172.16.26.27 root update
INSERT INTO `table1` (`a`,`b`,`be_deleted`) VALUES (4218, 0, 0),
(4218, 0, 0),
(4218, 0, 0),
(4218, 0, 0)*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 57 page no 232 n bits 264 index idx_a_b of table `table1` trx id 28809 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 57 page no 232 n bits 264 index idx_a_b of table `table1` trx id 28809 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;*** WE ROLL BACK TRANSACTION (2)
------------
- 根据日志和代码进行复现。
-
建表语句
CREATE TABLE `table1` ( `id` bigint unsigned NOT NULL AUTO_INCREMENT, `a` bigint NOT NULL, `b` bigint NOT NULL DEFAULT '0' COMMENT '记录ID', `be_deleted` tinyint DEFAULT '0' COMMENT '默认为0,为1时软删除', PRIMARY KEY (`id`), KEY `idx_a_b` (`a`,`b`) ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='表1';
-
案例1。条件:a和 b在数据库中没有。按照T的顺序执行后,在T6得到复现。备注:我们业务是假删除,所以代码中的delete对应的sql是update。
-
事物1
--- T1 START TRANSACTION; --- T2 UPDATE table1 SET `be_deleted`=1 WHERE a=1 and b=2; --- T5 INSERT INTO table1 (a,b) values(1,2);
-
事物2
--- T3 START TRANSACTION; --- T4 UPDATE table1 SET `be_deleted`=1 WHERE a=3 and b=4; --- T6 INSERT INTO table1 (a,b) values(3,4);
-
-
案例2。条件:数据库中有a=3并且b=4的数据。这个时候能执行成功。
-
事物1
--- T1 START TRANSACTION; --- T2 UPDATE table1 SET `be_deleted`=1 WHERE a=1 and b=2; --- T5 INSERT INTO table1 (a,b) values(1,2); --- T8 COMMIT;
- 事物2
--- T3 START TRANSACTION; --- T4 UPDATE table1 SET `be_deleted`=1 WHERE a=3 and b=4; --- T6 INSERT INTO table1 (a,b) values(3,4); --- T7 COMMIT;
-
-
执行到案例1,当执行到T5。在数据库执行sql“X_NAME,LOCK_MODE,LOCK_DATA FROM performance_schema.data_locks;”查看当前锁情况如下:
+-----------+------------+--------------------+------------------------+ | LOCK_TYPE | INDEX_NAME | LOCK_MODE | LOCK_DATA | +-----------+------------+--------------------+------------------------+ | TABLE | NULL | IX | NULL | | RECORD | idx_a | X | supremum pseudo-record | | RECORD | idx_a | X,INSERT_INTENTION | supremum pseudo-record | | TABLE | NULL | IX | NULL | | RECORD | idx_a | X | supremum pseudo-record | +-----------+------------+--------------------+------------------------+
-
可以看到锁的数据范围是 supremum pseudo-record即锁住了当前最大索引以及后续的值。
结论
-
在innodb中,以索引为条件进行update时,使用的是Next-Key Locks锁。如果对应的索引值不存在,则锁定的数据是supremum pseudo-record,也就是大于当前索引的最大值。参考官网文档Next-Key Locks说明。
Next-Key Locks的范围说明:
(negative infinity, 10] (10, 11] (11, 13] (13, 20] (20, positive infinity)
For the last interval, the next-key lock locks the gap above the largest value in the index and the “supremum” pseudo-record having a value higher than any value actually in the index. The supremum is not a real index record, so, in effect, this next-key lock locks only the gap following the largest index value.
在最后一个区间内,next-key锁定了索引中最大值上方的间隙,并且“至高”伪记录具有比索引中任何实际值都要高的值。至高并不是一个真正的索引记录,因此实际上,这个next-key锁只锁定了紧随最大索引值之后的间隙。
-
当update时,条件中的索引值不存在。这个时候如果其他事物执行insert,则触发update持有的Next-Key Locks。
-
所以案例1执行顺序。当两个事物都执行了update,并且修改的索引值不存在。再insert就需要等待ext-Key Locks的释放 。形成两个事物相互等待的情况。导致死锁。