一、问题描述
直播项目出现几次死锁异常报错
二、问题原因
1.发生时机
直播间用户同时退出直播间(有用户主动操作,也有主播结束直播情况),腾讯Im会回调告知用户退出直播间了,然后业务需要对这个直播间的用户更新离开时间,因为近乎同时更新导致死锁
2.错误日志
3.死锁分析
3.1)根据日志可以看到有两个事务T1,T2在执行。T1(4531853712)事务在2024-04-26 20:23:44.179执行,T2(4531853708)事务在2024-04-26 20:23:44.186执行
#T1事务,事务id:45318537121
UPDATE im_group_access_log SET leave_time='2024-04-26 20:23:44.179' WHERE (group_id = 1609377597358144 AND user_id IN (1484749629554720) AND leave_time IS NULL)
#T2事务,事务id:4531853708
UPDATE im_group_access_log SET leave_time='2024-04-26 20:23:44.186' WHERE (group_id = 1609377597358144 AND user_id IN (1569688507121728) AND leave_time IS NULL)
3.2)表中含有idx_group_id和idx_user_id两个非唯一普通索引,及以下的T1/T2的执行计划。注意type是index_merge:使用索引合并来检索行,Extra:多个索引的交集
3.3)T1持有PK的锁,等待idx_group_id索引的锁
RECORD LOCKS space id 4591 page no 28350 n bits 280 index PRIMARY of table `im_group_access_log` trx id 4531853712 lock_mode X locks rec but not gap
RECORD LOCKS space id 4591 page no 33263 n bits 416 index idx_group_id of table `im_group_access_log` trx id 4531853712 lock_mode X locks rec but not gap waiting
3.4)T2持有idx_group_id的锁,等待对应的PK的锁,所以看出T1和T2形成了循环等待
RECORD LOCKS space id 4591 page no 33263 n bits 416 index idx_group_id of table `im_group_access_log` trx id 4531853708 lock_mode X locks rec but not gap
RECORD LOCKS space id 4591 page no 28350 n bits 280 index PRIMARY of table `im_group_access_log` trx id 4531853708 lock_mode X locks rec but not gap waiting
3.5)加锁过程