由DUPLICATE引起的DeadlockLoserDataAccessException死锁

刚接手离职同事业务没几天, 线上就出现了死锁现象。 其业务是接收商品(sku_id)促销消息mq,促销状态有新增、删除、过期等状态,同一sku_id,这些状态可能在同一时间袭来

一、直击死锁现场

发现死锁,第一时间查看日志现场,死锁现场日志如下: 

发现如下死锁关键词

com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException:  Deadlock found when trying to get lock; try restarting transaction

死锁语句如下:

insert into student_sku (sku_id, sku_state, prom_id, prom_price, prom_state, start_time, end_time, create_time, update_time ) values (?, ?, ?, ?, ?, ?, ?, ?, ? ) ON DUPLICATE KEY UPDATE prom_id = ?, prom_price = ?, prom_state = ?, start_time = ?, end_time = ?, update_time = ?

表结构如下,其中有两个唯一索引uniq_promotion_id,uniq_sku_id

CREATE TABLE `student_sku` (
    `id` BIGINT ( 20 ) NOT NULL AUTO_INCREMENT COMMENT '主键',
    `sku_id` BIGINT ( 20 ) NOT NULL DEFAULT '0' COMMENT '商品id',
    `sku_state` TINYINT ( 5 ) NOT NULL DEFAULT '-1' COMMENT '商品上下柜状态,1:上架(可搜索,可购买),0:下架(可通过skuid搜索,不可购买),2:可上架(可通过skuid搜索,不可购买),10:pop 删除(不可搜索,不可购买))',
    `prom_id` BIGINT ( 20 ) NOT NULL DEFAULT '0' COMMENT '促销id',
    `prom_price` DECIMAL ( 15, 2 ) NOT NULL DEFAULT '0.00' COMMENT '促销价格',
    `prom_state` TINYINT ( 5 ) NOT NULL DEFAULT '0' COMMENT '促销状态,0默认值,1促销有效,2无效(包含删除自然结束等)',
    `start_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '促销开始时间',
    `end_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '促销结束时间',
    `create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '记录创建时间',
    `update_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '记录更新时间',
    PRIMARY KEY ( `id` ),
    UNIQUE KEY `uniq_promotion_id` ( `prom_id` ) USING BTREE,
    UNIQUE KEY `uniq_sku_id` ( `sku_id` ) USING BTREE,
    KEY `idx_update_time` ( `update_time` ) USING BTREE 
) ENGINE = INNODB AUTO_INCREMENT = 2797210 DEFAULT CHARSET = utf8 COMMENT = '学生商品表'

二、死锁分析

现象很明显,当执行更新的时候,发现了要更新的数据被别的事务锁了,拿不到了锁,于是就死锁了。

我们再来看SQL语句,student_sku表中两个唯一索引uniq_promotion_id,uniq_sku_id,经过实验,有多个唯一索引,在ON DUPLICATE KEY UPDATE 更新时,最后一个唯一索引作为更新条件,这里uniq_sku_id是更新条件

insert into student_sku (sku_id, sku_state, prom_id, prom_price, prom_state, start_time, end_time, create_time, update_time ) values (?, ?, ?, ?, ?, ?, ?, ?, ? ) ON DUPLICATE KEY UPDATE prom_id = ?, prom_price = ?, prom_state = ?, start_time = ?, end_time = ?, update_time = ?

由于没有开启锁日志记录,只能模拟本地再现 

1、开启MySQL锁记录

set GLOBAL innodb_status_output_locks=ON;

set GLOBAL innodb_status_output=ON;

 2、模拟再现

 事务1

mysql> begin;
Query OK, 0 rows affected (0.02 sec)

mysql> insert into student_sku (sku_id, sku_state, prom_id,prom_price, prom_state, start_time, end_time, create_time, update_time ) values (1112, 2, 1231, 2, 2, now(),  now(),  now(),  now() ) ON DUPLICATE KEY UPDATE prom_id = 1231, prom_price = 2, prom_state = 2, start_time =  now(), end_time = now(), update_time = now();
Query OK, 2 rows affected (0.03 sec)

mysql> 

 事务2

mysql> begin;
Query OK, 0 rows affected (0.02 sec)

mysql> insert into student_sku (sku_id, sku_state, prom_id,prom_price, prom_state, start_time, end_time, create_time, update_time ) values (1114, 2, 1232, 2, 2, now(),  now(),  now(),  now() ) ON DUPLICATE KEY UPDATE prom_id = 1232, prom_price = 2, prom_state = 2, start_time =  now(), end_time = now(), update_time = now();
Query OK, 1 row affected (0.02 sec)

mysql> 

 事务3

mysql> begin;
Query OK, 0 rows affected (0.02 sec)

mysql> insert into student_sku (sku_id, sku_state, prom_id,prom_price, prom_state, start_time, end_time, create_time, update_time ) values (1115, 2, 1233, 2, 2, now(),  now(),  now(),  now() ) ON DUPLICATE KEY UPDATE prom_id = 1233, prom_price = 2, prom_state = 2, start_time =  now(), end_time = now(), update_time = now();
1205 - Lock wait timeout exceeded; try restarting transaction
mysql> 

查看此时锁日志:show engine innodb status

------------
TRANSACTIONS
------------
Trx id counter 225695
Purge done for trx's n:o < 225693 undo n:o < 0 state: running but idle
History list length 39
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421367021271328, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 421367021266768, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
---TRANSACTION 225694, ACTIVE 16 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
MySQL thread id 6211, OS thread handle 139891716085504, query id 714060 100.124.26.230 dev update
insert into student_sku (sku_id, sku_state, prom_id,prom_price, prom_state, start_time, end_time, create_time, update_time ) values (1115, 2, 1233, 2, 2, now(),  now(),  now(),  now() ) ON DUPLICATE KEY UPDATE prom_id = 1233, prom_price = 2, prom_state = 2, start_time =  now(), end_time = now(), update_time = now()
------- TRX HAS BEEN WAITING 16 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 108 page no 4 n bits 72 index uniq_promotion_id of table `commoditypools`.`student_sku` trx id 225694 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

------------------
TABLE LOCK table `commoditypools`.`student_sku` trx id 225694 lock mode IX
RECORD LOCKS space id 108 page no 4 n bits 72 index uniq_promotion_id of table `commoditypools`.`student_sku` trx id 225694 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

RECORD LOCKS space id 108 page no 4 n bits 72 index uniq_promotion_id of table `commoditypools`.`student_sku` trx id 225694 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

---TRANSACTION 225693, ACTIVE 2630 sec
5 lock struct(s), heap size 1136, 4 row lock(s), undo log entries 1
MySQL thread id 6210, OS thread handle 139891166349056, query id 714053 100.124.26.230 dev
TABLE LOCK table `commoditypools`.`student_sku` trx id 225693 lock mode IX
RECORD LOCKS space id 108 page no 4 n bits 72 index uniq_promotion_id of table `commoditypools`.`student_sku` trx id 225693 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

RECORD LOCKS space id 108 page no 4 n bits 72 index uniq_promotion_id of table `commoditypools`.`student_sku` trx id 225693 lock_mode X locks gap before rec
Record lock, heap no 5 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 8; hex 80000000000004d0; asc         ;;
 1: len 8; hex 800000000000000b; asc         ;;

RECORD LOCKS space id 108 page no 5 n bits 72 index uniq_sku_id of table `commoditypools`.`student_sku` trx id 225693 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

RECORD LOCKS space id 108 page no 5 n bits 72 index uniq_sku_id of table `commoditypools`.`student_sku` trx id 225693 lock_mode X locks gap before rec
Record lock, heap no 5 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 8; hex 800000000000045a; asc        Z;;
 1: len 8; hex 800000000000000b; asc         ;;

---TRANSACTION 225688, ACTIVE 2691 sec
4 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 1
MySQL thread id 6208, OS thread handle 139891587360512, query id 714049 100.124.26.230 dev
TABLE LOCK table `commoditypools`.`student_sku` trx id 225688 lock mode IX
RECORD LOCKS space id 108 page no 4 n bits 72 index uniq_promotion_id of table `commoditypools`.`student_sku` trx id 225688 lock_mode X
Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 8; hex 80000000000004cf; asc         ;;
 1: len 8; hex 8000000000000008; asc         ;;

RECORD LOCKS space id 108 page no 5 n bits 72 index uniq_sku_id of table `commoditypools`.`student_sku` trx id 225688 lock_mode X
Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 8; hex 8000000000000458; asc        X;;
 1: len 8; hex 8000000000000008; asc         ;;

RECORD LOCKS space id 108 page no 3 n bits 72 index PRIMARY of table `commoditypools`.`student_sku` trx id 225688 lock_mode X locks rec but not gap
Record lock, heap no 4 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
 0: len 8; hex 8000000000000008; asc         ;;
 1: len 6; hex 000000037198; asc     q ;;
 2: len 7; hex 38000002610f7d; asc 8   a };;
 3: len 8; hex 8000000000000458; asc        X;;
 4: len 1; hex 82; asc  ;;
 5: len 8; hex 80000000000004cf; asc         ;;
 6: len 7; hex 80000000000200; asc        ;;
 7: len 1; hex 82; asc  ;;
 8: len 5; hex 99a9720264; asc   r d;;
 9: len 5; hex 99a9720264; asc   r d;;
 10: len 5; hex 99a9717ea5; asc   q~ ;;
 11: len 5; hex 99a9720264; asc   r d;;

3、加锁分析

表中已有数据

 Transaction1

insert into student_sku (sku_id, sku_state, prom_id,prom_price, prom_state, start_time, end_time, create_time, update_time ) values (1112, 2, 1231, 2, 2, now(),  now(),  now(),  now() ) 
ON DUPLICATE KEY UPDATE prom_id = 1231, prom_price = 2, prom_state = 2, start_time =  now(), end_time = now(), update_time = now();

Transaction2 

insert into student_sku (sku_id, sku_state, prom_id,prom_price, prom_state, start_time, end_time, create_time, update_time ) values (1114, 2, 1232, 2, 2, now(),  now(),  now(),  now() ) 
ON DUPLICATE KEY UPDATE prom_id = 1232, prom_price = 2, prom_state = 2, start_time =  now(), end_time = now(), update_time = now();
Query OK, 1 row affected (0.02 sec)

Transaction3

insert into student_sku (sku_id, sku_state, prom_id,prom_price, prom_state, start_time, end_time, create_time, update_time ) values (1115, 2, 1233, 2, 2, now(),  now(),  now(),  now() ) 
ON DUPLICATE KEY UPDATE prom_id = 1233, prom_price = 2, prom_state = 2, start_time =  now(), end_time = now(), update_time = now();

 

numberTransaction1Transaction2Transaction3
1uniq_promotion_id加锁uniq_promotion_id加锁uniq_promotion_id等待加锁,最终超时
2uniq_sku_id加锁uniq_sku_id加锁 
3primary加锁  

 

事物1首先到来,事物1的两个唯一索引uniq_promotion_id和uniq_sku_id插入的列值分别为1112,1231,首先对uniq_promotion_id加锁,然后对uniq_sku_id加锁,由于1112,1231,已经在表中已经存在,故此需要更新该行,更新时候需要根据主键更新,便对primary加锁。事务1对接下来的死锁没有什么影响,这里可以忽略。

事务2到来,事务2的唯一索引uniq_promotion_id插入的列值1232,在表中不存在,因此事务2执行插入操作,接着便对uniq_promotion_id加锁,这里的锁是间隙锁,加锁区间是(1231,+supernum]。

事务3到来,事务3的uniq_promotion_id唯一索引插入的列值1233,在表中也不存在,因此事务3执行插入操作,开始对uniq_promotion_id加锁,而此时发现要插入的值1233,已经被事务2锁定,即事务2占有uniq_promotion_id的锁,事务2没有提交,没有被释放该锁,于是事务3等待锁超时了。此时,若事务2在执行一条插入语句,就死锁了

三、解决方案

 (1)方案一

先根据条件查询,再判断是插入或者根据主键更新。

此方案,先根据已知条件查询,若数据存在,再根据主键更新数据,从缩小锁粒度,减少并发等待时间方面考虑解决死锁问题。

但是先查询,再插入或更新,在并发情况下,并不能保证原子性。比如,有同一个sku_id有两条促销消息,且促销状态不同,同一时刻到来并执行,这个时候查询表中并不存在该sku_id,两条消息都被插入表中,这显然不是我们想要的。

因此,该方案适合,同一条业务数据,不会同一时刻出现多条的情况,即多个相同的数据不存在并发现象

 (2)方案二

增加分布式锁,保证有序执行。此方案,从去除事务并发方面考虑解决死锁问题

由于方案一,在并发执行时候,同一个sku_id,可能会insert进表中多条数据,并不能保证其原子性。因此,无需修改SQL,增加分布式锁(Redis、Zookeeper等)即可,保证其有序性,从而解决事务并发问题

但是,加锁,多少会降低系统的处理速度,但是这也是分布式系统必须考虑的问题

 

 

参考:https://blog.csdn.net/li563868273/article/details/105213266/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值