概述
死锁是指两个或两个以上的进程在执行过程中,由于竞争资源或者由于彼此通信而造成的一种阻塞的现象,若无外力作用,它们都将无法推进下去。此时称系统处于死锁状态或系统产生了死锁,这些永远在互相等待的进程称为死锁进程。
InnoDB 引擎采取的是 wait-for graph 等待图的方法来自动检测死锁,如果发现死锁会自动回滚一个事务。
一、开启锁监控
一般我们可以通过 show engine innodb status 命令来获取死锁信息,但是它有个限制,只能拿到最近一次的死锁日志.
innodb存储引擎监控分为四种,表监控,表空间监控,锁监控,状态监控.能够在mysql客户端开启监控选项,而后信息将会输出在mysql的错误日志内.innodb_monitor和innodb_lock_monitor会每隔15秒会向错误日志中记录InnoDB监控信息,innodb_table_monitor和innodb_tablespace_monitor是每隔64秒.
以下基于系统表和基于系统参数的两种方式,只要使用二者其中一种方式开启监控即可。
1.对innodb状态监控
**基于表**
create table innodb_monitor (a int) engine=innodb; --开启监控
drop table innodb_monitor; --关闭监控 mysql
**基于系统参数**
set GLOBAL innodb_status_output=ON;
set GLOBAL innodb_status_output=OFF; --关闭
2.对innodb锁监控
**基于表**
create table innodb_lock_monitor (a int) engine=innodb; --开启监控
drop table innodb_lock_monitor; --关闭监控sql
**基于系统参数**
set GLOBAL innodb_status_output=ON;
set GLOBAL innodb_status_output_locks=ON;
set GLOBAL innodb_status_output_locks=OFF; -- 关闭
注意:前提需要开启 innodb_status_output
3.对innod的表监控
**基于表**
create table innodb_table_monitor(a int) engine=innodb; --开启监控
drop tables innodb_table_monitor; --关闭监控spa
注:表空间监控暂不支持通过参数方式配置,并且未来会被废弃。
4.对innodb表空间监控
**基于表**
create table innodb_tablespace_monitor (a int) engine=innodb; --开启监控
drop table innodb_tablespace_monitor; --关闭监控日志
注:表监控暂不支持通过参数方式配置,并且未来会被废弃。
监控复位
- 需要特别注意的一点是:mysql服务重启后,需要重启开启相应监控,才会生效。换句话说,服务重启后,之前配置的所有监控都被复位,处于关闭状态。
- 基于系统表方式开启的监控,在mysql服务重启后,即使表存在,监控也不会生效。需要重启drop表,再create表,才能使监控生效。
- 基于系统参数方式开启的监控,在mysql服务重启后,相关系统参数值都是OFF。需要重启设置对应的参数,才能使用监控生效。
- 不在停机或重启情况下,mysql每15秒输出一次INNODB运行状态信息到错误日志。这会使用日志变得越来越大。建议在需要的时候开启,不需要的时候关闭掉。
补充知识
1. 查看错误日志输出位置
mysql root@localhost:test> select @@log_error;
+----------------------------------------+
| @@log_error |
|----------------------------------------|
| /usr/local/mysql/data/mysqld.local.err |
+----------------------------------------+
2. 查看历史日志开启状态与输出位置
mysql root@localhost:test> show VARIABLES like 'general%';
+------------------+---------------------------------------+
| Variable_name | Value |
|------------------+---------------------------------------|
| general_log | ON |
| general_log_file | /usr/local/mysql/data/yerba-buena.log |
+------------------+---------------------------------------+
二.分析死锁日志
有了前面一篇mysql锁机制的了解,再来具体分析一下死锁日志,以及具体执行SQL的加锁过程,会对日后解决死锁有很大的帮助,在分析看死锁日志案例案分析之前,最好先了解清楚mysql innoDB引擎下各种加锁机制逻辑,然后再看这边死锁日志分析会更能认识到死锁最真实的问题,从而加以解决.
下面的案例都是因为执行循序不一致导致死锁,实际生产环境产生的死锁,往往是并发造成的,情况会比这更加难以分析重现,下面的案例相对比较简单,最好能够参透,真正的了解是能够重现。
死锁案例1
------------------------
LATEST DETECTED DEADLOCK
------------------------
2020-05-15 14:53:58 0x2260
*** (1) TRANSACTION:
TRANSACTION 353791, ACTIVE 44 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 20, OS thread handle 14368, query id 7885 localhost ::1 root updating
delete from tkey where morekey = 5
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 4880 page no 5 n bits 80 index key of table `other`.`tkey` trx id 353791 lock_mode X waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 4; hex 80000005; asc ;;
1: len 4; hex 80000005; asc ;;
*** (2) TRANSACTION:
TRANSACTION 353790, ACTIVE 48 sec inserting
mysql tables in use 1, locked 1
5 lock struct(s), heap size 1136, 5 row lock(s), undo log entries 3
MySQL thread id 19, OS thread handle 8800, query id 7887 localhost ::1 root executing
insert tkey select 3,1,4
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 4880 page no 5 n bits 80 index key of table `other`.`tkey` trx id 353790 lock_mode X
Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 4; hex 80000005; asc ;;
1: len 4; hex 80000005; asc ;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 4880 page no 5 n bits 80 index key of table `other`.`tkey` trx id 353790 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 4; hex 80000005; asc ;;
1: len 4; hex 80000005; asc ;;
*** WE ROLL BACK TRANSACTION (1)
以上一份死锁日志,下面来分析一下:
innoDB一共有四种类型的行锁:记录锁,间隙锁,Next-key 锁和插入意向锁,在死锁日志里面分别对应一下几种形式:
- 记录锁(LOCK_REC_NOT_GAP): lock_mode X locks rec but not gap
- 间隙锁(LOCK_GAP): lock_mode X locks gap before rec
- Next-key 锁(LOCK_ORNIDARY): lock_mode X
- 插入意向锁(LOCK_INSERT_INTENTION): lock_mode X locks gap before rec insert intention
单单从死锁日志可以看出
Transaction 1 在等待的是next-key锁,delete 操作被阻塞.
Transaction 2 持有next-key锁,在等待插入意向锁,insert操作被阻塞.
所以Transaction 1 等待的是Transaction2 的next-key锁。
从日志上能了解的信息非常有限,要具体分析死锁如何造成的还是得要具体结合SQL执行情况分析
下图是造成死锁的SQL执行情况以及分析:
仅仅通过死锁日志逆推死锁具体的SQL执行情况,往往会比较复杂,因为死锁日志反应的只是两个事物物在等待的锁的相关信息,实际造成死锁的事物可能不止两个、三个可能有四五条,要能具体分析问题出在哪里就需要对程序代码SQL执行比较熟悉,并且需要比较丰富加锁经验,对加锁的流程要有深刻理解,才能比较好的解决死锁问题.
本案例 重点需要理解一下加锁的顺序问题。
死锁案例2
------------------------
LATEST DETECTED DEADLOCK
------------------------
2020-05-18 16:03:21 0x1c3c
*** (1) TRANSACTION:
TRANSACTION 353798, ACTIVE 27 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 24, OS thread handle 6676, query id 7945 localhost ::1 root updating
delete from goods where count = 30
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 4883 page no 3 n bits 80 index PRIMARY of table `other`.`goods` trx id 353798 lock_mode X waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
0: len 4; hex 80000001; asc ;;
1: len 6; hex 000000056605; asc f ;;
2: len 7; hex 3a000002be04de; asc : ;;
3: len 2; hex 7878; asc xx;;
4: len 4; hex 80000014; asc ;;
5: len 4; hex 80000001; asc ;;
6: len 4; hex 80000001; asc ;;
*** (2) TRANSACTION:
TRANSACTION 353797, ACTIVE 55 sec starting index read
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
MySQL thread id 23, OS thread handle 7228, query id 7946 localhost ::1 root updating
delete from goods where count = 50
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 4883 page no 3 n bits 80 index PRIMARY of table `other`.`goods` trx id 353797 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
0: len 4; hex 80000001; asc ;;
1: len 6; hex 000000056605; asc f ;;
2: len 7; hex 3a000002be04de; asc : ;;
3: len 2; hex 7878; asc xx;;
4: len 4; hex 80000014; asc ;;
5: len 4; hex 80000001; asc ;;
6: len 4; hex 80000001; asc ;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 4883 page no 3 n bits 80 index PRIMARY of table `other`.`goods` trx id 353797 lock_mode X waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
0: len 4; hex 80000001; asc ;;
1: len 6; hex 000000056605; asc f ;;
2: len 7; hex 3a000002be04de; asc : ;;
3: len 2; hex 7878; asc xx;;
4: len 4; hex 80000014; asc ;;
5: len 4; hex 80000001; asc ;;
6: len 4; hex 80000001; asc ;;
*** WE ROLL BACK TRANSACTION (1)
下图为分析情况:
死锁案例3
------------------------
LATEST DETECTED DEADLOCK
------------------------
2020-05-18 16:29:18 0x1c3c
*** (1) TRANSACTION:
TRANSACTION 353812, ACTIVE 55 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 5 lock struct(s), heap size 1136, 4 row lock(s), undo log entries 2
MySQL thread id 24, OS thread handle 6676, query id 7984 localhost ::1 root executing
insert goods select 10,'ss',30,10,10
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 4883 page no 5 n bits 80 index package of table `other`.`goods` trx id 353812 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 5 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 8000000b; asc ;;
1: len 4; hex 8000000b; asc ;;
*** (2) TRANSACTION:
TRANSACTION 353811, ACTIVE 159 sec inserting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1136, 6 row lock(s), undo log entries 3
MySQL thread id 23, OS thread handle 7228, query id 7985 localhost ::1 root executing
insert goods select 2,'zz',30,2,4
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 4883 page no 5 n bits 80 index package of table `other`.`goods` trx id 353811 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;
Record lock, heap no 5 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 8000000b; asc ;;
1: len 4; hex 8000000b; asc ;;
Record lock, heap no 6 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 8000000b; asc ;;
1: len 4; hex 8000000f; asc ;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 4883 page no 5 n bits 80 index package of table `other`.`goods` trx id 353811 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 80000005; asc ;;
1: len 4; hex 80000005; asc ;;
*** WE ROLL BACK TRANSACTION (2)
具体执行如下图:
死锁案例4
------------------------
LATEST DETECTED DEADLOCK
------------------------
2020-05-18 17:38:02 0x1c3c
*** (1) TRANSACTION:
TRANSACTION 353830, ACTIVE 5 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 25, OS thread handle 12272, query id 8009 localhost ::1 root executing
insert goods select 5,'G5',30,5,5
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 4883 page no 3 n bits 80 index PRIMARY of table `other`.`goods` trx id 353830 lock_mode X locks rec but not gap waiting
Record lock, heap no 9 PHYSICAL RECORD: n_fields 7; compact format; info bits 32
0: len 4; hex 80000005; asc ;;
1: len 6; hex 000000056620; asc f ;;
2: len 7; hex 2a000002ae1239; asc * 9;;
3: len 2; hex 4735; asc G5;;
4: len 4; hex 8000001e; asc ;;
5: len 4; hex 80000005; asc ;;
6: len 4; hex 80000005; asc ;;
*** (2) TRANSACTION:
TRANSACTION 353829, ACTIVE 21 sec inserting
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 23, OS thread handle 7228, query id 8008 localhost ::1 root executing
insert goods select 5,'G5',30,5,5
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 4883 page no 3 n bits 80 index PRIMARY of table `other`.`goods` trx id 353829 lock mode S
Record lock, heap no 9 PHYSICAL RECORD: n_fields 7; compact format; info bits 32
0: len 4; hex 80000005; asc ;;
1: len 6; hex 000000056620; asc f ;;
2: len 7; hex 2a000002ae1239; asc * 9;;
3: len 2; hex 4735; asc G5;;
4: len 4; hex 8000001e; asc ;;
5: len 4; hex 80000005; asc ;;
6: len 4; hex 80000005; asc ;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 4883 page no 3 n bits 80 index PRIMARY of table `other`.`goods` trx id 353829 lock_mode X locks rec but not gap waiting
Record lock, heap no 9 PHYSICAL RECORD: n_fields 7; compact format; info bits 32
0: len 4; hex 80000005; asc ;;
1: len 6; hex 000000056620; asc f ;;
2: len 7; hex 2a000002ae1239; asc * 9;;
3: len 2; hex 4735; asc G5;;
4: len 4; hex 8000001e; asc ;;
5: len 4; hex 80000005; asc ;;
6: len 4; hex 80000005; asc ;;
*** WE ROLL BACK TRANSACTION (2)
下图为执行SQL分析:
更多死锁案例可以关注下面这个GitHub项目:
https://github.com/aneasystone/mysql-deadlocks
三.锁的优化建议
锁如果利用不好,会给业务造成大量的卡顿现象,在了解了锁相关的一些知识点后,我们可以有意识的去避免锁带来的一些问题。
- 合理设计索引,让 InnoDB 在索引键上面加锁的时候尽可能准确,尽可能的缩小锁定范围,避免造成不必要的锁定而影响其他 Query 的执行。
- 尽可能减少基于范围的数据检索过滤条件,避免因为间隙锁带来的负面影响而锁定了不该锁定的记录。(降低加锁的级别,非唯一索引条件的修改操作会产生间隙锁,而这是导致大多死锁的主要原因)
- 尽量控制事务的大小,减少锁定的资源量和锁定时间长度。
- 在业务环境允许的情况下,尽量使用较低级别的事务隔离,以减少 MySQL 因为实现事务隔离级别所带来的附加成本。
四.其他需要了解的(待完善)
MySQL DELETE 删除语句加锁分析
delete 语句的加锁有三种情况:
- 记录存在且有效:加记录X锁(lock_mode X locks rec but not gap);
- 记录存在但无效(被标记为删除):加 Next-key 锁(lock_mode X);
- 记录不存在:加 gap 锁(locks gap before rec);
具体参考下面这篇文章:
http://www.fordba.com/lock-analyse-of-delete.html