一 背景:
数据库 阿里云RDS mysql 5.6 隔离级别: RC
听小伙伴说系统最近在报死锁。奇怪的是update 还执行了。
二 相关准备
看表,主要是看表的所以。数据脱敏处理下,我们称为log表。
发现有索引userid,orderid. 两个单独索引。
再看下 数据量 count下。真大 已经1亿多条了。
看日志,SHOW ENGINE INNODB STATUS ,没权限,找dba执行完了导出来给一份。
相关知识:首先关于innodb的锁,推荐官网:
其次,加锁过程,推荐大神何博士的文章:http://hedengcheng.com/?p=771
贴一下结论:结论:若id列上有非唯一索引,那么对应的所有满足SQL查询条件的记录,都会被加锁。同时,这些记录在主键索引上的记录,也会被加锁。网上找个示意图
这里,我们有个概念,就是表锁跟行锁的区别,对于innodb的行锁不同情况不一样,今天要分析的case就是锁住了锁索引。
一些数据结构:
以下引自阿里数据库团队(阿里云数据库内核团队真心厉害,因为你懂原理是一会事,看懂源码是另一会事,能改内核源码做优化是顶级的)
在 Innodb 内部用一个 unsiged long 类型数据表示锁的类型, 如图所示,最低的 4 个 bit 表示 lock_mode, 5-8 bit 表示 lock_type, 剩下的高位 bit 表示行锁的类型。
record_lock type | lock_type | lock_mode |
lock_mode 描述了锁的基本类型,在代码中的定义如下:
/* Basic lock modes */
enum lock_mode {
LOCK_IS = 0, /* intention shared */
LOCK_IX, /* intention exclusive */
LOCK_S, /* shared */
LOCK_X, /* exclusive */
LOCK_AUTO_INC, /* locks the auto-inc counter of a table
in an exclusive mode */
LOCK_NONE, /* this is used elsewhere to note consistent read */
LOCK_NUM = LOCK_NONE, /* number of lock modes */
LOCK_NONE_UNSET = 255
};
#define LOCK_MODE_MASK 0xFUL /* mask used to extact lock type from the
type_mode field in a lock*/
lock_type 占用 5-8 bit 位,目前只用了 5 和 6 位,大小为 16 和 32 ,表示 LOCK_TABLE 和 LOCK_REC,使用宏定义 #define LOCK_TYPE_MASK 0xF0UL
来获取值。
record_lock_type 对于 LOCK_TABLE 类型来说都是空的,对于 LOCK_REC 目前值有:
#define LOCK_WAIT 256 /* 表示正在等待锁 */
#define LOCK_ORDINARY 0 /* 表示 next-key lock ,锁住记录本身和记录之前的 gap*/
#define LOCK_GAP 512 /* 表示锁住记录之前 gap(不锁记录本身) */
#define LOCK_REC_NOT_GAP 1024 /* 表示锁住记录本身,不锁记录前面的 gap */
#define LOCK_INSERT_INTENTION 2048 /* 插入意向锁 */
#define LOCK_CONV_BY_OTHER 4096 /* 表示锁是由其它事务创建的(比如隐式锁转换) */
使用位操作来设置和判断是否设置了对应的值。
静态数据结构
对于每个锁对象,有两个存在的纬度:一个是事务纬度,每个事务都可以获得锁结构和等待某些锁。另一个是全局纬度,所有的锁都保存在 Lock_sys->hash 哈希表中。无论是表锁还是行锁,都是用结构 lock_t 来描述:
/** Lock struct; protected by lock_sys->mutex */
struct lock_t {
trx_t* trx; /*!< transaction owning the
lock */
UT_LIST_NODE_T(lock_t)
trx_locks; /*!< list of the locks of the
transaction */
ulint type_mode; /*!< lock type, mode, LOCK_GAP or
LOCK_REC_NOT_GAP,
LOCK_INSERT_INTENTION,
wait flag, ORed */
hash_node_t hash; /*!< hash chain node for a record
lock */
dict_index_t* index; /*!< index for a record lock */
union {
lock_table_t tab_lock;/*!< table lock */
lock_rec_t rec_lock;/*!< record lock */
} un_member; /*!< lock details */
};
对于每个变量的意义注释已经说的比较清楚了,其中 type_mode 就是第一小节中 lock_type | type_mode,两个锁是否冲突就是使用它们各自的 type_mode 根据锁兼容矩阵来判断的,后面会详细说。
变量 hash 是 Inodb 中构造 Hash 表需要,当锁插入到 Lock_sys->hash 中,Hash 值相同就形成链表,使用变量 hash 相连。
un_member 表示 lock_t 不是表锁就是行锁,看下行锁的结构:
/** Record lock for a page */
struct lock_rec_t {
ulint space; /*!< space id */
ulint page_no; /*!< page number */
ulint n_bits; /*!< number of bits in the lock
bitmap; NOTE: the lock bitmap is
placed immediately after the
lock struct */
};
[space, page_no] 可以确定锁对应哪个页,参考下上个月月报最后两个小节,页上每行数据紧接着存放,内部使用一个 heap_no 来表示是第几行数据。因此[space, page_no, heap_no]可以唯一确定一行。Innodb 使用位图来表示锁具体锁住了那几行,在函数 lock_rec_create 中为 lock_t 分配内存空间的时候,会在对象地址后分配一段内存空间(当前行数 + 64)用来保存位图。n_bits 表示位图大小。
锁创建完成后首先会插入到全局 Hash 表中,然后放到对应的事务的锁链表中。相同(space,page_no)的锁会被 Hash 到同一个桶里,使用 lock_t->hash 串成链表。
实际产生的锁是在事务中,对应每个行锁或者表锁,有一个锁结构lock_t。有两种途径查询锁:第一种方式通过trx_t对象的trx_locks链表进行遍历可得到某个事务所持有或者正在等待的锁信息;另一种方式中全局变量lock_sys本质是一个哈希表,根据记录所在页进行哈希查询得到lock_rec_t,再扫描lock位图判断,最终得到该行记录是否有锁。
三 分析
先看下死锁日志:
------------------------
LATEST DETECTED DEADLOCK
------------------------
2019-07-28 16:55:55 7fa4396ba700
*** (1) TRANSACTION:
TRANSACTION 77124411358, ACTIVE 0.015 sec fetching rows
mysql tables in use 3, locked 3
LOCK WAIT 520 lock struct(s), heap size 63016, 3 row lock(s)
LOCK BLOCKING MySQL thread id: 4034245 block 4034802
MySQL thread id 4034802, OS thread handle 0x7fa4105e7700, query id 113785089726 10.10.10.10 djsycrm Searching rows for update
UPDATE log SET flush=2 WHERE uid=457861679367545 and orderid=6840
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1335 page no 1049191 n bits 576 index `idx_uid` of table `XXXX`.`log` trx id 77124411358 lock_mode X locks rec but not gap waiting
Record lock, heap no 462 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 8; hex 8001a06c3ac14579; asc l: Ey;;
1: len 8; hex 0000000006452bee; asc E+ ;;
*** (2) TRANSACTION:
TRANSACTION 77124411351, ACTIVE 0.022 sec fetching rows
mysql tables in use 3, locked 3
531 lock struct(s), heap size 63016, 8 row lock(s)
MySQL thread id 4034245, OS thread handle 0x7fa4396ba700, query id 113785089680 10.10.10.10 djsycrm Searching rows for update
UPDATE log SET flush=2 WHERE uid=457861679367545 and orderid=5392
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1335 page no 1049191 n bits 576 index `idx_uid` of table `XXXXX`.`log` trx id 77124411351 lock_mode X locks rec but not gap
Record lock, heap no 462 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 8; hex 8001a06c3ac14579; asc l: Ey;;
1: len 8; hex 0000000006452bee; asc E+ ;;
Record lock, heap no 466 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 8; hex 8001a06c3ac14579; asc l: Ey;;
1: len 8; hex 00000000064532bd; asc E2 ;;
Record lock, heap no 474 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 8; hex 8001a06c3ac14579; asc l: Ey;;
1: len 8; hex 0000000006454275; asc EBu;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1335 page no 1256651 n bits 232 index `PRIMARY` of table `XXXXX`.`log` trx id 77124411351 lock_mode X locks rec but not gap waiting
Record lock, heap no 11 PHYSICAL RECORD: n_fields 15; compact format; info bits 0
0: len 8; hex 0000000006454275; asc EBu;;
1: len 6; hex 0011f4d34089; asc @ ;;
2: len 7; hex eb0002c0320110; asc 2 ;;
3: len 8; hex 8001a06c3ac14579; asc l: Ey;;
4: len 8; hex 82a5c7d505c7d108; asc ;;
5: len 4; hex 5d3d4fda; asc ]=O ;;
6: len 4; hex 5d3d4fda; asc ]=O ;;
7: len 4; hex 80000000; asc ;;
8: len 4; hex 80000001; asc ;;
9: len 4; hex 80000000; asc ;;
10: len 2; hex 7b7d; asc {};;
11: len 4; hex 80000001; asc ;;
12: len 4; hex 80000000; asc ;;
13: len 0; hex ; asc ;;
14: len 1; hex 30; asc 0;;
*** WE ROLL BACK TRANSACTION (1)
数据本身被处理过,但是对于理解问题还是有帮助的,能定位主要信息有 1.只显示最近两条事务的信息 2.只显示事务最近执行的一条语句。如文中推荐的做法,配合 general log 和 binlog 进行排查。
先看下日志信息:
有两个事务,对应的SQL,索引信息,锁模式,页面信息及对应的行号(你能找到记录),等待的锁:
UPDATE log SET flush=2 WHERE uid=457861679367545 and orderid=6840
UPDATE log SET flush=2 WHERE uid=457861679367545 and orderid=5392
where条件的 uid 相同,orderid不同。都是索引。
我们看下执行计划:explain
id: 1
select_type: SIMPLE
table: log
type: index_merge
possible_keys: idx_orderid,idx_uid
key: idx_orderid,idx_uid
key_len: 8,8
ref: NULL
rows: 1
Extra: Using intersect(idx_orderid,idx_uid); Using where
是两个索引的交集。后来我查库发现:通过索引复合条件有多条,就是说update锁住的记录数也就不是一条。
那么在并发的情况下,会产生这种情况。
事务1 针对orderid的索引加锁成功,等待uid对应的数据再加锁。
事务2针对uid 索引加锁成功,等待orderid。加锁顺序不同导致的互相等待产生死锁。
上面的死锁日志,可以看出:事务1 等待Record lock, heap no 462 PHYSICAL RECORD:
恰恰是事务2 持有的锁。
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1335 page no 1049191 n bits 576 index `idx_uid` of table `XXXXX`.`log` trx id 77124411351 lock_mode X locks rec but not gap
Record lock, heap no 462 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
至于事务2 所期待的锁:这里日志信息还是不足。应该结合binlog 来排查。
最后的日志是mysql 进行了牺牲,放弃了事务1 *** WE ROLL BACK TRANSACTION (1)
根据事务1 的SQL去查一下,的确没执行,所以说小伙伴的描述不一定准确,要实地验证才行。
**********************************************************************************
这里谈论MySQL的死锁检测,目前仅讨论InnoDB的处理。
当InnoDB事务尝试获取(请求)加一个锁,并且需要等待时,InnoDB会进行死锁检测.
正常的流程如下:
1.InnoDB的初始化一个事务,当事务尝试获取(请求)加一个锁,并且需要等待时(wait_lock),innodb会开始进行死锁检测(deadlock_mark)
2.进入到lock_deadlock_check_and_resolve ,名字很明显了,要检测死锁和解决死锁
3.检测死锁过程中,也是有计数器来进行限制的
4.死锁检测的逻辑之一是等待图的处理过程,如果通过锁的信息和事务等待链构造出一个图,如果图中出现回路,就认为发生了死锁。
5.死锁的回滚,内部代码的处理逻辑之一是比较undo的数量
阿里云有个英文版的,过程介绍更加详细:
https://topic.alibabacloud.com/a/talk-about-mysql-deadlock-one_1_41_30042817.html
我补充一点,整理自mysql 5.6.45版本。主要是lock_rec_lock 函数调用到死锁检测函数lock_deadlock_check_and_resolve。对于行数据的加锁是由函数 lock_rec_lock 完成,简单点来看,主要的参数是 mode(锁类型),block(包含该行的 buffer 数据页),heap_no(具体哪一行)。就可以确定加什么样的锁,以及在哪一行加。前面的何博士哪一篇介绍清楚了。
源码如下:
/*********************************************************************//**
Tries to lock the specified record in the mode requested. If not immediately
possible, enqueues a waiting lock request. This is a low-level function
which does NOT look at implicit locks! Checks lock compatibility within
explicit locks. This function sets a normal next-key lock, or in the case
of a page supremum record, a gap type lock.
@return DB_SUCCESS, DB_SUCCESS_LOCKED_REC, DB_LOCK_WAIT, DB_DEADLOCK,
or DB_QUE_THR_SUSPENDED */
static
dberr_t
lock_rec_lock(
/*==========*/
ibool impl, /*!< in: if TRUE, no lock is set
if no wait is necessary: we
assume that the caller will
set an implicit lock */
ulint mode, /*!< in: lock mode: LOCK_X or
LOCK_S possibly ORed to either
LOCK_GAP or LOCK_REC_NOT_GAP */
const buf_block_t* block, /*!< in: buffer block containing
the record */
ulint heap_no,/*!< in: heap number of record */
dict_index_t* index, /*!< in: index of record */
que_thr_t* thr) /*!< in: query thread */
{
ut_ad(lock_mutex_own());
ut_ad((LOCK_MODE_MASK & mode) != LOCK_S
|| lock_table_has(thr_get_trx(thr), index->table, LOCK_IS));
ut_ad((LOCK_MODE_MASK & mode) != LOCK_X
|| lock_table_has(thr_get_trx(thr), index->table, LOCK_IX));
ut_ad((LOCK_MODE_MASK & mode) == LOCK_S
|| (LOCK_MODE_MASK & mode) == LOCK_X);
ut_ad(mode - (LOCK_MODE_MASK & mode) == LOCK_GAP
|| mode - (LOCK_MODE_MASK & mode) == LOCK_REC_NOT_GAP
|| mode - (LOCK_MODE_MASK & mode) == 0);
ut_ad(dict_index_is_clust(index) || !dict_index_is_online_ddl(index));
/* We try a simplified and faster subroutine for the most
common cases */
switch (lock_rec_lock_fast(impl, mode, block, heap_no, index, thr)) {
case LOCK_REC_SUCCESS:
return(DB_SUCCESS);
case LOCK_REC_SUCCESS_CREATED:
return(DB_SUCCESS_LOCKED_REC);
case LOCK_REC_FAIL:
return(lock_rec_lock_slow(impl, mode, block,
heap_no, index, thr));
}
ut_error;
return(DB_ERROR);
}
/*********************************************************************//**
This is a fast routine for locking a record in the most common cases:
there are no explicit locks on the page, or there is just one lock, owned
by this transaction, and of the right type_mode. This is a low-level function
which does NOT look at implicit locks! Checks lock compatibility within
explicit locks. This function sets a normal next-key lock, or in the case of
a page supremum record, a gap type lock.
@return whether the locking succeeded */
UNIV_INLINE
enum lock_rec_req_status
lock_rec_lock_fast(
/*===============*/
ibool impl, /*!< in: if TRUE, no lock is set
if no wait is necessary: we
assume that the caller will
set an implicit lock */
ulint mode, /*!< in: lock mode: LOCK_X or
LOCK_S possibly ORed to either
LOCK_GAP or LOCK_REC_NOT_GAP */
const buf_block_t* block, /*!< in: buffer block containing
the record */
ulint heap_no,/*!< in: heap number of record */
dict_index_t* index, /*!< in: index of record */
que_thr_t* thr) /*!< in: query thread */
{
lock_t* lock;
trx_t* trx;
enum lock_rec_req_status status = LOCK_REC_SUCCESS;
ut_ad(lock_mutex_own());
ut_ad((LOCK_MODE_MASK & mode) != LOCK_S
|| lock_table_has(thr_get_trx(thr), index->table, LOCK_IS));
ut_ad((LOCK_MODE_MASK & mode) != LOCK_X
|| lock_table_has(thr_get_trx(thr), index->table, LOCK_IX));
ut_ad((LOCK_MODE_MASK & mode) == LOCK_S
|| (LOCK_MODE_MASK & mode) == LOCK_X);
ut_ad(mode - (LOCK_MODE_MASK & mode) == LOCK_GAP
|| mode - (LOCK_MODE_MASK & mode) == 0
|| mode - (LOCK_MODE_MASK & mode) == LOCK_REC_NOT_GAP);
ut_ad(dict_index_is_clust(index) || !dict_index_is_online_ddl(index));
DBUG_EXECUTE_IF("innodb_report_deadlock", return(LOCK_REC_FAIL););
lock = lock_rec_get_first_on_page(block);
trx = thr_get_trx(thr);
if (lock == NULL) {// 对应的没有锁
if (!impl) {//需要加锁
/* Note that we don't own the trx mutex. */
/*创建一个锁*/
lock = lock_rec_create(
mode, block, heap_no, index, trx, FALSE);
}
//返回成功
status = LOCK_REC_SUCCESS_CREATED;
} else {
trx_mutex_enter(trx);
if (lock_rec_get_next_on_page(lock) /*页上只有一个锁*/
|| lock->trx != trx /*拥有锁的事务不是当前事务*/
|| lock->type_mode != (mode | LOCK_REC) /* 已有锁和要加的锁模式是否相同 */
|| lock_rec_get_n_bits(lock) <= heap_no) { /* 已有锁的 n_bits 是否满足 heap_no */
status = LOCK_REC_FAIL;
} else if (!impl) {
/* If the nth bit of the record lock is already set
then we do not set a new lock bit, otherwise we do
set 如果之前没有设置,那么只需要设置一下 bitmap 就可以了*/
if (!lock_rec_get_nth_bit(lock, heap_no)) {
lock_rec_set_nth_bit(lock, heap_no);
status = LOCK_REC_SUCCESS_CREATED;
}
}
trx_mutex_exit(trx);
}
return(status);
}
/*********************************************************************//**
This is the general, and slower, routine for locking a record. This is a
low-level function which does NOT look at implicit locks! Checks lock
compatibility within explicit locks. This function sets a normal next-key
lock, or in the case of a page supremum record, a gap type lock.
@return DB_SUCCESS, DB_SUCCESS_LOCKED_REC, DB_LOCK_WAIT, DB_DEADLOCK,
or DB_QUE_THR_SUSPENDED */
static
dberr_t
lock_rec_lock_slow(
/*===============*/
ibool impl, /*!< in: if TRUE, no lock is set
if no wait is necessary: we
assume that the caller will
set an implicit lock */
ulint mode, /*!< in: lock mode: LOCK_X or
LOCK_S possibly ORed to either
LOCK_GAP or LOCK_REC_NOT_GAP */
const buf_block_t* block, /*!< in: buffer block containing
the record */
ulint heap_no,/*!< in: heap number of record */
dict_index_t* index, /*!< in: index of record */
que_thr_t* thr) /*!< in: query thread */
{
trx_t* trx;
dberr_t err = DB_SUCCESS;
ut_ad(lock_mutex_own());
ut_ad((LOCK_MODE_MASK & mode) != LOCK_S
|| lock_table_has(thr_get_trx(thr), index->table, LOCK_IS));
ut_ad((LOCK_MODE_MASK & mode) != LOCK_X
|| lock_table_has(thr_get_trx(thr), index->table, LOCK_IX));
ut_ad((LOCK_MODE_MASK & mode) == LOCK_S
|| (LOCK_MODE_MASK & mode) == LOCK_X);
ut_ad(mode - (LOCK_MODE_MASK & mode) == LOCK_GAP
|| mode - (LOCK_MODE_MASK & mode) == 0
|| mode - (LOCK_MODE_MASK & mode) == LOCK_REC_NOT_GAP);
ut_ad(dict_index_is_clust(index) || !dict_index_is_online_ddl(index));
DBUG_EXECUTE_IF("innodb_report_deadlock", return(DB_DEADLOCK););
trx = thr_get_trx(thr);
trx_mutex_enter(trx);
if (lock_rec_has_expl(mode, block, heap_no, trx)) {
/* The trx already has a strong enough lock on rec: do
nothing */
} else if (lock_rec_other_has_conflicting(
static_cast<enum lock_mode>(mode),
block, heap_no, trx)) {
/* If another transaction has a non-gap conflicting
request in the queue, as this transaction does not
have a lock strong enough already granted on the
record, we have to wait. */
err = lock_rec_enqueue_waiting(
mode, block, heap_no, index, thr);
} else if (!impl) {
/* Set the requested lock on the record, note that
we already own the transaction mutex. */
lock_rec_add_to_queue(
LOCK_REC | mode, block, heap_no, index, trx, TRUE);
err = DB_SUCCESS_LOCKED_REC;
}
trx_mutex_exit(trx);
return(err);
}
关于死锁检测的代码:
/********************************************************************//**
Checks if a joining lock request results in a deadlock. If a deadlock is
found this function will resolve the dadlock by choosing a victim transaction
and rolling it back. It will attempt to resolve all deadlocks. The returned
transaction id will be the joining transaction id or 0 if some other
transaction was chosen as a victim and rolled back or no deadlock found.
@return id of transaction chosen as victim or 0 */
static
trx_id_t
lock_deadlock_check_and_resolve(
/*============================*/
const lock_t* lock, /*!< in: lock the transaction is requesting */
const trx_t* trx) /*!< in: transaction */
{
trx_id_t victim_trx_id;
ut_ad(trx != NULL);
ut_ad(lock != NULL);
ut_ad(lock_mutex_own());
assert_trx_in_list(trx);
/* Try and resolve as many deadlocks as possible. */
do {
lock_deadlock_ctx_t ctx;
/* Reset the context. */
ctx.cost = 0;
ctx.depth = 0;
ctx.start = trx;
ctx.too_deep = FALSE;
ctx.wait_lock = lock;
ctx.mark_start = lock_mark_counter;
//死锁检测
victim_trx_id = lock_deadlock_search(&ctx);
/* Search too deep, we rollback the joining transaction. 如果死锁检测的深度太深,需要回滚当前的事务 */
if (ctx.too_deep) {
ut_a(trx == ctx.start);
ut_a(victim_trx_id == trx->id);
if (!srv_read_only_mode) {
lock_deadlock_joining_trx_print(trx, lock);
}
MONITOR_INC(MONITOR_DEADLOCK);
// 如果需要回滚的是其他事务,那么调用lock_deadlock_trx_rollback进行回
} else if (victim_trx_id != 0 && victim_trx_id != trx->id) {
ut_ad(victim_trx_id == ctx.wait_lock->trx->id);
lock_deadlock_trx_rollback(&ctx);
lock_deadlock_found = TRUE;
MONITOR_INC(MONITOR_DEADLOCK);
}
} while (victim_trx_id != 0 && victim_trx_id != trx->id);
/* If the joining transaction was selected as the victim. */
if (victim_trx_id != 0) {
ut_a(victim_trx_id == trx->id);
//打印回滚事务
lock_deadlock_fputs("*** WE ROLL BACK TRANSACTION (2)\n");
lock_deadlock_found = TRUE;
}
return(victim_trx_id);
}
四 方案
死锁的发生与加锁顺序关联密切。而加锁顺序又跟使用的索引密切相关。 innodb在RR级别下,容易产生gap锁的问题。
针对本文分析的索引导致并发死锁问题。
两种方案:1.程序加锁。外面拦住并发。key:uid+orderid.
2 ,调整索引结构,从两个独立索引改为联合索引,保证一致性。
还有个相关的事,尽量让dba归档数据。
每次看mysql都觉得博大精深。如果能让你有所收获,就值得码字。
这篇分析的也很好:
https://colinback.github.io/szyblogs/database/2018/05/20/innodb-kernel-8/
mysql 官网源码下载地址: