MySQL45讲读书笔记 40讲insert语句的锁为什么这么多

一 序

  本文属于极客时间MySQL45讲读书笔记系列,本文老师主要讲了特殊的insert 情况。

MySQL对自增主键锁做了优化,尽量在申请到自增id以后,就释放自增锁。还有些insert语句是属于“特殊情况”的,在执行过程中需要给其他资源加锁,或者无法在申请到自增id以后就立马释放自增锁。

二 insert … select 语句

我们先从昨天的问题说起吧。表t和t2的表结构、初始化数据语句如下,今天的例子我们还是针对这两个表展开。

CREATE TABLE `t` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `c` int(11) DEFAULT NULL,
  `d` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `c` (`c`)
) ENGINE=InnoDB;

insert into t values(null, 1,1);
insert into t values(null, 2,2);
insert into t values(null, 3,3);
insert into t values(null, 4,4);

create table t2 like t

现在,我们一起来看看为什么在可重复读隔离级别下,binlog_format=statement时执行:

insert into t2(c,d) select c,d from t;

这个语句时,需要对表t的所有行和间隙加锁呢?

其实,这个问题我们需要考虑的还是日志和数据的一致性。我们看下这个执行序列:

图1 并发insert场景

实际的执行效果是,如果session B先执行,由于这个语句对表t主键索引加了(-∞,1]这个next-key lock,会在语句执行完成后,才允许session A的insert语句执行。

但如果没有锁的话,就可能出现session B的insert语句先执行,但是后写入binlog的情况。于是,在binlog_format=statement的情况下,binlog里面就记录了这样的语句序列:

insert into t values(-1,-1,-1);
insert into t2(c,d) select c,d from t;

这个语句到了备库执行,就会把id=-1这一行也写到表t2中,出现主备不一致。

insert 循环写入

当然了,执行insert … select 的时候,对目标表也不是锁全表,而是只锁住需要访问的资源。

如果现在有这么一个需求:要往表t2中插入一行数据,这一行的c值是表t中c值的最大值加1。

此时,我们可以这么写这条SQL语句 :

insert into t2(c,d)  (select c+1, d from t force index(c) order by c desc limit 1);

这个语句的加锁范围,就是表t索引c上的(4,supremum]这个next-key lock和主键索引上id=4这一行。

它的执行流程也比较简单,从表t中按照索引c倒序,扫描第一行,拿到结果写入到表t2中。

因此整条语句的扫描行数是1。

这个语句执行的慢查询日志(slow log),如下图所示:

图2 慢查询日志--将数据插入表t2

通过这个慢查询日志,我们看到Rows_examined=1,正好验证了执行这条语句的扫描行数为1。

那么,如果我们是要把这样的一行数据插入到表t中的话:

insert into t(c,d)  (select c+1, d from t force index(c) order by c desc limit 1);

语句的执行流程是怎样的?扫描行数又是多少呢?

这时候,我们再看慢查询日志就会发现不对了。

图3 慢查询日志--将数据插入表t

可以看到,这时候的Rows_examined的值是5。

我在前面的文章中提到过,希望你都能够学会用explain的结果来“脑补”整条语句的执行过程。今天,我们就来一起试试。

如图4所示就是这条语句的explain结果。

图4 explain结果

从Extra字段可以看到“Using temporary”字样,表示这个语句用到了临时表。也就是说,执行过程中,需要把表t的内容读出来,写入临时表。

图中rows显示的是1,我们不妨先对这个语句的执行流程做一个猜测:如果说是把子查询的结果读出来(扫描1行),写入临时表,然后再从临时表读出来(扫描1行),写回表t中。那么,这个语句的扫描行数就应该是2,而不是5。

所以,这个猜测不对。实际上,Explain结果里的rows=1是因为受到了limit 1 的影响。

从另一个角度考虑的话,我们可以看看InnoDB扫描了多少行。如图5所示,是在执行这个语句前后查看Innodb_rows_read的结果。

图5 查看 Innodb_rows_read变化

可以看到,这个语句执行前后,Innodb_rows_read的值增加了4。因为默认临时表是使用Memory引擎的,所以这4行查的都是表t,也就是说对表t做了全表扫描。

这样,我们就把整个执行过程理清楚了:

  1. 创建临时表,表里有两个字段c和d。

  2. 按照索引c扫描表t,依次取c=4、3、2、1,然后回表,读到c和d的值写入临时表。这时,Rows_examined=4。

  3. 由于语义里面有limit 1,所以只取了临时表的第一行,再插入到表t中。这时,Rows_examined的值加1,变成了5。

也就是说,这个语句会导致在表t上做全表扫描,并且会给索引c上的所有间隙都加上共享的next-key lock。所以,这个语句执行期间,其他事务不能在这个表上插入数据。

至于这个语句的执行为什么需要临时表,原因是这类一边遍历数据,一边更新数据的情况,如果读出来的数据直接写回原表,就可能在遍历过程中,读到刚刚插入的记录,新插入的记录如果参与计算逻辑,就跟语义不符。

由于实现上这个语句没有在子查询中就直接使用limit 1,从而导致了这个语句的执行需要遍历整个表t。它的优化方法也比较简单,就是用前面介绍的方法,先insert into到临时表temp_t,这样就只需要扫描一行;然后再从表temp_t里面取出这行数据插入表t1。

当然,由于这个语句涉及的数据量很小,你可以考虑使用内存临时表来做这个优化。使用内存临时表优化时,语句序列的写法如下:

create temporary table temp_t(c int,d int) engine=memory;
insert into temp_t  (select c+1, d from t force index(c) order by c desc limit 1);
insert into t select * from temp_t;
drop table temp_t;

insert 唯一键冲突

前面的两个例子是使用insert … select的情况,接下来我要介绍的这个例子就是最常见的insert语句出现唯一键冲突的情况。

对于有唯一键的表,插入数据时出现唯一键冲突也是常见的情况了。我先给你举一个简单的唯一键冲突的例子。

图6 唯一键冲突加锁

这个例子也是在可重复读(repeatable read)隔离级别下执行的。可以看到,session B要执行的insert语句进入了锁等待状态。

也就是说,session A执行的insert语句,发生主键冲突的时候,并不只是简单地报错返回,还在冲突的索引上加了锁。我们前面说过,一个next-key lock就是由它右边界的值定义的。这时候,session A持有索引c上的(5,10]共享next-key lock(读锁)。

至于为什么要加这个读锁,其实我也没有找到合理的解释。从作用上来看,这样做可以避免这一行被别的事务删掉。

这里,我就先和你分享一个经典的死锁场景。

图7 唯一键冲突--死锁

在session A执行rollback语句回滚的时候,session C几乎同时发现死锁并返回。

这个死锁产生的逻辑是这样的:

  1. 在T1时刻,启动session A,并执行insert语句,此时在索引c的c=5上加了记录锁。注意,这个索引是唯一索引,因此退化为记录锁(可以回顾下之前介绍的加锁规则)。

  2. 在T2时刻,session B要执行相同的insert语句,发现了唯一键冲突,加上读锁;同样地,session C也在索引c上,c=5这一个记录上,加了读锁。

  3. T3时刻,session A回滚。这时候,session B和session C都试图继续执行插入操作,都要加上写锁。两个session都要等待对方的行锁,所以就出现了死锁。

这个流程的状态变化图如下所示。

图8 状态变化图--死锁

insert into … on duplicate key update

上面这个例子是主键冲突后直接报错,如果是改写成

insert into t values(11,10,10) on duplicate key update d=100; 

的话,就会给索引c上(5,10] 加一个排他的next-key lock(写锁)。

insert into … on duplicate key update 这个语义的逻辑是,插入一行数据,如果碰到唯一键约束,就执行后面的更新语句。

小结

今天这篇文章,我和你介绍了几种特殊情况下的insert语句。

insert … select 是很常见的在两个表之间拷贝数据的方法。你需要注意,在可重复读隔离级别下,这个语句会给select的表里扫描到的记录和间隙加读锁。

而如果insert和select的对象是同一个表,则有可能会造成循环写入。这种情况下,我们需要引入用户临时表来做优化。

insert 语句如果出现唯一键冲突,会在冲突的唯一值上加共享的next-key lock(S锁)。因此,碰到由于唯一键约束导致报错后,要尽快提交或回滚事务,避免加锁时间过长。

老师留个问题:关于两个表复制数据。

这里有个坑:insert into ...from table where 一定要加索引,否则就是锁全表影响业务了。

还有真不推荐这个,不太可控。老老实实的找个低峰期,单线程的跑吧,批量操作,比如一次2000条。

关于加锁的差异,MySQL官网做了介绍:

14.7.3 Locks Set by Different SQL Statements in InnoDB

locking read, an UPDATE, or a DELETE generally set record locks on every index record that is scanned in the processing of the SQL statement. It does not matter whether there are WHERE conditions in the statement that would exclude the row. InnoDB does not remember the exact WHERE condition, but only knows which index ranges were scanned. The locks are normally next-key locks that also block inserts into the “gap” immediately before the record. However, gap locking can be disabled explicitly, which causes next-key locking not to be used. For more information, see Section 14.7.1, “InnoDB Locking”. The transaction isolation level also can affect which locks are set; see Section 14.7.2.1, “Transaction Isolation Levels”.

If a secondary index is used in a search and index record locks to be set are exclusive, InnoDB also retrieves the corresponding clustered index records and sets locks on them.

If you have no indexes suitable for your statement and MySQL must scan the entire table to process the statement, every row of the table becomes locked, which in turn blocks all inserts by other users to the table. It is important to create good indexes so that your queries do not unnecessarily scan many rows.

InnoDB sets specific types of locks as follows.

  • SELECT ... FROM is a consistent read, reading a snapshot of the database and setting no locks unless the transaction isolation level is set to SERIALIZABLE. For SERIALIZABLE level, the search sets shared next-key locks on the index records it encounters. However, only an index record lock is required for statements that lock rows using a unique index to search for a unique row.

  • For SELECT ... FOR UPDATE or SELECT ... LOCK IN SHARE MODE, locks are acquired for scanned rows, and expected to be released for rows that do not qualify for inclusion in the result set (for example, if they do not meet the criteria given in the WHERE clause). However, in some cases, rows might not be unlocked immediately because the relationship between a result row and its original source is lost during query execution. For example, in a UNION, scanned (and locked) rows from a table might be inserted into a temporary table before evaluation whether they qualify for the result set. In this circumstance, the relationship of the rows in the temporary table to the rows in the original table is lost and the latter rows are not unlocked until the end of query execution.

  • SELECT ... LOCK IN SHARE MODE sets shared next-key locks on all index records the search encounters. However, only an index record lock is required for statements that lock rows using a unique index to search for a unique row.

  • SELECT ... FOR UPDATE sets an exclusive next-key lock on every record the search encounters. However, only an index record lock is required for statements that lock rows using a unique index to search for a unique row.

    For index records the search encounters, SELECT ... FOR UPDATE blocks other sessions from doing SELECT ... LOCK IN SHARE MODE or from reading in certain transaction isolation levels. Consistent reads ignore any locks set on the records that exist in the read view.

  • UPDATE ... WHERE ... sets an exclusive next-key lock on every record the search encounters. However, only an index record lock is required for statements that lock rows using a unique index to search for a unique row.

  • When UPDATE modifies a clustered index record, implicit locks are taken on affected secondary index records. The UPDATE operation also takes shared locks on affected secondary index records when performing duplicate check scans prior to inserting new secondary index records, and when inserting new secondary index records.

  • DELETE FROM ... WHERE ... sets an exclusive next-key lock on every record the search encounters. However, only an index record lock is required for statements that lock rows using a unique index to search for a unique row.

  • INSERT sets an exclusive lock on the inserted row. This lock is an index-record lock, not a next-key lock (that is, there is no gap lock) and does not prevent other sessions from inserting into the gap before the inserted row.

    Prior to inserting the row, a type of gap lock called an insert intention gap lock is set. This lock signals the intent to insert in such a way that multiple transactions inserting into the same index gap need not wait for each other if they are not inserting at the same position within the gap. Suppose that there are index records with values of 4 and 7. Separate transactions that attempt to insert values of 5 and 6 each lock the gap between 4 and 7 with insert intention locks prior to obtaining the exclusive lock on the inserted row, but do not block each other because the rows are nonconflicting.

    If a duplicate-key error occurs, a shared lock on the duplicate index record is set. This use of a shared lock can result in deadlock should there be multiple sessions trying to insert the same row if another session already has an exclusive lock. This can occur if another session deletes the row. Suppose that an InnoDB table t1 has the following structure:

    CREATE TABLE t1 (i INT, PRIMARY KEY (i)) ENGINE = InnoDB;

    Now suppose that three sessions perform the following operations in order:

    Session 1:

    START TRANSACTION;
    INSERT INTO t1 VALUES(1);

    Session 2:

    START TRANSACTION;
    INSERT INTO t1 VALUES(1);

    Session 3:

    START TRANSACTION;
    INSERT INTO t1 VALUES(1);

    Session 1:

    ROLLBACK;

    The first operation by session 1 acquires an exclusive lock for the row. The operations by sessions 2 and 3 both result in a duplicate-key error and they both request a shared lock for the row. When session 1 rolls back, it releases its exclusive lock on the row and the queued shared lock requests for sessions 2 and 3 are granted. At this point, sessions 2 and 3 deadlock: Neither can acquire an exclusive lock for the row because of the shared lock held by the other.

    A similar situation occurs if the table already contains a row with key value 1 and three sessions perform the following operations in order:

    Session 1:

    START TRANSACTION;
    DELETE FROM t1 WHERE i = 1;

    Session 2:

    START TRANSACTION;
    INSERT INTO t1 VALUES(1);

    Session 3:

    START TRANSACTION;
    INSERT INTO t1 VALUES(1);

    Session 1:

    COMMIT;

    The first operation by session 1 acquires an exclusive lock for the row. The operations by sessions 2 and 3 both result in a duplicate-key error and they both request a shared lock for the row. When session 1 commits, it releases its exclusive lock on the row and the queued shared lock requests for sessions 2 and 3 are granted. At this point, sessions 2 and 3 deadlock: Neither can acquire an exclusive lock for the row because of the shared lock held by the other.

  • INSERT ... ON DUPLICATE KEY UPDATE differs from a simple INSERT in that an exclusive lock rather than a shared lock is placed on the row to be updated when a duplicate-key error occurs. An exclusive index-record lock is taken for a duplicate primary key value. An exclusive next-key lock is taken for a duplicate unique key value.

  • REPLACE is done like an INSERT if there is no collision on a unique key. Otherwise, an exclusive next-key lock is placed on the row to be replaced.

  • INSERT INTO T SELECT ... FROM S WHERE ... sets an exclusive index record lock (without a gap lock) on each row inserted into T. If the transaction isolation level is READ COMMITTED, or innodb_locks_unsafe_for_binlog is enabled and the transaction isolation level is not SERIALIZABLEInnoDB does the search on S as a consistent read (no locks). Otherwise, InnoDB sets shared next-key locks on rows from SInnoDB has to set locks in the latter case: During roll-forward recovery using a statement-based binary log, every SQL statement must be executed in exactly the same way it was done originally.

    CREATE TABLE ... SELECT ... performs the SELECT with shared next-key locks or as a consistent read, as for INSERT ... SELECT.

    When a SELECT is used in the constructs REPLACE INTO t SELECT ... FROM s WHERE ... or UPDATE t ... WHERE col IN (SELECT ... FROM s ...)InnoDB sets shared next-key locks on rows from table s.

  • InnoDB sets an exclusive lock on the end of the index associated with the AUTO_INCREMENT column while initializing a previously specified AUTO_INCREMENT column on a table.

    With innodb_autoinc_lock_mode=0InnoDB uses a special AUTO-INC table lock mode where the lock is obtained and held to the end of the current SQL statement (not to the end of the entire transaction) while accessing the auto-increment counter. Other clients cannot insert into the table while the AUTO-INC table lock is held. The same behavior occurs for “bulk inserts” with innodb_autoinc_lock_mode=1. Table-level AUTO-INC locks are not used with innodb_autoinc_lock_mode=2. For more information, See Section 14.6.1.6, “AUTO_INCREMENT Handling in InnoDB”.

    InnoDB fetches the value of a previously initialized AUTO_INCREMENT column without setting any locks.

  • If a FOREIGN KEY constraint is defined on a table, any insert, update, or delete that requires the constraint condition to be checked sets shared record-level locks on the records that it looks at to check the constraint. InnoDB also sets these locks in the case where the constraint fails.

  • LOCK TABLES sets table locks, but it is the higher MySQL layer above the InnoDB layer that sets these locks. InnoDB is aware of table locks if innodb_table_locks = 1 (the default) and autocommit = 0, and the MySQL layer above InnoDB knows about row-level locks.

    Otherwise, InnoDB's automatic deadlock detection cannot detect deadlocks where such table locks are involved. Also, because in this case the higher MySQL layer does not know about row-level locks, it is possible to get a table lock on a table where another session currently has row-level locks. However, this does not endanger transaction integrity, as discussed in Section 14.7.5.2, “Deadlock Detection and Rollback”.

  • LOCK TABLES acquires two locks on each table if innodb_table_locks=1 (the default). In addition to a table lock on the MySQL layer, it also acquires an InnoDB table lock. Versions of MySQL before 4.1.2 did not acquire InnoDB table locks; the old behavior can be selected by setting innodb_table_locks=0. If no InnoDB table lock is acquired, LOCK TABLES completes even if some records of the tables are being locked by other transactions.

    In MySQL 5.7, innodb_table_locks=0 has no effect for tables locked explicitly with LOCK TABLES ... WRITE. It does have an effect for tables locked for read or write by LOCK TABLES ... WRITE implicitly (for example, through triggers) or by LOCK TABLES ... READ.

  • All InnoDB locks held by a transaction are released when the transaction is committed or aborted. Thus, it does not make much sense to invoke LOCK TABLES on InnoDB tables in autocommit=1 mode because the acquired InnoDB table locks would be released immediately.

  • You cannot lock additional tables in the middle of a transaction because LOCK TABLES performs an implicit COMMIT and UNLOCK TABLES.

关于insert的介绍,老师认为上面官方不准的有bug。"INSERT ... ON DUPLICATE KEY UPDATE differs from a simple INSERT in that an exclusive lock rather than a shared lock is placed on the row to be updated when a duplicate-key error occurs. An exclusive index-record lock is taken for a duplicate primary key value. An exclusive next-key lock is taken for a duplicate unique key value."认为如果冲突的是主键索引,就加记录锁,唯一索引才加next-key lock。但实际上,这两类索引冲突加的都是next-key lock。

   这里我们可以看看insert的出现死锁的例子。session1 ROLLBACK,释放锁。session 2 获得S锁,申请插入意向X锁,等待session3.      session 3 获得S锁,申请插入意向X锁,等待session2. 因为插入意向X锁和S锁是不相容的。这也是一种典型的锁升级导致的死锁。如果session1执行COMMIT的话,则另外两个线程都会因为duplicate key失败。

   我不明白,就是为啥还要申请插入意向锁,找了淘宝的数据库月报,对此有解释:

因为ROLLBACK时原记录回滚时是被标记删除的。而我们尝试插入的记录和这个标记删除的记录是相邻的(键值相同),根据插入意向锁的规则,插入位置的下一条记录上如果存在与插入意向X锁冲突的锁时,则需要获取插入意向X锁。

参考:

https://dev.mysql.com/doc/refman/5.7/en/innodb-locks-set.html

http://mysql.taobao.org/monthly/2016/01/01/ 

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值