浅谈InnoDB的MVCC策略

最新推荐文章于 2024-01-02 09:03:42 发布

源de爸讲数据库

最新推荐文章于 2024-01-02 09:03:42 发布

阅读量425

点赞数

分类专栏： MySQL 文章标签： mysql

本文链接：https://blog.csdn.net/leonpenn/article/details/72730218

版权

MySQL 专栏收录该内容

24 篇文章 2 订阅

订阅专栏

InnoDB 是采用了多版本并发控制（MVCC）的一套存储引擎，它在表空间内一个被称为“rollback segment”的地方，记录了有变更的数据行的旧版本（修改前）信息，用来支持一些事务功能：例如一致性和回滚。InnoDB利用“rollback segment”里的这些信息实现了事务回滚过程中的undo操作；而在一致性读的过程中，这些信息也被用来构建行数据的修改前的值。

InnoDB is a multi-versioned storage engine: it keeps information about old versions of changed rows, to support transactional features such as concurrency and rollback. This information is stored in the tablespace in a data structure called a rollback segment (after an analogous data structure in Oracle). InnoDB uses the information in the rollback segment to perform the undo operations needed in a transaction rollback. It also uses the information to build earlier versions of a row for a consistent read.

MVCC在InnoDB的隔离级别为Repeatable Read或Read Commited时才生效。另外两个隔离级别不兼容是因为Read UnCommited要求存储引擎总是去读取最新的数据行（而不会根据当前的事务版本去选择性读取），而Serializable则要求事务串行化执行（完全没有并发的概念）。

InnoDB会自动为每个数据行增加3个字段：

1、DB_TRX_ID（6 byte）标示了最近对该数据行有insert或update操作的事务的transaction identifier。delete操作被视为update，内部会设定一个特殊的标志位用来标示删除状态。

2、DB_ROLL_PTR（7 byte）称为roll pointer。指向rollback segment内的undo log记录。一旦数据行有所更新，undo log 包含了重构修改前数据所需要的必要信息。

3、DB_ROW_ID（6 byte）包含了row ID，当有新数据行insert时单调递增。

Internally, InnoDB adds three fields to each row stored in the database. A 6-byte DB_TRX_ID field indicates the transaction identifier for the last transaction that inserted or updated the row. Also, a deletion is treated internally as an update where a special bit in the row is set to mark it as deleted. Each row also contains a 7-byte DB_ROLL_PTR field called the roll pointer. The roll pointer points to an undo log record written to the rollback segment. If the row was updated, the undo log record contains the information necessary to rebuild the content of the row before it was updated. A 6-byte DB_ROW_ID field contains a row ID that increases monotonically as new rows are inserted. If InnoDB generates a clustered index automatically, the index contains row ID values. Otherwise, the DB_ROW_ID column does not appear in any index.

由此，Repeatable Read隔离级别下，MVCC对增删改查的实现方法：

Select：
InnoDB只查找版本号小于或等于当前事务版本号的数据行。确保事务读取的行，要么是在事务开始前就存在的，要么是事务自身插入或者修改过的。
行的删除版本要么未定义，要么大于当前事务的版本。确保事务读取到的行，在事务开始之前未被删除。

只有符合上述两条的数据行才会被返回作为查询结果。

Insert：
InnoDB为新插入的行保存当前SVN作为行版本号。
Delete：
InnoDB为删除的行保存当前SVN作为行删除标识。
Update：
InnoDB对Update的操作会转义为Insert+Delete：插入一行新记录，保存当前SVN作为行版本号，同时保存当前SVN到原来的行作为行删除标识。

Tips：针对有行删除标识的数据行，InnoDB会在后台开启一个线程执行清理操作，将行删除版本号小于当前系统版本号的数据行删除。这就是1个purge的过程。

一致性非锁定读（Consistent Nonlocking Reads）是指InnoDB使用多版本并发控制的方式向query提供数据库在某个时间点的快照数据，提高了并发性。

在两种不同的隔离级别下，对于快照的定义有所不同。Repeatable Read：读取事务开始时的行数据版本；Read Commited：总是读取最新的快照数据版本。

A consistent read means that InnoDB uses multi-versioning to present to a query a snapshot of the database at a point in time.

The query sees the changes made by transactions that committed before that point of time, and no changes made by later or uncommitted transactions.

The exception to this rule is that the query sees the changes made by earlier statements within the same transaction.

This exception causes the following anomaly: If you update some rows in a table, a SELECT sees the latest version of the updated rows, but it might also see older versions of any rows. If other sessions simultaneously update the same table, the anomaly means that you might see the table in a state that never existed in the database.

注意，快照是在第一次select ... for update;之前读取所有已提交数据，创建read view 生成快照，而不是开启事务的时候。

参考资料：

（1）https://dev.mysql.com/doc/refman/5.5/en/innodb-multi-versioning.html

（2）《高性能MySQL》第1章

（3）http://www.web520.cn/archives/29979

（4）http://coding-geek.com/how-databases-work/

（5）https://segmentfault.com/a/1190000008459057

源de爸讲数据库

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
浅谈InnoDB的MVCC策略

关于InnoDB的并发控制InnoDB 存储引擎默认隔离级别为可重复读（Repeatable Read），该隔离级别下加行锁采用的是next-key locking 策略。关于 next-key locking 策略InnoDB支持行锁（锁定字段含有索引的情况下，否则走表锁），但锁定方式并非简单的锁定指定行上的索引，而是分为3种锁定算法：1）记录锁（Record L
复制链接

扫一扫

专栏目录