数据库实现不同隔离级别的方法

neo4j是怎么实现read-committed的?

用了MVCC的snapshot。实现细节不明,但是可以确定的是代码里出现了和timestamp类似的TransactionId。

关系型数据库(relational database)是怎么实现repeatable read的?外键(foreign key)在复制snapshot的时候是怎么处理的?

Concurrency control mechanisms
The main categories of concurrency control mechanisms are:

  • Optimistic - Delay the checking of whether a transaction meets the isolation and other integrity rules (e.g., serializability and recoverability) until its end, without blocking any of its (read, write) operations (“…and be optimistic about the rules being met…”), and then abort a transaction to prevent the violation, if the desired rules are to be violated upon its commit. An aborted transaction is immediately restarted and re-executed, which incurs an obvious overhead (versus executing it to the end only once). If not too many transactions are aborted, then being optimistic is usually a good strategy.
  • Pessimistic - Block an operation of a transaction, if it may cause violation of the rules, until the possibility of violation disappears. Blocking operations is typically involved with performance reduction.
  • Semi-optimistic - Block operations in some situations, if they may cause violation of some rules, and do not block in other situations while delaying rules checking (if needed) to transaction’s end, as done with optimistic.
可能会用到的关键技术:SnapShot Isolation; MVCC

When an MVCC database needs to update a piece of data, it will not overwrite the original data item with new data, but instead creates a newer version of the data item. Thus there are multiple versions stored. The version that each transaction sees depends on the isolation level implemented. The most common isolation level implemented with MVCC is snapshot isolation. With snapshot isolation, a transaction observes a state of the data as of when the transaction started.

MVCC introduces the challenge of how to remove versions that become obsolete and will never be read. Some databases split the storage blocks into two parts: the data part and an undo log. The data part always keeps the last committed version. The undo log enables the recreation of older versions of data. The main inherent limitation of this approach is that when there are update-intensive workloads, the undo log part runs out of space and then transactions are aborted as they cannot be given their snapshot

Implementation of MVCC:

MVCC uses timestamps (TS), and incrementing transaction IDs, to achieve transactional consistency. MVCC ensures **a transaction (T) never has to wait to Read a database object § ** by maintaining several versions of the object. Each version of object P has both a Read Timestamp (RTS) and a Write Timestamp (WTS) which lets a particular transaction Ti read the most recent version of the object which precedes the transaction’s Read Timestamp RTS(Ti).

If transaction Ti wants to Write to object P, and there is also another transaction Tk happening to the same object, the Read Timestamp RTS(Ti) must precede the Read Timestamp RTS(Tk), i.e., RTS(Ti) < RTS(Tk), for the object Write Operation (WTS) to succeed. A Write cannot complete if there are other outstanding transactions with an earlier Read Timestamp (RTS) to the same object. Like standing in line at the store, you cannot complete your checkout transaction until those in front of you have completed theirs.

To restate; every object § has a Timestamp (TS), however if transaction Ti wants to Write to an object, and the transaction has a Timestamp (TS) that is earlier than the object’s current Read Timestamp, TS(Ti) < RTS§, then the transaction is aborted and restarted. (This is because a later transaction already depends on the old value.) Otherwise, Ti creates a new version of object P and sets the read/write timestamp TS of the new version to the timestamp of the transaction TS ← TS(Ti).

这里的意思是做事情要排队。数据自己有一个时间标记,事物写它时,事物的时间标记需要晚于数据的时间标记。感觉这里细节讲的不太够啊?

snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database (in practice it reads the last committed values that existed at the time it started), and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made(就是和其他已经提交的update没有冲突。不需要理会没有提交的那些update) since that snapshot.

MVCC中文教程:
数据库中同时存在多个版本的数据,并不是整个数据库的多个版本,而是某一条记录的多个版本同时存在,在某个事务对其进行操作的时候,需要查看这一条记录的隐藏列事务版本id,比对事务id并根据事物隔离级别去判断读取哪个版本的数据。

事务每次开启前,都会从数据库获得一个自增长的事务ID (TransactionId),可以从事务ID判断事务的执行先后顺序。这就是事务版本号。

roll_pointer: 指向回滚段的undo日志。
undo log,回滚日志,用于记录数据被修改前的信息在表记录修改之前,会先把数据拷贝到undo log里,如果事务回滚,即可以通过undo log来还原数据。

多个事务并行操作某一行数据时,不同事务对该行数据的修改会产生多个版本,然后通过回滚指针(roll_pointer),连成一个链表,这个链表就称为版本链。

Read View是事务执行SQL语句时,产生的读视图。它主要是用来做可见性判断的,即判断当前事务可见哪个版本的数据。

  • m_ids:当前系统中那些活跃(未提交)的读写事务ID, 它数据结构为一个List。
  • min_limit_id:表示在生成ReadView时,当前系统中活跃的读写事务中最小的事务id,即m_ids中的最小值。
  • max_limit_id:表示生成ReadView时,系统中应该分配给下一个事务的id值。
  • creator_trx_id: 创建当前read view的事务ID

transactional control的经典书籍

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值