目录
几点关键的背景
正确的B+树并发控制机制需要满足以下几点要求:
正确的读操作:
- R.1 不会读到一个处于中间状态的键值对:读操作访问中的键值对正在被另一个写操作修改
- R.2 不会找不到一个存在的键值对:读操作正在访问某个树节点,这个树节点上的键值对同时被另一个写操作(分裂/合并操作)移动到另一个树节点,导致读操作没有找到目标键值对
正确的写操作:
- W.1 两个写操作不会同时修改同一个键值对
无死锁:
- D.1 不会出现死锁:两个或多个线程发生永久堵塞(等待),每个线程都在等待被其他线程占用并堵塞了的资源
一些名词
- SL (Shared Lock): 共享锁 — 加锁
- SU (Shared Unlock): 共享锁 — 解锁
- XL (Exclusive Lock): 互斥锁 — 加锁
- XU (Exclusive Unlock): 互斥锁 — 解锁
- SXL (Shared Exclusive Lock): 共享互斥锁 — 加锁
- SXU (Shared Exclusive Unlock): 共享互斥锁 — 解锁
- R.1/R.2/W.1/D.1: 并发机制需要满足的正确性要求
- safe nodes:判断依据为该节点上的当前操作是否会影响祖先节点。以传统B+树为例:(1) 对于插入操作,当键值对的数量小于M时,插入操作不会触发分裂操作,该节点属于safe node;反之当键值对数量等于M时,该节点属于unsafe node;(2)对于删除操作,当键值对的数量大于M/2时,不会触发合并操作,该节点属于safe node;反之当键值对数量等于M/2时,该节点属于unsafe node。当然,对于MySQL而言,一个节点是否是安全节点取决于键值对的大小和页面剩余空间大小等多个因素,详细代码可查询MySQL5.7的btr_cur_will_modify_tree()函数。
| S|SX| X|
–+–+–+–+
S| o| o| x|
–+–+–+–+
SX| o| x| x|
–+–+–+–+
X| x| x| x|
–+–+–+–+
.
S锁和X锁与之前的逻辑相同,没有做变动
SX与SX和X互斥,与S共享,内部定义为RW_SX_LATCH,根据描述,在加上SX锁之后,不会影响读操作,但阻塞写操作
SMO过程
MySQL5.6 SMO分析
MySQL 5.6 SMO代码分析
row_ins_clust_index_entry
| | ==> //乐观插入
| | ==> err = row_ins_clust_index_entry_low(0, BTR_MODIFY_LEAF, index, n_uniq, entry, n_ext, thr);
| | ==> mtr_start(&mtr);
| |
| |
| | ==> //mode is BTR_MODIFY_LEAF or BTR_MODIFY_TREE
| | ==> btr_cur_search_to_nth_level(index, 0, entry, PAGE_CUR_LE, mode, &cursor, 0, __FILE__, __LINE__, &mtr);
| | ==> //加Index锁
| | ==> mtr_x_lock(dict_index_get_lock(index), mtr);
| | ==> mtr_s_lock(dict_index_get_lock(index), mtr);
| | ==> //获取根节点的逻辑地址
| | ==> space = dict_index_get_space(index);
| | ==> page_no = dict_index_get_page(index);
| | ==> //非叶子节点,搜索时不加锁,叶子节点搜索时加s\x锁
| | ==> if (height != 0) {rw_latch = RW_NO_LATCH;}
| | ==> else{rw_latch = latch_mode;}
| | ==>
| | ==> block = buf_page_get_gen(space, zip_size, page_no, rw_latch, guess, buf_mode, file, line, mtr);
| | ==> if(height == ULINT_UNDEFINED) 确定树高
| | ==> // if(height == 0 && 乐观插入),释放index lock
| | ==> mtr_release_s_latch_at_savepoint(dict_index_get_lock(index))
| | ==> //page内的搜索,http://blog.itpub.net/7728585/viewspace-2144744/
| | ==> page_cur_search_with_match
| | ==> /* Go to the child node */
| | ==> page_no = btr_node_ptr_get_child_page_no(node_ptr, offsets);
| | ==> //乐观插入
| | ==> err = btr_cur_optimistic_insert(flags, &cursor, &offsets, &offsets_heap, entry, &insert_rec, &big_rec, n_ext, thr, &mtr)
| | ==> //判断空间是否足够
| | ==> dict_index_get_space_reserve
| | ==> mtr_commit(&mtr);
| | ==> //悲观插入
| | ==> if (err == DB_FAIL)row_ins_clust_index_entry_low(0, BTR_MODIFY_TREE, index, n_uniq, entry, n_ext, thr)
| | ==> mtr_start(&mtr);
| |
| |
| | ==> //mode is BTR_MODIFY_LEAF or BTR_MODIFY_TREE
| | ==> btr_cur_search_to_nth_level(index, 0, entry, PAGE_CUR_LE, mode, &cursor, 0, __FILE__, __LINE__, &mtr);
| | ==> //尝试乐观插入
| | ==> err = btr_cur_optimistic_insert(flags, &cursor, &offsets, &offsets_heap,
entry, &insert_rec, &big_rec, n_ext, thr, &mtr);
| | ==> //悲观插入
| | ==> err = btr_cur_pessimistic_insert(flags, &cursor, &offsets, &offsets_heap,
entry, &insert_rec, &big_rec, n_ext, thr, &mtr);
| | ==> btr_page_split_and_insert(flags, cursor, offsets, heap, entry, n_ext, mtr)
| | ==> //修改树结构
| | ==> btr_attach_half_pages
| | ==> btr_insert_on_non_leaf_level(flags, index, level + 1, node_ptr_upper, mtr)
| | ==> 这里是修改父节点,可能会出现进一步的SMO
| | ==> btr_insert_on_non_leaf_level_func
| | ==> //释放index lock
| | ==> mtr_memo_release(mtr, dict_index_get_lock(cursor->index), MTR_MEMO_X_LOCK);
| | ==> //移动数据到新Page
| | ==> page_zip_copy_recs
| | ==> page_delete_rec_list_start
| | ==> //从老Page删除数据
| | ==> page_delete_rec_list_end
MySQL 5.6 SMO过程分析
乐观插入:
- 加index s 锁(路径不加锁)
- 叶子节点加X锁
- 释放index Lock
- mod leaf
- 释放叶子节点x锁
悲观插入:
- 加index x 锁(路径不加锁)
- 叶子节点加X锁
- mod tree
- 释放index Lock
- mod leaf
- 释放叶子节点x锁
MySQL5.7 SMO分析
MySQL 5.7 SMO代码分析
row_ins_clust_index_entry
| | ==> //乐观插入
| | ==> err = row_ins_clust_index_entry_low(0, BTR_MODIFY_LEAF, index, n_uniq, entry, n_ext, thr);
| | ==> mtr_start(&mtr);
| |
| |
| | ==> //mode is BTR_MODIFY_LEAF or BTR_MODIFY_TREE
| | ==> btr_pcur_open(index, entry, PAGE_CUR_LE, mode, &pcur, &mtr);
| | ==> btr_pcur_open_low(dict_index_t* index, /*!< in: index */
ulint level, /*!< in: level in the btree */ (here is 0)
const dtuple_t* tuple, /*!< in: tuple on which search done */
page_cur_mode_t mode, /*!< in: PAGE_CUR_L, ...; */
ulint latch_mode,/*!< in: BTR_SEARCH_LEAF, ... */
btr_pcur_t* cursor, /*!< in: memory buffer for persistent cursor */
const char* file, /*!< in: file name */
ulint line, /*!< in: line where called */
mtr_t* mtr);
| | ==> //1300行的大函数 -_-!, //mode is BTR_MODIFY_LEAF or BTR_MODIFY_TREE
| | ==> btr_cur_search_to_nth_level(index, level, tuple, mode, latch_mode, btr_cursor, 0, file, line, mtr);
| | ==> //Store the position of the tree latch we push to mtr in memo stack for locks,
| | ==> //so that we know how to release it when we have latched leaf node(s)
| | ==> savepoint = mtr_set_savepoint(mtr);
| | ==> //如果是BTR_MODIFY_TREE,大概率加sx锁,其他场景加s锁:https://yq.aliyun.com/articles/41087
| | ==> mtr_sx_lock(dict_index_get_lock(index), mtr);
| | ==> upper_rw_latch = RW_X_LATCH;
| | ==> mtr_s_lock(dict_index_get_lock(index), mtr);
| | ==> upper_rw_latch = RW_S_LATCH;
| | ==> //获取根节点的加锁姿势
| | ==> root_leaf_rw_latch = btr_cur_latch_for_root_leaf(latch_mode);
| | ==> //获取根节点的逻辑地址
| | ==> page_id_t page_id(space, dict_index_get_page(index));
| | ==> Loop:
| | ==> rw_latch = RW_NO_LATCH;
| | ==> //如果是路径节点或根节点,在(乐观插入) or (目标是非叶子节点、且到达了该层)的场景下
| | ==> //对于root节点:rw_latch = RW_SX_LATCH; 对于其他节点rw_latch = upper_rw_latch;
| | ==> //如果是叶子节点,在乐观插入场景下,rw_latch = latch_mode
| | ==> tree_savepoints[n_blocks] = mtr_set_savepoint(mtr);
| | ==> //对于悲观插入,将page读入bp时,如果page还未到达目标层,读入时不加LATCH
| | ==> block = buf_page_get_gen(page_id, page_size, rw_latch, guess, buf_mode, file, line, mtr);
| | ==> //读入的block维持在数组tree_blocks中
| | ==> tree_blocks[n_blocks] = block;
| | ==> //Btree只有一个Root节点时,有单独处理逻辑
| | ==> upper_rw_latch = root_leaf_rw_latch;
| | ==> //一些加锁之前的类似死锁判断
| | ==> buf_block_dbg_add_level
| | ==> //在root节点这层获取树高
| | ==> height = btr_page_get_level(page, mtr);
| | ==> //如果是叶子节点,且是悲观插入,对当前page,及当前page的左右节点加上X LATCH
| | ==> btr_cur_latch_leaves
| | ==> //如果是叶子节点,且是乐观插入,释放索引锁和路径上的S锁
| | ==> mtr_release_s_latch_at_savepoint(dict_index_get_lock(index))
| | ==> mtr_release_block_at_savepoint
| | ==> //page内查找
| | ==> page_cur_search_with_match_bytes
| | ==> //如果还未到达目标层
| | ==> if (level != height) {
| | ==> height--;
| | ==> //判断当前Page是否会发送结构改变
| | ==> btr_cur_will_modify_tree
| | ==> //如果当前Page是安全的,释放之前的Lock
| | ==> mtr_release_block_at_savepoint
| | ==> //如果下一次就是目标层,对路径上Unsafe节点加X锁
| | ==> mtr_block_x_latch_at_savepoint
| | ==> /* Go to the child node */
| | ==> page_id.reset(space, btr_node_ptr_get_child_page_no(node_ptr, offsets));
| | ==> n_blocks++;
| | ==> goto loop
| | ==> }
| | ==> //乐观插入
| | ==> err = btr_cur_optimistic_insert(flags, &cursor, &offsets, &offsets_heap, entry, &insert_rec, &big_rec, n_ext, thr, &mtr)
| | ==> //判断空间是否足够
| | ==> dict_index_get_space_reserve
| | ==> mtr_commit(&mtr);
| | ==> //悲观插入
| | ==> if (err == DB_FAIL)row_ins_clust_index_entry_low(0, BTR_MODIFY_TREE, index, n_uniq, entry, n_ext, thr)
| | ==> mtr_start(&mtr);
| |
| |
| | ==> //mode is BTR_MODIFY_LEAF or BTR_MODIFY_TREE
| | ==> btr_pcur_open(index, entry, PAGE_CUR_LE, mode, &pcur, &mtr);
| | ==> //尝试乐观插入
| | ==> err = btr_cur_optimistic_insert(flags, &cursor, &offsets, &offsets_heap,
entry, &insert_rec, &big_rec, n_ext, thr, &mtr);
| | ==> //悲观插入
| | ==> err = btr_cur_pessimistic_insert(flags, &cursor, &offsets, &offsets_heap,
entry, &insert_rec, &big_rec, n_ext, thr, &mtr);
| | ==> btr_page_split_and_insert(flags, cursor, offsets, heap, entry, n_ext, mtr)
| | ==> //修改树结构
| | ==> btr_attach_half_pages
| | ==> btr_insert_on_non_leaf_level(flags, index, level + 1, node_ptr_upper, mtr)
| | ==> 这里是修改父节点,可能会出现进一步的SMO
| | ==> btr_insert_on_non_leaf_level_func
| | ==> //移动数据到新Page
| | ==> page_zip_copy_recs
| | ==> page_delete_rec_list_start
| | ==> //从老Page删除数据
| | ==> page_delete_rec_list_end
MySQL 5.7 SMO过程分析
For 悲观插入:
- 加Index SX Lock,在SMO上实现串行。
- 对Root 加SX Lock(貌似)
- 对路径上可能unsafe的节点加x锁
- 对叶子节点加x锁
- Mod tree
- 释放索引Lock
- mod leaf
- 释放节点Lock和路径Lock
For 乐观插入:
- 加Index S Lock
- 对路径加S Lock
- 对叶子节点加S or X Lock
- 释放索引Lock和路径Lock
- mod leaf
- 释放节点lock
并行SMO
两种情况:
计算节点快,存储节点慢:Force Apply
计算节点慢,存储节点快:版本号