背景
自MySQL 5.7以来,组提交大面积应用,已经不断地得到优化。但网上有关组提交的实现机制,却还不够详细。故障多的时候,往往会发生一些模棱两可的揣测和猜疑。因此,笔者有了从自己的角度,去分析组提交实现的动机。
源码分析
以“last_committed”为入口,搜索MySQL 5.7.24源码,很快可以定位到关键类Transaction_dependency_tracker。梳理一下该类的虚函数实现以及调用位置,基本就理解了大部分实现。如下:
以此类为基础,再梳理调用过程,结合前面文章《MySQL的after_sync与after_commit性能之争源码剖析与情景测试》的分析,基本就可以梳理出具体的实现细节。基本实现
通过类Commit_order_trx_dependency_tracker的两个成员变量来基本实现:m_max_committed_transaction,保存已提交的最大事务号;m_transaction_counter,保存已prepare的最大事务号。获取last_committed与sequence_number值
如上图所示,实现该功能的是get_dependency函数,把last_committed与sequence_number的绝对值转换成相对值。而这个转换恰巧是在Binlog的flush阶段MYSQL_BIN_LOG::write_gtid函数中,生成GTID事件之前。更新m_max_committed_transaction值
update_max_committed函数的调用发生在commit队列处理函数MYSQL_BIN_LOG::process_commit_stage_queue中,遍历每个线程提交最开始的时候。线程获取m_max_committed_transaction值
对于非事务型语句(如DDL)来说,在binlog_prepare中获取last_committed绝对值。而对于事务型语句来说,则在提交的函数MYSQL_BIN_LOG::commit中,执行ordered_commit函数之前获取last_committed绝对值。sequence_number值累加
step()函数的调用也是在Binlog的flush阶段,binlog_cache_data::flush函数中,不过是在MYSQL_BIN_LOG::write_gtid函数之前,即把绝对值转换成相对值之前。具体实现为,sequence_number每次加1。
梳理出来的流程如下所示:
案例分析
初看时,想当然的会以为,两个DDL进入了同一组导致了死锁。然后就会着重分析为什么在主库被分配了同一组,而在从库却又冲突了。从上图的分析可以了解,获取last_committed值是在执行ordered_commit函数之前。这个时候,线程所有持有的MDL锁并未释放,所以两者存在冲突,根本就不可能同时进入提交。但仔细回头一看,就会发现不对,两个语句根本就不是DDL,而是DCL呀!为了了解DCL中关于MDL锁的释放位置,就在测试环境抓取分别抓取它们的debug日志来分析。flush privileges
...
// 第一部分3343 T@5: | | | | | >MDL_context::release_transactional_locks
3344 T@5: | | | | | | >MDL_context::release_locks_stored_before
3345 T@5: | | | | | | <:release_locks_stored_before>
3346 T@5: | | | | | | >MDL_context::release_locks_stored_before
3347 T@5: | | | | | | | info: found lock to release ticket=0x7f3c64012410
3348 T@5: | | | | | | | >MDL_context::release_lock
3349 T@5: | | | | | | | | enter: db=mysql name=proxies_priv
3350 T@5: | | | | | | | <:release_lock>
3351 T@5: | | | | | | | info: found lock to release ticket=0x7f3c64012320
3352 T@5: | | | | | | | >MDL_context::release_lock
3353 T@5: | | | | | | | | enter: db=mysql name=db
3354 T@5: | | | | | | | <:release_lock>
3355 T@5: | | | | | | | info: found lock to release ticket=0x7f3c6400e860
3356 T@5: | | | | | | | >MDL_context::release_lock
3357 T@5: | | | | | | | | enter: db=mysql name=user
3358 T@5: | | | | | | | <:release_lock>
3359 T@5: | | | | | | <:release_locks_stored_before>
3360 T@5: | | | | | <:release_transactional_locks>
3361 T@5: | | | |
...
// 第二部分5073 T@5: | | | | | >MDL_context::release_transactional_locks
5074 T@5: | | | | | | >MDL_context::release_locks_stored_before
5075 T@5: | | | | | | <:release_locks_stored_before>
5076 T@5: | | | | | | >MDL_context::release_locks_stored_before
5077 T@5: | | | | | | | info: found lock to release ticket=0x7f3c640308e0
5078 T@5: | | | | | | | >MDL_context::release_lock
5079 T@5: | | | | | | | | enter: db=mysql name=procs_priv
5080 T@5: | | | | | | | <:release_lock>
5081 T@5: | | | | | | | info: found lock to release ticket=0x7f3c640273e0
5082 T@5: | | | | | | | >MDL_context::release_lock
5083 T@5: | | | | | | | | enter: db=mysql name=columns_priv
5084 T@5: | | | | | | | <:release_lock>
5085 T@5: | | | | | | | info: found lock to release ticket=0x7f3c6400e860
5086 T@5: | | | | | | | >MDL_context::release_lock
5087 T@5: | | | | | | | | enter: db=mysql name=tables_priv
5088 T@5: | | | | | | | <:release_lock>
5089 T@5: | | | | | | <:release_locks_stored_before>
5090 T@5: | | | | | <:release_transactional_locks>
5091 T@5: | | | |
...
// 第三部分5641 T@5: | | | | | >restore_backup_open_tables_state
5642 T@5: | | | | | | >MDL_context::rollback_to_savepoint
5643 T@5: | | | | | | | >MDL_context::release_locks_stored_before
5644 T@5: | | | | | | | <:release_locks_stored_before>
5645 T@5: | | | | | | | >MDL_context::release_locks_stored_before
5646 T@5: | | | | | | | | info: found lock to release ticket=0x7f3c6400e860
5647 T@5: | | | | | | | | >MDL_context::release_lock
5648 T@5: | | | | | | | | | enter: db=mysql name=servers
5649 T@5: | | | | | | | | <:release_lock>
5650 T@5: | | | | | | | <:release_locks_stored_before>
5651 T@5: | | | | | | <:rollback_to_savepoint>
5652 T@5: | | | | |
5653 T@5: | | | | | >my_hash_free
5654 T@5: | | | | | | enter: hash: 0x7f3c64002eb0
5655 T@5: | | | | |
5656 T@5: | | | | | info: unlocking servers_cache
5657 T@5: | | | |
...
// 第四部分5764 T@5: | | | | >trans_commit_stmt
5765 T@5: | | | | | debug: add_unsafe_rollback_flags: 0
5766 T@5: | | | | | >MYSQL_BIN_LOG::commit
5767 T@5: | | | | | | info: query='flush privileges'
...
以上分别截取了三部分:第一部分,对mysql.proxies_priv、mysql.db以及mysql.user释放MDL锁;第二部分,对mysql.procs_priv、mysql.columns_priv以及mysql.tables_priv释放MDL锁;第三部分,对mysql.servers释放MDL锁;第四部分,提交。由此,可以发现,MDL锁在进入提交之前就已经释放了。为了进一步确认,于是对函数MDL_context::acquire_lock设置断点,调试结果如下:
因此,可以发现,获取的都是SR锁。grant privileges
...
// 第一部分2477 T@4: | | | | | >trans_commit_stmt
2478 T@4: | | | | | | debug: add_unsafe_rollback_flags: 0
2479 T@4: | | | | | | >MYSQL_BIN_LOG::commit
2480 T@4: | | | | | | | info: query='grant process on *.* to 'slave'@'192.168.10.%''
...
2724 T@4: | | | | | | | | | >THD::enter_cond
2725 T@4: | | | | | | | | | | THD::enter_stage: 'Waiting for semi-sync ACK from slave' /data/mysql-5.7.24/plugin/semisync/semisync_master.cc:735
2726 T@4: | | | | | | | | | | >PROFILING::status_change
2727 T@4: | | | | | | | | | | <:status_change>
2728 T@4: | | | | | | | | | <:enter_cond>
...
2816 T@4: | | | | |
2817 T@4: | | | | | >trans_commit_implicit
...
2843 T@4: | | | | |
// 第二部分2950 T@4: | | | | | >MDL_context::release_transactional_locks
2951 T@4: | | | | | | >MDL_context::release_locks_stored_before
2952 T@4: | | | | | | | info: found lock to release ticket=0x7f3c7000e9b0
2953 T@4: | | | | | | | >MDL_context::release_lock
2954 T@4: | | | | | | | | enter: db= name=
2955 T@4: | | | | | | | <:release_lock>
2956 T@4: | | | | | | <:release_locks_stored_before>
2957 T@4: | | | | | | >MDL_context::release_locks_stored_before
2958 T@4: | | | | | | | info: found lock to release ticket=0x7f3c7000e760
2959 T@4: | | | | | | | >MDL_context::release_lock
2960 T@4: | | | | | | | | enter: db=mysql name=db
2961 T@4: | | | | | | | <:release_lock>
2962 T@4: | | | | | | | info: found lock to release ticket=0x7f3c7000ea10
2963 T@4: | | | | | | | >MDL_context::release_lock
2964 T@4: | | | | | | | | enter: db=mysql name=user
2965 T@4: | | | | | | | <:release_lock>
2966 T@4: | | | | | | <:release_locks_stored_before>
2967 T@4: | | | | | <:release_transactional_locks>
...
以上分别截取了两部分:第一部分执行提交;第二部分,对mysql.db和mysql.user释放锁。依照上面的方式,同样进行调试,结果如下:
此时,可以发现,上述debug日志中的第一个锁为IX类型的GLOBAL锁,其它两个都是SW锁。死锁分析
从以上的分析,基本可以认定flush privileges和grant privileges的执行过程有很大的不同,如果SW锁和SR锁不兼容的话,基本可以认定,前者先执行,后者可以并行,反之则不行。为此,进行一下测试(让语句处理ACK等待状态),先flush privileges再grant privileges,
mysql> show processlist;
+-----+------+-----------+------+---------+------+--------------------------------------+------------------------------------------------+| Id | User | Host | db | Command | Time | State | Info |
+-----+------+-----------+------+---------+------+--------------------------------------+------------------------------------------------+| 196 | root | localhost | NULL | Query | 2449 | Waiting for semi-sync ACK from slave | flush privileges |
| 197 | root | localhost | NULL | Query | 2430 | checking permissions | grant process on *.* to 'slave'@'192.168.10.%' |
| 198 | root | localhost | NULL | Query | 0 | starting | show processlist |
+-----+------+-----------+------+---------+------+--------------------------------------+------------------------------------------------+3 rows in set (0.00 sec)
线程197的堆栈信息如下:
反之,先执行grant privileges再flush privileges,
mysql> show processlist;
+-----+-------+--------------------+------+------------------+------+---------------------------------------------------------------+------------------------------------------------+| Id | User | Host | db | Command | Time | State | Info |
+-----+-------+--------------------+------+------------------+------+---------------------------------------------------------------+------------------------------------------------+| 196 | root | localhost | NULL | Query | 12 | Waiting for table level lock | flush privileges |
| 197 | root | localhost | NULL | Query | 23 | Waiting for semi-sync ACK from slave | grant process on *.* to 'slave'@'192.168.10.%' |
| 198 | root | localhost | NULL | Query | 0 | starting | show processlist |
| 199 | slave | 192.168.10.4:39721 | NULL | Binlog Dump GTID | 133 | Master has sent all binlog to slave; waiting for more updates | NULL |
+-----+-------+--------------------+------+------------------+------+---------------------------------------------------------------+------------------------------------------------+4 rows in set (0.00 sec)
线程196的堆栈信息如下:
通过以上的堆栈信息,可以惊讶地发现,线程196已经进入了mysql_lock_tables函数,即已经拿到了MDL锁,足以说明SR锁和SW锁是兼容的。通过上面的函数调用,不难发现,它们正在等待的,是表锁【InnoDB实现的表锁,存储引擎层只定义了表锁实现接口】。通过对比debug日志,不难发现,该锁的释放时间与MDL锁是紧贴在MDL锁前面的,因而以上的结论是成立的,只是对象由MDL锁变成了InnoDB表锁。故障复现
主库设置:注意:binlog_group_commit_sync_delay最大值为100万微妙,即1s。
mysql> show variables like '%group_commit%';
+-----------------------------------------+---------+| Variable_name | Value |
+-----------------------------------------+---------+| binlog_group_commit_sync_delay | 1000000 |
| binlog_group_commit_sync_no_delay_count | 1000 |
+-----------------------------------------+---------+2 rows in set (0.01 sec)
执行脚本:在主库先执行flush privileges再执行grant privileges以保证进入同一组;然后线程1睡眠0.5s以保证flush privileges先获取sequence_number值,即在从库先提交;最后,在从库让先执行的线程后提交。
import threading
import time
import os
def thread_1():
os.system('date +%M:%S')
time.sleep(0.5)
os.system("/usr/local/mysql/bin/mysql -uroot -ps3cret -e\"grant process on *.* to 'slave'@'192.168.10.%';\"")
os.system('date +%M:%S')
def thread_2():
os.system('date +%M:%S')
os.system("/usr/local/mysql/bin/mysql -uroot -ps3cret -e 'flush privileges'")
os.system('date +%M:%S')
t1 = threading.Thread(target=thread_2)
t1.start()
t2 = threading.Thread(target=thread_1)
t2.start()
从库现象:结论
flush privileges设计的提交前就释放锁的机制导致了死锁,依次类推,针对mysql.user表的DML语句与flush privileges配合,也有可能产生死锁。
总结
组提交的实现依赖于两阶段提交,因而对于非事务性语句来说,锁的释放可能会出现在事务提交前,因而可能会出现本来冲突的两个语句获得同一个last_committed,即被分配到了同一组,进而导致了并行复制死锁。