Lock_grant锁实际上是一个读写锁,每执行一条sql都要读取执行check_grant来检查权限是否匹配,这时候加的是读锁,而读锁是可以并发的。大多数情况下加锁开销是很不明显的,但是在MySQL将大多数只读场景下的锁都消除掉后,LOCK_grant开始变的比较突出了。
5.7里主要消除掉了几种会影响到只读性能的锁:
#为InnoDB表消除THR_LOCK
#MDL LOCK维护改成LOCK FREE的算法
#为AUTO-COMMIT的SELECT缓存Read View来减少trx_sys->mutex冲突
Stewart Smith report了一个Bug#72829 来描述LOCK_grant在power机型的影响,在他的测试中,LOCK_grant去除掉后,居然有50的性能提升。我相信这是和平台相关的。。。因为我的机器上很难达到这样的提升。。。。
我也在bug上跟着吐槽啦,MySQL 5.7.5的只读测试中,LOCK_grant直接排到第一位了:
root@performance_schema 05:37:18>SELECT COUNT_STAR, SUM_TIMER_WAIT, AVG_TIMER_WAIT, EVENT_NAME FROM events_waits_summary_global_by_event_name where COUNT_STAR > 0 and EVENT_NAME like 'wait/synch/%' order by SUM_TIMER_WAIT desc limit 20;
+------------+----------------+----------------+---------------------------------------------------+
| COUNT_STAR | SUM_TIMER_WAIT | AVG_TIMER_WAIT | EVENT_NAME |
+------------+----------------+----------------+---------------------------------------------------+
| 26086077 | 17068952969200 | 654000 | wait/synch/rwlock/sql/LOCK_grant |
| 78512461 | 16715117127344 | 212768 | wait/synch/sxlock/innodb/hash_table_locks |
| 130770402 | 15605301433540 | 119028 | wait/synch/mutex/sql/THD::LOCK_query_plan |
| 52314260 | 12553580466128 | 239800 | wait/synch/mutex/sql/LOCK_table_cache |
| 78446571 | 9825008327284 | 125132 | wait/synch/mutex/sql/THD::LOCK_thd_data |
| 26175928 | 6455036525220 | 246340 | wait/synch/sxlock/innodb/index_tree_rw_lock |
| 52299088 | 5974745787424 | 114232 | wait/synch/mutex/sql/THD::LOCK_thd_query |
| 7568 | 1095036672 | 144316 | wait/synch/mutex/innodb/flush_list_mutex |
| 7656 | 885055584 | 115540 | wait/synch/mutex/innodb/buf_pool_mutex |
| 7603 | 638888240 | 83712 | wait/synch/mutex/sql/LOCK_global_system_variables |
| 242 | 354649376 | 1465396 | wait/synch/sxlock/innodb/dict_operation_lock |
| 3445 | 351421232 | 101588 | wait/synch/mutex/innodb/dict_sys_mutex |
| 1602 | 339106848 | 211460 | wait/synch/mutex/innodb/innobase_share_mutex |
| 1602 | 321693008 | 200560 | wait/synch/mutex/sql/LOCK_open |
| 2672 | 315876768 | 118156 | wait/synch/mutex/sql/LOCK_plugin |
| 1050 | 308335712 | 293428 | wait/synch/mutex/sql/LOCK_connection_count |
| 1602 | 235623120 | 146932 | wait/synch/mutex/innodb/file_format_max_mutex |
| 1281 | 217084400 | 169168 | wait/synch/mutex/sql/LOCK_thd_list |
| 983 | 209930512 | 213204 | wait/synch/mutex/sql/LOCK_status |
| 1380 | 199285136 | 144316 | wait/synch/mutex/sql/LOCK_user_conn |
+------------+----------------+----------------+---------------------------------------------------+
由于只加的LOCK_grant读锁,该锁的并发开销主要来自Cache失效以及原子操作。
解决:
解法也比较简单:创建32个读写锁,当需要加读锁时,根据当前线程的thread_id,模到对应的读写锁对象,加上读锁; 当需要加写锁时,则把32个读写锁对象的写锁都加上。
显然加写锁的代价提升了,但是谁会没事儿把类似GRANT语句作为主要的负载呢。
在我的优化版本的5.6分支上,加了该改进后,可以将只读QPS从10w推进到10.5w QPS