我们在数据库中经常执行一个SQL,一直处于等待状态,这种问题一般是由于锁等待造成的,下面来模拟这一情况。
一、前提条件:
1、创建表student
CREATE TABLE `student` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(20) DEFAULT NULL,
`class` int(11) DEFAULT NULL,
`age` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
);
2、mysql 版本:V5.6
二、模拟锁等待
1、session 1和session 2 设置不自动提交:
set autocommit=0;
2、session 1执行下面的SQL:
mysql> select * from student for update;
+----+-------+-------+------+
| id | name | class | age |
+----+-------+-------+------+
| 1 | bruce | 1 | 30 |
| 2 | iori | 1 | 31 |
+----+-------+-------+------+
2 rows in set (0.00 sec)
3、session 2执行下面的SQL:
mysql> drop tables student;
在这个地方一直挂起,也不存在锁等待超时。
三、故障诊断
1、遇到这种问题,首先执行show full processlist查看哪些SQL在执行:
mysql> show full processlist;
+----+------+-----------+------+---------+------+---------------------------------+-----------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+-----------+------+---------+------+---------------------------------+-----------------------+
| 5 | root | localhost | test | Sleep | 225 | | NULL |
| 6 | root | localhost | test | Query | 211 | Waiting for table metadata lock | drop tables student |
| 7 | root | localhost | NULL | Query | 0 | init | show full processlist |
+----+------+-----------+------+---------+------+---------------------------------+-----------------------+
3 rows in set (0.00 sec)
发现drop table在等待metadata lock
2、查看innodb_trx发现有个事务一直在执行,但是一直没有完成
mysql> select * from information_schema.innodb_trx\G;*************************** 1. row ***************************
trx_id: 6208
trx_state: RUNNING
trx_started: 2016-03-17 13:57:46
trx_requested_lock_id: NULL
trx_wait_started: NULL
trx_weight: 2
trx_mysql_thread_id: 5
trx_query: NULL
trx_operation_state: NULL
trx_tables_in_use: 0
trx_tables_locked: 0
trx_lock_structs: 2
trx_lock_memory_bytes: 360
trx_rows_locked: 3
trx_rows_modified: 0
trx_concurrency_tickets: 0
trx_isolation_level: REPEATABLE READ
trx_unique_checks: 1
trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
trx_adaptive_hash_latched: 0
trx_adaptive_hash_timeout: 10000
trx_is_read_only: 0
trx_autocommit_non_locking: 0
1 row in set (0.00 sec)
ERROR:
No query specified
3、show engine innodb status\G;
------------
TRANSACTIONS
------------
Trx id counter 6209
Purge done for trx's n:o < 5138 undo n:o < 0 state: running but idle
History list length 233
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 7, OS thread handle 0x7f5831691700, query id 131 localhost root init
show engine innodb status
---TRANSACTION 0, not started
MySQL thread id 6, OS thread handle 0x7f58316d2700, query id 107 localhost root Waiting for table metadata lock
drop tables student
---TRANSACTION 6208, ACTIVE 1183 sec
2 lock struct(s), heap size 360, 3 row lock(s)
MySQL thread id 5, OS thread handle 0x7f5831713700, query id 96 localhost root cleaning up
Trx read view will not see trx with id >= 6209, sees < 6209
发现该事务有锁情况,而且执行了1183秒,要重点注意。
4、结合第二和第三的情况,判断trx_mysql_thread_id=5有很大嫌疑造成锁等待:
+----+------+-----------+------+---------+------+---------------------------------+-----------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+-----------+------+---------+------+---------------------------------+-----------------------+
| 5 | root | localhost | test | Sleep | 225 | | NULL |
5、杀掉进程5,然后session 2的drop table执行成功
mysql> drop tables student;
Query OK, 0 rows affected (28 min 13.52 sec)
6、根本原因查找
出现这个问题的难点是一个session执行了哪些命令找不到,binlog也没有记录
我们可以重点关注一下show full processist的HOST字段
我们的模拟项目中是localhost,说明是本地操作,可以询问拥有本地访问权限的人在操作什么?
还有通过tomcat连接到mysql的情况,一般会出现ip:netport的情况,我们可以根据该信息找到对应的tomcat,查找响应的报错,发现问题。