环境介绍:
contos 6.9 虚拟机 + 5.6.42版本数据库 数据库架构为:PXC
问题描述:
我执行的操作:如下存储过程,目的是为了把冷数据归档到历史表 msc_sm_bill_back_20191120
然后再将msc_sm_bill表里的对应数据删除!
DELIMITER $$
create procedure msc_bill_backup()
begin
DECLARE id_tmp int(10);
DECLARE done INT DEFAULT FALSE;
DECLARE my_cursor CURSOR FOR (
select id from msc_sm_bill where create_time < '2019-09-01'
);
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
OPEN my_cursor;
my_loop: LOOP
FETCH my_cursor into id_tmp;
IF done THEN
LEAVE my_loop;
END IF;
insert INTO msc_sm_bill_back_20191120 select * from msc_sm_bill where id=id_tmp;
COMMIT;
END LOOP my_loop;
CLOSE my_cursor;
END $$
DELIMITER ;
然后执行该存储过程:
call msc_bill_backup;
然后查看进程状态,发现很多对该表的insert操作被阻塞,状态为:wsrep in pre-commit stage
root@localhost : msc1 20:46:34>show processlist;
33144754 | app_msgcenter | 10.1.1.1:54742 | msc1 | Query| 3 | wsrep in pre-commit stage | insert into msc_sm_bill ( system_source_id,business_id,
我们知道 wsrep in pre-commit stage这种状态的事务是没有commit成功的,这意味着很多
线程已经在执行节点发出commit,但将该SQL发送到其他节点是处于独立异步地进行
certification test、事务插入待执行队列的状态,也就是说处于等待排队的状态,
分析原因:
该存储过程是根据游标一条一条的执行,然后本身MySQL默认自动提交,你又显示的
commit了(并且commit是在循环里面),如下所示,也就是说每insert一条数据,就需要commit两次,如下所示:
insert INTO msc_sm_bill_back_20191120 select * from msc_sm_bill where id=id_tmp;
COMMIT;
1.然后我们知道commit的时候会触发purg 线程去刷新脏数据,很耗资源的,尤其是io资源,
2.在MySQL rr隔离级别下,insert INTO msc_sm_bill_back_20191120 select * from msc_sm_bill where id=1,是会把msc_sm_bill 表id=1的数据给锁定的,但是这个锁定时间也
是非常短暂的,并且业务不会更新历史数据,也就是说id=1的数据被锁了,也不会有业务去更
新的,怎么会导致阻塞呢?
由于io等待严重,导致事务执行时间变长,然后由于insert的时候 也会上间隙锁,也就是第一insert 也可能阻塞第二次insert,那么阶梯式儿的阻塞就导致很多处于wsrep in pre-commit stage状态的事务;
总起来说这次问题原因:
1.存储过程写的有问题,频繁提交,消耗资源,io等待严重
2.当时业务insert的量比较大,处于等待最后的事务可能超多50秒而timeout,程序报错
如下查询所示,确实是再 上午11点执行的,那个时候是高峰期,量比较大!
select count(*) , DATE_FORMAT(create_time,'%H') from msc_sm_bill where create_time>'2019-11-20' group by DATE_FORMAT(create_time,'%H')
11606
23407
47108
58709
371210
315611
77312
53313
46414
62015
62316
127817
56318
52119
47920
42521
12122
然后关于insert INTO msc_sm_bill_back_20191120 select * from msc_sm_bill where id=1 会锁msc_sm_bill表数据的验证:
二:关于insert into t_bak select * from t where id=11 锁的区别:
rr隔离级别下,这样操作是非常危险的,他会锁id=11这条数据,但是隔离级别(read commited)
下,就不会锁定任何数据,只是会加表的元数据锁!MySQL默认隔离级别是rr,但是
Oracle默认隔离级别是read commited;
实验一:隔离级别REPEATABLE-READ,
会话1:
root@localhost : liuwenhe 17:10:44>select @@global.tx_isolation,@@tx_isolation;
+-----------------------+-----------------+
| @@global.tx_isolation | @@tx_isolation |
+-----------------------+-----------------+
| REPEATABLE-READ | REPEATABLE-READ |
+-----------------------+-----------------+
root@localhost : liuwenhe 17:33:00>start transaction;
Query OK, 0 rows affected (0.00 sec)
root@localhost : liuwenhe 17:36:21>insert into t_bak select * from t where id=11;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
会话2 尝试更新id=11的数据,发现被锁,但是更新其他的不会被锁,
前提是id有索引
root@localhost : liuwenhe 17:10:44>select @@global.tx_isolation,@@tx_isolation;
+-----------------------+-----------------+
| @@global.tx_isolation | @@tx_isolation |
+-----------------------+-----------------+
| REPEATABLE-READ | REPEATABLE-READ |
+-----------------------+-----------------+
root@localhost : liuwenhe 17:36:43>update t set name='li' where id=11; --等待
root@localhost : liuwenhe 17:36:43>update t set name='li' where id=12 --成功
会话3查询锁的信息,发现确实有锁等待
root@localhost : information_schema 17:41:16>select * from INNODB_LOCKS;
+--------------+-------------+-----------+-----------+----------------+------------+------------+-----------+----------+-----------+
| lock_id | lock_trx_id | lock_mode | lock_type | lock_table | lock_index | lock_space | lock_page | lock_rec | lock_data |
+--------------+-------------+-----------+-----------+----------------+------------+------------+-----------+----------+-----------+
| 6904:244:3:4 | 6904 | X | RECORD | `liuwenhe`.`t` | PRIMARY | 244 | 3 | 4 | 12 |
| 6897:244:3:4 | 6897 | S | RECORD | `liuwenhe`.`t` | PRIMARY | 244 | 3 | 4 | 12 |
+--------------+-------------+-----------+-----------+----------------+------------+------------+-----------+----------+-----------+
2 rows in set, 1 warning (0.00 sec)
实验2 在READ-COMMITTED隔离级别下,是没有问题的, 不会锁定任何数据
会话1:在READ-UNCOMMITTED和READ-COMMITTED隔离级别级别下执行insert into t_bak select * from t 都不会锁任何数据,如下只写出来READ-COMMITTED级别下的现象!
where id=11; 具体如下:
root@localhost : (none) 17:45:53>select @@global.tx_isolation,@@tx_isolation;
+-----------------------+------------------+
| @@global.tx_isolation | @@tx_isolation |
+-----------------------+------------------+
| READ-COMMITTED | READ-COMMITTED |
+-----------------------+------------------+
1 row in set, 2 warnings (0.00 sec)
root@localhost : (none) 17:45:57>start transaction;
Query OK, 0 rows affected (0.00 sec)
root@localhost : (none) 17:46:03>use liuwenhe
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
root@localhost : liuwenhe 17:46:11>insert into t_bak select * from t where id=11;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
会话2:发现可以正常更新
root@localhost : liuwenhe 17:42:11>update t set name='li' where id=11;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
但是你尝试alter表结构会阻塞
root@localhost : liuwenhe 17:47:32>alter table t add key idx_name (name);
会话3:发现等待元数据锁 Waiting for table metadata lock
root@localhost : information_schema 17:53:30>show processlist;
+----+-------------+-----------+--------------------+---------+---------+--------------------------------------------------------+---------------------------------------+-----------+---------------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined |
+----+-------------+-----------+--------------------+---------+---------+--------------------------------------------------------+---------------------------------------+-----------+---------------+
| 3 | system user | | NULL | Connect | 2271468 | Waiting for master to send event | NULL | 0 | 0 |
| 4 | system user | | NULL | Connect | 607512 | Slave has read all relay log; waiting for more updates | NULL | 0 | 0 |
| 14 | root | localhost | liuwenhe | Query | 72 |
Waiting for table metadata lock | alter table t add key idx_name (name) | 0 | 0 |
| 17 | root | localhost | information_schema | Query | 0 | starting | show processlist | 0 | 0 |
| 20 | root | localhost | liuwenhe | Sleep | 450 | | NULL | 0 | 1 |
+----+-------------+-----------+--------------------+---------+---------+--------------------------------------------------------+---------------------------------------+-----------+---------------+
5 rows in set (0.00 sec)
经实验Oracle也不会锁定任何数据,因为Oracle也是read commited的隔离级别!但是
和MySQL不同的是 Oracle不会阻塞 alter ,具体如下:
Oracle:
会话1 :
SQL> insert into test2 select * from test1 where id=13;
1 row created.
会话2 执行alter 和update 都可以成功!
SQL> update test1 set id =14 where id=12;
1 row updated.
SQL> alter table test1 add name varchar2(10);
Table altered.