参考文章:https://blog.csdn.net/m0_37827567/article/details/82979767
大数据表数据变更SQL回滚
MySQL执行数据量很大的表数据变更时,由于耗时太长,中途停止变更SQL,执行新的SQL,会发现之前执行的SQL进程状态为Killed,即使重启MySQL也没有用,进程依旧是以Killed状态继续运行。并且对应的表也被锁住,查询非常慢。这是由于如果大SQL执行被中途停止,MySQL本身有数据保护机制,会对之前执行的结果进行回滚,所有回滚操作都完成该进程才会停止,并且回滚期间会进行锁表操作。
查看MySQL运行的进程:show processlist;
停止正在运行的进程:kill #{pid} (pid通过show processlist 名称可以查询到)
查看被锁的表: show OPEN TABLES where In_use > 0;
查询正在运行的SQL事务情况:
SELECT * FROM information_schema.INNODB_TRX\G;
SELECT trx_state,trx_started,trx_mysql_thread_id,trx_query,trx_operation_state,trx_tables_in_use,trx_tables_locked,
trx_lock_memory_bytes,trx_rows_locked,trx_rows_modified,trx_isolation_level
FROM information_schema.INNODB_TRX;
trx_rows_locked: 事务锁住的行数
rx_rows_modified: 事务更改的行数,当数值为0时,锁将会释放
rx_rows_modified / trx_rows_locked
查看MySQL锁相关的记录表
desc information_schema.innodb_locks;
+————-+———————+——+—–+———+——-+
| Field | Type | Null | Key | Default | Extra |
+————-+———————+——+—–+———+——-+
| lock_id | varchar(81) | NO | | | |#锁ID
| lock_trx_id | varchar(18) | NO | | | |#拥有锁的事务ID
| lock_mode | varchar(32) | NO | | | |#锁模式
| lock_type | varchar(32) | NO | | | |#锁类型
| lock_table | varchar(1024) | NO | | | |#被锁的表
| lock_index | varchar(1024) | YES | | NULL | |#被锁的索引
| lock_space | bigint(21) unsigned | YES | | NULL | |#被锁的表空间号
| lock_page | bigint(21) unsigned | YES | | NULL | |#被锁的页号
| lock_rec | bigint(21) unsigned | YES | | NULL | |#被锁的记录号
| lock_data | varchar(8192) | YES | | NULL | |#被锁的数据
+————-+———————+——+—–+———+——-+
10 rows in set (0.00 sec)
desc information_schema.innodb_lock_waits;
+——————-+————-+——+—–+———+——-+
| Field | Type | Null | Key | Default | Extra |
+——————-+————-+——+—–+———+——-+
| requesting_trx_id | varchar(18) | NO | | | |#请求锁的事务ID
| requested_lock_id | varchar(81) | NO | | | |#请求锁的锁ID
| blocking_trx_id | varchar(18) | NO | | | |#当前拥有锁的事务ID
| blocking_lock_id | varchar(81) | NO | | | |#当前拥有锁的锁ID
+——————-+————-+——+—–+———+——-+
4 rows in set (0.00 sec)
desc information_schema.innodb_trx ;
+—————————-+———————+——+—–+———————+——-+
| Field | Type | Null | Key | Default | Extra |
+—————————-+———————+——+—–+———————+——-+
| trx_id | varchar(18) | NO | | | |#事务ID
| trx_state | varchar(13) | NO | | | |#事务状态:
| trx_started | datetime | NO | | 0000-00-00 00:00:00 | |#事务开始时间;
| trx_requested_lock_id | varchar(81) | YES | | NULL | |#innodb_locks.lock_id
| trx_wait_started | datetime | YES | | NULL | |#事务开始等待的时间
| trx_weight | bigint(21) unsigned | NO | | 0 | |#
| trx_mysql_thread_id | bigint(21) unsigned | NO | | 0 | |#事务线程ID
| trx_query | varchar(1024) | YES | | NULL | |#具体SQL语句
| trx_operation_state | varchar(64) | YES | | NULL | |#事务当前操作状态
| trx_tables_in_use | bigint(21) unsigned | NO | | 0 | |#事务中有多少个表被使用
| trx_tables_locked | bigint(21) unsigned | NO | | 0 | |#事务拥有多少个锁
| trx_lock_structs | bigint(21) unsigned | NO | | 0 | |#
| trx_lock_memory_bytes | bigint(21) unsigned | NO | | 0 | |#事务锁住的内存大小(B)
| trx_rows_locked | bigint(21) unsigned | NO | | 0 | |#事务锁住的行数
| trx_rows_modified | bigint(21) unsigned | NO | | 0 | |#事务更改的行数
| trx_concurrency_tickets | bigint(21) unsigned | NO | | 0 | |#事务并发票数
| trx_isolation_level | varchar(16) | NO | | | |#事务隔离级别
| trx_unique_checks | int(1) | NO | | 0 | |#是否唯一性检查
| trx_foreign_key_checks | int(1) | NO | | 0 | |#是否外键检查
| trx_last_foreign_key_error | varchar(256) | YES | | NULL | |#最后的外键错误
| trx_adaptive_hash_latched | int(1) | NO | | 0 | |#
| trx_adaptive_hash_timeout | bigint(21) unsigned | NO | | 0 | |#
+—————————-+———————+——+—–+———————+——-+
22 rows in set (0.01 sec)
大数据表数据变更注意事项
线上发布需要对两张大数据表进行数据订正,一张表的数据量是2000W+,另一张表的数据量是 3600W+,为了加快数据订正速度,同时执行了这两张表的订正SQL,结果SQL命令运行了5个小时后,因为MySQL内存溢出重启了。
查看MySQL运行日志,发现有错误日志:[Warning] InnoDB: Over 67 percent of the buffer pool is occupied by lock heaps or the adaptive hash index! Check that your transactions do not set too many row locks. Your buffer pool size is 127 MB. Maybe you should make the buffer pool bigger?. Starting the InnoDB Monitor to print diagnostics, including lock heap and hash index sizes.
这是是由于更新时事务锁住了大量的表数据,导致占满了 InnoDB Buffer Pool 缓存空间,因此在执行大量数据订正的SQL之前,最好先调大 InnoDB Buffer Pool 缓存空间,避免由于缓存空间不足导致变更执行失败。
# 查看MySQL缓冲区大小,单位:M
mysql> SELECT @@innodb_buffer_pool_size/1024/1024;
+-------------------------------------+
| @@innodb_buffer_pool_size/1024/1024 |
+-------------------------------------+
| 128.00000000 |
+-------------------------------------+
1 row in set (0.00 sec)
# 修改 /etc/my.cnf ,
vim /etc/my.cnf
# 设置 innodb的buffer pool 最大为1G
innodb_buffer_pool_size=1073741824
# 重启MySQL
systemctl restart mysqld.service
其他常用排查命令
-- 查看持有MySQL连接的host 和user 信息
SELECT user,substring_index(host, ':',1) AS host_name,state,count(*) FROM information_schema.processlist GROUP BY state,host_name;
+-------+---------------+---------------------------------------------------------------+----------+
| user | host_name | state | count(*) |
+-------+---------------+---------------------------------------------------------------+----------+
| root | 10.0.21.4 | | 1 |
| root | HZApp01 | | 2 |
| root | HZApp02 | | 3 |
| root | HZMycat | | 7 |
| root | localhost | executing | 1 |
| slave | HZMysqlBackup | Master has sent all binlog to slave; waiting for more updates | 1 |
+-------+---------------+---------------------------------------------------------------+----------+
6 rows in set (0.00 sec)
-- 查看MySQL当前总连接数信息,Threads_connected 为当前正在运行的连接数
SHOW STATUS LIKE 'Threads%';
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| Threads_cached | 0 |
| Threads_connected | 56 |
| Threads_created | 3440 |
| Threads_running | 2 |
+-------------------+-------+
4 rows in set (0.01 sec)
-- 查看MySQL设置的最大连接数
SHOW VARIABLES LIKE '%max_connections%';
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| max_connections | 500 |
+-----------------+-------+
1 row in set (0.00 sec)