列出几种常用场景,并进行分析实战测试
前提
* pt-kill必须获得的权限
1. PROCESS , SUPER
2. 否则,你只能看到打印出kill id,但是实际上并没有被kill掉
特殊、 打印出执行时间超过3秒的connection,仅仅打印,不kill
- 每2秒循环一次,超过10秒就退出pt-kill程序
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --print --ignore-self --busy-time=3 --interval 2 --run-time=10
重点注意: 这里的--busy-time=3,指的是Command=Query的连接,其他的并不会被匹配哦 , 所以一般情况下删除的都是比较安全的用户thread
重点注意2: ddl,dml,select,都是属于Command=Query
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------+
| 3 | repl | xx.xxx.126.166:60528 | NULL | Binlog Dump GTID | 492777 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 4 | repl | xx.xxx.126.165:48604 | NULL | Binlog Dump GTID | 492765 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 502 | job_heartbeat | xx.xxx.2.217:34626 | heartbeat_db | Sleep | 0 | | NULL |
| 1053 | dbadmin | localhost | heartbeat_db | Query | 1 | altering table | alter table heartbeat add column ts2 date |
| 1055 | pt_kill | xx.xxx.126.166:63167 | NULL | Sleep | 1 | | NULL |
| 1056 | dbadmin | localhost | NULL | Query | 0 | starting | show processlist |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------+
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------------+
| 3 | repl | xx.xxx.126.166:60528 | NULL | Binlog Dump GTID | 492716 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 4 | repl | xx.xxx.126.165:48604 | NULL | Binlog Dump GTID | 492704 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 502 | job_heartbeat | xx.xxx.2.217:34626 | heartbeat_db | Sleep | 0 | | NULL |
| 1053 | dbadmin | localhost | heartbeat_db | Query | 3 | updating | update heartbeat set ts = '2018-09-06 00:07:58' |
| 1055 | pt_kill | xx.xxx.126.166:63167 | NULL | Sleep | 0 | | NULL |
| 1056 | dbadmin | localhost | NULL | Query | 0 | starting | show processlist |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------------+
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------+
| 3 | repl | xx.xxx.126.166:60528 | NULL | Binlog Dump GTID | 492613 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 4 | repl | xx.xxx.126.165:48604 | NULL | Binlog Dump GTID | 492601 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 502 | job_heartbeat | xx.xxx.2.217:34626 | heartbeat_db | Sleep | 0 | | NULL |
| 1053 | dbadmin | localhost | heartbeat_db | Query | 3 | User sleep | select 1,sleep(4) |
| 1055 | pt_kill | xx.xxx.126.166:63167 | NULL | Sleep | 1 | | NULL |
| 1056 | dbadmin | localhost | NULL | Query | 0 | starting | show processlist |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------+
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+------------------+
| 3 | repl | xx.xxx.126.166:60528 | NULL | Binlog Dump GTID | 492740 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 4 | repl | xx.xxx.126.165:48604 | NULL | Binlog Dump GTID | 492728 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 502 | job_heartbeat | xx.xxx.2.217:34626 | heartbeat_db | Sleep | 0 | | NULL |
| 1053 | dbadmin | localhost | heartbeat_db | Query | 4 | starting | rollback |
| 1055 | pt_kill | xx.xxx.126.166:63167 | NULL | Sleep | 0 | | NULL |
| 1056 | dbadmin | localhost | NULL | Query | 0 | starting | show processlist |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+------------------+
一、打印出sleep时间超过3秒的connection,仅仅打印,不kill
- 每2秒循环一次,无限循环下去
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --print --ignore-self --idle-time=3 --interval 2
- 每2秒循环一次,超过10秒就退出pt-kill程序
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --print --ignore-self --idle-time=3 --interval 2 --run-time=10
二、kill掉query语句中带有sleep关键字(不区分大小写)的connection, 且Time超过3秒
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --print --ignore-self --interval 2 --match-info "(?i-xsm:(sleep))" --busy-time=3 --kill --victims all
三、kill掉非系统用户的select开头,且执行时间超过3秒的 connection
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info "(?i-xsm:^(select))" --ignore-user="root|repl" --busy-time=3 --kill --victims all
四、kill掉非系统用户的select,update,delete开头,且执行时间超过3秒的 connection
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))" --ignore-user="root|repl" --busy-time=3 --kill --victims all
五、kill掉指定特征的query语句
- kill掉非系统用户,且query语句中同时包含heartbeat 和 where ,且heartbeat在前where在后,且执行时间超过3秒的 connection
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info ".*heartbeat.*where.*" --ignore-user="root|repl" --busy-time=3 --kill --victims all
# 2018-11-15T10:38:03 KILL 1053 (Query 27 sec) select *,sleep(10) from heartbeat where id < '1000000000'
# 2018-11-15T10:38:05 KILL 1053 (Query 29 sec) select *,sleep(10) from heartbeat where id < '1000000000'
# 2018-11-15T10:38:07 KILL 1053 (Query 31 sec) select *,sleep(10) from heartbeat where id < '1000000000'
# 2018-11-15T10:38:09 KILL 1053 (Query 33 sec) select *,sleep(10) from heartbeat where id < '1000000000'
六、kill掉非系统库的select开头,且执行时间超过3秒的 connection
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info "(?i-xsm:^(select))" --ignore-db="mysql|information_schema" --ignore-user="root|repl" --busy-time=3 --kill --victims all
七、kill掉非系统用户的select,update,delete开头,且执行时间超过3秒,且不是被locked住 的 connection
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))" --ignore-state="Locked" --ignore-user="root|repl" --busy-time=3 --kill --victims all
八、kill掉非系统用户,指定state(Locked、login、Updating、Sorting for order等状态),且执行时间超过3秒 的 connection
- 指定Locked的connection删除掉
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))" --match-state="Locked" --ignore-user="root|repl" --busy-time=3 --kill --victims all
九、kill掉非系统用户,指定Command(Query、Sleep、Binlog Dump、Connect等状态),且执行时间超过3秒 的 connection
- kill掉指定Connect的command connection
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))" --match-command="Connect" --ignore-user="root|repl" --busy-time=3 --kill --victims all
十、kill掉指定来源host ip ,且select开头的,且执行时间超过3s的connection
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info "(?i-xsm:^(select))" --ignore-host="x.x.x.x" --ignore-user="root|repl" --busy-time=3 --kill --victims all
十一、kill掉非系统用户,Command=Sleep,且空闲时间为3s的connection
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --ignore-user="root|repl" --idle-time=3 --kill --victims all
十二、kill掉非系统用户,指定特征的query,在后台运行,并打印日志
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info ".*heartbeat.*where.*" --ignore-user="root|repl" --busy-time=3 --daemonize --log='/root/kill.log' --kill --victims all
十三、--victims的用法
- 背景
pid:1103 , 是最早开启事务的空闲进程 T1
pid:1095 , 是第二早开启事务的ddl进程 T2
pid:502 , 是最后一个开启事务的dml进程 T3
事务顺序是: T1 锁住了 T2,T3, T2锁住了T3 , T3被T1,T2锁住
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+---------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+---------------------------------------------------------+
| 3 | repl | xx.xxx.126.166:60528 | NULL | Binlog Dump GTID | 507504 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 4 | repl | xx.xxx.126.165:48604 | NULL | Binlog Dump GTID | 507492 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 502 | job_heartbeat | xx.xxx.2.217:34626 | heartbeat_db | Query | 328 | Waiting for table metadata lock | insert into heartbeat(ts) values('2018-11-15 13:43:50') |
| 1095 | dbadxxx | localhost | heartbeat_db | Query | 329 | Waiting for table metadata lock | alter table heartbeat add column ts3 date |
| 1103 | dbadxxx | localhost | heartbeat_db | Sleep | 341 | | NULL |
| 1104 | dbadxxx | localhost | NULL | Query | 0 | starting | show processlist |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+---------------------------------------------------------+
6 rows in set (0.00 sec)
- --match-command="Query|Sleep" --victims oldest --busy-time=3
只kill最老的command为Query|Sleep的最老的链接
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --ignore-user="root|repl" --match-command="Query|Sleep" --busy-time=3 --victims oldest
Enter MySQL password:
# 2018-11-15T13:49:07 KILL 1103 (Sleep 330 sec) NULL
# 2018-11-15T13:49:09 KILL 1103 (Sleep 332 sec) NULL
- --match-command="Query|Sleep" --victims all --busy-time=3
kill 所有command="Query|Sleep" 的所有链接
Enter MySQL password:
# 2018-11-15T13:48:41 KILL 1103 (Sleep 304 sec) NULL
# 2018-11-15T13:48:41 KILL 1095 (Query 292 sec) alter table heartbeat add column ts3 date
# 2018-11-15T13:48:41 KILL 502 (Query 291 sec) insert into heartbeat(ts) values('2018-11-15 13:43:50')
# 2018-11-15T13:48:41 KILL 1104 (Sleep 262 sec) NULL
- --match-command="Query" --victims all --busy-time=3
kill 所有 command="Query"(默认不填也就是Query)的所有链接
Enter MySQL password:
# 2018-11-15T13:46:59 KILL 1095 (Query 190 sec) alter table heartbeat add column ts3 date
# 2018-11-15T13:46:59 KILL 502 (Query 189 sec) insert into heartbeat(ts) values('2018-11-15 13:43:50')
- --match-command="Query" --victims oldest --busy-time=3
kill 所有 command="Query"(默认不填也就是Query)的最老的链接
Enter MySQL password:
# 2018-11-15T13:59:01 KILL 1095 (Query 912 sec) alter table heartbeat add column ts3 date
十四、kill掉非系统用户,指定库的所有链接
举例一、kill掉非root和repl账号,使用open_db或xf_loupan_db的所有链接
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --ignore-user="root|repl" --match-db="open_db|xf_loupan_db" --victims all --kill
dbadmin:xf_loupan_db> select *,sleep(100) from loupan_grade_photo;
ERROR 2013 (HY000): Lost connection to MySQL server during query
dbadmin:xf_loupan_db> select *,sleep(100) from loupan_grade_photo;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 245
Current database: xf_loupan_db
ERROR 2013 (HY000): Lost connection to MySQL server during query
Enter MySQL password:
# 2018-12-26T12:34:46 KILL 243 (Query 7 sec) select *,sleep(100) from loupan_grade_photo
# 2018-12-26T12:35:00 KILL 245 (Query 1 sec) select *,sleep(100) from loupan_grade_photo
特别注意: 这里面只针对db这一列是open_db或xf_loupan_db才会有效果,如果这个用户拥有多个库的权限
那么他在xx库去 select * from xf_loupan_db.loupan_grade_photo 是不会生效的
所以,要保证这个库账号只能对应一个库,才会有效果
比如这样是没有效果的:
root:(none)> show processlist;
+-----+---------+----------------------+------+---------+------+------------+----------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+---------+----------------------+------+---------+------+------------+----------------------------------------------------------+
| 248 | root | localhost | NULL | Query | 0 | starting | show processlist |
| 252 | pt_kill | xx.xxx.126.166:33088 | NULL | Sleep | 1 | | NULL |
| 257 | dbadmin | localhost | sys | Query | 19 | User sleep | select *,sleep(100) from xf_loupan_db.loupan_grade_photo |
+-----+---------+----------------------+------+---------+------+------------+----------------------------------------------------------+
3 rows in set (0.00 sec)
十五、kill掉指定用户的所有链接
一般我们什么时候使用它呢? 一般是在迁移某个DB,需要将这个DB上的账号链接全部清理掉,防止长连接
举例:kill掉所有lc_rx 和 rc_ronly 两个账号的所有链接
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-user="lc_rx|rc_ronly" --victims all --kill