作为一名全职运维,随时会碰到各种问题,今天晚上收到紧报警,一台数据库服务器磁盘空间使用快速从80%使用率到90%。我们的数据库都是>2T的磁盘,意识到这肯定是比较严重问题马上上线处理。
状况如下:
[root@mysql-node1 tmp]# ls #sql_8cc3_0.MYD #sql_8cc3_0.MYI #sql_8cc3_10.MYD #sql_8cc3_10.MYI #sql_8cc3_5.MYD #sql_8cc3_5.MYI [root@mysql-node1 tmp]# du -sh * 36Khsperfdata_root 346G#sql_8cc3_0.MYD 4.0K#sql_8cc3_0.MYI 336G#sql_8cc3_10.MYD 4.0K#sql_8cc3_10.MYI 340G#sql_8cc3_5.MYD 4.0K#sql_8cc3_5.MYI
根据尝试判定这是mysql生成的文件,查看了数据库:
mysql> show processlist; +------------+-----------------+----------------------+-----------+-------------+----------+-----------------------------------------------------------------------+------------------------------------------------------------------------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | +------------+-----------------+----------------------+-----------+-------------+----------+-----------------------------------------------------------------------+------------------------------------------------------------------------------------------------------+ | 1 | event_scheduler | localhost | NULL | Daemon | 54024745 | Waiting on empty queue | NULL | | 2912394659 | nginxs_rw | 172.17.11.99:12936 | nginxs | Execute | 12508 | Sending data | select month(a.blog_date), count(distinct b.usname) as android, count(distinct c.usname) | | 2912395083 | nginxs_rw | 172.17.11.99:34020 | nginxs | Execute | 12051 | Sending data | select month(a.blog_date), count(distinct b.usname) as android, count(distinct c.usname) | | 2912402122 | root | localhost | nginxs | Query | 0 | init | show processlist | +------------+-----------------+----------------------+-----------+-------------+----------+-----------------------------------------------------------------------+------------------------------------------------------------------------------------------------------+ 10 rows in set (0.00 sec)
解决方法:
根据经验快速判定上面两条sql join执行有问题,立即联系相关人员确定sql可以杀掉,
mysql> kill 2912394659; Query OK, 0 rows affected (0.00 sec) mysql> kill 2912395083; Query OK, 0 rows affected (0.00 sec)
杀掉这两个sql以后,数据库立即开始释放临时文件,磁盘空间恢复正常。
总结:
在线上数据库使用时,尽量给一些临时文件限制上限,下面是几个常见的参数
tmp_table_size = 256M max_heap_table_size = 256M thread_cache_size = 64 myisam_sort_buffer_size = 32M myisam_max_sort_file_size = 10G max_join_size = 268435456 innodb_online_alter_log_max_size = 134217728 innodb_sort_buffer_size = 1048576 max_allowed_packet = 128M max_binlog_size = 256M
在线修改方法:
mysql> set max_join_size=268435456; Query OK, 0 rows affected (0.00 sec)
转载于:https://blog.51cto.com/nginxs/1933625