问题发现:
例如:df -lh 显示/data 使用空间325G,用du -sh只使用了266G 差距很大。[root@localhost /]# df -lh
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 40G 14G 24G 37% /
tmpfs 16G 0 16G 0% /dev/shm
/dev/vdb 394G 325G 49G 87% /data
[root@localhost /]# du -sh /data
266G/data
原因:
已经被删掉的文件还有程序在占用,所以文件没被真正释放
办法:
1、lsof |grep deleted 查看有哪些未被释放的文件[root@localhost /]# lsof |grep deleted
zabbix_ag 1698 zabbix 1w REG 252,1 1048637 1441797 /tmp/zabbix_agentd.log.old (deleted)
zabbix_ag 1698 zabbix 2w REG 252,1 1048637 1441797 /tmp/zabbix_agentd.log.old (deleted)
zabbix_ag 1703 zabbix 2w REG 252,1 1048637 1441797 /tmp/zabbix_agentd.log.old (deleted)
tail 3003 cc.Liu 3r REG 252,16 3606512687 20972229 /data/applog/service/2017_04_25/live_service_bootstrap_info.log (deleted)
ilogtail_ 3316 root cwd DIR 252,1 0 1973408 /home/cha123 (deleted)
ilogtail_ 3317 root 31r REG 252,16 14319951610 20974265 /data/applog/service/2017_01_25/live_service_bootstrap_info.log (deleted)
tailf 17518 tcollector 4r REG 252,16 9373625578 20973019 /data/applog/service/2017_01_25/live_service_bootstrap_info.log (deleted)
tailf 26502 root 4r REG 252,16 9208694487 20972852 /data/applog/service/2017_01_25/live_service_bootstrap_info.log (deleted)
tail 3595 cc.Liu 3r REG 252,16 3606512687 20972229 /data/applog/service/2017_04_25/live_service_bootstrap_info.log (deleted)
grep 9630 root 1w REG 252,1 0 1482667 /tmp/deleted_file
2、排序看最大的未被释放的文件大小,命令:sort -nr -k 7 deleted_file>sort_deleted_file
more sort_deleted_file
编写脚本将相关进程kill掉#!/bin/bash
FILENAME='deleted_file'
process=("zabbix_ag"
"tail")
for i in ${process[@]}
do
/usr/sbin/lsof |grep deleted |grep "${i}" |grep -v 'grep' >> /tmp/${FILENAME}
done
awk '{print $2}' /tmp/${FILENAME} |while read LINE
do
if [ -n $LINE ];then
kill -9 $LINE
echo "PID $LINE is killed!"
fi
done
/bin/rm /tmp/${FILENAME}
if [ -f /tmp/${FILENAME} ];then
cat /dev/null > /tmp/${FILENAME}
else
echo 'file is delete!'
fi