1、故障现象
巡检发现日志报错
$ tail -500f alert_ywjk.log
Mon Feb 05 09:57:38 2018
Non critical error ORA-48180 caught while writing to trace file
"/u01/app/oradba/diag/rdbms/ywjk/ywjk/trace/ywjk_ora_28302.trc"
Error message: Linux-x86_64 Error: 28: No space left on device
Additional information: 1
Writing to the above trace file is disabled for now on...
OS Audit file could not be created; failing after 6 retries
登入数据库也失败了
$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Mon Feb 5 09:28:18 2018
Copyright (c) 1982, 2013, Oracle. All rights reserved.
ERROR:
ORA-09925: Unable to create audit trail file
Linux-x86_64 Error: 28: No space left on device
Additional information: 9925
ORA-01075: you are currently logged on
2、故障分析
根据报错信息“No space left on device”,可以用df -i显示inode信息,发现根目录满了,需要去删除数量过多的小文件
$ df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/vg_szzjjk07-lv_root 6553600 6553600 0 100% /
tmpfs 10294742 4 10294738 1% /dev/shm
/dev/sda2 128016 29 127987 1% /boot
/dev/sda1 0 0 0 - /boot/efi
/dev/mapper/vg_szzjjk07-LogVol03 161914880 2834 161912046 1% /data
/dev/mapper/vg_szzjjk07-LogVol02 5242880 12 5242868 1% /oracle
ps:在df -h 和df -i 显示使用率100%,基本解决方法都是删除文件。
df -h 是去删除比较大无用的文件———–大文件占用大量的磁盘容量。
df -i 则去删除数量过多的小文件———–过多的文件占用了大量的inode号。
3、故障解决
通过以下脚本进行检查哪个目录下面的文件最多
for i in / * /; do echo $i; find $i | wc -l; done
排查到是审计目录
/u01/app/oradba/admin/ywjk/adump
查看审计文件的命名格式
-rw-r—– 1 oradba oinstall 895 Feb 3 01:28 ywjk_ora_15740_20180203012842333186143795.aud
用编辑工具以天为单位,编辑从2017年10月1日开始删除,到2018年1月31日结束
rm -rf ywjk_ora_*_20171001*.aud
rm -rf ywjk_ora_*_20171002*.aud
rm -rf ywjk_ora_*_20171003*.aud
rm -rf ywjk_ora_*_20171004*.aud
。
。
rm -rf ywjk_ora_*_20180131*.aud
再次检查
$ df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/vg_szzjjk07-lv_root 6553600 4629762 1923838 71% /
tmpfs 10294742 4 10294738 1% /dev/shm
/dev/sda2 128016 29 127987 1% /boot
/dev/sda1 0 0 0 - /boot/efi
/dev/mapper/vg_szzjjk07-LogVol03 161914880 2837 161912043 1% /data
/dev/mapper/vg_szzjjk07-LogVol02 5242880 12 5242868 1% /oracle
查看日志
$ tail -5000f alert_ywjk.log
Mon Feb 05 10:01:24 2018
OS Audit file could not be created; failing after 6 retries
Mon Feb 05 10:24:03 2018
Thread 1 cannot allocate new log, sequence 8680
Private strand flush not complete
Current log# 9 seq# 8679 mem# 0: /data/ywjk/ywjk/redo09.log
Thread 1 advanced to log sequence 8680 (LGWR switch)
Current log# 10 seq# 8680 mem# 0: /data/ywjk/ywjk/redo10.log
Mon Feb 05 10:24:58 2018
Archived Log entry 8678 added for thread 1 sequence 8679 ID 0x5ea53704 dest 1:
数据库正常连入