故障:
1、8月1日,归档满,数据库挂起,应用系统无法登陆
解决方法:8月1日的解决方法,删除归档文件。
解决方法的评价:
根本问题不解决,8月15日 500G归档目录100%,剩余3G空间。
故障原因:
分析数据库后台日志:alert_ybzdb1.log
1、2011年7月6日 星期三 晚 22点17分34秒 归档日志空间满。其后所有数据库更新操作无法执行。
Wed Jul 6 22:17:34 2011
Thread 1 advanced to log sequence 6002 (LGWR switch)
Current log# 2 seq# 6002 mem# 0: /dev/rredo_2_512m_01
Current log# 2 seq# 6002 mem# 1: /dev/rredo_2_512m_02
Wed Jul 6 22:17:46 2011
ARC1: Encountered disk I/O error 19502
Wed Jul 6 22:17:46 2011
ARC1: Closing local archive destination LOG_ARCHIVE_DEST_1: '/arch/1_6001_702581932.dbf' (error 19502)
(ybzdb1)
Wed Jul 6 22:17:47 2011
Errors in file /oracle/app/admin/ybzdb/bdump/ybzdb1_arc1_1098138.trc:
ORA-19502: write error on file "/arch/1_6001_702581932.dbf", blockno 380929 (blocksize=512)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: -1
Additional information: 1048576
ORA-19502: write error on file "/arch/1_6001_702581932.dbf", blockno 380929 (blocksize=512)
。。。。。。
。。。。。。
。。。。。。
7月7日 星期四 早8点32分32秒 故障依旧。
Thu Jul 7 08:32:32 2011
Errors in file /oracle/app/admin/ybzdb/bdump/ybzdb1_arc0_1114392.trc:
ORA-19502: write error on file "/arch/1_6001_702581932.dbf", blockno 399361 (blocksize=512)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: -1
Additional information: 1048576
ORA-19502: write error on file "/arch/1_6001_702581932.dbf", blockno 380929 (blocksize=512)
7月7日 星期四 早8点38分42秒 故障恢复,数据库可以顺利归档,业务操作得以正常运行。
Thu Jul 7 08:38:42 2011
Thread 1 advanced to log sequence 6004 (LGWR switch)
Current log# 1 seq# 6004 mem# 0: /dev/rredo_1_512m_01
Current log# 1 seq# 6004 mem# 1: /dev/rredo_1_512m_02
7月7日至7月21日 数据库归档一直正常,日志切换从序号6004到序号6910 数据库是先切换在线日志组,再对前一个日志组6909 做归档操作。
累计归档 453 G归档日志
Thu Jul 21 20:15:32 2011
Thread 1 advanced to log sequence 6910 (LGWR switch)
Current log# 1 seq# 6910 mem# 0: /dev/rredo_1_512m_01
Current log# 1 seq# 6910 mem# 1: /dev/rredo_1_512m_02
7月21日 星期四 20:15:51 2011
Thu Jul 21 20:15:51 2011
ARC0: Encountered disk I/O error 19502
Thu Jul 21 20:15:51 2011
ARC0: Closing local archive destination LOG_ARCHIVE_DEST_1: '/arch/1_6909_702581932.dbf' (error 19502)
(ybzdb1)
Thu Jul 21 20:15:52 2011
Errors in file /oracle/app/admin/ybzdb/bdump/ybzdb1_arc0_1114392.trc:
ORA-19502: write error on file "/arch/1_6909_702581932.dbf", blockno 587777 (blocksize=512)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: -1
Additional information: 1048576
ORA-19502: write error on file "/arch/1_6909_702581932.dbf", blockno 585729 (blocksize=512)
7月22日 星期五 早 8点59分 归档恢复正常 比 7月7日早8点38分 迟了 20分钟。维护人员你早上8点30分到公司后都在做些啥?
Fri Jul 22 08:59:03 2011
Errors in file /oracle/app/admin/ybzdb/bdump/ybzdb1_arc1_1098138.trc:
ORA-16038: log 3 sequence# 6909 cannot be archived
ORA-00001: unique constraint (.) violated
Fri Jul 22 08:59:24 2011
kcrrdmx: Successful archiving of previously failed ORL
Archiver process freed from errors. No longer stopped
Fri Jul 22 08:59:24 2011
Thread 1 advanced to log sequence 6912 (LGWR switch)
Current log# 3 seq# 6912 mem# 0: /dev/rredo_3_512m_01
Current log# 3 seq# 6912 mem# 1: /dev/rredo_3_512m_02
7月31日 星期天 下午15点16分 在线事务日志切换 从6912 到 7515。301G被归档。 序号 7514 归档失败。
Sun Jul 31 15:16:07 2011
Thread 1 advanced to log sequence 7515 (LGWR switch)
Current log# 3 seq# 7515 mem# 0: /dev/rredo_3_512m_01
Current log# 3 seq# 7515 mem# 1: /dev/rredo_3_512m_02
Sun Jul 31 15:16:26 2011
ARC1: Encountered disk I/O error 19502
Sun Jul 31 15:16:26 2011
ARC1: Closing local archive destination LOG_ARCHIVE_DEST_1: '/arch/1_7514_702581932.dbf' (error 19502)
(ybzdb1)
Sun Jul 31 15:16:27 2011
Errors in file /oracle/app/admin/ybzdb/bdump/ybzdb1_arc1_1098138.trc:
ORA-19502: write error on file "/arch/1_7514_702581932.dbf", blockno 532481 (blocksize=512)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: -1
Additional information: 1048576
ORA-19502: write error on file "/arch/1_7514_702581932.dbf", blockno 532481 (blocksize=512)
8月1日 星期一 上午10点26分 维护人员清理 空间后,数据库才能正常运行,而不能归档的这段时间内,数据库是不能进行DML 操作的。
Mon Aug 1 10:26:04 2011
ORA-16038: log 2 sequence# 7514 cannot be archived
ORA-00001: unique constraint (.) violated
Mon Aug 1 10:26:04 2011
Errors in file /oracle/app/admin/ybzdb/bdump/ybzdb1_arc1_1098138.trc:
ORA-16038: log 2 sequence# 7514 cannot be archived
ORA-00001: unique constraint (.) violated
Mon Aug 1 10:26:20 2011
kcrrdmx: Successful archiving of previously failed ORL
Archiver process freed from errors. No longer stopped
Mon Aug 1 10:26:21 2011
Error stack returned to user:
ORA-03113: end-of-file on communication channel
Mon Aug 1 10:26:21 2011
Thread 1 advanced to log sequence 7517 (LGWR switch)
Current log# 2 seq# 7517 mem# 0: /dev/rredo_2_512m_01
Current log# 2 seq# 7517 mem# 1: /dev/rredo_2_512m_02
8月15日 上午
# df -k
Filesystem 1024-blocks Free %Used Iused %Iused Mounted on
/dev/hd4 2097152 2005712 5% 6193 2% /
/dev/hd2