收到用户反应,公司软件操作时提示11115错误:
如下:ORA-01115:IO error reading block from file 10 (block # 24413)
刚开始以为是文件有坏块了,检查alert日志文件时,发现实例意外重启了。
部分日志如下:
 
  Tue Jul 13 14:38:23 2010
Thread 2 advanced to log sequence 952 (LGWR switch)
  Current log# 4 seq# 952 mem# 0: +DG1/klir/onlinelog/redo04a.log
  Current log# 4 seq# 952 mem# 1: +RECOVER/klir/onlinelog/redo04b.log
Tue Jul 13 14:38:28 2010
Archived Log entry 6533 added for thread 2 sequence 951 ID 0x38918de5 dest 1:
Archived Log entry 6534 added for thread 2 sequence 951 ID 0x38918de5 dest 2:
Tue Jul 13 14:43:58 2010
Deleted Oracle managed file +RECOVER/klir/flashback/log_685.953.722545985
Tue Jul 13 15:03:27 2010
Detected change in CPU count to 8
Tue Jul 13 15:03:27 2010
Restarting dead background process MMON
Tue Jul 13 15:03:27 2010
MMON started with pid=250, OS id=4252
Tue Jul 13 15:04:54 2010
Process m001 died, see its trace file
Tue Jul 13 15:04:55 2010
Process PZ99 died, see its trace file
Exception [type: SIGFPE, Integer divide by zero] [ADDR:0x2AB252CA5A87] [PC:0x2AB252CA5A87, skgxp_setup_sliding_window()+499] [flags: 0x0, count: 1]
Errors in file /u02/app/oracle/diag/rdbms/klir/klir2/trace/klir2_m000_4262.trc  (incident=29442):
ORA-07445: exception encountered: core dump [skgxp_setup_sliding_window()+499] [SIGFPE] [ADDR:0x2AB252CA5A87] [PC:0x2AB252CA5A87] [Integer divide by zero] []
Incident details in: /u02/app/oracle/diag/rdbms/klir/klir2/incident/incdir_29442/klir2_m000_4262_i29442.trc
Tue Jul 13 15:04:56 2010
Trace dumping is performing id=[cdmp_20100713150456]
Process m001 died, see its trace file
Tue Jul 13 15:05:27 2010
Detected change in CPU count to 8
Tue Jul 13 15:08:08 2010
Process m000 died, see its trace file
Process m001 died, see its trace file
Process m002 died, see its trace file
Tue Jul 13 15:08:11 2010
Process PZ99 died, see its trace file
Process m001 died, see its trace file
Process PZ99 died, see its trace file
Process m000 died, see its trace file
Process m002 died, see its trace file
Process m001 died, see its trace file
Process m001 died, see its trace file
Tue Jul 13 15:08:29 2010
.....
Tue Jul 13 15:22:41 2010
RCBG started with pid=40, OS id=8246
replication_dependency_tracking turned off (no async multimaster replication found)
WARNING: AQ_TM_PROCESSES is set to 0. System operation                     might be adversely affected.
Completed: ALTER DATABASE OPEN
Tue Jul 13 15:29:02 2010
Deleted Oracle managed file +RECOVER/klir/flashback/log_715.983.722546351
Tue Jul 13 15:31:37 2010
Deleted Oracle managed file +RECOVER/klir/flashback/log_716.984.722546361
Tue Jul 13 15:31:40 2010
Starting background process SMCO
Tue Jul 13 15:31:40 2010
SMCO started with pid=98, OS id=9690
Tue Jul 13 15:33:23 2010
Deleted Oracle managed file +RECOVER/klir/flashback/log_717.985.722546371
Tue Jul 13 15:40:37 2010
Deleted Oracle managed file +RECOVER/klir/flashback/log_718.986.722546381
Tue Jul 13 15:45:18 2010
Deleted Oracle managed file +RECOVER/klir/flashback/log_719.987.722546399
Tue Jul 13 15:45:22 2010
    分析日志文件,看到数据库在14:43:58 分删除闪回日志文件后几秒就出错并重新启动实例了,这时想起系统的闪回空间已经使用90%以上,估计是闪回空间不足,造成实例重启了。前好多天发现闪回空间不足时,就考虑要停机关闭数据库闪回的,没想到今天就出问题了。
    基本确定原因后,赶紧停机关闭数据库的闪回功能。重新启动数据库后,系统正常。