Alter日志中出现如下错误,而后数据库就奔溃了。
Thu Aug 16 20:21:25 2012
Detected change in CPU count to 8
Thu Aug 16 20:23:29 2012
Process J000 died, see its trace file
Thu Aug 16 20:23:29 2012
kkjcre1p: unable to spawn jobq slave process
Thu Aug 16 20:23:29 2012
Errors in file /db/oracle10g/admin/benguo/bdump/benguo_cjq0_3249.trc:
Thu Aug 16 20:25:25 2012
Detected change in CPU count to 8
Thu Aug 16 12:25:55 2012
Errors in file /db/oracle10g/admin/benguo/udump/benguo_ora_31075.trc:
ORA-00600: Message 600 not found; No message file for product=RDBMS, facility=ORA; arguments: [keltnfy-ldmInit] [46] [1]
Thu Aug 16 20:28:25 2012
Determining CPU socket count failed!
Detected change in CPU count to 1
Thu Aug 16 20:29:15 2012
Process J000 died, see its trace file
Thu Aug 16 20:29:15 2012
kkjcre1p: unable to spawn jobq slave process
Thu Aug 16 20:29:15 2012
Errors in file /db/oracle10g/admin/benguo/bdump/benguo_cjq0_3249.trc:
Thu Aug 16 20:29:26 2012
Detected change in CPU count to 8
Thu Aug 16 20:29:50 2012
OER 7451 in Load Indicator : Error Code = Linux-x86_64 Error: 11086: Unknown system error
Thu Aug 16 20:30:26 2012
Determining CPU socket count failed!
Detected change in CPU count to 1
Thu Aug 16 20:31:26 2012
Detected change in CPU count to 8
Thu Aug 16 22:22:49 2012
Errors in file /db/oracle10g/admin/benguo/bdump/benguo_lgwr_3241.trc:
ORA-00471: DBWR process terminated with error
Instance terminated by DBW0, pid = 3239
根据ora-00600 [keltnfy-ldmInit] [46] [1]的错误,一般是由于主机名不一致导致数据库无法启动等原因,不过我查看数据库中的/etc/hosts文件和hostname主机名确实是一致的。查看相应的trace文件都发现文件不存在了,这确实令我非常的疑惑。
[oracle@server127 bdump]$ uptime
10:31:30 up 24 days, 23:33, 3 users, load average: 2.12, 2.29, 2.55
而系统也没有重启过,不过好在数据库也能正常的startup,不知是否Detected change in CPU count to 8
Determining CPU socket count failed等这类cpu信息有关。
不一会儿该服务器又再次奔溃
Fri Aug 17 10:46:02 2012
Process J000 died, see its trace file
Fri Aug 17 10:46:02 2012
kkjcre1p: unable to spawn jobq slave process
Fri Aug 17 10:46:02 2012
Errors in file /db/oracle10g/admin/benguo/bdump/benguo_cjq0_19052.trc:
Fri Aug 17 10:48:47 2012
Errors in file /db/oracle10g/admin/benguo/udump/benguo_ora_19187.trc:
ORA-01242: data file suffered media failure: database in NOARCHIVELOG mode
Fri Aug 17 10:48:48 2012
Errors in file /db/oracle10g/admin/benguo/bdump/benguo_lgwr_19044.trc:
ORA-01242: data file suffered media failure: database in NOARCHIVELOG mode
Instance terminated by CKPT, pid = 19046
[oracle@server127 bdump]$ oerr ora 01242
01242, 00000, "data file suffered media failure: database in NOARCHIVELOG mode"
// *Cause: The database is in NOARCHIVELOG mode and a database file was
// detected as inaccessible due to media failure.
// *Action: Restore accessibility to the file mentioned in the error stack
// and restart the instance.
之前也遇到过由于磁盘坏道引起的ora-01242错误。
http://blog.itpub.net/post/43172/527958
metalink中给出的:
The File suffered media failure as before that there was some I/O error in writing to the datafile as seen in the alert.log. The root-cause is that the datafile was locked by an OS-tool making a filesystem backup, like Netbackup or ArcServ. The RDBMS could not open the datafile and failed accordingly .
The instance will crash in NOARCHIVELOG-mode, while in ARCHIVELOG-mode, the instance will remain running, but the datafile will be put OFFLINE and will require recovery.
Solution
If the Media recovery is required then
-- restore the old backup of the datafile
-- recover the datafile/tablespace
If there was no logswitch after the failure then the file can be recovered from the current redo log and no need to restore the old backup , so just recover database/tablespace will do
Also make sure that the backup window does not exceed and does not clash with the db open time
Online backup should be recommended , to avoid these problems
这个数据库并没有netbackup啊,可能还是磁盘引起。
Linux的系统日志中出现了如下错误:
end_request: I/O error, dev sr0, sector 6979968
Buffer I/O error on device sr0, logical block 872496
sr 1:0:0:0: SCSI error: return code = 0x08000002
sr0: Current: sense key: Medium Error
Add. Sense: No seek complete
end_request: I/O error, dev sr0, sector 0
Buffer I/O error on device sr0, logical block 0
Buffer I/O error on device sr0, logical block 1
Buffer I/O error on device sr0, logical block 2
Buffer I/O error on device sr0, logical block 3
Buffer I/O error on device sr0, logical block 4
sr 1:0:0:0: SCSI error: return code = 0x08000002
sr0: Current: sense key: Medium Error
Add. Sense: No seek complete
end_request: I/O error, dev sr0, sector 0
printk: 3 messages suppressed.
Buffer I/O error on device sr0, logical block 0
sr 1:0:0:0: SCSI error: return code = 0x08000002
sr0: Current: sense key: Medium Error
Add. Sense: No seek complete
看来还是磁盘存在问题了,导致了数据库的意外关闭,而linux给出的还是可能会是bug引起。
https://bbs.archlinux.org/viewtopic.php?pid=740977
[@more@]来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/25362835/viewspace-1059203/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/25362835/viewspace-1059203/