ORA-600(1616)和ORA-569错误

最新推荐文章于 2022-03-31 22:49:58 发布

ckawt40802

最新推荐文章于 2022-03-31 22:49:58 发布

阅读量382

点赞数

文章标签：网络数据库

出现错误的数据库为Oracle 10.2.0.4 RAC for HP-UX。

在RAC的一个节点上出现了ORA-600[1616]错误和ORA-569错误：

Sat Apr 7 03:03:09 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_smon_27471.trc:
ORA-00569: Failed to acquire global enqueue.
Sat Apr 7 03:04:14 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_smon_27471.trc:
ORA-00569: Failed to acquire global enqueue.
Sat Apr 7 03:04:15 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_smon_27471.trc:
ORA-00569: Failed to acquire global enqueue.
.
.
.
Sat Apr 7 03:06:35 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_smon_27471.trc:
ORA-00569: Failed to acquire global enqueue.
Sat Apr 7 03:07:42 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_smon_27471.trc:
ORA-00569: Failed to acquire global enqueue.
Sat Apr 7 03:07:42 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_mmon_27483.trc:
ORA-00600: internal error code, arguments: [1616], [30], [1], [], [], [], [], []
Sat Apr 7 03:07:44 2012
Trace dumping is performing id=[cdmp_20120407030744]
Sat Apr 7 03:07:44 2012
ORA-600 encountered when generating server alert SMG-3503
Sat Apr 7 03:07:57 2012
Reconfiguration started (old inc 24, new inc 26)
List of nodes:
1
Global Resource Directory frozen
* dead instance detected - domain 0 invalid = TRUE
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Sat Apr 7 03:08:03 2012
LMS 0: 1 GCS shadows cancelled, 0 closed
Sat Apr 7 03:08:03 2012
LMS 1: 1 GCS shadows cancelled, 0 closed
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Sat Apr 7 03:08:13 2012
Reconfiguration started (old inc 26, new inc 28)
List of nodes:
0 1
Nested reconfiguration detected.
Global Resource Directory frozen
Communication channels reestablished
* domain 0 valid = 0 according to instance 0
Sat Apr 7 03:08:13 2012
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Sat Apr 7 03:08:17 2012
LMS 1: 0 GCS shadows cancelled, 0 closed
Sat Apr 7 03:08:17 2012
LMS 0: 0 GCS shadows cancelled, 0 closed
Set master node info
Sat Apr 7 03:08:39 2012
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Sat Apr 7 03:08:51 2012
LMS 0: 7542 GCS shadows traversed, 4001 replayed
Sat Apr 7 03:08:51 2012
LMS 1: 7472 GCS shadows traversed, 4001 replayed
Sat Apr 7 03:08:52 2012
LMS 0: 7481 GCS shadows traversed, 4001 replayed
Sat Apr 7 03:08:52 2012
LMS 1: 7522 GCS shadows traversed, 4001 replayed
Sat Apr 7 03:08:52 2012
LMS 0: 7450 GCS shadows traversed, 4001 replayed
.
.
.
Sat Apr 7 03:08:52 2012
LMS 1: 7550 GCS shadows traversed, 4001 replayed
Sat Apr 7 03:08:52 2012
LMS 0: 6396 GCS shadows traversed, 3431 replayed
Sat Apr 7 03:08:52 2012
LMS 1: 5339 GCS shadows traversed, 2878 replayed
Sat Apr 7 03:08:52 2012
Submitted all GCS remote-cache requests
Sat Apr 7 03:08:52 2012
Instance recovery: looking for dead threads
Fix write in gcs resources
Sat Apr 7 03:08:52 2012
Beginning instance recovery of 1 threads
Reconfiguration complete
Sat Apr 7 03:08:53 2012
parallel recovery started with 7 processes
Sat Apr 7 03:08:53 2012
Started redo scan
Sat Apr 7 03:08:53 2012
Completed redo scan
2550 redo blocks read, 152 data blocks need recovery
Sat Apr 7 03:08:54 2012
Started redo application at
Thread 1: logseq 75561, block 256542
Sat Apr 7 03:08:54 2012
Recovery of Online Redo Log: Thread 1 Group 1 Seq 75561 Reading mem 0
Mem# 0: +DATA/orcl/onlinelog/group_1.261.611597975
Sat Apr 7 03:08:54 2012
Completed redo application
Sat Apr 7 03:08:54 2012
Completed instance recovery at
Thread 1: logseq 75561, block 259092, scn 52674829295
149 data blocks read, 137 data blocks written, 2550 redo blocks read
Switch log for thread 1 to sequence 75562

虽然ORA-600[1616]错误在MOS中完全没有记载，但是根据错误信息分析，这个ORA-600错误应该是ORA-569错误导致的，而ORA-569的错误显然和另外一个实例的状态有关。可以看到在问题发生不久之后，另外一个节点发生了关闭和启动的操作。

那么检查一下另外一个节点对应时刻的日志：

Thu Apr 5 23:06:35 2012
Thread 1 advanced to log sequence 75470
Current log# 6 seq# 75470 mem# 0: +DATA/orcl/onlinelog/group_6.258.611597995
Thu Apr 5 23:59:38 2012
Thread 1 advanced to log sequence 75471
Current log# 2 seq# 75471 mem# 0: +DATA/orcl/onlinelog/group_2.260.611597981
Fri Apr 6 00:20:12 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:15 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:15 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:16 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:16 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:17 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:17 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:18 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:18 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:19 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:19 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:20 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:20 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:20 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:21 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:21 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:22 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:22 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:23 2012
Some indexes or index [sub]partitions of table ORCLLJ.OLD_000 have been marked unusable
.
.
.
Fri Apr 6 10:55:20 2012
Some indexes or index [sub]partitions of table ORCLTJ.OLD_000 have been marked unusable
Fri Apr 6 10:55:22 2012
Some indexes or index [sub]partitions of table ORCLTJ.OLD_000 have been marked unusable
Fri Apr 6 10:58:54 2012
Thread 1 advanced to log sequence 75530
Current log# 6 seq# 75530 mem# 0: +DATA/orcl/onlinelog/group_6.258.611597995
Fri Apr 6 11:20:34 2012
Thread 1 advanced to log sequence 75531
Current log# 2 seq# 75531 mem# 0: +DATA/orcl/onlinelog/group_2.260.611597981
.
.
.
Fri Apr 6 16:53:40 2012
Thread 1 advanced to log sequence 75543
Current log# 2 seq# 75543 mem# 0: +DATA/orcl/onlinelog/group_2.260.611597981
Fri Apr 6 17:07:15 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl1_smon_15533.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 4120 bytes of shared memory ("shared pool","select increment$,minvalue,m...","Typecheck","kgghteInit")
Fri Apr 6 17:07:22 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl1_smon_15533.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 4120 bytes of shared memory ("shared pool","select name,online$,contents...","Typecheck","kgghteInit")
Fri Apr 6 17:07:23 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl1_smon_15533.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 4120 bytes of shared memory ("shared pool","select name,online$,contents...","Typecheck","kgghteInit")
Fri Apr 6 17:07:40 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl1_smon_15533.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 4120 bytes of shared memory ("shared pool","select name,online$,contents...","Typecheck","kgghteInit")
Fri Apr 6 17:07:41 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl1_smon_15533.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 4120 bytes of shared memory ("shared pool","select name,online$,contents...","Typecheck","kgghteInit")
Fri Apr 6 17:15:53 2012
Thread 1 advanced to log sequence 75544
Current log# 5 seq# 75544 mem# 0: +DATA/orcl/onlinelog/group_5.259.611597989
Fri Apr 6 17:34:28 2012
Thread 1 advanced to log sequence 75545
Current log# 1 seq# 75545 mem# 0: +DATA/orcl/onlinelog/group_1.261.611597975
.
.
.
Sat Apr 7 03:00:22 2012
Thread 1 advanced to log sequence 75561
Current log# 1 seq# 75561 mem# 0: +DATA/orcl/onlinelog/group_1.261.611597975
Sat Apr 7 03:03:18 2012
WARNING: ran out of shared pool for GES resource object.
Sat Apr 7 03:03:49 2012
WARNING: ran out of shared pool for GES resource object.
Sat Apr 7 03:04:20 2012
WARNING: ran out of shared pool for GES resource object.
Sat Apr 7 03:04:51 2012
WARNING: ran out of shared pool for GES resource object.
Sat Apr 7 03:05:22 2012
WARNING: ran out of shared pool for GES resource object.
Sat Apr 7 03:05:54 2012
WARNING: ran out of shared pool for GES resource object.
Sat Apr 7 03:06:24 2012
WARNING: ran out of shared pool for GES resource object.
Sat Apr 7 03:07:42 2012
WARNING: ran out of shared pool for GES resource object.
Sat Apr 7 03:07:44 2012
Trace dumping is performing id=[cdmp_20120407030744]
Sat Apr 7 03:07:45 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl1_lmd0_15494.trc:
ORA-00600: internal error code, arguments: [kjrpar:noalloc], [1], [385], [0x60000000000C09B8], [0xC0000002137539D8], [0xC000000213752748], [], []
Sat Apr 7 03:07:46 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl1_lmd0_15494.trc:
ORA-00600: internal error code, arguments: [kjrpar:noalloc], [1], [385], [0x60000000000C09B8], [0xC0000002137539D8], [0xC000000213752748], [], []
Sat Apr 7 03:07:54 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl1_lmd0_15494.trc:
ORA-00600: internal error code, arguments: [kjrpar:noalloc], [1], [385], [0x60000000000C09B8], [0xC0000002137539D8], [0xC000000213752748], [], []
Sat Apr 7 03:07:54 2012
LMD0: terminating instance due to error 482
Sat Apr 7 03:07:54 2012
Shutting down instance (abort)
License high water mark = 56
Sat Apr 7 03:07:58 2012
Instance terminated by LMD0, pid = 15494
Sat Apr 7 03:07:59 2012
Instance terminated by USER, pid = 8118
Sat Apr 7 03:08:05 2012
Starting ORACLE instance (normal)
Sat Apr 7 03:08:05 2012
Specified value of sga_max_size is too small, bumping to 6627000320
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 lan901 192.168.0.0 configured from OCR for use as a cluster interconnect
Interface type 1 lan900 198.5.1.0 configured from OCR for use as a public interface
Picked latch-free SCN scheme 3
Autotune of undo retention is turned off.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
My pid is 8415, priority is -179, running with SCHED_NOAGE policy 8.
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.3.0.
System parameters with non-default values:
processes = 300
sessions = 335
timed_statistics = FALSE
sga_max_size = 6627000320
lock_sga = FALSE
shared_pool_size = 4294967296
large_pool_size = 67108864
java_pool_size = 33554432
streams_pool_size = 16777216
shared_pool_reserved_size= 263402290
spfile = +DATA/orcl/spfileorcl.ora
sga_target = 0
control_files = +DATA/orcl/controlfile/current1.ora, +ORCL_DATA/orcl/controlfile/current2.ora
db_block_size = 8192
db_cache_size = 2147483648
db_writer_processes = 2
compatible = 10.2.0.2.0
log_archive_dest_1 = LOCATION=/u01/app/oradata/orcl_1_arch1/arch2tape
log_archive_dest_2 = LOCATION=/u01/app/oradata/orcl_1_arch2/arch2tape
log_archive_format = ORCL_%t_%s_%r.ARC
log_buffer = 52428800
log_checkpoint_timeout = 1800
db_file_multiblock_read_count= 8
cluster_database = TRUE
cluster_database_instances= 2
db_create_file_dest = +ORCL_DATA
db_create_online_log_dest_1= +DATA
thread = 1
fast_start_mttr_target = 600
instance_number = 1
undo_management = AUTO
undo_tablespace = UNDOTBS1
undo_retention = 10800
remote_login_passwordfile= EXCLUSIVE
db_domain =
dispatchers = (PROTOCOL=TCP)(DISPATCHERS=1)
max_shared_servers = 32
max_dispatchers = 2
local_listener =
remote_listener = LISTENERS_ORCL
job_queue_processes = 10
background_dump_dest = /u01/app/oracle/admin/ORCL/bdump
user_dump_dest = /u01/app/oracle/admin/ORCL/udump
max_dump_file_size = 52428800
core_dump_dest = /u01/app/oracle/admin/ORCL/cdump
audit_file_dest = /u01/app/oracle/admin/ORCL/adump
hpux_sched_noage = 178
open_links = 30
db_name = ORCL
open_cursors = 300
star_transformation_enabled= TRUE
pga_aggregate_target = 1073741824
Cluster communication is configured to use the following interface(s) for this instance
192.168.0.111
Sat Apr 7 03:08:10 2012
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
PMON started with pid=2, OS id=8596
DIAG started with pid=4, OS id=8606
PSP0 started with pid=6, OS id=8608
LMON started with pid=8, OS id=8614
LMD0 started with pid=10, OS id=8616
LMS0 started with pid=12, OS id=8618
LMS1 started with pid=3, OS id=8620
MMAN started with pid=14, OS id=8623
DBW0 started with pid=16, OS id=8625
DBW1 started with pid=5, OS id=8644
LGWR started with pid=7, OS id=8657
CKPT started with pid=18, OS id=8659
SMON started with pid=20, OS id=8661
RECO started with pid=22, OS id=8663
CJQ0 started with pid=24, OS id=8665
MMON started with pid=26, OS id=8667
Sat Apr 7 03:08:12 2012
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
MMNL started with pid=28, OS id=8669

从节点1上的告警日志基本上就可以判断出问题的原因了。显然在问题发生的前一天夜里，执行了大量的分区维护操作，致使大量的分区索引失效。这个分区维护的操作导致大量的分区源数据被加载到共享池中，随后不久，Oracle就出现了共享池不足的ORA-4031错误。在节点2出现ORA-569错误的同时，节点1出现了WARNING: ran out of shared pool for GES resource object信息，说明这时节点1上共享池不足导致了节点2上全局队列失败。

而后节点1由于资源不足导致了ORA-600[kjrpar:noalloc]错误的出现，并导致了实例的崩溃，关于这个错误的更详细描述可以参考：http://yangtingkun.net/?p=620。

这两个错误实际上还是由于RAC另外节点上共享池资源不足导致的，要避免这两个错误，还是要解决ORA-4031的错误。

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/4227/viewspace-732598/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/4227/viewspace-732598/