客户业务系统
数据库出现如下故障,造成业务系统缓慢,几分钟后系统恢复正常,查看系统日志发现如下错误信息。
节点2报错日志如下:
Trace dumping is performing id=[cdmp_20090119104811]
Mon Jan 19 10:49:22 2009
Waiting for clusterware split-brain resolution
Mon Jan 19 10:59:35 2009
Errors in file /opt/app/oracle/admin/postdb/bdump/postdb2_lmon_5598.trc:
ORA-29740: 已被成员 0 逐出, 组原型 16
Mon Jan 19 10:59:35 2009
LMON: terminating instance due to error 29740
Mon Jan 19 10:59:35 2009
Errors in file /opt/app/oracle/admin/postdb/bdump/postdb2_lms1_5606.trc:
ORA-29740: 已被成员 逐出, 组原型
Mon Jan 19 10:59:35 2009
Errors in file /opt/app/oracle/admin/postdb/bdump/postdb2_lms3_5614.trc:
ORA-29740: 已被成员 逐出, 组原型
Mon Jan 19 10:59:35 2009
Errors in file /opt/app/oracle/admin/postdb/bdump/postdb2_pmon_5592.trc:
ORA-29740: 已被成员 逐出, 组原型
Mon Jan 19 10:59:35 2009
Errors in file /opt/app/oracle/admin/postdb/bdump/postdb2_lms2_5610.trc:
ORA-29740: 已被成员 逐出, 组原型
Mon Jan 19 10:59:35 2009
Errors in file /opt/app/oracle/admin/postdb/bdump/postdb2_lmd0_5600.trc:
ORA-29740: 已被成员 逐出, 组原型
Mon Jan 19 10:59:35 2009
Errors in file /opt/app/oracle/admin/postdb/bdump/postdb2_lms0_5602.trc:
ORA-29740: 已被成员 逐出, 组原型
Mon Jan 19 10:59:35 2009
System state dump is made for local instance
System State dumped to trace file /opt/app/oracle/admin/postdb/bdump/postdb2_diag_5594.trc
Mon Jan 19 10:59:37 2009
Trace dumping is performing id=[cdmp_20090119105935]
Mon Jan 19 10:59:40 2009
Instance terminated by LMON, pid = 5598
Mon Jan 19 11:00:08 2009
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth0 1.1.1.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth1 10.194.7.128 configured from OCR for use as a public interface
Picked latch-free SCN scheme 1
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.1.0.
System parameters with non-default values:
processes = 500
sessions = 555
__shared_pool_size = 1325400064
__large_pool_size = 16777216
__java_pool_size = 16777216
__streams_pool_size = 0
spfile = +DATA1/postdb/spfilepostdb.ora
nls_language = SIMPLIFIED CHINESE
nls_territory = CHINA
sga_target = 5888802816
control_files = +DATA1/postdb/controlfile/current.274.671538401, +FLASH_RECAVORY/postdb/controlfile/current.262.6715384
01
db_block_size = 8192
__db_cache_size = 4513071104
compatible = 10.2.0.1.0
log_archive_dest_1 = LOCATION=+DATA1/postdb/
log_archive_format = %t_%s_%r.dbf
db_file_multiblock_read_count= 16
cluster_database = TRUE
cluster_database_instances= 2
db_create_file_dest = +DATA1
db_recovery_file_dest = +FLASH_RECAVORY
db_recovery_file_dest_size= 26214400000
thread = 2
instance_number = 2
undo_management = AUTO
undo_tablespace = UNDOTBS2
remote_login_passwordfile= EXCLUSIVE
db_domain =
dispatchers = (PROTOCOL=TCP) (SERVICE=postdbXDB)
remote_listener = LISTENERS_POSTDB
job_queue_processes = 20
background_dump_dest = /opt/app/oracle/admin/postdb/bdump
user_dump_dest = /opt/app/oracle/admin/postdb/udump
core_dump_dest = /opt/app/oracle/admin/postdb/cdump
audit_file_dest = /opt/app/oracle/admin/postdb/adump
db_name = postdb
open_cursors = 300
pga_aggregate_target = 2310012928
Cluster communication is configured to use the following interface(s) for this instance
1.1.1.2
Mon Jan 19 11:00:09 2009
cluster interconnect IPC version:Oracle UDP/IP
IPC Vendor 1 proto 2
Trace dumping is performing id=[cdmp_20090119104811]
Mon Jan 19 10:49:22 2009
Waiting for clusterware split-brain resolution
Mon Jan 19 10:59:35 2009
Errors in file /opt/app/oracle/admin/postdb/bdump/postdb2_lmon_5598.trc:
ORA-29740: 已被成员 0 逐出, 组原型 16
Mon Jan 19 10:59:35 2009
LMON: terminating instance due to error 29740
Mon Jan 19 10:59:35 2009
Errors in file /opt/app/oracle/admin/postdb/bdump/postdb2_lms1_5606.trc:
ORA-29740: 已被成员 逐出, 组原型
Mon Jan 19 10:59:35 2009
Errors in file /opt/app/oracle/admin/postdb/bdump/postdb2_lms3_5614.trc:
ORA-29740: 已被成员 逐出, 组原型
Mon Jan 19 10:59:35 2009
Errors in file /opt/app/oracle/admin/postdb/bdump/postdb2_pmon_5592.trc:
ORA-29740: 已被成员 逐出, 组原型
Mon Jan 19 10:59:35 2009
Errors in file /opt/app/oracle/admin/postdb/bdump/postdb2_lms2_5610.trc:
ORA-29740: 已被成员 逐出, 组原型
Mon Jan 19 10:59:35 2009
Errors in file /opt/app/oracle/admin/postdb/bdump/postdb2_lmd0_5600.trc:
ORA-29740: 已被成员 逐出, 组原型
Mon Jan 19 10:59:35 2009
Errors in file /opt/app/oracle/admin/postdb/bdump/postdb2_lms0_5602.trc:
ORA-29740: 已被成员 逐出, 组原型
Mon Jan 19 10:59:35 2009
System state dump is made for local instance
System State dumped to trace file /opt/app/oracle/admin/postdb/bdump/postdb2_diag_5594.trc
Mon Jan 19 10:59:37 2009
Trace dumping is performing id=[cdmp_20090119105935]
Mon Jan 19 10:59:40 2009
Instance terminated by LMON, pid = 5598
Mon Jan 19 11:00:08 2009
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 eth0 1.1.1.0 configured from OCR for use as a cluster interconnect
Interface type 1 eth1 10.194.7.128 configured from OCR for use as a public interface
Picked latch-free SCN scheme 1
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.1.0.
System parameters with non-default values:
processes = 500
sessions = 555
__shared_pool_size = 1325400064
__large_pool_size = 16777216
__java_pool_size = 16777216
__streams_pool_size = 0
spfile = +DATA1/postdb/spfilepostdb.ora
nls_language = SIMPLIFIED CHINESE
nls_territory = CHINA
sga_target = 5888802816
control_files = +DATA1/postdb/controlfile/current.274.671538401, +FLASH_RECAVORY/postdb/controlfile/current.262.6715384
01
db_block_size = 8192
__db_cache_size = 4513071104
compatible = 10.2.0.1.0
log_archive_dest_1 = LOCATION=+DATA1/postdb/
log_archive_format = %t_%s_%r.dbf
db_file_multiblock_read_count= 16
cluster_database = TRUE
cluster_database_instances= 2
db_create_file_dest = +DATA1
db_recovery_file_dest = +FLASH_RECAVORY
db_recovery_file_dest_size= 26214400000
thread = 2
instance_number = 2
undo_management = AUTO
undo_tablespace = UNDOTBS2
remote_login_passwordfile= EXCLUSIVE
db_domain =
dispatchers = (PROTOCOL=TCP) (SERVICE=postdbXDB)
remote_listener = LISTENERS_POSTDB
job_queue_processes = 20
background_dump_dest = /opt/app/oracle/admin/postdb/bdump
user_dump_dest = /opt/app/oracle/admin/postdb/udump
core_dump_dest = /opt/app/oracle/admin/postdb/cdump
audit_file_dest = /opt/app/oracle/admin/postdb/adump
db_name = postdb
open_cursors = 300
pga_aggregate_target = 2310012928
Cluster communication is configured to use the following interface(s) for this instance
1.1.1.2
Mon Jan 19 11:00:09 2009
cluster interconnect IPC version:Oracle UDP/IP
IPC Vendor 1 proto 2
节点1报错日志如下:
Recovery of Online Redo Log: Thread 2 Group 3 Seq 197 Reading mem 0
Mem# 0 errs 0: +DATA1/postdb/onlinelog/group_3.266.671539695
Mem# 1 errs 0: +FLASH_RECAVORY/postdb/onlinelog/group_3.265.671539695
Mon Jan 19 10:59:43 2009
Completed redo application
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
Completed instance recovery at
Thread 2: logseq 197, block 45836, scn 41611174
51 data blocks read, 54 data blocks written, 520 redo blocks read
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Recovery of Online Redo Log: Thread 2 Group 3 Seq 197 Reading mem 0
Mem# 0 errs 0: +DATA1/postdb/onlinelog/group_3.266.671539695
Mem# 1 errs 0: +FLASH_RECAVORY/postdb/onlinelog/group_3.265.671539695
Mon Jan 19 10:59:43 2009
Completed redo application
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
Completed instance recovery at
Thread 2: logseq 197, block 45836, scn 41611174
51 data blocks read, 54 data blocks written, 520 redo blocks read
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 19 10:59:43 2009
WARNING: inbound connection timed out (ORA-3136)
当发生ora-29740错时,组成cluster的一个节点的数据库会被驱逐出cluster环境,这将直接引起数据库宕机。出现如上报错信息通常是因为在资源紧张、网络拥堵等情况下,就有可能发生IPC超时的问题,而 RAC随后就会将问题节点驱逐,引发一轮重新配置。
此问题为Oracle10g 中的一个bug,可以oracle官网找到该bug号为Bug 5190596,在10.2.0.3中已经有了一个Patch补丁。
把RAC升级到10.2.0.4后,系统运行稳定,再未出现该错误!
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/25369863/viewspace-688248/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/25369863/viewspace-688248/