2012/3/22 rac其中一个实例自动重启,错误日志如下
Thu Mar 22 09:48:57 2012
Errors in file /data/oracle/app/oracle/admin/wap/bdump/wap1_lmon_7858.trc:
ORA-00481: LMON process terminated with error
Thu Mar 22 09:48:57 2012
LMON: terminating instance due to error 481
Thu Mar 22 09:48:57 2012
Errors in file /data/oracle/app/oracle/admin/wap/bdump/wap1_lms1_7865.trc:
ORA-00481: LMON process terminated with error
Thu Mar 22 09:48:57 2012
Errors in file /data/oracle/app/oracle/admin/wap/bdump/wap1_pmon_7852.trc:
ORA-00481: LMON process terminated with error
Thu Mar 22 09:48:57 2012
Errors in file /data/oracle/app/oracle/admin/wap/bdump/wap1_lms2_7875.trc:
ORA-00481: LMON process terminated with error
Thu Mar 22 09:48:57 2012
Errors in file /data/oracle/app/oracle/admin/wap/bdump/wap1_lms3_7882.trc:
ORA-00481: LMON process terminated with error
Thu Mar 22 09:48:57 2012
Errors in file /data/oracle/app/oracle/admin/wap/bdump/wap1_lms0_7862.trc:
ORA-00481: LMON process terminated with error
Thu Mar 22 09:48:57 2012
Errors in file /data/oracle/app/oracle/admin/wap/bdump/wap1_lmd0_7860.trc:
ORA-00481: LMON process terminated with error
Thu Mar 22 09:48:57 2012
System state dump is made for local instance
Thu Mar 22 09:48:57 2012
Errors in file /data/oracle/app/oracle/admin/wap/bdump/wap1_smon_7899.trc:
ORA-00481: LMON process terminated with error
System State dumped to trace file /data/oracle/app/oracle/admin/wap/bdump/wap1_diag_7854.trc
Thu Mar 22 09:48:58 2012
Errors in file /data/oracle/app/oracle/admin/wap/bdump/wap1_lgwr_7892.trc:
ORA-00481: LMON process terminated with error
Thu Mar 22 09:48:58 2012
Errors in file /data/oracle/app/oracle/admin/wap/bdump/wap1_ckpt_7896.trc:
ORA-00481: LMON process terminated with error
Thu Mar 22 09:48:58 2012
Errors in file /data/oracle/app/oracle/admin/wap/bdump/wap1_dbw2_7890.trc:
ORA-00481: LMON process terminated with error
Thu Mar 22 09:48:58 2012
Errors in file /data/oracle/app/oracle/admin/wap/bdump/wap1_dbw0_7886.trc:
ORA-00481: LMON process terminated with error
Thu Mar 22 09:49:03 2012
Instance terminated by LMON, pid = 7858
Thu Mar 22 09:49:08 2012
Starting ORACLE instance (normal)
Thu Mar 22 09:49:19 2012
LICENSE_MAX_SESSION = 0
关于 LMON 进程的说法如下
LMON: Global Enqueue Service Monitor
oracle官方文档的描述
Global Enqueue Service Monitor (LMON)
The background LMON process monitors the entire cluster to manage global resources. LMON manages instance deaths and the associated recovery for any failed instance. In particular, LMON handles the part of recovery associated with global resources. LMON-provided services are also known as Cluster Group Services.
LMON主要监测群集内的全局队列和全局资源,管理实例和处理异常并相应的群集队列进行恢复操作。
查看相关的错误trc文件有如下错误
*** 2012-03-22 09:33:26.808
Begin DRM(359)
sent syncr inc 40 lvl 1001 to 0 (40,0/31/0)
sent synca inc 40 lvl 1001 (40,0/31/0)
sent syncr inc 40 lvl 1002 to 0 (40,0/34/0)
sent synca inc 40 lvl 1002 (40,0/34/0)
sent syncr inc 40 lvl 1003 to 0 (40,0/36/0)
sent synca inc 40 lvl 1003 (40,0/36/0)
sent syncr inc 40 lvl 1004 to 0 (40,0/38/0)
sent synca inc 40 lvl 1004 (40,0/38/0)
sent syncr inc 40 lvl 1005 to 0 (40,0/31/0)
sent synca inc 40 lvl 1005 (40,0/31/0)
sent syncr inc 40 lvl 1006 to 0 (40,0/34/0)
sent synca inc 40 lvl 1006 (40,0/34/0)
sent syncr inc 40 lvl 1007 to 0 (40,0/36/0)
sent synca inc 40 lvl 1007 (40,0/36/0)
sent syncr inc 40 lvl 1008 to 0 (40,0/38/0)
sent synca inc 40 lvl 1008 (40,0/38/0)
sent syncr inc 40 lvl 1009 to 0 (40,0/31/0)
sent synca inc 40 lvl 1009 (40,0/31/0)
sent syncr inc 40 lvl 1010 to 0 (40,0/34/0)
sent synca inc 40 lvl 1010 (40,0/34/0)
sent syncr inc 40 lvl 1011 to 0 (40,0/36/0)
sent synca inc 40 lvl 1011 (40,0/36/0)
sent syncr inc 40 lvl 1012 to 0 (40,0/38/0)
sent synca inc 40 lvl 1012 (40,0/38/0)
*** 2012-03-22 09:48:34.088
kjfcdrmrfg: SYNC TIMEOUT (10229449, 10228548, 900), step 31
Submitting asynchronized dump request [28]
KJC Communication Dump:
state 0x5 flags 0x0 mode 0x0 inst 0 inc 40
nrcv 5 nsp 5 nrcvbuf 1000
reg_msg: sz 456 cur 487 (s:0 i:487) max 1305 ini 8750
big_msg: sz 8240 cur 410 (s:0 i:410) max 1049 ini 1934
rsv_msg: sz 8240 cur 0 (s:0 i:0) max 0 tot 1000
rcvr: id 3 orapid 16 ospid 7875
rcvr: id 4 orapid 18 ospid 7882
rcvr: id 1 orapid 12 ospid 7862
rcvr: id 2 orapid 14 ospid 7865
rcvr: id 0 orapid 10 ospid 7860
send proxy: id 3 ndst 1 (1:3 )
send proxy: id 4 ndst 1 (1:4 )
send proxy: id 1 ndst 1 (1:1 )
send proxy: id 2 ndst 1 (1:2 )
send proxy: id 0 ndst 1 (1:0 )
GES resource limits:
ges resources: cur 0 max 0 ini 81170
ges enqueues: cur 0 max 0 ini 126705
ges cresources: cur 7929 max 8690
gcs resources: cur 5233667 max 6037674 ini 6381355
gcs shadows: cur 5505422 max 6285054 ini 6381355
KJCTS state: seq-check:no timeout:yes waitticks:0x3 highload no
GES destination context:
GES remote instance per receiver context:
GES destination context:
Dest 1 rcvr 0 inc 40 state 0x10041 tstate 0x0
batch-type quick bmsg 0x0000000000000000 tmout 0x8d0e650 msg_in_batch 0
tkt total 1000 avl 750 sp_rsv 249 max_sp_rsv 250
可以看出在做DRM的时候到31步出错,到support.oracle.com,查找相关的信息得到如下解释
Bug 6500033 LMON crash the instance with ORA-481 due to DRM sync timeout
This note gives a brief overview of bug 6500033.
The content was last updated on: 10-JUL-2009
Click here for details of each of the sections below.
Affects:
Product (Component) Oracle Server (Rdbms)
Range of versions believed to be affected Versions BELOW 11.2
Versions confirmed as being affected ?10.2.0.4
?10.2.0.3
Platforms affected Generic (all / most platforms affected)
Fixed:
This issue is fixed in ?11.2.0.1 (Base Release)
?11.1.0.7 (Server Patch Set)
?10.2.0.5 (Server Patch Set)
?10.2.0.4.1 (Patch Set Update)
?10.2.0.4 Patch 18 on Windows Platforms
?10.2.0.4 RAC Recommended Patch Bundle #2
Symptoms: Related To:
?Instance May Crash
?RAC (Real Application Clusters) / OPS
Description
LMON can crash the instance with ORA-481 due to DRM sync timeout.
DIAG dumping Systemstate dump may be aborted due to log file size limit
while in server mode which can cause a DRM sync timeout when
lmon unsuccessfully tries to freeze it.
HOOKS PSE:A204 PSE:A203 LIKELYAFFECTS XAFFECTS_10.2.0.1 XAFFECTS_V10020001 AFFECTS=10.2.0.1 XAFFECTS_10.2.0.2 XAFFECTS_V10020002 AFFECTS=10.2.0.2 XAFFECTS_10.2.0.3 XAFFECTS_V10020003 AFFECTS=10.2.0.3 XAFFECTS_10.2.0.4 XAFFECTS_V10020004 AFFECTS=10.2.0.4 XAFFECTS_11.1.0.6 XAFFECTS_V11010006 AFFECTS=11.1.0.6 XPRODID_5 PRODUCT_ID=5 PRODID-5 RDBMS XCOMP_RDBMS COMPONENT=RDBMS TAG_CRASH TAG_OPS CRASH OPS FIXED_10.2.0.4.1 FIXED_10.2.0.5 FIXED_11.1.0.7 FIXED_11.2.0.1 FIXED_PATCH:A204.RECRAC.2 FIXED_WIN:A204P18
Please note: The above is a summary description only. Actual symptoms can vary. Matching to any symptoms here does not confirm that you are encountering this problem. For questions about this bug please consult Oracle Support.
References
Bug:6500033 (This link will only work for PUBLISHED bugs)
Note:245840.1 Information on the sections in this article
bug 6500033 最好解决办法就是禁用DRM功能,具体修改那些参数怎么修改网上就太多了
DRM和相关内容的介绍
http://www.banping.com/2009/08/26/rac_drm/
http://blog.csdn.net/inthirties/article/details/4875535
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/16719800/viewspace-719364/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/16719800/viewspace-719364/