故障发生时间为9月21日17:10左右。
crs中的alert*.log看不到该时间段相关内容。
crsd.log中故障之前出现大量(近二百行):
2010-09-21 15:25:44.592: [ OCRSRV][34]th_select_handler: Failed to retrieve procctx from ht. constr = [27943392] retval lht [-27] Signal CV.
2010-09-21 15:31:13.508: [ OCRSRV][34]th_select_handler: Failed to retrieve procctx from ht. constr = [27943392] retval lht [-27] Signal CV.
2010-09-21 16:41:19.219: [ OCRSRV][34]th_select_handler: Failed to retrieve procctx from ht. constr = [27943392] retval lht [-27] Signal CV.
2010-09-21 16:42:49.423: [ OCRSRV][34]th_select_handler: Failed to retrieve procctx from ht. constr = [30417968] retval lht [-27] Signal CV.
2010-09-21 16:43:02.080: [ OCRSRV][34]th_select_handler: Failed to retrieve procctx from ht. constr = [30417968] retval lht [-27] Signal CV.
2010-09-21 16:43:51.942: [ OCRSRV][34]th_select_handler: Failed to retrieve procctx from ht. constr = [30417968] retval lht [-27] Signal CV.
2010-09-21 16:44:04.586: [ OCRSRV][34]th_select_handler: Failed to retrieve procctx from ht. constr = [30417968] retval lht [-27] Signal CV.
2010-09-21 16:44:54.440: [ OCRSRV][34]th_select_handler: Failed to retrieve procctx from ht. constr = [30417968] retval lht [-27] Signal CV.
2010-09-21 17:00:19.072: [ OCRSRV][34]th_select_handler: Failed to retrieve procctx from ht. constr = [27936096] retval lht [-27] Signal CV.
之后开始(这里看是否vip资源先offline,是否vip?):
2010-09-21 17:08:56.373: [ CRSAPP][2899] CheckResource error for ora.ggdb2.vip error code = 1
2010-09-21 17:08:56.377: [ CRSRES][2899] In stateChanged, ora.ggdb2.vip target is ONLINE
2010-09-21 17:08:56.377: [ CRSRES][2899] ora.ggdb2.vip on ggdb1 went OFFLINE unexpectedly
2010-09-21 17:08:56.377: [ CRSRES][2899] StopResource: setting CLI values
2010-09-21 17:08:56.380: [ CRSRES][2899] Attempting to stop `ora.ggdb2.vip` on member `ggdb1`
2010-09-21 17:08:56.776: [ CRSRES][2899] Stop of `ora.ggdb2.vip` on member `ggdb1` succeeded.
2010-09-21 17:08:56.777: [ CRSRES][2899] ora.ggdb2.vip RESTART_COUNT=0 RESTART_ATTEMPTS=0
2010-09-21 17:08:56.787: [ CRSRES][2899] ora.ggdb2.vip failed on ggdb1 relocating.
2010-09-21 17:08:56.875: [ CRSRES][2899] Attempting to start `ora.ggdb2.vip` on member `ggdb2`
2010-09-21 17:09:05.163: [ OCRSRV][34]th_select_handler: Failed to retrieve procctx from ht. constr = [27930144] retval lht [-27] Signal CV.
2010-09-21 17:09:05.552: [ CRSAPP][2902] CheckResource error for ora.ggdb1.vip error code = 1
2010-09-21 17:09:05.555: [ CRSRES][2902] In stateChanged, ora.ggdb1.vip target is ONLINE
2010-09-21 17:09:05.556: [ CRSRES][2902] ora.ggdb1.vip on ggdb1 went OFFLINE unexpectedly
2010-09-21 17:09:05.556: [ CRSRES][2902] StopResource: setting CLI values
2010-09-21 17:09:05.559: [ CRSRES][2902] Attempting to stop `ora.ggdb1.vip` on member `ggdb1`
2010-09-21 17:09:05.941: [ CRSRES][2902] Stop of `ora.ggdb1.vip` on member `ggdb1` succeeded.
2010-09-21 17:09:05.942: [ CRSRES][2902] ora.ggdb1.vip RESTART_COUNT=0 RESTART_ATTEMPTS=0
2010-09-21 17:09:05.953: [ CRSRES][2902] ora.ggdb1.vip failed on ggdb1 relocating.
2010-09-21 17:09:06.002: [ CRSRES][2902] StopResource: setting CLI values
2010-09-21 17:09:06.004: [ CRSRES][2902] Attempting to stop `ora.ggdb1.LISTENER_ggdb1.lsnr` on member `ggdb1`
2010-09-21 17:09:06.140: [ CRSRES][2899] Start of `ora.ggdb2.vip` on member `ggdb2` failed.
2010-09-21 17:10:22.949: [ CRSRES][2902] Stop of `ora.ggdb1.LISTENER_ggdb1.lsnr` on member `ggdb1` succeeded.
2010-09-21 17:10:22.950: [ CRSRES][2902] StopResource: setting CLI values
2010-09-21 17:10:22.952: [ CRSRES][2902] Attempting to stop `ora.ggggg.ggggg1.inst` on member `ggdb1`
2010-09-21 17:10:35.516: [ CRSRES][2902] Stop of `ora.ggggg.ggggg1.inst` on member `ggdb1` succeeded.
2010-09-21 17:10:35.531: [ CRSRES][2902] Attempting to start `ora.ggdb1.vip` on member `ggdb2`
2010-09-21 17:10:44.692: [ CRSRES][2902] Start of `ora.ggdb1.vip` on member `ggdb2` failed.
2010-09-21 17:10:44.726: [ CRSRES][2902] Attempting to start `ora.ggdb1.vip` on member `ggdb2`
2010-09-21 17:10:53.993: [ CRSRES][2902] Start of `ora.ggdb1.vip` on member `ggdb2` failed.
2010-09-21 17:11:21.331: [ OCRSRV][34]th_select_handler: Failed to retrieve procctx from ht. constr = [16923792] retval lht [-27] Signal CV.
2010-09-21 17:20:12.632: [ OCRSRV][34]th_select_handler: Failed to retrieve procctx from ht. constr = [27931488] retval lht [-27] Signal CV.
2010-09-21 17:40:13.439: [ OCRSRV][34]th_select_handler: Failed to retrieve procctx from ht. constr = [16927056] retval lht [-27] Signal CV.
2010-09-21 18:00:14.248: [ OCRSRV][34]th_select_handler: Failed to retrieve procctx from ht. constr = [30418928] retval lht [-27] Signal CV.
2010-09-21 18:01:25.074: [ OCRSRV][34]th_select_handler: Failed to retrieve procctx from ht. constr = [30416048]
同一时段ocssd.log显示:
CSSD]2010-09-21 17:09:05.688 [9] >TRACE: clssgmClientConnectMsg: Connect from con(600000000003f980) proc(6000000000cd9d50) pid() proto(10:2:1:1)
[ CSSD]2010-09-21 17:09:06.167 [9] >TRACE: clssgmClientConnectMsg: Connect from con(6000000000cf7ec0) proc(6000000000cd96f0) pid() proto(10:2:1:1)
[ CSSD]2010-09-21 17:10:12.244 [9] >TRACE: clssgmClientConnectMsg: Connect from con(600000000003f680) proc(6000000000cd9a20) pid() proto(10:2:1:1)
[ CSSD]2010-09-21 17:10:19.596 [9] >TRACE: clssgmClientConnectMsg: Connect from con(6000000000cf8040) proc(6000000000cd9a20) pid() proto(10:2:1:1)
[ CSSD]2010-09-21 17:10:23.258 [9] >TRACE: clssgmClientConnectMsg: Connect from con(6000000000cf7ec0) proc(6000000000c90060) pid() proto(10:2:1:1)
[ CSSD]2010-09-21 17:10:32.419 [12] >TRACE: clscsendx: (600000000003cc80) Connection not active
[ CSSD]2010-09-21 17:10:32.419 [12] >TRACE: clssgmSendClient: Send failed rc 6, con (600000000003cc80), client (6000000000041dd0), proc (0000000000000000)
[ CSSD]2010-09-21 17:10:32.419 [12] >TRACE: clscsendx: (600000000003cd40) Connection not active
[ CSSD]2010-09-21 17:10:32.419 [12] >TRACE: clssgmSendClient: Send failed rc 6, con (600000000003cd40), client (6000000000041ef0), proc (0000000000000000)
[ CSSD]2010-09-21 17:10:32.421 [12] >TRACE: clscsendx: (600000000003d1c0) Connection not active
[ CSSD]2010-09-21 17:10:32.421 [12] >TRACE: clssgmSendClient: Send failed rc 6, con (600000000003d1c0), client (6000000000042010), proc (0000000000000000)
[ CSSD]2010-09-21 17:10:32.421 [12] >TRACE: clscsendx: (6000000000cf7680) Connection not active
[ CSSD]2010-09-21 17:10:32.421 [12] >TRACE: clssgmSendClient: Send failed rc 6, con (6000000000cf7680), client (6000000000043a80), proc (0000000000000000)
[ CSSD]2010-09-21 17:10:32.422 [12] >TRACE: clscsendx: (600000000003c8c0) Connection not active
[ CSSD]2010-09-21 17:10:32.422 [12] >TRACE: clssgmSendClient: Send failed rc 6, con (600000000003c8c0), client (6000000000041c20), proc (0000000000000000)
[ CSSD]2010-09-21 17:11:21.358 [9] >TRACE: clssgmClientConnectMsg: Connect from con(6000000000cf7e00) proc(6000000000c90170) pid() proto(10:2:1:1)
[ CSSD]2010-09-21 17:11:21.758 [9] >TRACE: clssgmClientConnectMsg: Connect from con(600000000003c800) proc(6000000000cd9800) pid() proto(10:2:1:1)
[ CSSD]2010-09-21 17:20:12.658 [9] >TRACE: clssgmClientConnectMsg: Connect from con(6000000000cf7e00) proc(60000000000fe160) pid() proto(10:2:1:1)
[本帖最后由 eigrpeigrp 于 2010-9-27 14:58 编辑]