[Oracle 11g r2(11.2.0.4.0)]案例分析7-丢失本地心跳导致的集群重新配置

( I ) cssdagent.log 或cssdmonitor.log ( 11 gR2 版本)

2013-10-15 20:20:49.426: [USRTHRD] [1548] (:CLSNOOlll:)clsnp roc_need reboot:
Impending reboot at 50% of limit 27945; disk timeout 27742, network timeout
27945, last heartbeat from CSSD at epoch seconds 1381839635.115, 14311
milliseconds ago based on invariant clock 3906743916; now polling at 100 ms
2013-10-15 20:20:56.117: [USRTHRD] [1548] (:CLSNOOlll:)clsnproc need reboot:
Impending reboot at 75% of limit 27945; disk timeout 27742, network timeout
27945, last heartbeat from CSSD at epoch seconds 1381839635.115, 21002
milliseconds ago based on invariant clock 3906743916; now polling at 100 ms
2013-10-15 20:21:00.347: [USRTHRD] (1548] (:CLSNOOlll:)c lsnproc n e e d r e boot:
Impending reboot at 90% of limit 27945; disk timeout 27742, network timeout
27945, last heartbeat from CSSD at epoch seconds 1381839635 .115, 25232
milliseconds ago based on invariant clock 3906743916; now polling at 100 ms

从上面的日志能看到cssdagent 在ocssd.bin 连续丢失本地心跳到达isscount 的50%时开始记录日志, 之后会选择直接重新启动本地节点, 并且会在对应的reboot advisory 中记录相应的信息。
( 2)在重启后的GI a lert.log 中, 能看到以下信息:

2013-10-15 20:27:24.746
[ohasd(2294128)]CRS-8011:reboot advisory message from host: ****, component:
cssagent, with time stamp: L-2013-10-15-20:21:03.134
[ohasd(2294128) ]CRS-8013:reboot advisory message text: Rebooting after limit
27945 exceeded; disk timeout 27742, network timeout 27945, last heartbeatfrom CSSD at epoch seconds 1381839635.115, 28019 milliseconds ago based on invariant clock value of 3906743916

由于cssdagent/cssdmonitor 在发现丢失本地心跳之后会认为ocssd.bin进程出现了问题,所以会直接重启本地节点, 而不会通知远程节点。换句话说, 在远程节点ocssd.log 信息中,待本地节点重启之后的一段时间后(misscount/2) 会出现类似以下的信息。接着会对集群进行重新配置。但是, 这并不能说明之前的节点重启是由于丢失NHB导致的。

2013-10-15 20: 20: 51. 585
[cssd(54723786) ]CRS-1612:Network communication with node **** (2) missing for 50% of timeout interval. ?Removal of this node from cluster in 14.992 seconds 
2013-10-15 20:20:59.635
[cssd(54723786) ]CRS-1611 :Network communication with node **** (2) missing for 75% of timeout interval. ?Removal of this node from cluster in 6.942 seconds
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值