Oracle RAC CSS 超时计算及参数 misscount,Disktimeout 说明 .

http://blog.csdn.net/tianlesoftware/article/details/6728885

一. 概述

在之前的文章:

       RAC 的一些概念性和原理性的知识

       http://blog.csdn.net/tianlesoftware/article/details/5331067      

       提到OCSSD 这个进程是Clusterware最关键的进程,如果这个进程出现异常,会导致系统重启,这个进程提供CSS(Cluster Synchronization Service)服务。 CSS 服务通过多种心跳机制实时监控集群状态,提供脑裂保护等基础集群服务功能。

       CSS 服务有2种心跳机制: 一种是通过私有网络的Network Heartbeat,另一种是通过Voting Disk的Disk Heartbeat.

       这2种心跳都有最大延时,对于Disk Heartbeat, 这个延时叫作IOT (I/O Timeout);对于Network Heartbeat, 这个延时叫MC(Misscount)。 这2个参数都以秒为单位,缺省时IOT大于MC,在默认情况下,这2个参数是Oracle 自动判定的,并且不建议调整。

可以通过如下命令来查看参数值:

$crsctl get css disktimeout

$crsctl get css misscount

如:

[oracle@rac1 ~]$ crsctl get css disktimeout

200

[oracle@rac1 ~]$ crsctl get css misscount

60

这是这2个参数的默认值。

二. MOS 上相关的几篇文章

How to start/stop the 10g CRS ClusterWare[ID 309542.1]

10g RAC: Steps To Increase CSS Misscount,Reboottime and Disktimeout [ID 284752.1]

CSS Timeout Computation in OracleClusterware [ID 294430.1]

RAC Assurance Support Team: RAC and OracleClusterware Starter Kit and Best Practices (Generic) [ID 810394.1]

2.1修改CSS Misscount 步骤:

  1)Shut down CRS on all but one node. For exact steps use Note 309542.1

  2)Execute crsctl as root to modify the misscount:

    $ORA_CRS_HOME/bin/crsctl set css misscount

    where is the maximum i/o latency to the voting disk +1 second

  3)Reboot the node where adjustment was made

  4)Start all other nodes shutdown in step 1

With the Patch:4896338 for 10.2.0.1 thereare two additional settings that can be tuned. This change is incorporated into the 10.2.0.2 and 10.1.0.6patchsets.  

These following are only relevant on10.2.0.1 with Patch:4896338,In addition to MissCount, CSS now has two more parameters:

  1)reboottime (default 3 seconds) - the amount of time allowed for a node  to complete a reboot after the CSS daemon hasbeen evicted. (I.E. how  long does ittake for the machine to completely shutdown when you do a reboot)

  2)disktimeout (default 200 seconds) - the maximum amount of time allowed      for a voting file I/O to complete; if thistime is exceeded the voting disk will be marked as offline.  Note that this is also the amount of timethat will be required for initial cluster formation, i.e. when no nodes havepreviously been up and in a cluster.

      $CRS_HOME/bin/crsctl set css reboottime [-force]  ( is seconds)

      $CRS_HOME/bin/crsctl set css disktimeout [-force] (is seconds)

Confirm the new css  misscount setting via ocrdump

2.2 CSS Timeout Computation in OracleClusterware

2.2.1 MISSCOUNTDEFINITION AND DEFAULT VALUES
       The CSS misscount parameterrepresents the maximum time, in seconds, that a network heartbeat can be missedbefore entering into a cluster reconfiguration to evict the node. The followingare the default values for the misscount parameter and their respectiveversions when using Oracle Clusterware* in seconds:

       *CSS misscount default value when using vendor (non-Oracle)clusterware is 600 seconds. This is to allow the vendor clusterwareample time to resolve any possible split brain scenarios.

       On AIX platforms with HACMP starting with 10.2.0.3 BP#1, themisscount is 30. This is documented in Note551658.1

2.2.2 CSS HEARTBEATMECHANISMS AND THEIR INTERRELATIONSHIP
       The synchronization servicescomponent (CSS) of the Oracle Clusterware maintains two heartbeat mechanisms

1.) the disk heartbeat to the voting deviceand

2.) the network heartbeat  across theinterconnect which establish and confirm valid node membership in the cluster.

       Bothof these heartbeat mechanisms have an associated timeout value. The diskheartbeat has an internal i/o timeout interval (DTO Disk TimeOut), in seconds,where an i/o to the voting disk must complete. The misscount parameter (MC), asstated above, is the maximum time, in seconds, that a network heartbeat can be missed. The disk heartbeat i/o timeout interval is directly related tothe misscount parameter setting. There has been some variation in thisrelationship 
between versions as described below:

9.x.x.x

NOTE, MISSCOUNT WAS A  DIFFERENT ENTITY IN THIS RELEASE

10.1.0.2

No one should be on this version

10.1.0.3

DTO = MC - 15 seconds

10.1.0.4

DTO = MC - 15 seconds

10.1.0.4+Unpublished

Bug 3306964

DTO = MC - 3 seconds

10.1.0.4 with CRS II Merge patch

DTO =Disktimeout (Defaults to 200 seconds) Normally OR Misscount seconds only during initial Cluster formation or Slightly before reconfiguration

10.1.0.5

IOT = MC - 3 seconds

10.2.0.1 +Fix for unpublished

 Bug 4896338

IOT=Disktimeout (Defaults to 200 seconds) Normally OR Misscount seconds only during initial Cluster formation or Slightly before reconfiguration

10.2.0.2

Same as above (10.2.0.1 with Patch Bug:4896338

10.1 - 11.1

During node join and leave (reconfiguration) in a cluster we need to reconfigure, in that particular case we use Short Disk TimeOut (SDTO) which is in all versions SDTO = MC â

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/24867586/viewspace-711638/,如需转载,请注明出处,否则将追究法律责任。

下一篇: ASM信息资料汇总
user_pic_default.png
请登录后发表评论 登录
全部评论
<%=items[i].createtime%>

<%=items[i].content%>

<%if(items[i].items.items.length) { %>
<%for(var j=0;j
<%=items[i].items.items[j].createtime%> 回复

<%=items[i].items.items[j].username%>   回复   <%=items[i].items.items[j].tousername%><%=items[i].items.items[j].content%>

<%}%> <%if(items[i].items.total > 5) { %>
还有<%=items[i].items.total-5%>条评论 ) data-count=1 data-flag=true>点击查看
<%}%>
<%}%> <%}%>

转载于:http://blog.itpub.net/24867586/viewspace-711638/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值