9i 11g linux,Linux平台下Oracle9i/10g/11gR1IO-Fencing的hangcheck-timer模块说

5268f80b9b1e01f982625ef6fac83ca1.png

Linux 平台下Oracle 9i/10g/11gR1 IO-Fencing 的hangcheck-timer 模块说明,Hangcheck-timer 是Linux 提供的一个内核级的IO-Fenc

一.官网的说明

参考MOS:

9i, 10g, and11gR1 RAC [ID 726833.1]

Hangcheck_timermodule is required to run a supported configuration in Oracle Real ApplicationClusters environments on Linux, with Oracle releases 9i, 10g, or 11gR1RAC. This note identifies and outlines the requirements needed toconfigure hangcheck-timer in an Oracle Enterprise Linux, Red Hat Linux, or SUSELinux environment.

Note : Hangheck timer is notrequired starting with Oracle Clusterware 11gR2

Starting in release 9.2.0.2and later, Oracle RAC environments required using a new I/O fencing model,named the hangcheck-timer module. This module was implemented to replace theWatchdog module, which provided similar fencing functionality. Hangcheck-timerwas subsequently delivered as part of the standard kernel distribution forLinux kernel releases 2.4 and above.

Hangcheck-timer shouldbe loaded at boot time, and monitors the Linux kernel for long operatingsystem hangs that could affect the reliability of a RAC node. It runs inkernel mode and uses the Time Stamp Counter (TSC) to catch scheduling delays ornode hangs. This is done by setting a timer, then checking when the timerfires as to whether it was delayed by more than the allowed margin oferror. If the duration exceeds the allowed time of (hangcheck_tick +hangcheck_margin seconds), the machine is restarted. Hangcheck-timer willnot cause reboots to occur due to CPU starvation.

Hangcheck-timer requiresthree configuration parameters:

(1) hangcheck_tick - defines howoften, in seconds, the hangcheck-timer checks the node for hangs. The defaultvalue is 60 seconds.

(2) hangcheck_margin - defines howmuch margin is allowed, in seconds, between expected scheduling and realscheduling time. The default value is 180 seconds.

(3) hangcheck_reboot - determinesif the hangcheck-timer restarts the node if the kernel fails to respond withinthe sum of the hangcheck_tick and hangcheck_margin parameter values. If thevalue of hangcheck_reboot is equal to or greater than 1, then thehangcheck-timer module restarts the system. If the hangcheck_reboot parameteris set to zero, then the hangcheck-timer module will not reboot the node,even if a hang is detected. The default value varies by kernelversion. In the 2.4 kernel, the default is 1. In 2.6 kernels, thedefault is 0.

当hangcheck_reboot=1并且满足下面的公式时,hangcheck-timer将reboot系统: system hang time > (hangcheck_tick + hangcheck_margin)

All hangcheck-timer defaultvalues should be explicitly overridden when loading the kernel module, based onthe Oracle release as follows:

hangcheck_tick=30hangcheck_margin=180 hangcheck_reboot=1

--9i: 假如"oracle misscount"的缺省设置是220秒,则hangcheck_tick=30hangcheck_margin=180 hangcheck_reboot=1

hangcheck_tick=1hangcheck_margin=10 hangcheck_reboot=1

--10g/11gR1: 假如"CSS misscount"的设置是30或者60秒,则hangcheck_tick=1hangcheck_margin=10 hangcheck_reboot=1

You must always ensure thatthe Cluster misscount setting is greater than the sum of the setting forhangcheck_tick + hangcheck_margin.

When running OracleClusterware on Linux, hangcheck-timer should always be configured on each RACcluster node, as the functionality of this module is required to provide I/O Fencingto ensure no stray writes will occur from an evicted node in a RACcluster. To verify if the hangcheck-timer module is running on a nodeexecute as the root or oracle user:

# /sbin/lsmod | grep hangcheck

hangcheck-timer 2672 0

If the hangcheck-timer moduleis loaded (running) you will see output similar to above. When hangcheck-timeris not loaded no output is generated, and the command prompt is returned to theuser.

In an Oracle Enterprise Linux,Red Hat 4/5, or SUSE 9/10 environment the hangcheck-timer module is loadedusing the modprobe command:

# modprobe hangcheck-timer hangcheck_tick=1 hangcheck_margin=10hangcheck_reboot=1

In order to ensure the moduleis loaded at boot time, you should also place the same command in the appropriatelocal command execution directory (e.g. /etc/rc.d/rc.local, or/etc/init.d/boot.local). In earlier releases, hangcheck-timer was loadedusing insmod in place of modprobe. Consult your release specific documentationto determine which initialization method is required.

Hangcheck-timer will providemessage logging to the system messages log when a failure is detected, and anode restart is initiated by the module:

(1) When Hangcheck-timer reboots itmay leave "Hangcheck: hangcheck is restarting the machine" message in/var/log/messages。

(2) If you see the followingmessage in /var/log/messages: "Hangcheck: hangcheck value pastmargin!" this means a reboot was required but was not performed, becausehangcheck_reboot was not set to 1. If this message is seen, you mustreload the hangcheck module as described earlier in this note, with thehangcheck_reboot value set to 1.

注:

Bug:6125546 which can preventhangcheck-timer from rebooting in RHEL4 (fixed in 2.6.9.56 or RHEL4.6)

logo.gif

1428d0e076c3959ab11d28a39bc84fab.png

5268f80b9b1e01f982625ef6fac83ca1.png

本条技术文章来源于互联网,如果无意侵犯您的权益请点击此处反馈版权投诉

本文系统来源:php中文网

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值