VIPs Often Go Offline Unexpectedly and Relocate to Another Node(文档 ID 1297867.1)

In this Document

 Symptoms
 Cause
 Solution


APPLIES TO:

Oracle Database - Enterprise Edition - Version 10.2.0.5 and later
Information in this document applies to any platform.

SYMPTOMS


VIPs often go offline unexpectedly, with the following message in crsd.log:

2011-02-17 15:11:16.437: [ CRSAPP][11321]32CheckResource error for ora.node02.vip error code = 1
2011-02-17 15:11:16.441: [ CRSRES][11321]32In stateChanged, ora.node02.vip target is ONLINE
2011-02-17 15:11:16.441: [ CRSRES][11321]32 ora.node02.vip on node02 went OFFLINE unexpectedly


VIP tracing is set by using the following commands:

#crsctl debug log res "ora.node01.vip:5"
#crsctl debug log res "ora.node02.vip:5"


Following error messages (highlighted in bold letters) can be seen in the generated VIP trace "CRS_HOME/log/node02:

2011-02-18 15:32:39.481: [ RACG][1] [4587556][1][ora.node02.vip]: Fri Feb 18 15:32:37 GMT+08:00 2011 [ 8257768 ] About to execute command: /usr/sbin/ping -S 192.168.220.36 -c 1 -w 1 192.168.220.33
Fri Feb 18 15:32:39 GMT+08:00 2011 [ 8257768 ]  IsIfAlive: RX packets checked if=en1 failed

2011-02-18 15:32:39.481: [ RACG][1] [4587556][1][ora.node02.vip]: Fri Feb 18 15:32:39 GMT+08:00 2011 [ 8257768 ] Interface en1 checked failed (host=node02)
Fri Feb 18 15:32:39 GMT+08:00 2011 [ 8257768 ] IsIfAlive: end for if=en1
Fri Feb 18 15:32:39 GMT+08:00 2011 [ 8257768 ] checkIf: end for if=en1

 

You can reset the VIP tracing to the default level by using the following commands:

#crsctl debug log res "ora.node01.vip:0"
#crsctl debug log res "ora.node02.vip:0"

 

CAUSE

The issue can be due to network performance when pinging the gateway using the public IP.

See "man ping" on AIX:

-S hostname/IP addr
Uses the IP address as the source address in outgoing ping packets.

-c Count
Specifies the number of echo requests, as indicated by the Count
variable, to be sent (and received).

-w timeout
This option works only with the -c option. It causes ping to wait
for a maximum of 'timeout' seconds for a reply (after sending the
last packet).


So the following command will check, if 1 packet sent from 192.168.220.36 to 192.168.220.33 will receive a reply within 1s.

ping -S 192.168.220.36 -c 1 -w 1 192.168.220.33
==>192.168.220.36 is the public IP, 192.168.220.33 is the gateway.


If the problem is with the network, the above "ping" command would take longer than 1s, and this leads to VIPs going offline unexpectedly and relocating to another node.

SOLUTION

To resolve the issue, please contact your network administrator to tune your network and ensure that the reply of the ping command is within 1s.

If you can't improve the network performance, please use the following temporary workaround (which is not recommended):

1. Stop all node applications.
% srvctl stop nodeapps -n <hostname>

2. Backup then Modify the racgvip script .

Change:
# timeout of ping in number of loops (1 sec)
PING_TIMEOUT=" -c 1 -w 1"

To:
# timeout of ping in number of loops (3 sec)
PING_TIMEOUT=" -c 1 -w 3"

3. Start the node applications and other necessary resources.
% srvctl start nodeapps -n <hostname>


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值