root.sh Fails to Start HAIP as Default Gateway is Configured for Private Network VLAN (文档 ID 1366211

转载 2016年05月31日 22:34:06

Applies to:

Oracle Server - Enterprise Edition - Version 11.2.0.2 and later
Information in this document applies to any platform.

Symptoms

Installing 11.2.0.2 Grid Infrastructure on 2 node RAC cluster with VLAN configured for underlying network, root.sh fails with:

......
Start of resource "ora.cluster_interconnect.haip" failed
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'db1'
CRS-5017: The resource action "ora.cluster_interconnect.haip start" encountered the following error:
Start action for HAIP aborted
CRS-2674: Start of 'ora.cluster_interconnect.haip' on 'db1' failed
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'db1'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'db1' succeeded
CRS-4000: Command Start failed, or completed with errors.
Failed to start Oracle Clusterware stack
Failed to start High Availability IP at /u01/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 1043.
/u01/app/11.2.0/grid/perl/bin/perl -I/u01/app/11.2.0/grid/perl/lib -I/u01/app/11.2.0/grid/crs/install /u01/app/11.2.0/grid/crs/install/rootcrs.pl execution failed


This happens on both nodes.

Changes

New installation

Cause

The problem happens as IP address 10.1.15.254/24 is configured as default gateway for VLAN for private network on Cisco switch, causing HAIP retrieve MAC address e8:b7:48:e3:10:d4 associated with IP: 10.1.15.254/24 instead of the real MAC 00:10:3e:14:8e:19 associated with private network adapter 10.1.15.30 and HAIP startup fails with conflict MAC address.

orarootagent_root.log shows:

2011-09-20 01:34:29.591: [ USRTHRD][1099024704] {0:0:167} HAIP: initializing to 1 interfaces
2011-09-20 01:34:29.592: [ USRTHRD][1099024704] {0:0:167} HAIP: configured to use 1 interfaces
2011-09-20 01:34:29.595: [ USRTHRD][1099024704] {0:0:167} HAIP: Updating member info HAIP1;10.1.15.0#0
2011-09-20 01:34:29.595: [ USRTHRD][1099024704] {0:0:167} InitializeHaIps[ 0] infList 'inf eth1, ip 10.1.15.30, sub 10.1.15.0'
2011-09-20 01:34:29.596: [ USRTHRD][1099024704] {0:0:167} Error in getting Key SYSTEM.network.haip.group.cluster_interconnect.interface.valid in OCR
2011-09-20 01:34:29.598: [ CLSINET][1099024704] failed to open OLR HAIP subtype SYSTEM.network.haip.group.cluster_interconnect.interface.valid key, rc=4
2011-09-20 01:34:29.598: [ USRTHRD][1099024704] {0:0:167} HAIP reset on new modified startup, ipSize 0 != numInf 1
2011-09-20 01:34:29.598: [ USRTHRD][1099024704] {0:0:167} HAIP: starting inf 'eth1', suggestedIp '', assignedIp ''
2011-09-20 01:34:29.598: [ USRTHRD][1099024704] {0:0:167} Thread:[NetHAWork]start {
2011-09-20 01:34:29.598: [ USRTHRD][1099024704] {0:0:167} Thread:[NetHAWork]start }
2011-09-20 01:34:29.598: [ USRTHRD][1119660352] {0:0:167} [NetHAWork] thread started
2011-09-20 01:34:29.598: [ USRTHRD][1119660352] {0:0:167} Arp::sCreateSocket {
2011-09-20 01:34:29.627: [ USRTHRD][1119660352] {0:0:167} Arp::sCreateSocket }
2011-09-20 01:34:29.627: [ USRTHRD][1119660352] {0:0:167} Starting Probe for ip 169.254.12.247
2011-09-20 01:34:29.627: [ USRTHRD][1119660352] {0:0:167} Transitioning to Probe State
2011-09-20 01:34:30.115: [ USRTHRD][1119660352] {0:0:167} Arp::sProbe {
2011-09-20 01:34:30.115: [ USRTHRD][1119660352] {0:0:167} Arp::sSend: sending type 1
2011-09-20 01:34:30.115: [ USRTHRD][1119660352] {0:0:167} Arp::sProbe }
2011-09-20 01:34:30.116: [ USRTHRD][1119660352] {0:0:167} PROBE: got conflicting source ip 169.254.12.247, addr e8:b7:48:e3:10:d4
2011-09-20 01:34:30.116: [ USRTHRD][1119660352] {0:0:167} PROBE: conflict detected src { 169.254.12.247, e8:b7:48:e3:10:d4 }, target { 0.0.0.0, 00:10:3e:14:8e:19 }
2011-09-20 01:34:30.116: [ USRTHRD][1119660352] {0:0:167} Starting Probe for ip 169.254.38.147
2011-09-20 01:34:30.116: [ USRTHRD][1119660352] {0:0:167} Transitioning to Probe State
2011-09-20 01:34:30.760: [ USRTHRD][1119660352] {0:0:167} Arp::sProbe {
2011-09-20 01:34:30.760: [ USRTHRD][1119660352] {0:0:167} Arp::sSend: sending type 1
2011-09-20 01:34:30.760: [ USRTHRD][1119660352] {0:0:167} Arp::sProbe }
2011-09-20 01:34:30.762: [ USRTHRD][1119660352] {0:0:167} PROBE: got conflicting source ip 169.254.38.147, addr e8:b7:48:e3:10:d4
2011-09-20 01:34:30.762: [ USRTHRD][1119660352] {0:0:167} PROBE: conflict detected src { 169.254.38.147, e8:b7:48:e3:10:d4 }, target { 0.0.0.0, 00:10:3e:14:8e:19 }
...
<< repeated 10 times with different HAIP IP and abort:

2011-09-20 01:34:35.459: [ USRTHRD][1119660352] {0:0:167} Rate limiting attempts, numConflict 10
2011-09-20 01:35:29.501: [ AGFW][1113356608] {0:0:167} Created alert : (:CRSAGF00113:) : Aborting the command: start for resource: ora.cluster_interconnect.haip 1 1
2011-09-20 01:35:35.708: [ora.cluster_interconnect.haip][1115457856] {0:0:167} [start] Start of HAIP aborted
2011-09-20 01:35:35.709: [ AGENT][1115457856] {0:0:167} UserErrorException: Locale is
2011-09-20 01:35:35.709: [ora.cluster_interconnect.haip][1115457856] {0:0:167} [start] clsnUtils::error Exception type=2 string=
CRS-5017: The resource action "ora.cluster_interconnect.haip start" encountered the following error:
Start action for HAIP aborted


Network configuration shows no Mac address e8:b7:48:e3:10:d4 is defined on the host physical network:

$ /sbin/ifconfig -a

eth0 Link encap:Ethernet HWaddr 00:10:3E:58:3E:E7
     inet addr:10.2.14.30 Bcast:10.2.14.255 Mask:255.255.255.0
     inet6 addr: fe80::216:3eff:fe58:3ee7/64 Scope:Link
     UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
     RX packets:4273973 errors:0 dropped:0 overruns:0 frame:0
     TX packets:3176416 errors:0 dropped:0 overruns:0 carrier:0
     collisions:0 txqueuelen:1000
     RX bytes:4309493182 (4.0 GiB) TX bytes:2326925399 (2.1 GiB)


eth1 Link encap:Ethernet HWaddr 00:10:3E:14:8E:19
     inet addr:10.1.15.30 Bcast:10.1.15.255 Mask:255.255.255.0
     inet6 addr: fe80::216:3eff:fe14:8e19/64 Scope:Link
     UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
     RX packets:1441782 errors:0 dropped:0 overruns:0 frame:0
     TX packets:1156267 errors:0 dropped:0 overruns:0 carrier:0
     collisions:0 txqueuelen:1000
     RX bytes:935044730 (891.7 MiB) TX bytes:682093588 (650.4 MiB)



Per network admin, MAC address e8:b7:48:e3:10:d4 is associated with IP 10.1.15.254/24, it is created as gateway IP for VLAN for private network on Cisco switch.

#show int Vlan15
Vlan15 is up, line protocol is up
Hardware is EtherSVI, address is e8b7.48e3.10d4 (bia e8b7.48e3.10d4)
Description: Cluster
Internet address is 10.1.15.254/24

Solution

It's recommended to have private network on dedicated switches, but in case VLAN is used for private network, on Cisco switch, gateway is not needed for the private network VLAN.

After removing the gateway IP 10.1.15.254/24 from the Cisco switch,  deconfig the failed Grid Infrastructure installation:

as  root user:

# $GRID_HOME/crs/install/rootcrs.pl -deconfig -force

On the last node:
# $GRID_HOME/crs/install/rootcrs.pl -deconfig -force -lastnode


rerun root.sh as root user:

# $GRID_HOME/root.sh

相关文章推荐

root.sh Fails to Start HAIP as Default Gateway is Configured for Private Network VLAN (ID 1366211)

Applies to: Oracle Server - Enterprise Edition - Version 11.2.0.2 and later Information in this docu...
  • wengtf
  • wengtf
  • 2016-06-01 09:47
  • 1577

Oracle RAC root.sh 报错 Timed out waiting for the CRS stack to start 解决方法

一.问题描述 在Oracle Linux 6.1 上安装11.2.0.1的RAC,在第二个节点执行root.sh时,报time out,如下:[root@rac2 ~]# /u01/app/11.2....

Oracle RAC 第二节点 root.sh 报错 Timed out waiting for the CRS stack to start

在VBox 上安装11.2.0.1的RAC。 这里打算安装11.2.0.1是因为从11.2.0.2以后的版本对public 和private 网卡网段区分很严格,这个网卡必须配置在不同的网段。 而在1...

Running Root.Sh On Second Node Fails PRKN-1008 Unable to load the shared library srvmhas10

修改时间 04-AUG-2010     类型 PROBLEM     状态 ARCHIVED Running Root.Sh On Second Node Fails PRKN-1008 ...

Instances Unable To Start If MTU Size Is Different for Cluster_interconnect (文档 ID 300388.1)

对于Cluster_interconnect ,如果节点之间心跳网卡的MTU设置不同  ,可能会造成实例无法启动。  APPLIES TO: 应用于: ...

[翻译自MOS文章]警告:在rhel7 or OL(RHCK)7上安装GI 12.2.0.1时 root.sh fails并有报错"CLSRSC-400"

警告:在rhel7 or OL(RHCK)7上安装GI 12.2.0.1时 root.sh fails并有报错"CLSRSC-400"

Instances Unable To Start If MTU Size Is Different for Cluster_interconnect (Doc ID 300388.1)

Instances Unable To Start If MTU Size Is Different for Cluster_interconnect (文档 ID 300388.1) 集群互联的M...

在linux AS5.4 64bit 安装CRS时,执行root.sh时报错

在linux AS5.4 64bit 安装CRS时,执行root.sh时报错 [root@rac02 crs]# sh root.sh WARNING: directory '/opt/ora10...

Root.sh Failed on Second Node: Configuration of ASM Failed: Disk Group ... Already Exists [ID 138472

In this Document   Symptoms   Cause   Solution This document is being delivered to...

MySQL之——主从server-id不生效,The server is not configured as slave

数据库已搭建完成,各种配置均已完成,但是在start slave ;的时候,报错: ERROR 1200 (HY000): The server is not configured as slave;...
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:深度学习:神经网络中的前向传播和反向传播算法推导
举报原因:
原因补充:

(最多只允许输入30个字)