11gR2私有ip修改错误导致crs无法启动解决方法

--oifcfg使用帮助
[root@rac1 grid]# oifcfg -help

Name:

    oifcfg - Oracle Interface Configuration Tool.

Usage:  oifcfg iflist [-p [-n]]

    oifcfg setif {-node | -global} {/:}...

   oifcfg getif [-node | -global] [ -if [/] [-type ] ]

    oifcfg delif {{-node | -global} [[/]] [-force] | -force}

    oifcfg [-help]

    - name of the host, as known to a communications network

      - name by which the interface is configured in the system

       - subnet address of the interface

      - type of the interface { cluster_interconnect | public }

这里面特别要注意的是,setif中写入的是subnet,这个如果写不对就会发生crs无法启动的情况

--我本机上的hosts配置情况

[root@rac1 grid]# cat /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

#node1

192.168.8.221   rac1 rac1.oracle.com

192.168.8.242   rac1-vip 

172.168.0.18    rac1-priv

#node2

192.168.8.223   rac2 rac2.oracle.com

192.168.8.244   rac2-vip

172.168.0.19    rac2-priv

#scan-ip

192.168.8.245   rac-cluster rac-cluster-scan


--查看ip

[root@rac1 grid]# oifcfg getif

eth0  192.168.8.0  global  public

eth1  172.168.0.0  global  cluster_interconnect

[root@rac1 grid]# ifconfig eth1

eth1      Link encap:Ethernet  HWaddr 08:00:27:72:5A:8F 

          inet addr:172.168.0.18  Bcast:172.168.255.255  Mask:255.255.0.0

          

可以看到这里的Mask:255.255.0.0而不是255.255.255.0,所以如果要将hosts文件中私有IP172.168.0.18修改成172.168.8.18时,在setif时输入的subnet就要注意subnet,如下可以说明:

因为ifconfig eth1查出来Mask255.255.0.0

[root@rac1 grid]# ipcalc -bnm 172.168.0.18 255.255.0.0

NETMASK=255.255.0.0

BROADCAST=172.168.255.255

NETWORK=172.168.0.0

[root@rac1 grid]# ipcalc -bnm 172.168.8.18 255.255.0.0

NETMASK=255.255.0.0

BROADCAST=172.168.255.255

NETWORK=172.168.0.0

如上可以看出,在Mask255.255.0.0情况下,subnet就算修改了ip,(上面是NETWORK=172.168.0.0)其实是不变的

如果这里查出来Mask255.255.255.0

[root@rac1 grid]# ipcalc -bnm 172.168.0.18 255.255.255.0

NETMASK=255.255.255.0

BROADCAST=172.168.0.255

NETWORK=172.168.0.0

[root@rac1 grid]# ipcalc -bnm 172.168.8.18 255.255.255.0

NETMASK=255.255.255.0

BROADCAST=172.168.8.255

NETWORK=172.168.8.0

如上可以看的出来subnet发生了变化,所以这里要格外的注意

不幸的是我在这里就疏忽了,在修改私有IP时,subnet错写成了172.168.8.0,以下是故障重现场景

--删除Private配置

[grid@racl ~]$ oifcfg delif -global eth1

PRIF-31: Failed to delete the specified network interface because it is the last private interface

11.2.0.2以后的版本,是无法直接删除最后一个private IP ,如果要删除,必须先添加一个。然后重启CRS,再删除旧的private信息即可。

--查看网卡配置

[grid@racl ~]$ oifcfg getif -global

eth0  192.168.8.0  global  public

eth1  172.168.0.0  global  cluster_interconnect

--添加新的private配置(注意这里是错误的,正确的应该是172.168.0.0)

[grid@rac1 ~]$ oifcfg setif -global eth1/172.168.8.0:cluster_interconnect

--查看修改后的配置

[grid@rac1 ~]$ oifcfg getif -global

eth0  192.168.8.0  global  public

eth1  172.168.0.0  global  cluster_interconnect

eth1  172.168.8.0  global  cluster_interconnect

--删除旧配置

[grid@rac1 ~]$ oifcfg delif -global eth1/172.168.0.0

--再次验证:

[grid@rac1 ~]$ oifcfg getif -global

eth0  192.168.8.0  global  public

eth1  172.168.8.0  global  cluster_interconnect

--root停止所有节点上的clusterware

[root@racl ~]# crsctl stop crs -f

[root@rac2 ~]# crsctl stop crs -f

--重新启动crs时,log日志里面报错如下:

[/u01/app/11.2.0/grid/bin/orarootagent.bin(3196)]CRS-5818:Aborted command 'start' for resource 'ora.cluster_interconnect.haip'. Details at (:CRSAGF00113:) {0:0:2} in /u01/app/11.2.0/grid/log/rac1/agent/ohasd/ora

rootagent_root/orarootagent_root.log.

2016-06-01 06:47:30.200:

[ohasd(3017)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.cluster_interconnect.haip'. Details at (:CRSPE00111:) {0:0:2} in /u01/app/11.2.0/grid/log/rac1/ohasd/ohasd.log.

2016-06-01 06:48:30.227:

....

[/u01/app/11.2.0/grid/bin/orarootagent.bin(17207)]CRS-5822:Agent '/u01/app/11.2.0/grid/bin/orarootagent_root' disconnected from server. Details at (:CRSAGF00117:) {0:5:44} in /u01/app/11.2.0/grid/log/rac1/agent/

crsd/orarootagent_root/orarootagent_root.log.

--check crs时也报错

[root@rac1 rac1]# crsctl check crs

CRS-4638: Oracle High Availability Services is online

CRS-4535: Cannot communicate with Cluster Ready Services

CRS-4530: Communications failure contacting Cluster Synchronization Services daemon

CRS-4534: Cannot communicate with Event Manager

恢复步骤如下:

--两个节点关闭crs

[root@rac1 rac1]# crsctl stop crs -f

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1'

CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1'

CRS-2673: Attempting to stop 'ora.cssd' on 'rac1'

CRS-2677: Stop of 'ora.cssd' on 'rac1' succeeded

CRS-2673: Attempting to stop 'ora.crf' on 'rac1'

CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded

CRS-2677: Stop of 'ora.crf' on 'rac1' succeeded

CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1'

CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded

CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1'

CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded

CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed

CRS-4133: Oracle High Availability Services has been stopped.

--已独占方式启动crs(这时haip启动就会报错,不用管)

[root@rac1 rac1]# crsctl start crs -excl -nocrs

CRS-4123: Oracle High Availability Services has been started.

CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'

CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded

CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'

CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded

CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'

CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'

CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded

CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded

CRS-2672: Attempting to start 'ora.cssd' on 'rac1'

CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'

CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded

CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded

CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'rac1'

CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'

CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'rac1' succeeded

CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rac1'

CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded

CRS-5017: The resource action "ora.cluster_interconnect.haip start" encountered the following error:

Start action for HAIP aborted. For details refer to "(:CLSN00107:)" in "/u01/app/11.2.0/grid/log/rac1/agent/ohasd/orarootagent_root/orarootagent_root.

CRS-2674: Start of 'ora.cluster_interconnect.haip' on 'rac1' failed

CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'rac1'

CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'rac1' succeeded

CRS-2673: Attempting to stop 'ora.ctssd' on 'rac1'

CRS-2677: Stop of 'ora.ctssd' on 'rac1' succeeded

CRS-4000: Command Start failed, or completed with errors.

--备份crs的配置信息

[root@rac1 ~]# mkdir /home/oracle/gpnp

[root@rac1 ~]# export GPNPDIR=/home/oracle/gpnp

[root@rac1 ~]# gpnptool get -o=$GPNPDIR/profile.xml

Resulting profile written to "/home/oracle/gpnp/profile.xml".

Success.

[root@rac1 ~]# cat /home/oracle/gpnp/profile.xml

+DlT/meivHNPx1yzXh/Lh5gpB6w=jUZ0IQfYt5dvupziJf8nPo/KtWu2aPl3nl0ute/RAPYYkIOw3ZvDqHuREggvNsgDKGv28mLeDVzmt0N1aU0QprVrg3Rxlt1R3AFxREukvqawQ4BLwbiEo2yoBcBNhP1AQV7ZVdgQqX9FYntVcKNZeP7pMnMpJcmG2Cp87iop05U=

--查看crs配置信息

[root@rac1 ~]# gpnptool get

Warning: some command line parameters were defaulted. Resulting command line:

         /u01/app/11.2.0/grid/bin/gpnptool.bin get -o-

+DlT/meivHNPx1yzXh/Lh5gpB6w=jUZ0IQfYt5dvupziJf8nPo/KtWu2aPl3nl0ute/RAPYYkIOw3ZvDqHuREggvNsgDKGv28mLeDVzmt0N1aU0QprVrg3Rxlt1R3AFxREukvqawQ4BLwbiEo2yoBcBNhP1AQV7ZVdgQqX9FYntVcKNZeP7pMnMpJcmG2Cp87iop05U=

Success.

--修改备份的CRS配置信息

--备份配置文件

[root@rac1 ~]# cp $GPNPDIR/profile.xml $GPNPDIR/p.xml

--获取当前的crs序列号

[root@rac1 ~]# gpnptool getpval -p=$GPNPDIR/p.xml -prf_sq -o-

29

--获取公有网络和私有网络标识(与实际网卡名称不一致,可以在配置文件中找到)

[root@rac1 ~]# gpnptool getpval -p=$GPNPDIR/p.xml -net -o-

net1 net2

--修改配置文件中的序列号(原序列号值加1,即29+1=30)和私网的正确实际网段(subnet172.168.0.0)信息:

[root@rac1 ~]# gpnptool edit -p=$GPNPDIR/p.xml -o=$GPNPDIR/p.xml -ovr -prf_sq=30 -net2:net_ip=172.168.0.0

Resulting profile written to "/home/oracle/gpnp/p.xml".

Success.

--用私钥重新标识配置文件

[root@rac1 ~]# gpnptool sign -p=$GPNPDIR/p.xml -o=$GPNPDIR/p.xml -ovr -w=cw-fs:peer

Resulting profile written to "/home/oracle/gpnp/p.xml".

Success.

--将配置文件信息回写到crs

[root@rac1 ~]# gpnptool put -p=$GPNPDIR/p.xml

Success.

--验证crs中配置信息

[root@rac1 ~]# gpnptool find -c=rac-cluster(这个为hosts文件中的scan name)

Found 1 instances of service 'gpnp'.

    mdns:service:gpnp._tcp.local.://rac1:44022/agent=gpnpd,cname=rac-cluster,host=rac1,pid=23802/gpnpd h:rac1 c:rac-cluster

[root@rac1 ~]# gpnptool rget -h=rac1(这个为节点一的主机名)

Warning: some command line parameters were defaulted. Resulting command line:

         /u01/app/11.2.0/grid/bin/gpnptool.bin rget -h=rac1 -o-

Found 1 gpnp service instance(s) to rget profile from.

RGET from tcp://rac1:44022 (mdns:service:gpnp._tcp.local.://rac1:44022/agent=gpnpd,cname=rac-cluster,host=rac1,pid=23802/gpnpd h:rac1 c:rac-cluster):

0dwyjB220ul3DWEmv5pAz1GzH4w=fuboD8S5uj1LH7A/Wdg321x6QGfQ4wkzSj/yXk9SnTVYuGwi2E9+XXaVk/pos8pVHqiChsuiWwGhjXZxnIuJrMrRF+t06PGqGlBxf0JQ557OmT1WZOvgsb1QPbRjb2tSqaazDIfG+y0ps0nNZMO5E4d2zITqmcBRUkV5UBnrvj8=

Success.

--启动crsd进程

[root@rac1 ~]# crsctl start res ora.crsd -init

CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'

CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rac1'

CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded

CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rac1' succeeded

CRS-2672: Attempting to start 'ora.asm' on 'rac1'

CRS-2676: Start of 'ora.asm' on 'rac1' succeeded

CRS-2672: Attempting to start 'ora.crsd' on 'rac1'

CRS-2676: Start of 'ora.crsd' on 'rac1' succeeded

--查看私有网络配置(会显示有警告)

[root@rac1 ~]# oifcfg getif

eth0  192.168.8.0  global  public

eth1  172.168.0.0  global  cluster_interconnect

Only in OCR: eth1  172.168.8.0  global  cluster_interconnect

PRIF-30: Network information in OCR and GPnP profile differs

--修改私有网络配置

[root@rac1 ~]# oifcfg setif -global eth1/172.168.0.0:cluster_interconnect

--再次查看警告消失

[root@rac1 ~]# oifcfg getif

eth0  192.168.8.0  global  public

eth1  172.168.0.0  global  cluster_interconnect

--重启crs服务

[root@rac1 ~]# crsctl stop crs -f

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1'

CRS-2673: Attempting to stop 'ora.crsd' on 'rac1'

CRS-2677: Stop of 'ora.crsd' on 'rac1' succeeded

CRS-2673: Attempting to stop 'ora.ctssd' on 'rac1'

CRS-2673: Attempting to stop 'ora.asm' on 'rac1'

CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1'

CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded

CRS-2677: Stop of 'ora.asm' on 'rac1' succeeded

CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rac1'

CRS-2677: Stop of 'ora.ctssd' on 'rac1' succeeded

CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rac1' succeeded

CRS-2673: Attempting to stop 'ora.cssd' on 'rac1'

CRS-2677: Stop of 'ora.cssd' on 'rac1' succeeded

CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1'

CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded

CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1'

CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded

CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed

CRS-4133: Oracle High Availability Services has been stopped.

--两个节点都启动

[root@rac1 ~]# crsctl start crs

CRS-4123: Oracle High Availability Services has been started.

[root@rac2 ~]# crsctl start crs

CRS-4123: Oracle High Availability Services has been started.

--验证crs

[root@rac1 ~]# crsctl check crs

CRS-4638: Oracle High Availability Services is online

CRS-4535: Cannot communicate with Cluster Ready Services

CRS-4529: Cluster Synchronization Services is online

CRS-4533: Event Manager is online

[root@rac2 ~]# crsctl check crs

CRS-4638: Oracle High Availability Services is online

CRS-4535: Cannot communicate with Cluster Ready Services

CRS-4529: Cluster Synchronization Services is online

CRS-4534: Cannot communicate with Event Manager

[root@rac1 ~]# crs_stat -t

Name           Type           Target    State     Host       

------------------------------------------------------------

ora.DATADG.dg  ora....up.type ONLINE    ONLINE    rac1       

ora....ER.lsnr ora....er.type ONLINE    ONLINE    rac1       

ora....N1.lsnr ora....er.type ONLINE    ONLINE    rac1       

ora....EMDG.dg ora....up.type ONLINE    ONLINE    rac1       

ora.asm        ora.asm.type   ONLINE    ONLINE    rac1       

ora.cvu        ora.cvu.type   ONLINE    ONLINE    rac1       

ora.gsd        ora.gsd.type   OFFLINE   OFFLINE              

ora....network ora....rk.type ONLINE    ONLINE    rac1       

ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    rac1       

ora.ons        ora.ons.type   ONLINE    ONLINE    rac1       

ora.orcl.db    ora....se.type ONLINE    ONLINE    rac1       

ora....taf.svc ora....ce.type ONLINE    ONLINE    rac1       

ora....SM1.asm application    ONLINE    ONLINE    rac1       

ora....C1.lsnr application    ONLINE    ONLINE    rac1       

ora.rac1.gsd   application    OFFLINE   OFFLINE              

ora.rac1.ons   application    ONLINE    ONLINE    rac1       

ora.rac1.vip   ora....t1.type ONLINE    ONLINE    rac1       

ora....SM2.asm application    ONLINE    ONLINE    rac2       

ora....C2.lsnr application    ONLINE    ONLINE    rac2       

ora.rac2.gsd   application    OFFLINE   OFFLINE              

ora.rac2.ons   application    ONLINE    ONLINE    rac2       

ora.rac2.vip   ora....t1.type ONLINE    ONLINE    rac2       

ora.scan1.vip  ora....ip.type ONLINE    ONLINE    rac1

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/29812844/viewspace-2112502/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/29812844/viewspace-2112502/

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值