Oracle11.2.0.1.0节点1集群不能正常启动,报CRS-4535,CRS-4530

环境:虚拟机:VBOX,操作系统为RHEL5.6,数据库为Oracle11.2.0.1.0

      

问题:启动节点1后,发现集群不能正常启动,检查虚拟机配置没有问题,检查集群状态发现错误,如下:

[root@node1 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Event Manager is online


解决过程:

1.检查集群状态,如上所示

2.然后检查crsd与cssd的日志文件

日志目录通常在“GRID的安装主目录/log/节点主机名/”目录下面,通常按照服务名来命名目录,可以进入相应目录下查看日志。

3.crsd目录下的crsd.log一直报如下错误,CSS is not ready,问题应该在cssd服务上:

2014-07-18 17:07:04.063: [  CRSRTI][1547711200] CSS is not ready. Received status 3 from CSS. Waiting for good status .. 
2014-07-18 17:07:05.065: [ CSSCLNT][1547711200]clssscConnect: gipc request failed with 29 (0x16)
2014-07-18 17:07:05.065: [ CSSCLNT][1547711200]clsssInitNative: connect failed, rc 29
2014-07-18 17:07:05.066: [  CRSRTI][1547711200] CSS is not ready. Received status 3 from CSS. Waiting for good status .. 

4.继续查看cssd的日志,cssd目录下的ocssd.log日志,内容较多,但是有关于IP地址的报错:

2014-07-18 17:17:58.018: [GIPCXCPT][2517008128]gipcmodGipcPassInitializeNetwork: failed to find any interfaces in clsinet, ret gipcretFail (1)
2014-07-18 17:17:58.019: [GIPCGMOD][2517008128]gipcmodGipcPassInitializeNetwork: EXCEPTION[ ret gipcretFail (1) ]  failed to determine host from clsinet, using default
2014-07-18 17:17:58.019: [GIPCXCPT][2517008128]gipcShutdownF: skipping shutdown, count 3, from [ clsinet.c : 1732], ret gipcretSuccess (0)
2014-07-18 17:17:58.021: [GIPCXCPT][2517008128]gipcShutdownF: skipping shutdown, count 2, from [ clsgpnp0.c : 1021], ret gipcretSuccess (0)
2014-07-18 17:17:58.021: [GIPCGMOD][2517008128]gipcmodGipcPassInitializeNetwork: using host information node1
2014-07-18 17:17:58.022: [GIPCXCPT][2517008128]gipcmodNetworkProcessBind: failed to bind endp 0x818c170 [000000000000007d] { gipcEndpoint : localAddr 'gipc://node1:nm_scan-cluster#192.0.2.130', remoteAddr '', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, flags 0x0, usrFlags 0x0 }, addr 0x818b3e0 [000000000000007f] { gipcAddress : name 'gipc://node1:nm_scan-cluster#192.0.2.130', objFlags 0x0, addrFlags 0x1 }
2014-07-18 17:17:58.022: [GIPCXCPT][2517008128]gipcmodNetworkProcessBind: slos op  :  sgipcnTcpBind
2014-07-18 17:17:58.022: [GIPCXCPT][2517008128]gipcmodNetworkProcessBind: slos dep :  Cannot assign requested address (99)
2014-07-18 17:17:58.022: [GIPCXCPT][2517008128]gipcmodNetworkProcessBind: slos loc :  bind
2014-07-18 17:17:58.022: [GIPCXCPT][2517008128]gipcmodNetworkProcessBind: slos info:  addr '192.0.2.130:0'
2014-07-18 17:17:58.022: [GIPCXCPT][2517008128]gipcBindF [gipcInternalEndpoint : gipcInternal.c : 416]: EXCEPTION[ ret gipcretAddressNotAvailable (39) ]  failed to bind endp 0x818c170 [000000000000007d] { gipcEndpoint : localAddr 'gipc://node1:nm_scan-cluster#192.0.2.130', remoteAddr '', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, flags 0x0, usrFlags 0x0 }, addr 0x81906d0 [0000000000000084] { gipcAddress : name 'gipc://node1:nm_scan-cluster', objFlags 0x0, addrFlags 0x0 }, flags 0x0
2014-07-18 17:17:58.022: [GIPCXCPT][2517008128]gipcInternalEndpoint: failed to bind address to endpoint name 'gipc://node1:nm_scan-cluster', ret gipcretAddressNotAvailable (39)
2014-07-18 17:17:58.022: [GIPCXCPT][2517008128]gipcEndpointF [clsssclsnrsetup : clsssc.c : 2743]: EXCEPTION[ ret gipcretAddressNotAvailable (39) ]  failed endp create ctx 0x808a630 [0000000000000011] { gipcContext : traceLevel 2, fieldLevel 0x0, numDead 0, numPending 0, numZombie 0, numObj 4, objFlags 0x0 }, name 'gipc://node1:nm_scan-cluster', flags 0x0
2014-07-18 17:17:58.022: [    CSSD][2517008128]clsssclsnrsetup: gipcEndpoint failed, rc 39
2014-07-18 17:17:58.022: [    CSSD][2517008128]clssnmOpenGIPCEndp: failed to listen on gipc addr gipc://node1:nm_scan-cluster- ret 39
2014-07-18 17:17:58.022: [    CSSD][2517008128]clssscmain: failed to open gipc endp
2014-07-18 17:17:58.096: [    CSSD][1136195904]clssscSelect: cookie accept request 0x7ebe878
2014-07-18 17:17:58.096: [    CSSD][1136195904]clssgmAllocProc: (0x81938e0) allocated
2014-07-18 17:17:58.097: [    CSSD][1136195904]clssgmClientConnectMsg: properties of cmProc 0x81938e0 - 1,2,3,4
2014-07-18 17:17:58.097: [    CSSD][1136195904]clssgmClientConnectMsg: Connect from con(0xfd) proc(0x81938e0) pid(4051) version 11:2:1:4, properties: 1,2,3,4
2014-07-18 17:17:58.097: [    CSSD][1136195904]clssgmClientConnectMsg: msg flags 0x0000

5.检查主机的IP地址,发现两个网卡都没有启动:

[root@node1 ~]# ifconfig -a
lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:1632 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1632 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1973544 (1.8 MiB)  TX bytes:1973544 (1.8 MiB)

6.检查网卡配置文件,发现eth0与eth1的配置文件不正确,配置正确后,尝试重启网卡,报错:

[root@node1 cssd]# service network restart
Shutting down loopback interface:                          [  OK  ]
Bringing up loopback interface:                            [  OK  ]
Bringing up interface eth0:  e1000 device eth0 does not seem to be present, delaying initialization.
                                                           [FAILED]
Bringing up interface eth1:  e1000 device eth1 does not seem to be present, delaying initialization.
                                                           [FAILED]

7.检查开机内核加载网卡的信息,发现不正常,内容如下:

[root@node1 cssd]# dmesg | grep eth
[root@node1 cssd]# dmesg | grep e1000
e1000: Unknown parameter `irq'
e1000: Unknown parameter `irq'
e1000: Unknown parameter `irq'
e1000: Unknown parameter `irq'
e1000: Unknown parameter `irq'
e1000: Unknown parameter `irq'
e1000: Unknown parameter `irq'

8.从网上搜索上面错误后,有一个国外的网站提到了/etc/modprobe.conf里的内容,检查文件内容,发现问题,有一行options e1000 irq=4 irq=4

[root@node1 cssd]# cat /etc/modprobe.conf

alias scsi_hostadapter ata_piix
alias scsi_hostadapter1 ahci
alias net-pf-10 off
alias ipv6 off
options ipv6 disable=1
remove snd-intel8x0 { /usr/sbin/alsactl store 0 >/dev/null 2>&1 || : ; }; /sbin/modprobe -r --ignore-remove snd-intel8x0
alias eth0 e1000
options e1000 irq=4
alias snd-card-0 snd-intel8x0
options snd-card-0 index=0
options snd-intel8x0 index=0
alias eth1 e1000

9.不知道这行是干吗的,后来找了一台刚安装的机器,检查/etc/modprobe.conf文件,内容如下,并没有options e1000 irq=4这一行

[root@redhat ~]# cat /etc/modprobe.conf
alias scsi_hostadapter ata_piix
alias scsi_hostadapter1 ahci
alias net-pf-10 off
alias ipv6 off
options ipv6 disable=1
options snd-intel8x0 index=0
remove snd-intel8x0 { /usr/sbin/alsactl store 0 >/dev/null 2>&1 || : ; }; /sbin/modprobe -r --ignore-remove snd-intel8x0
alias eth0 e1000

10.删除这一行信息后,重启虚拟机,IP地址正常,集群也正常

[root@node1 ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 08:00:27:B0:C1:BB  
          inet addr:192.0.2.130  Bcast:192.0.2.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:617 errors:0 dropped:0 overruns:0 frame:0
          TX packets:45 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:48756 (47.6 KiB)  TX bytes:6011 (5.8 KiB)


eth1      Link encap:Ethernet  HWaddr 08:00:27:22:0E:25  
          inet addr:10.10.10.101  Bcast:10.10.10.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:699 errors:0 dropped:0 overruns:0 frame:0
          TX packets:54 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:55458 (54.1 KiB)  TX bytes:6402 (6.2 KiB)

[root@node1 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

[root@node1 bin]# ./crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.DATA.dg    ora....up.type ONLINE    ONLINE    node1       
ora.FLASH.dg   ora....up.type ONLINE    ONLINE    node1       
ora.GRIDDG.dg  ora....up.type ONLINE    ONLINE    node1       
ora....ER.lsnr ora....er.type ONLINE    ONLINE    node1       
ora....N1.lsnr ora....er.type ONLINE    ONLINE    node1       
ora.asm        ora.asm.type   ONLINE    ONLINE    node1       
ora.devdb.db   ora....se.type ONLINE    OFFLINE               
ora.eons       ora.eons.type  ONLINE    ONLINE    node1       
ora.gsd        ora.gsd.type   OFFLINE   OFFLINE               
ora....network ora....rk.type ONLINE    ONLINE    node1       
ora....SM1.asm application    ONLINE    ONLINE    node1       
ora....E1.lsnr application    ONLINE    ONLINE    node1       
ora.node1.gsd  application    OFFLINE   OFFLINE               
ora.node1.ons  application    ONLINE    ONLINE    node1       
ora.node1.vip  ora....t1.type ONLINE    ONLINE    node1       
ora.node2.vip  ora....t1.type ONLINE    ONLINE    node1       
ora.oc4j       ora.oc4j.type  OFFLINE   OFFLINE               
ora.ons        ora.ons.type   ONLINE    ONLINE    node1       
ora....ry.acfs ora....fs.type ONLINE    ONLINE    node1       
ora.scan1.vip  ora....ip.type ONLINE    ONLINE    node1       


总结:该问题和网卡没有正常启动导致集群不能启动有关系,还是要多了解Redhat Linux系统与Oracle集群的原理啊。

  • 0
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值