同事在切割的时候,发现数据库节点1启动不起来,重启服务器等操作均无法启动节点1,通过vpn登录到数据库查看日志文件,ocssd.log如下一直重复内容,与节点2通信失败。
。。。。。
[ CSSD]2012-05-18 13:32:43.406 [3020401568] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(11
5330) LATS(75442116) Disk lastSeqNo(115330)
[ CSSD]2012-05-18 13:32:43.407 [3037719456] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(11
5330) LATS(75442126) Disk lastSeqNo(115330)
[ CSSD]2012-05-18 13:32:43.894 [3029060512] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(11
5331) LATS(75442606) Disk lastSeqNo(115331)
[ CSSD]2012-05-18 13:32:44.410 [3020401568] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(11
5331) LATS(75443126) Disk lastSeqNo(115331)
[ CSSD]2012-05-18 13:32:44.414 [3037719456] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(11
5331) LATS(75443126) Disk lastSeqNo(115331)
[ CSSD]2012-05-18 13:32:44.898 [3029060512] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(11
5332) LATS(75443616) Disk lastSeqNo(115332)
[ CSSD]2012-05-18 13:32:45.414 [3020401568] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(11
5332) LATS(75444126) Disk lastSeqNo(115332)
[ CSSD]2012-05-18 13:32:45.418 [3037719456] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(11
5332) LATS(75444136) Disk lastSeqNo(115332)
[ CSSD]2012-05-18 13:32:45.902 [3029060512] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(11
5333) LATS(75444616) Disk lastSeqNo(115333)
[ CSSD]2012-05-18 13:32:46.418 [3020401568] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(2) wrtcnt(11
5333) LATS(75445136) Disk lastSeqNo(115333)
第一感觉是网关或私有IP不通和/etc/hosts有错误,但是ping都是好的,两边/etc/hosts完全一样
解决方法,关闭两节点服务器,然后重启服务,完成。
参考 CRS can not Start After Node Reboot [ID 733260.1]
重启过程中节点1的监听和VIP启动不起来,因为修改了网卡编号导致启动不起来。
节点1的网卡信息如下:
rac1:/opt/oracle/product/10gR2/crs/log/rac1 # ifconfig
eth5 Link encap:Ethernet HWaddr 78:1D:BA:32:27:E0
inet addr:10.210.88.20 Bcast:10.210.88.63 Mask:255.255.255.192
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2943 errors:0 dropped:0 overruns:0 frame.:0
TX packets:70559 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:281598 (274.9 Kb) TX bytes:62667960 (59.7 Mb)
Interrupt:58
eth8 Link encap:Ethernet HWaddr 78:1D:BA:32:27:E1
inet addr:172.168.88.20 Bcast:172.168.88.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:44045 errors:0 dropped:0 overruns:0 frame.:0
TX packets:9178 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:22212445 (21.1 Mb) TX bytes:8441029 (8.0 Mb)
Interrupt:66
节点2如下
eth0 Link encap:Ethernet HWaddr 78:1D:BA:32:27:FB
inet addr:10.210.88.30 Bcast:10.210.88.63 Mask:255.255.255.192
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3970437 errors:0 dropped:0 overruns:0 frame.:0
TX packets:3158804 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:745352710 (710.8 Mb) TX bytes:1140165970 (1087.3 Mb)
Interrupt:58
eth0:1 Link encap:Ethernet HWaddr 78:1D:BA:32:27:FB
inet addr:10.210.88.31 Bcast:10.210.88.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:58
eth1 Link encap:Ethernet HWaddr 78:1D:BA:32:27:FC
inet addr:172.168.88.30 Bcast:172.168.88.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:81384 errors:0 dropped:0 overruns:0 frame.:0
TX packets:54268 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:16139161 (15.3 Mb) TX bytes:29211877 (27.8 Mb)
Interrupt:74
和节点2的网卡信息不对应,就修改了/etc/udev/rules.d/30-net_persistent_names.rules的信息,让两边相同导致vip,listener启动不起来。此处是多余步骤,因为不知道该处环境具体情况
本地运维也没有具体的实施文档才导致多此一举,重新修改/etc/udev/rules.d/30-net_persistent_names.rules,重启网卡,在启动搞定。
rac1:/opt/oracle/product/10gR2/crs/bin # ./srvctl start nodeapps -n rac1
rac1:ora.rac1.vip:eth5:eth8: error fetching interface information: Device not found
rac1:ora.rac1.vip:checkIf: interface eth5:eth8 is down
rac1:ora.rac1.vip:Invalid parameters, or failed to bring up VIP (host=rac1)
CRS-1006: No more members to consider
CRS-0215: Could not start resource 'ora.rac1.vip'.
rac1:ora.rac1.vip:eth5:eth8: error fetching interface information: Device not found
rac1:ora.rac1.vip:checkIf: interface eth5:eth8 is down
rac1:ora.rac1.vip:Invalid parameters, or failed to bring up VIP (host=rac1)
CRS-0215: Could not start resource 'ora.rac1.LISTENER_RAC1.lsnr'.
rac1:/opt/oracle/product/10gR2/crs/bin # ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.orcl.db application ONLINE ONLINE rac1
ora....l1.inst application ONLINE ONLINE rac1
ora....l2.inst application ONLINE ONLINE rac2
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE ONLINE rac2
ora....C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/22779291/viewspace-730190/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/22779291/viewspace-730190/