昨天在修改一套10G RAC数据库vip的时候,出现了一个奇怪的问题,在修改了1号节点修改之前的和修改之后的VIP居然共存于1号节点所在主机的网卡上,2号节点正常,回想整个操作过程,1号节点和2号节点在修改VIP时候只有一点不同,就是1号节点的VIP始终无法停掉,所以在1号节点vip没有停掉的情况下,进行了修改VIP的操作。为了证明是不是由于该原因引起的,做了以下实验:
思路:
1、停掉两个节点的DB\ASM\NODEAPPS
2、单独启动1号节点的vip
3、在1号节点启动的情况下,修改两个节点的vip
4、启动两个节点的nodeapps,查看1号节点是否存在修改前后的两个vip
实验过程:
一、修改前节点的信息(1号节点)
Last login: Mon May 20 22:32:08 2013 from 192.168.100.1
[root@hcn1 ~]# ifconfig -a
eth0 Link encap:Ethernet HWaddr 08:00:27:82:1E:93
inet addr:192.168.100.101 Bcast:192.168.100.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:262 errors:0 dropped:0 overruns:0 frame:0
TX packets:238 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:31738 (30.9 KiB) TX bytes:33024 (32.2 KiB)
eth0:1 Link encap:Ethernet HWaddr 08:00:27:82:1E:93
inet addr:192.168.100.102 Bcast:192.168.100.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
eth1 Link encap:Ethernet HWaddr 08:00:27:DF:DD:4A
inet addr:10.105.1.105 Bcast:10.105.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:32736 errors:0 dropped:0 overruns:0 frame:0
TX packets:38979 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:14878394 (14.1 MiB) TX bytes:23924051 (22.8 MiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:4872 errors:0 dropped:0 overruns:0 frame:0
TX packets:4872 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:6052152 (5.7 MiB) TX bytes:6052152 (5.7 MiB)
----此时eth0中只有修改前vip的信息(eth0:1)
[root@hcn1 ~]#
[root@hcn1 ~]#
[root@hcn1 ~]#
[root@hcn1 ~]#
[oracle@hcn1 ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE hcn1
ora....N1.lsnr application ONLINE ONLINE hcn1
ora.hcn1.gsd application ONLINE ONLINE hcn1
ora.hcn1.ons application ONLINE ONLINE hcn1
ora.hcn1.vip application ONLINE ONLINE hcn1
ora....SM2.asm application ONLINE ONLINE hcn2
ora....N2.lsnr application ONLINE ONLINE hcn2
ora.hcn2.gsd application ONLINE ONLINE hcn2
ora.hcn2.ons application ONLINE ONLINE hcn2
ora.hcn2.vip application ONLINE ONLINE hcn2
ora.hcndb.db application ONLINE ONLINE hcn2
ora....b1.inst application ONLINE ONLINE hcn1
ora....b2.inst application ONLINE ONLINE hcn2
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$ srvctl config nodeapps -n hcn1 -a
VIP exists.: /hcn1-vip/192.168.100.102/255.255.255.0/eth0
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$
[oracle@hcn2 ~]$
[oracle@hcn2 ~]$ srvctl config nodeapps -n hcn2 -a
VIP exists.: /hcn2-vip/192.168.100.104/255.255.255.0/eth0
[oracle@hcn2 ~]$
二、停掉两个节点的DB\ASM\NODEAPPS
停掉DB、asm以及nodeapps
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$ srvctl stop database -d hcndb
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$ srvctl stop asm -n hcn1
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$ srvctl stop asm -n hcn2
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$ srvctl stop nodeapps -n hcn1
[oracle@hcn1 ~]$ srvctl stop nodeapps -n hcn2
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application OFFLINE OFFLINE
ora....N1.lsnr application OFFLINE OFFLINE
ora.hcn1.gsd application OFFLINE OFFLINE
ora.hcn1.ons application OFFLINE OFFLINE
ora.hcn1.vip application OFFLINE OFFLINE
ora....SM2.asm application OFFLINE OFFLINE
ora....N2.lsnr application OFFLINE OFFLINE
ora.hcn2.gsd application OFFLINE OFFLINE
ora.hcn2.ons application OFFLINE OFFLINE
ora.hcn2.vip application OFFLINE OFFLINE
ora.hcndb.db application OFFLINE OFFLINE
ora....b1.inst application OFFLINE OFFLINE
ora....b2.inst application OFFLINE OFFLINE
[oracle@hcn1 ~]$
三、启动1号节点的vip服务
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$ crs_start ora.hcn1.vip
Attempting to start `ora.hcn1.vip` on member `hcn1`
Start of `ora.hcn1.vip` on member `hcn1` succeeded.
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application OFFLINE OFFLINE
ora....N1.lsnr application OFFLINE OFFLINE
ora.hcn1.gsd application OFFLINE OFFLINE
ora.hcn1.ons application OFFLINE OFFLINE
ora.hcn1.vip application ONLINE ONLINE hcn1
ora....SM2.asm application OFFLINE OFFLINE
ora....N2.lsnr application OFFLINE OFFLINE
ora.hcn2.gsd application OFFLINE OFFLINE
ora.hcn2.ons application OFFLINE OFFLINE
ora.hcn2.vip application OFFLINE OFFLINE
ora.hcndb.db application OFFLINE OFFLINE
ora....b1.inst application OFFLINE OFFLINE
ora....b2.inst application OFFLINE OFFLINE
[oracle@hcn1 ~]$
[oracle@hcn1 ~]$
四、修改vip
修改vip
./srvctl modify nodeapps -n hcn1 -A 192.168.100.122/255.255.255.0/eth0
./srvctl modify nodeapps -n hcn2 -A 192.168.100.124/255.255.255.0/eth0
五、启动vip
[root@hcn1 ~]#
[root@hcn1 ~]# ifconfig -a
eth0 Link encap:Ethernet HWaddr 08:00:27:82:1E:93
inet addr:192.168.100.101 Bcast:192.168.100.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1335 errors:0 dropped:0 overruns:0 frame:0
TX packets:1072 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:135015 (131.8 KiB) TX bytes:134194 (131.0 KiB)
eth0:1 Link encap:Ethernet HWaddr 08:00:27:82:1E:93
inet addr:192.168.100.102 Bcast:192.168.100.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
eth1 Link encap:Ethernet HWaddr 08:00:27:DF:DD:4A
inet addr:10.105.1.105 Bcast:10.105.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:43775 errors:0 dropped:0 overruns:0 frame:0
TX packets:61145 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:19368474 (18.4 MiB) TX bytes:46122338 (43.9 MiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:10771 errors:0 dropped:0 overruns:0 frame:0
TX packets:10771 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:6463861 (6.1 MiB) TX bytes:6463861 (6.1 MiB)
[root@hcn1 ~]#
---- 修改后老的vip(eth0:1)依然存在,其次还未启动1号节点的所有nodeapps
修改/etc/hosts后启动vip
[oracle@hcn1 bin]$
[oracle@hcn1 bin]$ srvctl start nodeapps -n hcn1
[oracle@hcn1 bin]$
[oracle@hcn1 bin]$
[oracle@hcn1 bin]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application OFFLINE OFFLINE
ora....N1.lsnr application ONLINE ONLINE hcn1
ora.hcn1.gsd application ONLINE ONLINE hcn1
ora.hcn1.ons application ONLINE ONLINE hcn1
ora.hcn1.vip application ONLINE ONLINE hcn1
ora....SM2.asm application OFFLINE OFFLINE
ora....N2.lsnr application OFFLINE OFFLINE
ora.hcn2.gsd application OFFLINE OFFLINE
ora.hcn2.ons application OFFLINE OFFLINE
ora.hcn2.vip application OFFLINE OFFLINE
ora.hcndb.db application OFFLINE OFFLINE
ora....b1.inst application OFFLINE OFFLINE
ora....b2.inst application OFFLINE OFFLINE
[oracle@hcn1 bin]$
[root@hcn1 ~]#
[root@hcn1 ~]#
[root@hcn1 ~]# ifconfig -a
eth0 Link encap:Ethernet HWaddr 08:00:27:82:1E:93
inet addr:192.168.100.101 Bcast:192.168.100.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1582 errors:0 dropped:0 overruns:0 frame:0
TX packets:1262 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:157531 (153.8 KiB) TX bytes:157824 (154.1 KiB)
eth0:1 Link encap:Ethernet HWaddr 08:00:27:82:1E:93
inet addr:192.168.100.102 Bcast:192.168.100.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
eth0:2 Link encap:Ethernet HWaddr 08:00:27:82:1E:93
inet addr:192.168.100.122 Bcast:192.168.100.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
eth1 Link encap:Ethernet HWaddr 08:00:27:DF:DD:4A
inet addr:10.105.1.105 Bcast:10.105.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:44508 errors:0 dropped:0 overruns:0 frame:0
TX packets:61950 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:19558274 (18.6 MiB) TX bytes:46505098 (44.3 MiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:11927 errors:0 dropped:0 overruns:0 frame:0
TX packets:11927 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:6499518 (6.1 MiB) TX bytes:6499518 (6.1 MiB)
----启动了1号节点的nodeapps后,发现新的vip出现(eth0:2),之前老的vip(eth0:1)依然存在
[root@hcn1 ~]#
结论:
可见在没完全停止vip服务的情况下,就直接修改vip的地址,就可以造成新旧vip共存的情况,并且此时如果ping老的vip依然能ping通
解决办法:
重启网卡服务,重启后老的vip将会消失,如果遇到重启后仍然存在的情况,可以通过重新配置网卡信息,然后重启的方式,来解决。
看来下次在修改vip的时候一定要注意这个问题,在确定vip服务已经完全停止的情况下进行修改操作