1. 准备工作
确认当前RAC状态
[grid@db1 ~]$ crsctl status res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.DATA.dg ONLINE ONLINE db1 ONLINE ONLINE db2 ora.LISTENER.lsnr ONLINE ONLINE db1 ONLINE ONLINE db2 ora.asm ONLINE ONLINE db1 Started ONLINE ONLINE db2 Started ora.eons ONLINE ONLINE db1 ONLINE ONLINE db2 ora.gsd OFFLINE OFFLINE db1 OFFLINE OFFLINE db2 ora.net1.network ONLINE ONLINE db1 ONLINE ONLINE db2 ora.ons ONLINE ONLINE db1 ONLINE ONLINE db2 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE db1 ora.db.db 1 ONLINE ONLINE db1 Open 2 ONLINE ONLINE db2 Open ora.db1.vip 1 ONLINE ONLINE db1 ora.db2.vip 1 ONLINE ONLINE db2 ora.oc4j 1 OFFLINE OFFLINE ora.scan1.vip 1 ONLINE ONLINE db1 |
修改前IP配置
[grid@db1 ~]$ cat /etc/hosts # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 odd.up.com odd localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6
192.168.1.161 db1.up.com db1 192.168.1.162 db2.up.com db2
10.0.1.161 db1-priv.up.com db1-priv 10.0.1.162 db2-priv.up.com db2-priv
192.168.1.163 db1-vip.up.com db1-vip 192.168.1.164 db2-vip.up.com db2-vip
192.168.1.165 db-cluster |
| 修改前IP | 修改后IP |
db1 Public IP | 192.168.1.161 | 20.0.1.161 |
db2 Public IP | 192.168.1.162 | 20.0.1.162 |
db1-vip IP | 192.168.1.163 | 20.0.1.163 |
db2-vip IP | 192.168.1.164 | 20.0.1.164 |
db1-priv IP | 10.0.1.161 | 100.0.1.161 |
db2-priv IP | 10.0.1.162 | 100.0.1.162 |
db-cluster | 192.168.1.165 | 20.0.1.165 |
2. 具体的操作
2.1. 操作前准备工作
停止两边节点数据库,监听,并且停止crs
1 所有节点上禁止数据库启动,停止数据库(2个节点) [root@db1 ~]# srvctl disable database -d db [root@db1 ~]# srvctl stop database -d db
2 禁止所有节点的LISTNER的启动,停止所有节点上的LISTENER(2个节点) [root@db1 ~]# srvctl disable listener [root@db1 ~]# srvctl stop listener
3 禁止所有节点的VIP的启动,停止所有节点上的VIP(注意:a.操作VIP的时候提供的/etc/hosts中配置的是VIP的名字;b.只有root用户才能DISABLE VIP资源)(节点1执行) [root@db1 ~]# srvctl disable vip -i "db1-vip" [root@db1 ~]# srvctl disable vip -i "db2-vip" [root@db1 ~]# srvctl stop vip -n db1 [root@db1 ~]# srvctl stop vip -n db2
4 禁止所有节点的SCAN_LISTENER的启动,停止所有节点上的SCAN_LISTENER(1个节点执行) [root@db1 ~]# srvctl disable scan_listener [root@db1 ~]# srvctl stop scan_listener
5 禁止所有节点的SCAN的启动,停止所有节点上的SCAN(节点1执行) [root@db1 ~]# srvctl disable scan [root@db1 ~]# srvctl stop scan
6 停止节点上的集群 节点1上执行: [root@db1 ~]# /u01/app/11.2.0/grid/bin/crsctl stop crs 节点2上执行: [root@db2 ~]# /u01/app/11.2.0/grid/bin/crsctl stop crs |
修改两个节点的/etc/hosts文件
20.0.1.161 db1.up.com db1 20.0.1.162 db2.up.com db2 10.0.1.161 db1-priv.up.com db1-priv 10.0.1.162 db2-priv.up.com db2-priv 20.0.1.163 db1-vip.up.com db1-vip 20.0.1.164 db2-vip.up.com db2-vip 20.0.1.165 db-cluster |
注意,第一步,先不修改private IP
利用OS命令,修改网卡的IP地址
利用以下命令,在节点1,2上进行相同的操作。
[root@db1 ~]# system-config-network [root@db1 ~]# ifdown eth0 [root@db1 ~]# ifup eth0
[root@db1 ~]# ifconfig |grep inet inet addr:20.0.1.161 Bcast:20.0.1.255 Mask:255.255.255.0 inet6 addr: fe80::a00:27ff:fe6c:3749/64 Scope:Link inet addr:10.0.1.161 Bcast:10.0.1.255 Mask:255.255.255.0 inet6 addr: fe80::a00:27ff:fe25:bf57/64 Scope:Link inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host |
2.2. 修改Public IP
两边节点启动crs,用oifcfg 命令修改public ip
节点1启动crs [root@db1 bin]# /u01/app/11.2.0/grid/bin/crsctl start crs CRS-4123: Oracle High Availability Services has been started. 节点2启动crs [root@db1 bin]# /u01/app/11.2.0/grid/bin/crsctl start crs CRS-4123: Oracle High Availability Services has been started.
节点1上操作 [root@db1 bin]# ./oifcfg delif -global eth0 [root@db1 bin]# ./oifcfg setif -global eth0/20.0.1.0:public [root@db1 bin]# ./oifcfg getif eth1 10.0.1.0 global cluster_interconnect eth0 20.0.1.0 global public |
去节点2确认操作
[root@db2 ~]# /u01/app/11.2.0/grid/bin/oifcfg getif eth1 10.0.1.0 global cluster_interconnect eth0 20.0.1.0 global public |
可以看到节点2也已经变过来了。
修改两个节点的VIP此时数据库并没有起来,如果数据起来了,就应该先关掉
[root@db1 ~]# srvctl config vip -n db1 VIP exists.:db1 VIP exists.: /db1-vip/20.0.1.163/255.255.255.0/eth0 [root@db1 ~]# srvctl config vip -n db2 VIP exists.:db2 VIP exists.: /db2-vip/20.0.1.164/255.255.255.0/eth0 |
此时查看VIP的时候,证明已经自动修改过来了。
如果此时查看到VIP还是原来的地址的话,就需要进行以下操作
root用户 停止vip 服务和修改vip srvctl stop listener -n db1 srvctl stop listener -n db2 srvctl stop vip -n db1 srvctl stop vip -n db2 srvctl modify nodeapps -n db1 -A 20.0.1.163/255.255.255.0/eth0 srvctl modify nodeapps -n db2 -A 20.0.1.164/255.255.255.0/eth0 再次启动VIP srvctl start listener -n db1 srvctl start listener -n db2 srvctl start vip -n db1 srvctl start vip -n db2 |
然后再去确认VIP的状态
修改local_listener参数
这一步在我的实验中不需要操作,我的数据库的版本是11.2.0.3 ,在最后做完所有操作,启动数据库的时候,能在alert日志里看到如下内容
Completed: ALTER DATABASE OPEN Wed Mar 19 12:51:52 2014 ALTER SYSTEM SET local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=db1-vip)(PORT=1521))))' SCOPE=MEMORY SID='db1'; Wed Mar 19 12:52:05 2014 |
自动注册了
SQL> show parameter local_listener
NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ local_listener string |
首先查看两个节点的local_listener,我这个RAC环境中。没有设置local_listener,如果设置了,此时可以看到listener的地址地址还是原VIP地址,需要进行修改,命令如下:
alter system set local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.163)(PORT=1521))))' scope=both sid='db1'; alter system set local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.164)(PORT=1521))))' scope=both sid='db2'; |
2.3. 修改SCAN IP
操作如下
[root@db1 ~]# srvctl config scan SCAN name: db-cluster, Network: 1/192.168.1.0/255.255.255.0/eth0 SCAN VIP name: scan1, IP: /192.168.1.165/192.168.1.165 You have new mail in /var/spool/mail/root [root@db1 ~]# srvctl status scan SCAN VIP scan1 is disabled SCAN VIP scan1 is not running [root@db1 ~]# srvctl status scan_listener SCAN Listener LISTENER_SCAN1 is disabled SCAN listener LISTENER_SCAN1 is not running [root@db1 ~]# srvctl modify scan -n db-cluster [root@db1 ~]# srvctl config scan SCAN name: db-cluster, Network: 1/192.168.1.0/255.255.255.0/eth0 SCAN VIP name: scan1, IP: /db-cluster/20.0.1.165 [root@db1 ~]# cat /etc/hosts # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 odd.up.com odd localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6
20.0.1.161 db1.up.com db1 20.0.1.162 db2.up.com db2 10.0.1.161 db1-priv.up.com db1-priv 10.0.1.162 db2-priv.up.com db2-priv 20.0.1.163 db1-vip.up.com db1-vip 20.0.1.164 db2-vip.up.com db2-vip 20.0.1.165 db-cluster
[root@db1 ~]# srvctl modify scan -n 20.0.1.165 [root@db1 ~]# srvctl config scan SCAN name: 20.0.1.165, Network: 1/192.168.1.0/255.255.255.0/eth0 SCAN VIP name: scan1, IP: /db-cluster/20.0.1.165
此时注意到网段还是192.168.1.0的网段,应该这种修改是不对的
经过试验发现scan中的subnet依赖于资源ora.net1.network的USR_ORA_SUBNET属性,所以修改SCAN前先修改该属性修改资源ora.net1.network的USR_ORA_SUBNET属性为新的网络号
[root@db1 ~]# crsctl modify res "ora.net1.network" -attr "USR_ORA_SUBNET=20.0.1.0" [root@db1 ~]# srvctl modify scan -n db-cluster
修改db-cluster的值,srvctl只提供了一个用域名来修改scan配置的选项,猜测Oracle是通过DNS来获取对应的IP从而实现配置的
[root@db1 ~]# srvctl config scan SCAN name: db-cluster, Network: 1/20.0.1.0/255.255.255.0/eth0 SCAN VIP name: scan1, IP: /db-cluster/20.0.1.165
此时节点2也变过来了 [root@db2 ~]# srvctl config scan SCAN name: db-cluster, Network: 1/20.0.1.0/255.255.255.0/eth0 SCAN VIP name: scan1, IP: /db-cluster/20.0.1.165 |
启动scan 和scan_listener
[root@db1 ~]# srvctl enable scan [root@db1 ~]# srvctl start scan [root@db1 ~]# srvctl enable scan_listener [root@db1 ~]# srvctl start scan_listener |
将第一步准备工作禁止启动的服务改成自启动状态
[root@db1 ~]# srvctl enable vip -i "db1-vip" [root@db1 ~]# srvctl enable vip -i "db2-vip" [root@db1 ~]# srvctl start vip -n db1 [root@db1 ~]# srvctl start vip -n db2 [root@db1 ~]# srvctl enable listener [root@db1 ~]# srvctl start listener |
可以看到,现在集群状态已经正常
[root@db1 ~]# crsctl status res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.DATA.dg ONLINE ONLINE db1 ONLINE ONLINE db2 ora.LISTENER.lsnr ONLINE ONLINE db1 ONLINE ONLINE db2 ora.asm ONLINE ONLINE db1 Started ONLINE ONLINE db2 Started ora.eons ONLINE ONLINE db1 ONLINE ONLINE db2 ora.gsd OFFLINE OFFLINE db1 OFFLINE OFFLINE db2 ora.net1.network ONLINE ONLINE db1 ONLINE ONLINE db2 ora.ons ONLINE ONLINE db1 ONLINE ONLINE db2 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE db1 ora.db.db 1 OFFLINE OFFLINE 2 OFFLINE OFFLINE ora.db1.vip 1 ONLINE ONLINE db1 ora.db2.vip 1 ONLINE ONLINE db2 ora.oc4j 1 OFFLINE OFFLINE ora.scan1.vip 1 ONLINE ONLINE db1 |
2.4. 修改Private IP
接下来修改private ip
可以看到两个节点的状态都是active的
[root@db1 ~]# olsnodes -s db1 Active db2 Active |
首先修改主机的eth1的IP地址,然后做接下来的修改。
现在去修改/etc/hosts文件里的配置
[root@db1 ~]# vim /etc/hosts [root@db1 ~]# cat /etc/hosts # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 odd.up.com odd localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6
20.0.1.161 db1.up.com db1 20.0.1.162 db2.up.com db2 20.0.1.163 db1-vip.up.com db1-vip 20.0.1.164 db2-vip.up.com db2-vip 20.0.1.165 db-cluster 100.0.1.161 db1-priv.up.com db1-priv 100.0.1.162 db2-priv.up.com db2-priv |
可以见IP地址还是原来的地址
[root@db1 ~]# oifcfg getif eth1 10.0.1.0 global cluster_interconnect eth0 20.0.1.0 global public |
在同一个网卡上设置新的IP地址
[root@db1 ~]# oifcfg setif -global eth1/100.0.1.0:cluster_interconnect [root@db1 ~]# oifcfg getif eth1 10.0.1.0 global cluster_interconnect eth0 20.0.1.0 global public eth1 100.0.1.0 global cluster_interconnect [root@db1 ~]# oifcfg delif -global eth1/10.0.1.0 [root@db1 ~]# oifcfg getif eth0 20.0.1.0 global public eth1 100.0.1.0 global cluster_interconnect |
参考命令
不是新网卡的命令 oifcfg getif oifcfg setif -global eth1/100.0.1.0:cluster_interconnect oifcfg delif -global eth1/10.0.1.0 oifcfg getif
新网卡的命令 oifcfg getif oifcfg setif -global eth3/100.0.1.0:cluster_interconnect oifcfg delif -global eth1 oifcfg getif |
启动数据库
[root@db1 ~]# srvctl enable database -d db [root@db1 ~]# srvctl start database -d db |
再次确认集群的状态
在操作过程中由于我没有修改网卡的IP地址后,直接修改了private ip,导致重启crs后报如下错误
2014-03-19 12:13:25.446: [ CRSMAIN][1624389360] Checking the OCR device 2014-03-19 12:13:25.449: [ CRSMAIN][1624389360] Connecting to the CSS Daemon 2014-03-19 12:13:25.477: [ CRSMAIN][1624389360] Initializing OCR 2014-03-19 12:13:25.483: [ OCRAPI][1624389360]clsu_get_private_ip_addr: Calling clsu_get_private_ip_addresses to get first private ip 2014-03-19 12:13:25.483: [ OCRAPI][1624389360]Check namebufs 2014-03-19 12:13:25.483: [ OCRAPI][1624389360]Finished checking namebufs 2014-03-19 12:13:25.485: [ GIPC][1624389360] gipcCheckInitialization: possible incompatible non-threaded init from [clsinet.c : 3229], original from [clsss.c : 5011] 2014-03-19 12:13:25.490: [ GPnP][1624389360]clsgpnp_Init: [at clsgpnp0.c:404] gpnp tracelevel 3, component tracelevel 0 2014-03-19 12:13:25.490: [ GPnP][1624389360]clsgpnp_Init: [at clsgpnp0.c:534] '/u01/app/11.2.0/grid' in effect as GPnP home base. 2014-03-19 12:13:25.501: [ GIPC][1624389360] gipcCheckInitialization: possible incompatible non-threaded init from [clsgpnp0.c : 680], original from [clsss.c : 5011] 2014-03-19 12:13:25.504: [ GPnP][1624389360]clsgpnp_InitCKProviders: [at clsgpnp0.c:3866] Init gpnp local security key providers (2) fatal if both fail 2014-03-19 12:13:25.505: [ GPnP][1624389360]clsgpnp_InitCKProviders: [at clsgpnp0.c:3869] Init gpnp local security key proveders 1 of 2: file wallet (LSKP-FSW) 2014-03-19 12:13:25.506: [ GPnP][1624389360]clsgpnpkwf_initwfloc: [at clsgpnpkwf.c:398] Using FS Wallet Location : /u01/app/11.2.0/grid/gpnp/db1/wallets/peer/
2014-03-19 12:13:25.506: [ GPnP][1624389360]clsgpnp_InitCKProviders: [at clsgpnp0.c:3891] Init gpnp local security key provider 1 of 2: file wallet (LSKP-FSW) OK 2014-03-19 12:13:25.506: [ GPnP][1624389360]clsgpnp_InitCKProviders: [at clsgpnp0.c:3897] Init gpnp local security key proveders 2 of 2: OLR wallet (LSKP-CLSW-OLR) [ CLWAL][1624389360]clsw_Initialize: OLR initlevel [30000] 2014-03-19 12:13:25.527: [ GPnP][1624389360]clsgpnp_InitCKProviders: [at clsgpnp0.c:3919] Init gpnp local security key provider 2 of 2: OLR wallet (LSKP-CLSW-OLR) OK 2014-03-19 12:13:25.527: [ GPnP][1624389360]clsgpnp_getCK: [at clsgpnp0.c:1950] <Get gpnp security keys (wallet) for id:1,typ;7. (2 providers - fatal if all fail) 2014-03-19 12:13:25.527: [ GPnP][1624389360]clsgpnpkwf_getWalletPath: [at clsgpnpkwf.c:498] req_id=1 ck_prov_id=1 wallet path: /u01/app/11.2.0/grid/gpnp/db1/wallets/peer/ 2014-03-19 12:13:25.598: [ GPnP][1624389360]clsgpnpwu_walletfopen: [at clsgpnpwu.c:494] Opened SSO wallet: '/u01/app/11.2.0/grid/gpnp/db1/wallets/peer/cwallet.sso' 2014-03-19 12:13:25.598: [ GPnP][1624389360]clsgpnp_getCK: [at clsgpnp0.c:1965] Result: (0) CLSGPNP_OK. Get gpnp wallet - provider 1 of 2 (LSKP-FSW(1)) 2014-03-19 12:13:25.598: [ GPnP][1624389360]clsgpnp_getCK: [at clsgpnp0.c:1982] Got gpnp security keys (wallet).> 2014-03-19 12:13:25.608: [ GPnP][1624389360]clsgpnp_getCK: [at clsgpnp0.c:1950] <Get gpnp security keys (wallet) for id:1,typ;4. (2 providers - fatal if all fail) 2014-03-19 12:13:25.608: [ GPnP][1624389360]clsgpnpkwf_getWalletPath: [at clsgpnpkwf.c:498] req_id=1 ck_prov_id=1 wallet path: /u01/app/11.2.0/grid/gpnp/db1/wallets/peer/ 2014-03-19 12:13:25.671: [ GPnP][1624389360]clsgpnpwu_walletfopen: [at clsgpnpwu.c:494] Opened SSO wallet: '/u01/app/11.2.0/grid/gpnp/db1/wallets/peer/cwallet.sso' 2014-03-19 12:13:25.672: [ GPnP][1624389360]clsgpnp_getCK: [at clsgpnp0.c:1965] Result: (0) CLSGPNP_OK. Get gpnp wallet - provider 1 of 2 (LSKP-FSW(1)) 2014-03-19 12:13:25.672: [ GPnP][1624389360]clsgpnp_getCK: [at clsgpnp0.c:1982] Got gpnp security keys (wallet).> 2014-03-19 12:13:25.672: [ GPnP][1624389360]clsgpnp_Init: [at clsgpnp0.c:837] GPnP client pid=23803, tl=3, f=0 2014-03-19 12:13:25.770: [ OCRAPI][1624389360]clsu_get_private_ip_addresses: no ip addresses found. 2014-03-19 12:13:25.770: [GIPCXCPT][1624389360] gipcShutdownF: skipping shutdown, count 2, from [ clsinet.c : 1732], ret gipcretSuccess (0) 2014-03-19 12:13:25.774: [GIPCXCPT][1624389360] gipcShutdownF: skipping shutdown, count 1, from [ clsgpnp0.c : 1021], ret gipcretSuccess (0) [ OCRAPI][1624389360]a_init_clsss: failed to call clsu_get_private_ip_addr (7) 2014-03-19 12:13:25.782: [ OCRAPI][1624389360]a_init:13!: Clusterware init unsuccessful : [44] 2014-03-19 12:13:25.783: [ CRSOCR][1624389360] OCR context init failure. Error: PROC-44: Error in network address and interface operations Network address and interface operations error [7] 2014-03-19 12:13:25.783: [ CRSD][1624389360][PANIC] CRSD exiting: Could not init OCR, code: 44 |
3. 总结
在操作过程中,我使用的是虚拟机,当我重启网络的时候,主机名总是被改掉了,每次都要去修改/etc/sysconfig/network这个文件里的主机名,然后执行hostname命令。
修改private ip的顺序刚好和10gR2相反,10gR2是先关闭crs,然后修改hosts表和物理ip,再启动crs,用oifcfg 设置新私网ip,这点要注意,否则按10gR2修改私网的方法,会导致CRS集群起不来,所以做之前先做好备份。