环境:ORACLE10g RAC + ASM +AIX
节点:192.168.5.15, 192.168.5.16
前天同事说数据库不能启动了,让我去查看下,我用crs_stat 发现 db02(16)机器online,db01主节点offline了。然后我用crs_stop –all关闭,然后又crs_start -all重启了下,出现说没有资源resource没有或者failed的信息,这个信息原来我没有见过,信息上还显示是vip失败,我查看了下ip,发现db01点的vip没有了。Db02节点还正常。于是就用aix的命令(smitty mkinetvi)配置了虚拟ip,又进行了关闭和重启crs,发现还是原来的问题。….后来找了1个多小时,最后lspv的时候,发现原来的pv没有了,少了4个pv,晕!!!!分区不见了。这怎么能启动数据库?然后跑到机房,看看是不是光纤卡,或者光线被谁给碰掉了, 结果正常。用IBM400的盘柜软件连上盘柜,查看盘柜信息,出现了警告灯,说什么“逻辑路径”错误,看来确实是盘柜的问题。联系存储厂商,后来来了工程师,检查了下,并搞定了。怎么搞定的,他也没有说什么,就是把光纤交换机重启了下,光纤卡又插了插,就搞定了。不知道怎么回事。
今天,同事给我说分区有了,我用lspv看了下,呵呵~ 分区都回来了。从2到9都是裸设备,没有pvid.
# lspv hdisk0 00cc85bf3d2db424 rootvg active hdisk1 00cc85bf404044eb rootvg active hdisk2 none None hdisk3 none None hdisk4 none None hdisk5 none None hdisk6 none None hdisk7 none None hdisk8 none None hdisk9 none None hdisk10 none None hdisk11 none None hdisk12 00cc85bf8266c2a8 datavg active # |
然后,crs_start –all启动服务,出现如下错误:
ash-3.00$ crs_start -all Attempting to start `ora.db01.vip` on member `db01` Attempting to start `ora.db02.vip` on member `db02` Start of `ora.db02.vip` on member `db02` succeeded. Attempting to start `ora.db02.ASM2.asm` on member `db02` Start of `ora.db01.vip` on member `db01` failed. Attempting to start `ora.db01.vip` on member `db02` Start of `ora.db01.vip` on member `db02` succeeded. db02 : CRS-1019: Resource ora.db01.ASM1.asm (application) cannot run on db02 db02 : CRS-1019: Resource ora.db01.ASM1.asm (application) cannot run on db02 db02 : CRS-1019: Resource ora.db01.LISTENER_DB01.lsnr (application) cannot run n db02 db02 : CRS-1019: Resource ora.db01.ASM1.asm (application) cannot run on db02 Start of `ora.db02.ASM2.asm` on member `db02` succeeded. Attempting to start `ora.GASDB.GASDB2.inst` on member `db02` Start of `ora.GASDB.GASDB2.inst` on member `db02` succeeded. Attempting to start `ora.db02.LISTENER_DB02.lsnr` on member `db02` Start of `ora.db02.LISTENER_DB02.lsnr` on member `db02` succeeded. Attempting to start `ora.racdb.racdb2.inst` on member `db02` Start of `ora.racdb.racdb2.inst` on member `db02` succeeded. CRS-1002: Resource 'ora.db02.ons' is already running on member 'db02' CRS-1002: Resource 'ora.GASDB.db' is already running on member 'db01' Attempting to start `ora.db01.gsd` on member `db01` Attempting to start `ora.db01.ons` on member `db01` Attempting to start `ora.db02.gsd` on member `db02` Attempting to start `ora.racdb.db` on member `db01` Start of `ora.racdb.db` on member `db01` succeeded. Start of `ora.db01.gsd` on member `db01` succeeded. Start of `ora.db02.gsd` on member `db02` succeeded. Start of `ora.db01.ons` on member `db01` succeeded. CRS-0223: Resource 'ora.GASDB.GASDB1.inst' has placement error. CRS-0223: Resource 'ora.GASDB.db' has placement error. CRS-0223: Resource 'ora.db01.ASM1.asm' has placement error. CRS-0223: Resource 'ora.db01.LISTENER_DB01.lsnr' has placement error. CRS-0223: Resource 'ora.db02.ons' has placement error. CRS-0223: Resource 'ora.racdb.racdb1.inst' has placement error.
|
bash-3.00$ crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora....B1.inst application OFFLINE OFFLINE ora....B2.inst application ONLINE ONLINE db02 ora.GASDB.db application ONLINE ONLINE db01 ora....SM1.asm application OFFLINE OFFLINE ora....01.lsnr application OFFLINE OFFLINE ora.db01.gsd application ONLINE ONLINE db01 ora.db01.ons application ONLINE ONLINE db01 ora.db01.vip application ONLINE ONLINE db02 ora....SM2.asm application ONLINE ONLINE db02 ora....02.lsnr application ONLINE ONLINE db02 ora.db02.gsd application ONLINE ONLINE db02 ora.db02.ons application ONLINE ONLINE db02 ora.db02.vip application ONLINE ONLINE db02 ora.racdb.db application ONLINE ONLINE db01 ora....b1.inst application OFFLINE OFFLINE ora....b2.inst application ONLINE ONLINE db02 |
看来还是VIP错误。是不是我虚拟IP配错了。Db02节点的vip没有问题,看下db02的ip吧,一看之下,果然配错了。
--15 bash-3.00# ifconfig -a en0: flags=5e080863,c0 ,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN> inet 192.168.5.15 netmask 0xffffff00 broadcast 192.168.5.255 tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0 en1: flags=5e080863,c0 ,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN> inet 10.168.5.15 netmask 0xff000000 broadcast 10.255.255.255 tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0 vi0: flags=84000041 inet 192.168.5.17 netmask 0xffffff00 lo0: flags=e08084b > inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255 inet6 ::1/0 tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1 |
--16 bash-3.00# ifconfig -a en0: flags=5e080863,c0 ,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN> inet 192.168.5.16 netmask 0xffffff00 broadcast 192.168.5.255 inet 192.168.5.18 netmask 0xffffff00 broadcast 192.168.5.255 inet 192.168.5.17 netmask 0xffffff00 broadcast 192.168.5.255 tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0 en1: flags=5e080863,c0 ,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN> inet 10.168.5.16 netmask 0xff000000 broadcast 10.255.255.255 tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0 lo0: flags=e08084b > inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255 inet6 ::1/0 tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1 |
后来又google了些解决方法。都没有找到一个如何解决的步骤。不过我想既然还是vip的问题,就解决ip问题就ok了。
解决步骤如下:
1. ----ping db01, db02, db01_vip, db02_vip均能ping通
2. ----停止racdb数据库服务 bash-3.00$ crs_stop ora.racdb.db Attempting to stop `ora.racdb.db` on member `db01` Stop of `ora.racdb.db` on member `db01` succeeded.
3.----用srvctl启动db01节点,出现如下信息 bash-3.00$ srvctl start nodeapps -n db01 db01:ora.db01.vip:IP:192.168.5.17 is not configured as alias (host=db01) db01:ora.db01.vip:IP:192.168.5.17 is not configured as alias (host=db01) CRS-0215: Could not start resource 'ora.db01.LISTENER_DB01.lsnr'.
4. ---检查crs bash-3.00$ crsctl check crs CSS appears healthy CRS appears healthy EVM appears healthy
5. ---检查vip crs_stat -p ora.db01.vip
6. ---关闭所有服务 #crs_stop -all
7. ---删除db01的虚拟vi0, 添加en0的ip别名 #ifconfig vi0 192.168.5.17 delete
8. ---删除db02的虚拟en0的17ip #ifconfig vi0 192.168.5.17 delete
9. ---2节点执行ifconfig -a 查看ip |
--15 # ifconfig -a en0: flags=5e080863,c0 ,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN> inet 192.168.5.15 netmask 0xffffff00 broadcast 192.168.5.255 inet 192.168.5.17 netmask 0xffffff00 broadcast 192.168.5.255 tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0 en1: flags=5e080863,c0 ,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN> inet 10.168.5.15 netmask 0xff000000 broadcast 10.255.255.255 tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0 vi0: flags=84000041 lo0: flags=e08084b > inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255 inet6 ::1/0 tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1 --16 bash-3.00# ifconfig -a en0: flags=5e080863,c0 ,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN> inet 192.168.5.16 netmask 0xffffff00 broadcast 192.168.5.255 inet 192.168.5.18 netmask 0xffffff00 broadcast 192.168.5.255 tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0 en1: flags=5e080863,c0 ,64BIT,CHECKSUM_OFFLOAD(ACTIVE),PSEG,LARGESEND,CHAIN> inet 10.168.5.16 netmask 0xff000000 broadcast 10.255.255.255 tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0 vi0: flags=84000000<64BIT> lo0: flags=e08084b > inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255 inet6 ::1/0 tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1 |
10. ---重启服务 #crs_start –all bash-3.00$ crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora....B1.inst application ONLINE ONLINE db01 ora....B2.inst application ONLINE ONLINE db02 ora.GASDB.db application ONLINE ONLINE db01 ora....SM1.asm application ONLINE ONLINE db01 ora....01.lsnr application ONLINE ONLINE db01 ora.db01.gsd application ONLINE ONLINE db01 ora.db01.ons application ONLINE ONLINE db01 ora.db01.vip application ONLINE ONLINE db01 ora....SM2.asm application ONLINE ONLINE db02 ora....02.lsnr application ONLINE ONLINE db02 ora.db02.gsd application ONLINE ONLINE db02 ora.db02.ons application ONLINE ONLINE db02 ora.db02.vip application ONLINE ONLINE db02 ora.racdb.db application ONLINE ONLINE db01 ora....b1.inst application ONLINE ONLINE db01 ora....b2.inst application ONLINE ONLINE db02 |
最后解决OK!!!! 通过这次问题,其实主要要掌握RAC中的体系及概念还是很重要的,了解和掌握了这些,就能看到问题所在,并解决。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/3090/viewspace-672035/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/3090/viewspace-672035/