Oracle升级导致ocr文件损坏一例

所以说对于生产库,备份重要啊。。。

[@more@]

备用的集群rac发生意外宕机,一边查硬件原因一边查oracle这边的问题。发现数据库是10.2.0.3但是crs是10.2.0.1,虽然不能断定是这个原因,但是排除法嘛,先试试看咯,结果试了一身冷汗。。。

过程一路顺利,启动crs,所有节点服务也正常起来了,但是迟迟看不到数据库起来,crs_stat检查发现所有的实例和service都没起来。手工是用sqlplus 启动正常,数据库也可以打开。但是使用srvctl启动就报下面的错误:

[oracle@bj15-75 ~]$ srvctl start database -d membj
PRKP-1001 : Error starting instance membj1 on node bj15-74
CRS-0212: Resource 'ora.membj.membj1.inst' is not registered.
PRKP-1001 : Error starting instance membj2 on node bj15-75
CRS-0212: Resource 'ora.membj.membj2.inst' is not registered.
PRKP-1001 : Error starting instance membj3 on node bj15-77
CRS-0212: Resource 'ora.membj.membj3.inst' is not registered.

crs_stat的输出明明显示有这些资源阿。。。。心里一紧,马上查了下metalink,发现了一个说法:

Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 10.2.0.3
This problem can occur on any platform.

Symptoms
The database and/or instances is not starting up using the srvctl command, reporting the following errors when invoked, ex:

PRKP-1001: Error starting instance inslo1 on node rias-ins-dba01
CRS-0212: Resource 'ora.inslo.inslo1.inst' is not registered.

The errors seems like the that the clusterware does not know about the resource, because it is not registered in the OCR.
Cause
Issue is caused due to the corruption of the database and/or instances entries corruption in the OCR.
The following output shows that the resource is not registered in the clusterware, but at the same time the CRS can get its status using the crs_stat command, so it is discouraging any updating, starting or stopping for this resource.

PRKP-1001: Error starting instance inslo1 on node rias-ins-dba01
CRS-0212: Resource 'ora.inslo.inslo1.inst' is not registered.
Solution
Because the of the corruption of this resource entry in the OCR you can simply remove the the resource with all of its corrupted information from the OCR using the "srvctl remove" command for this resource, then proceed with adding the resource again which is going to make it work back again.

1. Removing the resource:
srvctl remove database -d

2. Add the resources again:
srvctl add database -d -o
srvctl add instance -d -i -n

说我的ocr升级升坏了?那就先恢复ocr咯,于是用ocrconfig恢复了最近升级前的ocr备份,也就是这个有问题的备份导致我走了很多弯路,还原后的ocr文件还是出现了上面的问题,致使我认为备份都出了问题。

看来只好试试看文档中的办法了。但是运行完srvctl remove database后,仍然发现crs_stat中还是有原来的数据db资源。尝试添加新的db资源失败,报已经存在,看来ocr坏的比较严重了,没法通过常规删除信息了。只能试试看dd ocr文件出来修改。但是条目太多,上次被我侥幸修改成功了,这次却总是失败。于是想办法删除实例服务试试看,前2个节点都成功了,到最后一个报错:

[oracle@bj15-74 ~]$ srvctl remove instance -i membj1 -d membj
Remove instance membj1 from the database membj? (y/[n]) y
PRKP-1075 : Instance membj1 is the last preferred instance for service membjapp.

尝试删除服务:

[oracle@bj15-74 ~]$ srvctl remove service -s membjapp -d membj
service membjapp is running

还是失败- -,可是这个服务明明没有起来么。。。。

没办法了,最后只能想还原上个礼拜的ocr备份碰碰运气了。不行只能重装crs了。。。。。

还原后,我尝试先删除前2个节点的实例和服务,然后重新注册了这2个节点的服务,之后再去删除第三个节点的实例,居然成功了,之后使用dbca重新配置了下service,居然成功了!

看来除了ocr的备份的问题,ocr信息的写入使用srvctl始终还是不如dbca好啊,已经第二次遇上信息清除不掉了。。。

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/79686/viewspace-1018846/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/79686/viewspace-1018846/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值