RMAN备份遭遇ORA-235

This is the first time i post blog using English.

Today i get a ticket from EBR team(3rd part backup team), saying that the backup job fail due to ora-235:

……
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 10/12/2011 02:38:13
ORA-00235: controlfile fixed table inconsistent due to concurrent update
RMAN-06031: could not translate database keyword
Recovery Manager complete.

so i go to the veritas netbackup path to check the backup log:

au11qap830tels2:SANL01P1:/usr/openv/netbackup/logs/user_ops/dbext/oracle>ls -lrt
total 1000
-rw-rw-rw-   1 root     root        3302 Oct  9 23:13 progress.1318162395.13914.log.Z
-rw-rw-rw-   1 root     root      120859 Oct 10 11:48 progress.1318165604.231.log
-rw-rw-rw-   1 root     root      107600 Oct 11 06:49 progress.1318248053.7838.log
-rw-rw-rw-   1 root     root      102098 Oct 11 23:10 progress.1318334454.10590.log
-rw-rw-rw-   1 root     root        8139 Oct 12 02:38 progress.1318347478.12109.log
-rw-rw-rw-   1 root     root      121113 Oct 12 16:57 progress.1318362511.1274.log

we see there are 2 backup log file today(2011-10-12). And one is backup fail, other is backup successful:

BACKUP FAIL LOG:

au11qap830tels2:SANL01P1:/usr/openv/netbackup/logs/user_ops/dbext/oracle>tail -20 progress.1318347478.12109.log
INF - released channel: ch06
INF - released channel: ch07
INF - released channel: ch08
INF - released channel: ch09
INF - released channel: ch10
INF - released channel: ch11
INF - released channel: ch12
INF - released channel: ch13
INF - released channel: ch14
INF - released channel: ch15
INF - RMAN-00571: ===========================================================
INF - RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
INF - RMAN-00571: ===========================================================
INF - RMAN-03002: failure of backup command at 10/12/2011 02:38:13
INF - ORA-00235: controlfile fixed table inconsistent due to concurrent update
INF - RMAN-06031: could not translate database keyword
INF - Recovery Manager complete.
INF - logout
INF - End of Recovery Manager output.
INF - End Oracle Recovery Manager.
au11qap830tels2:SANL01P1:/usr/openv/netbackup/logs/user_ops/dbext/oracle>

BACKUP SUCCESS LOG:

au11qap830tels2:SANL01P1:/usr/openv/netbackup/logs/user_ops/dbext/oracle>tail -20 progress.1318362511.1274.log
INF - released channel: ch09
INF - released channel: ch10
INF - released channel: ch11
INF - released channel: ch12
INF - released channel: ch13
INF - released channel: ch14
INF - released channel: ch15
INF - allocated channel: ch00
INF - channel ch00: starting full datafile backupset
INF - including current controlfile in backupset
INF - piece handle=ctrl_uapmou8hk_s108889_p1_t764355124 comment=API Version 2.0,MMS Version 5.0.0.0
INF - channel ch00: backup set complete, elapsed time: 00:03:06
INF - Starting Control File and SPFILE Autobackup at 12-OCT-11
INF - piece handle=c-3411474590-20111012-12 comment=API Version 2.0,MMS Version 5.0.0.0
INF - Finished Control File and SPFILE Autobackup at 12-OCT-11
INF - released channel: ch00
INF - Recovery Manager complete.
INF - logout
INF - End of Recovery Manager output.
INF - End Oracle Recovery Manager.
au11qap830tels2:SANL01P1:/usr/openv/netbackup/logs/user_ops/dbext/oracle>

The backup fail due to ORA-00235 at 02:38am, and re-run the backup job at another time can be successfully.

The error happen because controlfile fixed table inconsistent due to concurrent update.

When we do the rman backup without catalog, just using controlfile to store backup information, it will read the controlfile and get the information like SCN from the controlfile.

When the database is doing a combination of a high rate of change, it will trigger redo log switch and when log switch, it will trigger checkpoint.

checkpoint operation will update the newest SCN to controlfile.

So the SCN is inconsistent with what we read at first time. ora-235 error raise.

From the netbackup log, we see the error happen at 10/12/2011 02:38:13.

From the log history, we also can see there are some log switch before 02:38:13.

TO_CHAR(FIRST_TIME,  SEQUENCE#
------------------- ----------
2011-10-12 00:30:18      59728
2011-10-12 00:31:47      59729
2011-10-12 01:28:08      59730
2011-10-12 01:30:30      59731
2011-10-12 02:29:23      59732
2011-10-12 02:34:07      59733
2011-10-12 02:34:45      59734
2011-10-12 03:38:52      59735
2011-10-12 03:40:28      59736
2011-10-12 04:43:04      59737
2011-10-12 04:44:56      59738
====================================

So here we can get the root cause and solution:

++++++++++++++++
+CAUSE:
++++++++++++++++
As each redo log is archived, the control file will be updated with the latest SCN of the REDO LOG switch.  If this is happening very frequently, the control  file is never released and made available for RMAN for the resync.
 
+++++++++++++++
+SOLUTION
+++++++++++++++
(1) Backup the database at the time which controlfile is not frequently update.
 
(2) Need to reduce the frequency of checkpoint.
(2.1) Increase the size of the redologfiles, but due to the redo log file size is already 4G, this solution is not recommend
(2.2) Increase the value of fast_start_mttr_target from 300 to 600.
 
 
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值