UNDO相关问题总结（六）

最新推荐文章于 2024-04-08 13:44:36 发布

oradh

最新推荐文章于 2024-04-08 13:44:36 发布

阅读量1.4k

点赞数

分类专栏： oralce undo相关知识文章标签： oracle 数据库 undo dba 4000

本文链接：https://blog.csdn.net/oradh/article/details/25373069

版权

oralce undo相关知识专栏收录该内容

10 篇文章 0 订阅

订阅专栏

本次主题也是undo相关问题的最后一篇，也可以归纳为数据库的非常规恢复，主要描述的数据库出现ORA-600 [4xxx]错误的处理方案。

问题描述

ORA-600 [4xxx] 错误基本都与回滚段相关，下面简单描述几种（仅供参考）

ORA-600 [4000] [a] ：这个错误表示oracle尝试撤销数据字典表（数据字典表ITL中存在未提交事务），而根据ITL中的内容寻找undo segment number时，出现错误和失败，其中参数 a 表示undo segment number，数据库无法启动（一般都是bootstrap$自举对象撤销时失败），一般需要通过bbed工具修改数据字典，undo相关问题中这类问题最为严重，处理也最为复杂。
ORA-600 [4193] [a] [b] : A mismatch has been detected between Redo records and Rollback (Undo) records.We are validating the Undo block sequence number in the undo block against the Redo block sequence number relating to the change being applied.This error is reported when this validation fails。这类问题处理相对简单，处理方法见后面。
ORA-600 [4194] [a] [b] : A mismatch has been detected between Redo records and rollback (Undo) records. We are validating the Undo record number relating to the change being applied against the maximum undo record number recorded in the undo block.This error is reported when the validation fails。这类问题处理相对简单，处理方法见后面。
ORA-600 [4194] : While backing out an undo record (i.e. at the time of rollback) we found a transaction id mis-match indicating either a corruption in the rollback segment or corruption in an object which the rollback segment is trying to apply undo records on.This would indicate a corrupted rollback segment。这类问题处理相对简单，处理方法见后面。
ORA-600 [4097] : XID involved and seq# from undo header,the database crashed and system rollback segment has problems with transaction table. when accessing a rollback segment header to see if a transaction has been committed you observe the XID given is in the future of the transaction tracle。这类问题处理相对简单，处理方法见后面。

解决方法（仅供参考）

1.将回滚段管理方式改为手工，然后尝试启动数据库，如果不成功则尝试步骤2

create pfile='xxxx' from spfile; 在pfile中添加如下条目

UNDO_MANAGEMENT = MANUAL

undo_tablespace=system
2.尝试设置一些event（10513 等）尝试启动数据库，未成功则执行步骤3

在pfile中添加事件条目

event='10513 trace name context forever,level 2 :10512 trace name context forever,level 1: 10511 trace name context forever,level 2:10510 trace name context forever,level 1' scope=spfile;

事件说明

10510 : Turn off SMON check to offline pending offline rollbacksegment
10511 : Turn off SMON check to cleanup undo dictionary
10512 : Turn off SMON check to shrink rollback segments
10513 : Turn off SMON rollback uncommitted transaction(禁止smon进程进行事务回滚)

3.使用_offline_rollback_segments/_corrupted_rollback_segments屏蔽回滚段，尝试启动数据库，未成功执行步骤4

在pfile中添加问题回滚段，例如: _offline_rollback_segments=('_SYSSMU1_3780397527$','_SYSSMU2_2232571081$')

回滚段名称获取方式

dba_rollback_segments,v$rollname （数据库打开状态）
strings system01.dbf | grep _SYSSMU | cut -d $ -f 1 | sort -u （数据库关闭状态）

4.如果依然不能open数据库,考虑使用bbed工具提交事务，修改回滚段状态等操作，未成功执行步骤5

ORA-600 [4000] 错误一般需要使用bbed修改数据字典，修改前一定要备份！
5.如果依然还不能open数据库，考虑使用dul，抽取数据

注意事项：

数据库启动完成以后，一定要记得完成扫尾工作，将有问题的回滚段offline 并drop掉，新建回滚表空间，将原来的 undo表空间切换至新建的，将undo管理还原为自动模式，最后以spfile启动数据库。、
如果问题解决过程中使用了bbed修改了数据字典，最好检查一下数据字典的一致性（hcheck.sql脚本），观察是否有其它600错误出现，最后判断是否需要重建冲数据。

MOS 文档

ORA-600 [4000] [a]

VERSIONS:

version 6.0 to 9.2

DESCRIPTION:

This has the potential to be a very serious error.

It means that Oracle has tried to find an undo segment number in the

dictionary cache and failed

ARGUMENTS:

Arg [a] Undo segment number

FUNCTIONALITY:

KERNEL TRANSACTION UNDO

IMPACT:

INSTANCE FAILURE - Instance will not restart

STATEMENT FAILURE

SUGGESTIONS:

As per Note 1371820.8, this can be seen when executing DML on tables residing

in tablespaces transported from another database.

It is fixed in 8.1.7.4, 9.0.1.4 and 9.2.0.1 The workaround however is to

create more rollback segments in the target database until the highest

rollback segment number (select max(US#) from sys.undo$;) is at least

as high as in equivalent max(US#) from the source database.

It has also been seen where memory has been corrupted so try shutting

down and restarting the instance.

If the database will not start contact Oracle Support Services

immediately, providing the alert.log and associated trace files.

Known Bugs

NB Bug Fixed Description
16761566 11.2.0.3.9, 11.2.0.3.BP22, 11.2.0.4, 12.1.0.2, 12.2.0.0 Instance fails to start with ORA-600 [4000] [usn#]
13910190 11.2.0.3.BP15, 11.2.0.4, 12.1.0.1 ORA-600 [4000] from plugged in tablespace in Exadata
14741727 11.2.0.2.9, 11.2.0.2.BP19, 11.2.0.3.BP12, 11.2.0.3.BP13, 12.1.0.1 Fixes for bug 12326708 and 14624146 can cause problems - backout fix
+ 10425010 11.2.0.3, 12.1.0.1 Stale data blocks may be returned by Exadata FlashCache
* 9145541 11.1.0.7.4, 11.2.0.1.2, 11.2.0.2, 12.1.0.1 OERI[25027]/OERI[4097]/OERI[4000]/ORA-1555 in plugged datafile after CREATE CONTROLFILE in 11g
12353983 ORA-600 [4000] with XA in RAC
7687856 11.2.0.1 ORA-600 [4000] from DML on transported ASSM tablespace
2917441 11.1.0.6 OERI [4000] during startup
3115733 9.2.0.5, 10.1.0.2 OERI[4000] / index corruption can occur during index coalesce
2959556 9.2.0.5, 10.1.0.2 STARTUP after an ORA-701 fails with OERI[4000]
1371820 8.1.7.4, 9.0.1.4, 9.2.0.1 OERI:4506 / OERI:4000 possible against transported tablespace
+ 434596 7.3.4.2, 8.0.3.0 ORA-600[4000] from altering storage of BOOTSTRAP$

ORA-600 [4193] [a] [b]

VERSIONS:

versions 6.0 to 10.1

DESCRIPTION:

A mismatch has been detected between Redo records and Rollback (Undo)

records.

We are validating the Undo block sequence number in the undo block against

the Redo block sequence number relating to the change being applied.

This error is reported when this validation fails.

ARGUMENTS:

Arg [a] Undo record seq number

Arg [b] Redo record seq number

FUNCTIONALITY:

KERNEL TRANSACTION UNDO

IMPACT:

PROCESS FAILURE

POSSIBLE ROLLBACK SEGMENT CORRUPTION

SUGGESTIONS:

This error may indicate a rollback segment corruption.

This may require a recovery from a database backup depending on

the situation.

For further analysis, please submit the trace files and alert.log to

Oracle Support Services.

Known Issues:

Known Bugs

NB Bug Fixed Description
14034244 11.2.0.3.BP09, 11.2.0.4, 12.1.0.1 Lost write type corruption using ASM in 11.2.0.3
8240762 10.2.0.5, 11.1.0.7.10, 11.2.0.1 Undo corruptions with ORA-600 [4193]/ORA-600 [4194] or ORA-600 [4137] / SMON may spin to recover transaction

ORA-600 [4194] [a] [b]

VERSIONS:

versions 6.0 to 10.1

DESCRIPTION:

A mismatch has been detected between Redo records and rollback (Undo)

records.

We are validating the Undo record number relating to the change being

applied against the maximum undo record number recorded in the undo block.

This error is reported when the validation fails.

ARGUMENTS:

Arg [a] Maximum Undo record number in Undo block

Arg [b] Undo record number from Redo block

FUNCTIONALITY:

Kernel Transaction Undo called from Cache layer

IMPACT:

PROCESS FAILURE

POSSIBLE ROLLBACK SEGMENT CORRUPTION

SUGGESTIONS:

This error may indicate a rollback segment corruption.

This may require a recovery from a database backup depending on

the situation.

If the Known Issues section below does not help in terms of identifying

a solution, please submit the trace files and alert.log to Oracle

Support Services for further analysis.

Known Issues:

Known Bugs

NB Bug Fixed Description
8240762 10.2.0.5, 11.1.0.7.10, 11.2.0.1 Undo corruptions with ORA-600 [4193]/ORA-600 [4194] or ORA-600 [4137] / SMON may spin to recover transaction
3210520 9.2.0.5, 10.1.0.2 OERI[kjccqmg:esm] / OERI[4194] / corruption possible in RAC
+ 792610 8.0.6.0, 8.1.6.0 Rollback segment corruption OERI:4194 can occur if block checking detects a corrupt block

ORA-600 [4137]

VERSIONS:

versions 7.0 to 10.1

DESCRIPTION:

While backing out an undo record (i.e. at the time of rollback) we found a

transaction id mis-match indicating either a corruption in the rollback

segment or corruption in an object which the rollback segment is trying to

apply undo records on. This would indicate a corrupted rollback segment.

FUNCTIONALITY:

Kernel Transaction Undo Recovery

IMPACT:

POSSIBLE PHYSICAL CORRUPTION in Rollback segments

SUGGESTIONS:

Signalled during rollback (also rollback for consistent read).

The consistency check that compares the transaction id of the

transaction being rolled back against the transaction id in

undo block being applied is failing.

A possible cause is a lost write to the undo segment.

The main approach is to identify the file containing the bad undo

segment block and treat it as if the file is corrupt. Consult

the trace file for this information.

If in archivelog mode, restore the file & roll forward.

If in Noarchivelog mode, restore from a cold backup taken before

the error was reported.

Alternatively, you can look at dba_rollback_segs data dictionary view.

If the status column that describes what state the rollback segment is

currently in is "needs recovery" then lookup the following article

for posible solution.

Note:28812.1

Rollback Segment Needs Recovery

If the Known Issues section below does not help in terms of identifying

a solution, please submit the trace files and alert.log to Oracle

Support Services for further analysis.

Known Issues：

Known Bugs

NB Bug Fixed Description
8240762 10.2.0.5, 11.1.0.7.10, 11.2.0.1 Undo corruptions with ORA-600 [4193]/ORA-600 [4194] or ORA-600 [4137] / SMON may spin to recover transaction
671491 8.1.6.0 Rollback Segment corruption possible if RBS has > 32767 extents

Step by step to resolve ORA-600 4194 4193 4197 on database crash

Oracle Database - Enterprise Edition - Version 9.2.0.1 to 11.2.0.3 [Release 9.2 to 11.2]
Information in this document applies to any platform.
***Checked for relevance on 04-Dec-2013***

SYMPTOMS

The following error is occurring in the alert.log right before the database crashes.

ORA-00600: internal error code, arguments: [4194], [#], [#], [], [], [], [], []

This error indicates that a mismatch has been detected between redo records and rollback (undo) records.

ARGUMENTS:

Arg [a] - Maximum Undo record number in Undo block
Arg [b] - Undo record number from Redo block

Since we are adding a new undo record to our undo block, we would expect that the new record number is equal to the maximum record number in the undo block plus one. Before Oracle can add a new undo record to the undo block it validates that this is correct. If this validation fails, then an ORA-600 [4194] will be triggered.

CHANGES

This issue generally occurs when there is a power outage or hardware failure that initially crashes the database. On startup, the database does the normal roll forward (redo) and then rollback (undo), this is where the error is generated on the rollback.

CAUSE

This also can be cause by the following defect

Bug 8240762 Abstract: Undo corruptions with ORA-600 [4193]/ORA-600 [4194] or ORA-600 [4137] after SHRINK

Details:
Undo corruption may be caused after a shrink and the same undo block may be used
for two different transactions causing several internal errors like:
ORA-600 [4193] / ORA-600 [4194] for new transactions
ORA-600 [4137] for a transaction rollback

SOLUTION

Best practice to create a new undo tablespace.
This method includes segment check.

Create pfile from spfile to edit
>create pfile from spfile;

1. Shutdown the instance

2. set the following parameters in the pfile
    undo_management = manual
    event = '10513 trace name context forever, level 2'

3. >startup restrict pfile=<initsid.ora>

4. >select tablespace_name, status, segment_name from dba_rollback_segs where status != 'OFFLINE';

This is critical - we are looking for all undo segments to be offline - System will always be online.

If any are 'PARTLY AVAILABLE' or 'NEEDS RECOVERY' - Please open an issue with Oracle Support or update the current SR. There are many options from this moment and Oracle Support Analyst can offer different solutions for the bad undo segments.

If all offline then continue to the next step

5. Create new undo tablespace - example
>create undo tablespace <new undo tablespace> datafile <datafile> size 2000M;

6. Drop old undo tablespace
>drop tablespace <old undo tablespace> including contents and datafiles;

7. >shutdown immediate;

8 >startup mount;

9 modify the pfile with the new undo tablespace name
>alter system set undo_tablespace = '<new tablespace>' scope=pfile;

10. >shutdown immediate;

11. >startup;

        Startup using the normal spfile
================

The reason we create a new undo tablespace first is to use new undo segment numbers that are higher then the current segments being used. This way when a transaction goes to do block clean-out the reference to that undo segment does not exist and continues with the block clean-out.

oradh

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
UNDO相关问题总结（六）

本次主题也是undo相关问题的最后一篇，也可以归纳为数据库的非常规恢复，主要描述的数据库出现ORA-600 [4xxx]错误的处理方案。问题描述ORA-600 [4xxx] 错误基本都与回滚段相关，下面简单描述几种（仅供参考）ORA-600 [4000] [a] ：这个错误表示oracle尝试撤销数据字典表（数据字典表ITL中存在未提交事务），而根据ITL
复制链接

扫一扫

专栏目录

NB	Bug	Fixed	Description
	16761566	11.2.0.3.9, 11.2.0.3.BP22, 11.2.0.4, 12.1.0.2, 12.2.0.0	Instance fails to start with ORA-600 [4000] [usn#]
	13910190	11.2.0.3.BP15, 11.2.0.4, 12.1.0.1	ORA-600 [4000] from plugged in tablespace in Exadata
	14741727	11.2.0.2.9, 11.2.0.2.BP19, 11.2.0.3.BP12, 11.2.0.3.BP13, 12.1.0.1	Fixes for bug 12326708 and 14624146 can cause problems - backout fix
+	10425010	11.2.0.3, 12.1.0.1	Stale data blocks may be returned by Exadata FlashCache
*	9145541	11.1.0.7.4, 11.2.0.1.2, 11.2.0.2, 12.1.0.1	OERI[25027]/OERI[4097]/OERI[4000]/ORA-1555 in plugged datafile after CREATE CONTROLFILE in 11g
	12353983		ORA-600 [4000] with XA in RAC
	7687856	11.2.0.1	ORA-600 [4000] from DML on transported ASSM tablespace
	2917441	11.1.0.6	OERI [4000] during startup
	3115733	9.2.0.5, 10.1.0.2	OERI[4000] / index corruption can occur during index coalesce
	2959556	9.2.0.5, 10.1.0.2	STARTUP after an ORA-701 fails with OERI[4000]
	1371820	8.1.7.4, 9.0.1.4, 9.2.0.1	OERI:4506 / OERI:4000 possible against transported tablespace
+	434596	7.3.4.2, 8.0.3.0	ORA-600[4000] from altering storage of BOOTSTRAP$

UNDO相关问题总结（六）

SYMPTOMS

CHANGES

CAUSE

SOLUTION

“相关推荐”对你有帮助么？