oracle 8103错误,【数据恢复】详解ORA-8103错误

最新推荐文章于 2023-10-12 16:11:59 发布

期货常小识

最新推荐文章于 2023-10-12 16:11:59 发布

阅读量1.5k

点赞数

文章标签： oracle 8103错误

ORA-8103是我们Database Consultant 经常要遇到的一个问题，了解ORA-8103的成因非常重要。

简单来说ORA-8103 的主要成因有2类:

数据块的 block type 类型是无效的或者读出来的块类型与Oracle期望的不一致。例如 Oracle 认为该数据块的类型为data(type=6)，但实际却不是。

数据块中的data_object_id 和数据字典中的data_object_id不匹配

针对ORA-8103问题我们优先推荐一些措施：

ORA-08103问题的诊断最好是能生成8103错误的ERROR STACK TRACE，在TRACE中会记录具体引发8103的对象的OBJ和OBJD，这便于我们定位可能存在corruption的对象。

问题在于往往前台进程遇到ORA-08103错误不会在后台生成TRACE文件，这需要我们手动设置8103 触发ERRORSTACK的EVENTS:

ALTER SYSTEM SET EVENTS ’8103 TRACE NAME ERRORSTACK LEVEL 3′;

解决思路包括：

1. 通过OBJD和DBA定位到具体的表名和块号

2. 有条件的情况下对该表做一个analyze .. validate structure

3. 有条件的情况下对该表所在tablespace做一个 dbms_space_admin.ASSM_TABLESPACE_VERIFY

4. 有条件的情况下move这张表或者相关的分区，尝试绕过该问题

5. 有条件的情况下降该表或分区移动到MSSM表空间上，绕过该问题

execute dbms_space_admin.tablespace_verify(‘&tablespace_name’)

oradebug setmypid

oradebug tracefile_name

execute dbms_space_admin.assm_tablespace_verify(‘&tablespace_name’,dbms_space_admin.TS_VERIFY_BITMAPS)

oradebug setmypid

oradebug tracefile_name

针对不同的 analyze validate structure 后得到的结果，我们可以得到一些初步的结论：

如果执行 flush buffer cache之后再次analyze validate structure不再报ORA-8103错误则说明：

可能是完全正常的现象，之前的ORA-8103正是也因为对象正在被DROP/TRUNCATE而导致SELECT报ORA-8103。一般来说Call Stack会显示进程正尝试访问该段的segment header。更多信息可以参考BUG 7441661

也可能该问题仅仅发生在buffer cache层，而没有发生在DISK上。通过flush buffer_cache若能解决，则一般是这种情况，往往是Buffer Cache管理的BUG 。

如果执行 flush buffer cache之后再次analyze validate structure再次报ORA-8103错误则说明：

如果dump对应的数据块发现该块在逻辑上是完整一致的(也可以用bbed/dbv工具验证)，则有可能是Lost Write，则不是被其他对象重格式化使用了。

这里判断Lost Write的一个重要手段是对块做recover/blockrecover，如果recover能修复该块，则说明是因为Lost Write引起了本ORA-8103问题，如果不是则说明99%的可能性是BUG引起的。

常见的一种现象是使用第三方工具在数据库打开的情况下copy 数据库，这些工具的BUG可能导致copy 老的版本的block到目标新库中。

另一种可能是 extent盘区级别的不一致。同一个数据块/extent 可能同时属于 2个数据段segment，这导致其中的一个被后者覆盖。通过recover的方式是无法修复这种场景的，因为这种逻辑的讹误发生在表空间级别的extent信息上。可以检查dba_extents/dba_segments/dba_free_space这些视图来确定问题数据块到底是否同时属于多个对象，或者一个数据块同时出现在dba_extents/dba_segments/dba_free_space 三个视图中，因为 used extent 不该出现在dba_free_space中，而free extent不该在dba_extents，当然要排除recyclebin中对象的影响。绝大多数情况下这种extent逻辑不一致的现象，被称作extent overlap ，通常是Oracle Space Management空间管理层面的BUG。

在对ORA-8103问题的诊断过程中定位问题的OBJD异常重要。应当说准确地将ORA-8103错误与BUG定位起来是有难度的，因为这往往需要涉及到redo dump以发现到底是哪些opcode造成了后续的objd 或 block type 不一致。在一些BUG中我们发现，由于可能的变量陈旧，造成objd的结构未合理清除，之后就发现block上的objd是错的了，可能遇到ORA-8103也可能是ORA-1410，这引起了后续其他的逻辑讹误，以至于很难通过TRACE/REDO LOG DUMP来定位原始问题所在。这也是为什么虽然在例如版本10.2.0.4上有几个ORA-8103的bug Note，但这些BUG最终未被close为real software bug即真的软件BUG ，大多都是不了了之，因为在用户现场的TRACE和REDO DUMP都未必能真实定位到问题所在，这也是为什么我们要说逻辑讹误的分析和处理原要比物理讹误来的复杂。

Maclean的经验是在有大量Oracle DB的环境下一年出个几次的逻辑/物理坏块是很正常的事情，对于物理讹误我们只要切实备份即可99%得解决。而对于逻辑坏块可做的事情不多，打最新的补丁开 db_block_checking、db_block_checksum几件事情而已。

值得一说的是如果去读一下ORA-8103的一些Bug Note，可以发现使用 LOB、APPEND INSERT、PARALLEL INSERT、exchange partition 、Split partition、advanced compression、HCC 混合列压缩往往是引起ORA-8103的高危操作，但实际我们又不可能放弃上述操作。

目前已知ORA-8103相关的BUG 列表：

Bug

Fixed

Description

13910420

11.2.0.3.BP09, 12.1.0.0

ORA-8103 during insert / update of basicfile LOB in assm segment using space search cache

13725395

11.2.0.3.BP07, 11.2.0.4, 12.1.0.0

ORA-600 [kdzhFindHeadPiece: unnewed > 1] from load into HCC table

13700577

11.2.0.3.BP07, 11.2.0.4, 12.1.0.0

PQ slave dies with ORA-600 [kdblddr_2]

12747437

12.1.0.0

ORA-600 [ktspfmdb:objdchk_kcbnew_3] after purging single consumer queue table

12582839

11.2.0.3, 12.1.0.0

ORA-8103/ORA-600 [3020] on RMAN recovered locally managed tablespace

12321309

12.1.0.0

ORA-600 / ORA-8103 UNUSABLE state of partitioned index is not carried across by TABLESPACE transport using DataPump

11937253

11.2.0.2.6, 11.2.0.2.BP11, 11.2.0.3, 12.1.0.0

A Parallel query fails with ORA-8103 on an Active Dataguard Enviroment.

11850492

11.2.0.3, 12.1.0.0

ORA-8103 ORA-600 ORA-3113 on temporary tables using INDEX FAST FULL SCAN and DIRECT read

10385812

11.2.0.3, 12.1.0.0

ORA-1410 or ORA-8103 by queries with DIRECT READ while concurrent DIRECT INSERT

10329146

11.2.0.1.BP10, 11.2.0.2.2, 11.2.0.2.BP03, 11.2.0.2.GIBUNDLE02, 11.2.0.2.GIPSU02, 11.2.0.3, 12.1.0.0

Lost write in ASM with multiple DBWs and a disk is offlined and then onlined

10209232

11.1.0.7.7, 11.2.0.1.BP08, 11.2.0.2.1, 11.2.0.2.BP02, 11.2.0.2.GIBUNDLE01, 11.2.0.3, 12.1.0.0

ORA-1578 / ORA-600 [3020] Corruption. Misplaced Blocks and Lost Write in ASM

10136415

11.2.0.3, 12.1.0.0

ORA-8103 on Partitioned IOT after partition maintenance

9965085

11.2.0.3, 12.1.0.0

ORA-1578 / ORA-8103 Temporary table block corruption / space wastage from PDML

9659614

10.2.0.5.3, 11.2.0.2, 11.2.0.3.5, 11.2.0.3.BP05, 12.1.0.0

Large trace file for ORA-8103

9651350

11.2.0.2.2, 11.2.0.2.BP05, 11.2.0.3, 12.1.0.0

Large redo dump and ORA-308 might be raised due to ORA-8103

9275027

11.2.0.2, 12.1.0.0

ORA-600 [kcbnew_3] can occur after TRUNCATE / DROP

9272086

11.1.0.7.4, 11.2.0.1.2, 11.2.0.1.BP06, 11.2.0.2, 12.1.0.0

ORA-8103 by a query on DBA_EXTENTS. Trace file with Block type: 0×44=NGLOB: Extent Map

8754670

11.2.0.2, 12.1.0.0

IMP-17 / ORA-8103 transporting a large dictionary managed tablespace

8740993

11.1.0.7.8, 11.2.0.2, 12.1.0.0

ORA-1410 / ORA-8103 on ADG STANDBY during table scan after DROP/TRUNCATE/SHRINK in PRIMARY

8725282

11.2.0.1.BP08, 11.2.0.2, 12.1.0.0

Corruption from cross platform transport of tablespace with securefile objects

8716064

11.2.0.2, 12.1.0.0

Analyze Table Validate Structure fails on ADG standby with several errors

8597106

11.2.0.1.BP06, 11.2.0.2, 12.1.0.0

Lost Write in ASM when normal redundancy is used

8428523

11.2.0.2, 12.1.0.0

Alter Table Rename causes wrong results/ora-8103/hangs on ADG Standby.

7710827

11.2.0.2, 12.1.0.0

Index rebuild or Merge partition causes wrong results in concurrent reads instead of ORA-8103

7519406

10.2.0.5.1, 11.2.0.1.2, 11.2.0.1.BP06, 11.2.0.2, 12.1.0.0

Larger trace than needed for ORA-8103 under kteinicnt1

12330911

12.1

EXADATA LSI firmware for lost writes

8876094

11.1.0.7.2, 11.2.0.2

ORA-8103 by DBA_UNDO_EXTENTS or DBMS_SPACE_ADMIN.TABLESPACE_VERIFY on Block type: 0×25

9167831

11.2.0.2

ORA-8103 instead of ORA-1410

7650993

11.1.0.7.1, 11.2.0.1

ORA-8103 in a select at ADG standby database from table stored in ASSM tablespace

7432556

11.1.0.7.1, 11.2.0.1

ORA-8103 by Parallel Query on Partitioned Tables in BIGFILE Tablespaces

7390324

11.2.0.1

ANALYZE signals OERI [kcbgtcr_12]/ORA-8103 on bitmap index

7117200

11.2.0.1

ORA-8103 after TSPITR/PLUGIN tablespace from a restored Level 1 Backup

8825048

11.1.0.7.3

ORA-308/ORA-27037 when dumping archived log for ORA-8103. Dump when event 10736 level 4 is set

6337376

11.1.0.7

OERI:kcbgcur_3 / ORA-8103 after truncating a partition table with LOBs

9711472

11.1.0.6

ORA-8103 on operations for a partitioned LOB if any different partition is dropped

5637976

10.2.0.4, 11.1.0.6

ORA-8103/ORA-1410 from concurrent INSERT / export on ASSM tables

5083393

10.2.0.4, 11.1.0.6

DBA_FREE_SPACE FILE_ID / REL_FNO may be wrong

4592596

10.2.0.4, 11.1.0.6

Corruption (ORA-1410 / ORA-8103) from multi-table insert with direct load

6864586

10.2.0.5

ORA-8103 on partitioned table with a LOB column during analyze table with concurrent add/drop partition.

3569503

9.2.0.6, 10.2.0.4

PQ may signal a false ORA-8103 under load

13618170

ORA-8103 for create index online when the fix of bug 10027403 is installed

3966709

9.2.0.7, 10.1.0.4, 10.2.0.1

Range/object reuse prematurely (ORA-8103)

3868753

9.2.0.7, 10.1.0.5, 10.2.0.1

Concurrent export / INSERT of ASSM segment can fail with ORA-1410 / ORA-8103

5523799

Various OERI (eg kcbgtcr_12) using ASSM managed segments – superceded

6047085

Linux x64-64: SGA corruption / crash following any ORA-7445

3785200

9.2.0.6, 10.1.0.2

Corruption possible in automatic space managed segments

3083560

9.2.0.5, 10.1.0.2

ORA-1410 / ORA-8103 from direct path export if concurrent DML occurs

2619867

9.2.0.3, 10.1.0.2

OERI:[KCBGTCR_12] / ORA-8103 / ORA-1410 SELECTing from bitmap managed segment

2551000

9.2.0.4, 10.1.0.2

False ORA-1410 / ORA-8103 possible from ANALYZE COMPUTE/ESTIMATE STATISTICS

2333731

9.2.0.2

ORA-8103 possible in PQ slave

2105419

9.0.1.3, 9.2.0.1

ORA-8103 possible from PQ on bitmap managed segments with concurrent inserts

1998455

8.1.7.3, 9.0.1.3, 9.2.0.1

OERI:KCBGTCR_4 possible from long running DDL if referenced object dropped/truncated

1804299

9.0.1.1, 9.2.0.1

Rollback of Direct load can corrupt BITMAP managed segments / ORA-8103

1698789

9.2.0.1

Wrong results, ORA-1410, ORA-8103, OERI:25012 on SELECT of UNSCOPED REF with ROWID

1504967

9.2.0.1

ORA-8103 possible on READ ONLY standby after TRUNCATE on primary

1400739

8.1.7.1, 9.0.1.0

Block corruption/OERI:2023 /ORA-8103 can occur if TRUNCATE is interrupted (Ctrl-C)

1283521

8.1.7.0

ORA-8103 can occur on TRUNCATED cluster table

589855

7.3.3.6, 7.3.4.1

ORA:1578 or ORA:8103 selecting invalid ROWID

1053863

8.0.5.2, 8.0.6.2

NCR: ORA-8103 / corrupt read possible using async IO

对于ORA-8103问题的更多信息可以参考：

MOS文档Note 268302.1 ORA-8103 Diagnostics and Solution

期货常小识

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
oracle 8103错误,【数据恢复】详解ORA-8103错误

ORA-8103是我们Database Consultant 经常要遇到的一个问题，了解ORA-8103的成因非常重要。简单来说ORA-8103 的主要成因有2类:数据块的 block type 类型是无效的或者读出来的块类型与Oracle期望的不一致。例如 Oracle 认为该数据块的类型为data(type=6)，但实际却不是。数据块中的data_object_id 和数据字典中的...
复制链接

扫一扫