上篇博客刚说完数据文件中有坏块,会影响rman的备份,导致备份失败。今天就遇到了生产环境由于坏块导致rman备份失败,已经有20几天没有备份完成了。唯一不同的是,这些坏块的位置,经查询,不属于任何对象,也就是格式化了,在未使用的块上面。即如何格式化不属于任何段的损坏块。下面就模拟这一报错,及响应的解决方案。
故障模拟
1、此块不属于任何对象;
2、此块必须是格式化未使用的坏块,如果是未格式化(从未使用过的空块)的坏块,rman备份不会报错
本次环境:oracle 11.2.0.4 asm standalone环境
创建必须表:
[oracle@11gasm ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Wed Nov 11 12:34:05 2020
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
SQL> conn zhuo/zhuo
SQL> create table test as select * from zhuo;
Table created.
查询block的分布:
SQL> conn / as sysdba
Connected.
SQL> select file_id,block_id,block_id+blocks-1
2 from dba_extents
3 where segment_name ='TEST' AND owner='ZHUO';
FILE_ID BLOCK_ID BLOCK_ID+BLOCKS-1
---------- ---------- -----------------
5 384 391
5 392 399
5 400 407
5 408 415
5 416 423
5 424 431
5 432 439
5 440 447
5 448 455
5 456 463
5 464 471
5 472 479
5 480 487
5 488 495
5 496 503
5 504 511
5 512 639
17 rows selected.
SQL> select * from dba_free_space where file_id=5;
TABLESPACE_NAME FILE_ID BLOCK_ID BYTES BLOCKS RELATIVE_FNO
------------------------------ ---------- ---------- ---------- ---------- ------------
ZHUO 5 640 4156555264 507392 5
ZHUO 5 508032 1206910976 147328 5
此时表test占用的区都是连续分布的,dba_free_space里面查到的空闲空间也是连续分布。
制造格式化,但未使用的块:
1)删除块上的数据,但不释放空间
SQL> delete from zhuo.test;
18333 rows deleted.
SQL> commit;
Commit complete.
SQL> select file_id,block_id,block_id+blocks-1
2 from dba_extents
3 where segment_name ='TEST' AND owner='ZHUO';
FILE_ID BLOCK_ID BLOCK_ID+BLOCKS-1
---------- ---------- -----------------
5 384 391
5 392 399
5 400 407
5 408 415
5 416 423
5 424 431
5 432 439
5 440 447
5 448 455
5 456 463
5 464 471
5 472 479
5 480 487
5 488 495
5 496 503
5 504 511
5 512 639
17 rows selected.
SQL> select * from dba_free_space where file_id=5;
TABLESPACE_NAME FILE_ID BLOCK_ID BYTES BLOCKS RELATIVE_FNO
------------------------------ ---------- ---------- ---------- ---------- ------------
ZHUO 5 640 4156555264 507392 5
ZHUO 5 508032 1206910976 147328 5
dba_free_space里面的空间分布没有变化。
2)制造一个坏块.
此处是选择一个块,修改rdba的地址,为什么选择387块。因为387号块原来属于表zhuo,现在数据已经删除,现在此块不属于任何对象,
此处选择直接拷贝别的块到此块处,就相当于rdba已经修改。
SQL> @asm
Enter value for filesystem_filename: /tmp/385.dbf
Enter value for asm_datafile_name: +datadg/orcl/datafile/ZHUO.266.1024792071
Enter value for block_id_in_datafile: 387
3)处理此块
SQL> set autot off
SQL> set lines 150
SQL> col segment_name for a15
SQL> col owner for a20
SQL> SELECT tablespace_name, segment_type, owner, segment_name
2 FROM dba_extents
3 WHERE file_id = &AFN
4 and &BL between block_id AND block_id+blocks-1;
Enter value for afn: 5
old 3: WHERE file_id = &AFN
new 3: WHERE file_id = 5
Enter value for bl: 387
old 4: and &BL between block_id AND block_id+blocks-1
new 4: and 387 between block_id AND block_id+blocks-1
TABLESPACE_NAME SEGMENT_TYPE OWNER SEGMENT_NAME
------------------------------ ------------------ -------------------- ---------------
ZHUO TABLE ZHUO TEST
SQL> drop table zhuo.test;
Table dropped.
SQL> set autot off
SQL> set lines 150
SQL> col segment_name for a15
SQL> col owner for a20
SQL> SELECT tablespace_name, segment_type, owner, segment_name
2 FROM dba_extents
3 WHERE file_id = &AFN
4 and &BL between block_id AND block_id+blocks-1;
Enter value for afn: 5
old 3: WHERE file_id = &AFN
new 3: WHERE file_id = 5
Enter value for bl: 387
old 4: and &BL between block_id AND block_id+blocks-1
new 4: and 387 between block_id AND block_id+blocks-1
no rows selected
SQL> select * from dba_free_space where file_id=5;
TABLESPACE_NAME FILE_ID BLOCK_ID BYTES BLOCKS RELATIVE_FNO
------------------------------ ---------- ---------- ---------- ---------- ------------
ZHUO 5 640 4156555264 507392 5
ZHUO 5 508032 1206910976 147328 5
ZHUO 5 384 65536 8 5 --属于此块
ZHUO 5 392 65536 8 5
ZHUO 5 400 65536 8 5
ZHUO 5 408 65536 8 5
ZHUO 5 416 65536 8 5
ZHUO 5 424 65536 8 5
ZHUO 5 432 65536 8 5
ZHUO 5 440 65536 8 5
ZHUO 5 448 65536 8 5
ZHUO 5 456 65536 8 5
ZHUO 5 464 65536 8 5
ZHUO 5 472 65536 8 5
ZHUO 5 480 65536 8 5
ZHUO 5 488 65536 8 5
ZHUO 5 496 65536 8 5
ZHUO 5 504 65536 8 5
ZHUO 5 512 1048576 128 5
19 rows selected.
请注意:有时候会查到,此块也不在dba_free_Space空间里面。
4)rman 备份出现报错,错误出现
RMAN> backup datafile 5 format '/home/oracle/zhuo.bak';
Starting backup at 10-NOV-20
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00005 name=+DATADG/orcl/datafile/zhuo.266.1024792071
channel ORA_DISK_1: starting piece 1 at 10-NOV-20
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 11/10/2020 23:18:19
ORA-19566: exceeded limit of 0 corrupt blocks for file +DATADG/orcl/datafile/zhuo.266.1024792071
坏块识别
1)坏块检测
[oracle@11gasm ~]$ dbv file=+datadg/orcl/datafile/ZHUO.266.1024792071 userid=sys/oracle
DBVERIFY: Release 11.2.0.4.0 - Production on Tue Nov 10 23:18:03 2020
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
DBVERIFY - Verification starting : FILE = +datadg/orcl/datafile/ZHUO.266.1024792071
Page 387 is marked corrupt
Corrupt block relative dba: 0x01400183 (file 5, block 387) --坏块类型:block misplaced
Bad header found during dbv:
Data in bad block:
type: 0 format: 2 rdba: 0x00000181
last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x00000001
check value in block header: 0xa681
computed block checksum: 0x0 --checksum值是正确的
DBVERIFY - Verification complete
Total Pages Examined : 655360
Total Pages Processed (Data) : 485
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 654872
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 2
Total Pages Marked Corrupt : 1 --有一个物理坏块
Total Pages Influx : 0
Total Pages Encrypted : 0
Highest block SCN : 0 (0.0)
有坏块,下面需要确定坏块类型。根据不同的坏块类型,进行不同的处理。
- 方法一:参考mos
Step 3 - Check whether block is part of any object - For Small Number of Corrupted Blocks
Query dba_extents and cross check the block does not belong to any object :
SQL> select segment_name, segment_type, owner
from dba_extents
where file_id = <Absolute file number>
and <corrupted block number> between block_id
and block_id + blocks -1;
If it doesn't belong to an object, double check if it does exists in dba_free_space :
SQL> Select * from dba_free_space where file_id= <Absolute file number>
and <corrupted block number> between block_id and block_id + blocks -1;
If the block cannot be found on DBA_FREE_SPACE nor DBA_EXTENTS, the block might be a file space usage bitmap and cannot be reformatted.
1)查询坏块属于对象:
SQL> set autot off
SQL> set lines 150
SQL> col segment_name for a15
SQL> col owner for a20
SQL> SELECT tablespace_name, segment_type, owner, segment_name
2 FROM dba_extents
3 WHERE file_id = &AFN
4 and &BL between block_id AND block_id+blocks-1;
Enter value for afn: 5
old 3: WHERE file_id = &AFN
new 3: WHERE file_id = 5
Enter value for bl: 387
old 4: and &BL between block_id AND block_id+blocks-1
new 4: and 387 between block_id AND block_id+blocks-1
no rows selected
不属于任何对象。
注意:如果是段头损坏或者位图块,此语句查不出来。
所以单靠这一个语句,根本确定不了这个块是否未未格式化的块。还得进一步佐证。
2)查询dba_free_space,是否属于free块
SQL> select * from dba_free_space where file_id=5;
TABLESPACE_NAME FILE_ID BLOCK_ID BYTES BLOCKS RELATIVE_FNO
------------------------------ ---------- ---------- ---------- ---------- ------------
ZHUO 5 640 4156555264 507392 5
ZHUO 5 508032 1206910976 147328 5
ZHUO 5 384 65536 8 5
ZHUO 5 392 65536 8 5
ZHUO 5 400 65536 8 5
ZHUO 5 408 65536 8 5
ZHUO 5 416 65536 8 5
ZHUO 5 424 65536 8 5
ZHUO 5 432 65536 8 5
ZHUO 5 440 65536 8 5
ZHUO 5 448 65536 8 5
ZHUO 5 456 65536 8 5
ZHUO 5 464 65536 8 5
ZHUO 5 472 65536 8 5
ZHUO 5 480 65536 8 5
ZHUO 5 488 65536 8 5
ZHUO 5 496 65536 8 5
ZHUO 5 504 65536 8 5
ZHUO 5 512 1048576 128 5
19 rows selected.
此块属于free空间了,所以此块属于未格式化坏块。
3)当然进一步的,如果dba_extents和dba_free_space都查询不出来,可以用dump来确认,排除段头,位图块的坏块的能.If the block cannot be found on DBA_FREE_SPACE nor DBA_EXTENTS, the block might be a file space usage bitmap and cannot be reformatted.
2个视图里面都没有此块,根据mos,该块可能是文件空间使用位图,因此无法重新格式化。
dump结果:
*** 2020-05-19 19:04:33.402
Start dump data blocks tsn: 5 file#:5 minblk 131 maxblk 131 –dump的是131块
Block dump from cache:
Dump of buffer cache at level 4 for tsn=5 rdba=20971651
Block dump from disk:
buffer tsn: 5 rdba: 0x01400084 (5/132) –实际却是132块,所以属于写紊乱了
scn: 0x0000.000ba23c seq: 0x02 flg: 0x04 tail: 0xa23c0602
frmt: 0x02 chkval: 0xe412 type: 0x06=trans data –块类型
Hex dump of corrupt header 4 = CORRUPT
Dump of memory from 0x00007FFFF4CF3200 to 0x00007FFFF4CF3214
7FFFF4CF3200 0000A206 01400084 000BA23C 04020000 […@.<…]
7FFFF4CF3210 0000E412 […]
Hex dump of corrupt block
Dump of memory from 0x00007FFFF4CF3214 to 0x00007FFFF4CF51FC
7FFFF4CF3210 00000001 00013AE2 000BA236 […:…6…]
7FFFF4CF3220 00000000 00320003 01400080 0000FFFF […2…@…]
7FFFF4CF3230 00000000 00000000 00000000 00008000 […]
7FFFF4CF3240 000BA236 00000000 00000000 00000000 [6…]
7FFFF4CF3250 00000000 00000000 00000000 00000000 […]
type: 0x06=trans data 。type是准的。
说明这个块之前是数据块,从而排除了段头,位图块的可能。
通过上面3种方式的确认,此块一定属于未格式化的块。
补充:块类型,关于block type,有如下几种类型:
01 undo segment header
02 undo data block
03 save undo header
04 save undo data block
05 data segment header
06 data block
07 temp data block(无ITL)
08 sort key
09 sort run
10 segment free list block
11 datafile header block
2. 方法二:文件高水位分析此块
select *
from (select /*+ ordered use_hash(a,b,c) */
a.file_id,
a.file_name,
a.filesize,
b.freesize,
(a.filesize - b.freesize) usedsize,
c.hwmsize,
c.hwmsize - (a.filesize - b.freesize) unsedsize_belowhwm,
a.filesize - c.hwmsize canshrinksize
from (select file_id,
file_name,
round(bytes / 1024 / 1024) filesize
from dba_data_files) a,
(select file_id, round(sum(dfs.bytes) / 1024 / 1024) freesize
from dba_free_space dfs
group by file_id) b,
(select file_id, round(max(block_id) * 8 / 1024) HWMsize
from dba_extents
group by file_id) c
where a.file_id = b.file_id
and a.file_id = c.file_id
order by unsedsize_belowhwm desc)
where file_id in (select file_id
from dba_data_files
where file_id=5)
order by file_id;
FILE_ID FILE_NAME FILESIZE FREESIZE USEDSIZE HWMSIZE UNSEDSIZE_BELOWHWM CANSHRINKSIZE
---------- -------------------------------------------------- ---------- ---------- ---------- ---------- ------------------ -------------
5 +DATADG/orcl/datafile/zhuo.266.1024792071 5120 5117 3 2 -1 5118
不难看出目前该数据文件的HWM大概在2M左右;而前面报错的坏块编号387*8192/1024/1024 大概是3M左右;显然这是有问题的。
—此处主要模拟坏块出现在已格式化,但未使用的块里面。
原因
格式化未使用的块出现坏块
解决方案
对于这个问题,我们可以其实不做任何处理,等业务表的数据逐渐增加;会自动去格式化处理掉这部分
坏块,不过这样会影响备份。也就是说在坏块处理好之前备份可能都无法正常进行,除非指定allow corrption参数。
那么针对这种简单问题其实有3种比较简单的处理方法:
1、通过resize文件来完成
2、通过创建测试表,allocate extent到指定文件,并往表中插入大量数据,格式化坏块后,再drop table
3、通过bbed copy 空块直接覆盖即可(copy 覆盖需要修改rdba地址)。
bbed copy
通过bbed copy 空块直接覆盖即可(copy 覆盖需要修改rdba地址)。
- 坏块检测
语法:
dbv userid={system/<password>} file={full path filename} logfile={output filename}
[oracle@11gasm ~]$ dbv file='+datadg/orcl/datafile/ZHUO.266.1024792071' userid=sys/oracle
DBVERIFY: Release 11.2.0.4.0 - Production on Tue Nov 10 20:39:04 2020
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
DBVERIFY - Verification starting : FILE = +datadg/orcl/datafile/ZHUO.266.1024792071
Page 387 is marked corrupt
Corrupt block relative dba: 0x01400183 (file 5, block 387)
Bad header found during dbv:
Data in bad block:
type: 0 format: 2 rdba: 0x00000182
last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x00000001
check value in block header: 0xa682
computed block checksum: 0x0
DBVERIFY - Verification complete
Total Pages Examined : 655360
Total Pages Processed (Data) : 243
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 139
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 654977
Total Pages Marked Corrupt : 1
Total Pages Influx : 0
Total Pages Encrypted : 0
Highest block SCN : 0 (0.0)
file 5 block 387出现坏块
- 检查坏块对象
SQL> set autot off
SQL> set lines 150
SQL> col segment_name for a15
SQL> col owner for a20
SQL> SELECT tablespace_name, segment_type, owner, segment_name
2 FROM dba_extents
3 WHERE file_id = &AFN
4 and &BL between block_id AND block_id+blocks-1;
Enter value for afn: 5
old 3: WHERE file_id = &AFN
new 3: WHERE file_id = 5
Enter value for bl: 387
old 4: and &BL between block_id AND block_id+blocks-1
new 4: and 387 between block_id AND block_id+blocks-1
no rows selected
不属于任何对象
SQL> select * from dba_free_space where file_id=5;
TABLESPACE_NAME FILE_ID BLOCK_ID BYTES BLOCKS RELATIVE_FNO
------------------------------ ---------- ---------- ---------- ---------- ------------
ZHUO 5 384 4158652416 507648 5
ZHUO 5 508032 1206910976 147328 5
进一步有free视图确认。387在未格式化块里面
- 从asm拷贝正常的空块到os
这里选择的空块有2个原则:1、必须是空块;2、尽量选择临近的空块,这样rdba好计算
SQL> @file
Enter value for asm_datafile_name: +datadg/orcl/datafile/ZHUO.266.1024792071
Enter value for block_id_in_datafile: 386
Enter value for number_of_blocks: 1
Enter value for filesystem_filename: /tmp/386.dbf
PL/SQL procedure successfully completed.
[root@11gasm tmp]# ll
total 52
-rw-r----- 1 grid oinstall 16384 Nov 10 20:25 386.dbf
drwx------ 2 gdm gdm 4096 Nov 10 20:13 orbit-gdm
drwx------ 2 grid oinstall 4096 Oct 31 2019 pulse-1a7xzG3NlK0S
drwx------. 2 root root 4096 Oct 31 2019 pulse-5oFdltOC75CE
drwx------. 2 gdm gdm 4096 Nov 10 20:13 pulse-s8dLH8uq2bYS
drwx------ 2 oracle oinstall 4096 Oct 31 2019 pulse-zHzcYhCkV3tu
[root@11gasm tmp]# chown oracle:oinstall 386.dbf
bbed查看块结构
[oracle@11gasm ~]$ bbed
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Tue Nov 10 20:55:39 2020
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set filename '/tmp386.dbf'
BBED-00303: unable to open file '/tmp386.dbf'
BBED> set filename '/tmp/386.dbf'
FILENAME /tmp/386.dbf
BBED> map /v
File: /tmp/386.dbf (0)
Block: 1 Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (00)
BBED> v
DBVERIFY - Verification starting
FILE = /tmp/386.dbf
BLOCK = 1
Block 1 is corrupt
Corrupt block relative dba: 0x00000001 (file 0, block 1)
Bad header found during verification
Data in bad block:
type: 0 format: 2 rdba: 0x00000182
last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x00000001
check value in block header: 0xa682
computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing (Index): 0
Total Blocks Empty : 0
Total Blocks Marked Corrupt : 1
Total Blocks Influx : 0
Message 531 not found; product=RDBMS; facility=BBED
可以看到,拷贝出来正常的块,在verify校验的时候,报错是坏块。具体原因位置,所以asm不能在os层面由此方法判断是否为坏块。
----map也不能看结构。如果结构忘记,此处我们就可以参照其他正常好的快结构,来修改
例如:
BBED> map /v
File: /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf (5)
Block: 231 Dba:0x014000e7
------------------------------------------------------------
KTB Data Block (Table/Cluster)
struct kcbh, 20 bytes @0
ub1 type_kcbh @0
ub1 frmt_kcbh @1
ub1 spare1_kcbh @2
ub1 spare2_kcbh @3
ub4 rdba_kcbh @4 ----rdba地址
。。。。。。。。
BBED> p rdba_kcbh
ub4 rdba_kcbh @4 0x014000e7
BBED> d /v offset 4 count 4
File: /u01/app/oracle/oradata/ZHUO/datafile/o1_mf_zhuo_gxdcfr5s_.dbf (5)
Block: 231 Offsets: 4 to 7 Dba:0x014000e7
-------------------------------------------------------
e7004001 l ..@.
<16 bytes per line>
---注意字节序。
rdba的偏移量为offset 4。
4. 修改asm中的rdba
BBED> show all
FILE# 0
BLOCK# 1
OFFSET 0
DBA 0x00000000 (0 0,1)
FILENAME /tmp/386.dbf
BIFILE bifile.bbd
LISTFILE
BLOCKSIZE 8192
MODE Browse
EDIT Unrecoverable
IBASE Dec
OBASE Dec
WIDTH 80
COUNT 512
LOGFILE log.bbd
SPOOL No
BBED> d /v offset 4 count 4
File: /tmp/386.dbf (0)
Block: 1 Offsets: 4 to 7 Dba:0x00000000
-------------------------------------------------------
82010000 l ....
<16 bytes per line>
BBED> m /x 83 offset 4 ---修改值为82+1,因为是上一个相邻的块拷贝
BBED-00215: editing not allowed in BROWSE mode
BBED> set mode edit
MODE Edit
BBED> m /x 83 offset 4
File: /tmp/386.dbf (0)
Block: 1 Offsets: 4 to 7 Dba:0x00000000
------------------------------------------------------------------------
83010000
<32 bytes per line>
BBED> sum apply
Check value for File 0, Block 1:
current = 0xa683, required = 0xa683
BBED> v
DBVERIFY - Verification starting
FILE = /tmp/386.dbf
BLOCK = 1
Block 1 is corrupt
Corrupt block relative dba: 0x00000001 (file 0, block 1)
Bad header found during verification
Data in bad block:
type: 0 format: 2 rdba: 0x00000183
last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x00000001
check value in block header: 0xa683
computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing (Index): 0
Total Blocks Empty : 0
Total Blocks Marked Corrupt : 1
Total Blocks Influx : 0
Message 531 not found; product=RDBMS; facility=BBED
----依然报坏块,但是我们确定修改正确了,所以直接从os拷贝回asm。
- os块拷贝回asm
SQL> @asm
Enter value for filesystem_filename: /tmp/386.dbf
Enter value for asm_datafile_name: +datadg/orcl/datafile/ZHUO.266.1024792071
Enter value for block_id_in_datafile: 387
PL/SQL procedure successfully completed.
- 坏块检测
[oracle@11gasm ~]$ dbv file='+datadg/orcl/datafile/ZHUO.266.1024792071' userid=sys/oracle
DBVERIFY: Release 11.2.0.4.0 - Production on Tue Nov 10 21:04:03 2020
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
DBVERIFY - Verification starting : FILE = +datadg/orcl/datafile/ZHUO.266.1024792071
DBVERIFY - Verification complete
Total Pages Examined : 655360
Total Pages Processed (Data) : 243
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 139
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 654978
Total Pages Marked Corrupt : 0
Total Pages Influx : 0
Total Pages Encrypted : 0
Highest block SCN : 0 (0.0)
坏块修复。。
allocate extent
- 查找受影响的块并验证其是否属于任何段的
SQL> SELECT e.owner, e.segment_type, e.segment_name, e.partition_name, c.file#
2 , greatest(e.block_id, c.block#) corr_start_block#
3 , least(e.block_id+e.blocks-1, c.block#+c.blocks-1) corr_end_block#
4 , least(e.block_id+e.blocks-1, c.block#+c.blocks-1)
5 - greatest(e.block_id, c.block#) + 1 blocks_corrupted
6 , null description
7 FROM dba_extents e, v$database_block_corruption c
8 WHERE e.file_id = c.file#
9 AND e.block_id <= c.block# + c.blocks - 1
10 AND e.block_id + e.blocks - 1 >= c.block#
11 UNION
12 SELECT s.owner, s.segment_type, s.segment_name, s.partition_name, c.file#
13 , header_block corr_start_block#
14 , header_block corr_end_block#
15 , 1 blocks_corrupted
16 , 'Segment Header' description
17 FROM dba_segments s, v$database_block_corruption c
18 WHERE s.header_file = c.file#
19 AND s.header_block between c.block# and c.block# + c.blocks - 1
20 UNION
21 SELECT null owner, null segment_type, null segment_name, null partition_name, c.file#
22 , greatest(f.block_id, c.block#) corr_start_block#
23 , least(f.block_id+f.blocks-1, c.block#+c.blocks-1) corr_end_block#
24 , least(f.block_id+f.blocks-1, c.block#+c.blocks-1)
25 - greatest(f.block_id, c.block#) + 1 blocks_corrupted
26 , 'Free Block' description
27 FROM dba_free_space f, v$database_block_corruption c
28 WHERE f.file_id = c.file#
29 AND f.block_id <= c.block# + c.blocks - 1
30 AND f.block_id + f.blocks - 1 >= c.block#
31 ORDER BY file#, corr_start_block#;
OWNER SEGMENT_TYPE SEGMENT_NAME PARTITION_NAME FILE# CORR_START_BLOCK# CORR_END_BLOCK# BLOCKS_CORRUPTED DESCRIPTION
-------------------- ------------------ ------------------------------ ------------------------------ ---------- ----------------- --------------- ---------------- --------------
5 387 387 1 Free Block
- 以非 SYS 或 SYSTEM(用户)的用户身份创建一个虚拟表
SQL> conn zhuo/zhuo
ERROR:
ORA-28002: the password will expire within 6 days
Connected.
SQL> create table s (
2 n number,
3 c varchar2(4000)
4 ) nologging tablespace zhuo ;
Table created.
SQL> select segment_name,tablespace_name from user_segments
2 where segment_name='S';
no rows selected
SQL> select table_name ,tablespace_name from user_tables where table_name='S' ;
TABLE_NAME TABLESPACE_NAME
------------------------------ ------------------------------
S ZHUO
- 在虚拟表上创建触发器,一旦重新使用坏块,该触发器便会引发异常
SQL> conn / as sysdba
Connected.
SQL> CREATE OR REPLACE TRIGGER corrupt_trigger
2 AFTER INSERT ON zhuo.s
3 REFERENCING OLD AS p_old NEW AS new_p
4 FOR EACH ROW
5 DECLARE
6 corrupt EXCEPTION;
7 BEGIN
8 IF (dbms_rowid.rowid_block_number(:new_p.rowid)=&blocknumber)
9 and (dbms_rowid.rowid_relative_fno(:new_p.rowid)=&filenumber) THEN
10 RAISE corrupt;
11 END IF;
12 EXCEPTION
13 WHEN corrupt THEN
14 RAISE_APPLICATION_ERROR(-20000, 'Corrupt block has been formatted');
15 END;
16 /
Enter value for blocknumber: 387
old 8: IF (dbms_rowid.rowid_block_number(:new_p.rowid)=&blocknumber)
new 8: IF (dbms_rowid.rowid_block_number(:new_p.rowid)=387)
Enter value for filenumber: 5
old 9: and (dbms_rowid.rowid_relative_fno(:new_p.rowid)=&filenumber) THEN
new 9: and (dbms_rowid.rowid_relative_fno(:new_p.rowid)=5) THEN
Trigger created.
- 为受影响的数据文件中的表分配空间。
SQL> Select BYTES from dba_free_space where file_id=5 and 387 between block_id and block_id + blocks -1;
BYTES
----------
65536
SQL> alter table zhuo.s allocate extent (SIZE 65536);
Table altered.
。。。。。。
SQL> SELECT tablespace_name, segment_type, owner, segment_name
2 FROM dba_extents
3 WHERE file_id = &AFN
4 and &BL between block_id AND block_id+blocks-1;
Enter value for afn: 5
old 3: WHERE file_id = &AFN
new 3: WHERE file_id = 5
Enter value for bl: 387
old 4: and &BL between block_id AND block_id+blocks-1
new 4: and 387 between block_id AND block_id+blocks-1
no rows selected
。。。
SQL> BEGIN
2 for i in 1..10000 loop
3 EXECUTE IMMEDIATE 'alter table zhuo.s allocate extent (SIZE 65536)';
4 end loop;
5 end ;
6 /
BEGIN
*
ERROR at line 1:
ORA-01653: unable to extend table ZHUO.S by 128 in tablespace ZHUO
ORA-06512: at line 3
原则:继续分配空间直到坏块成为zhuo.s 的一部分 — 使用以下查询进行检查:
SQL> SELECT tablespace_name, segment_type, owner, segment_name
2 FROM dba_extents
3 WHERE file_id = &AFN
4 and &BL between block_id AND block_id+blocks-1;
Enter value for afn: 5
old 3: WHERE file_id = &AFN
new 3: WHERE file_id = 5
Enter value for bl: 387
old 4: and &BL between block_id AND block_id+blocks-1
new 4: and 387 between block_id AND block_id+blocks-1
TABLESPACE_NAME SEGMENT_TYPE OWNER SEGMENT_NAME
------------------------------ ------------------ -------------------- ------------------------------
ZHUO TABLE ZHUO S
SQL> select * from dba_free_space;
TABLESPACE_NAME FILE_ID BLOCK_ID BYTES BLOCKS RELATIVE_FNO
------------------------------ ---------- ---------- ---------- ---------- ------------
SYSTEM 1 46336 354418688 43264 1
SYSTEM 1 45784 327680 40 1
SYSAUX 2 22144 447741952 54656 2
SYSAUX 2 21992 196608 24 2
UNDOTBS1 3 44672 1048576 128 3
UNDOTBS1 3 44416 1048576 128 3
UNDOTBS1 3 43648 5242880 640 3
UNDOTBS1 3 4480 319815680 39040 3
UNDOTBS1 3 3104 786432 96 3
UNDOTBS1 3 3088 65536 8 3
UNDOTBS1 3 2048 7340032 896 3
UNDOTBS1 3 440 589824 72 3
UNDOTBS1 3 408 196608 24 3
UNDOTBS1 3 280 65536 8 3
UNDOTBS1 3 168 65536 8 3
USERS 4 128 4194304 512 4
- 向虚拟表中插入数据以格式化块
SQL> BEGIN
2 FOR i IN 1..10000000 LOOP
3 INSERT /*+ APPEND */ INTO zhuo.s select i, lpad('REFORMAT',3092, 'R') from dual;
4 commit ;
5 END LOOP;
6 END;
7 /
BEGIN
*
ERROR at line 1:
ORA-20000: Corrupt block has been formatted
ORA-06512: at "SYS.CORRUPT_TRIGGER", line 10
ORA-04088: error during execution of trigger 'SYS.CORRUPT_TRIGGER'
ORA-06512: at line 3
- 通过运行 DBV 和 Rman 备份确定数据文件中的坏块情况
[oracle@11gasm ~]$ dbv file=+datadg/orcl/datafile/ZHUO.266.1024792071 userid=sys/oracle
DBVERIFY: Release 11.2.0.4.0 - Production on Wed Nov 11 05:01:42 2020
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
DBVERIFY - Verification starting : FILE = +datadg/orcl/datafile/ZHUO.266.1024792071
DBVERIFY - Verification complete
Total Pages Examined : 655360
Total Pages Processed (Data) : 654079
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 1279
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 2
Total Pages Marked Corrupt : 0
Total Pages Influx : 0
Total Pages Encrypted : 0
Highest block SCN : 0 (0.0)
- 删除第 4 步中创建的虚拟表
SQL> drop table zhuo.s purge;
Table dropped.
- 执行手动日志切换和检查点
执行两次日志切换和检查点,以便将在内存中格式化的块写入到磁盘并使 dbverify 不再报告错误
SQL> alter system switch logfile;
System altered.
SQL> alter system checkpoint;
System altered.
- 删除第 6 步中创建的触发器
SQL> select * from dba_objects where object_name='CORRUPT_TRIGGER';
no rows selected
resize
太简单了,不测了。
参考
dbms拷贝asm参考:https://www.xifenfei.com/2015/04/dbms_diskgroup%E6%8B%B7%E8%B4%9Dblockdatafile.html
坏块修复参考:
http://www.killdb.com/2020/03/19/oracle-backup-failed-due-to-corrupt-block/
http://www.killdb.com/2012/11/07/%e5%a6%82%e4%bd%95%e4%bf%ae%e5%a4%8d%e6%9c%aa%e6%a0%bc%e5%bc%8f%e5%8c%96%e7%9a%84%e5%9d%8f%e5%9d%97%ef%bc%9f/
如何格式化不属于任何段的损坏块 (Doc ID 1526163.1)
How to reformat corrupt blocks which are not part of any segment?
$ dbv file=/oracle/SID/system_1/system.data1
DBVERIFY: Release 10.2.0.5.0 - Production on Fri Jul 11 08:04:18 2014
Copyright © 1982, 2007, Oracle. All rights reserved.
DBVERIFY - Verification starting : FILE = /oracle/SID/system_1/system.data1
Block Checking: DBA = 67121421, Block Type = Save undo data block
ERROR: SAVE Undo Block Corrupted. Error Code = 50
kts4subck: record (3) seq# (0), split flag (0)
and total pieces(0)
…
REM Many such corruptions.
select * from v$database_block_corruption order by 1,2,3,4 ;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
1 12557 1 1 UNKNOWN
1 12561 1 1 UNKNOWN
1 12589 1 1 UNKNOWN
1 12593 1 1 UNKNOWN
…
REM Verify that you don’t have table with a name TEST1.
drop table test1;
create table test1 (n number, v varchar2(2048)) tablespace system;
REM You may have to disable auto extend. Check all the files for that tablespace.
select file_id, autoextensible from dba_data_files where tablespace_name=‘SYSTEM’;
REM If there any file with auto extend on, switch it off.
alter database datafile auto extend off ;
REM
select count(*) from dba_Free_space where tablespace_name=‘SYSTEM’;
set serveroutput on size 100000
REM following PL/SQL block allocate space in the tablespace. <<< 用下面这个PLSQL块
declare
begin
for c1 in (
select bytes from dba_free_space
where tablespace=‘SYSTEM’
order by bytes desc )
loop
dbms_output.put_line (‘alter table test1 allocate extent ( size ‘|| c1.bytes ||’)’ );
execute immediate ‘alter table test1 allocate extent ( size ‘|| c1.bytes ||’)’;
end loop;
end;
/
alter table test1 allocate extent ( size 99426304)
alter table test1 allocate extent ( size 71548928)
alter table test1 allocate extent ( size 38182912)
…
REM In our case, corruption in file_id=1
select distinct f.file_id, f.block_id, f.bytes, f.blocks
from dba_free_space f join v$database_block_corruption c
on (c.block# between f.block_id and f.block_id + f.blocks -1
and f.file_id =c.file#)
where f.file_id=1
/
no rows selected
select * from dba_free_space where tablespace_name=‘SYSTEM’ order by bytes
/
no rows selected.
alter table test1 nologging;
– fill up the table until there are space errors. <<< 把空间填满
declare
begin
while true
loop
insert /*+ append */ into test1
select n, lpad(n, 1900, ‘x’) from ( select level n from dual connect by level <=100000);
commit;
end loop;
end;
/
Finally, validate the data file.
rman target /
rman> backup validate check logical datafile 1; <<< 验证
– succeeded
drop table test1;
<<<如果表空间有多个文件,为了覆盖到坏块,需要指定文件,但是又担心文件过大,可以设置最大大小。参考下列语法。
alter table test1 allocate extent ( size datafile ‘’);
Also alter the table to avoid further allocations.
select count(*) extcnt from dba_extents where segment_name=‘TEST11’
alter table test1 storage (maxextents ) ;