场景描述:
在undo表空间丢失后,可能导致数据库无法启动。在解决了undo数据文件丢失后做了一个总结。
已知无法启动的场景:
当有事务没有提交,然后异常停止数据库(shutdown abort或服务器停机),会导致数据库无法启动。
说明:在 oracle11.2.0.3 上做的验证。预备知识
1 block校验
1.1 checksum校验
将数据块以双字节分组,进行异或运算,所得的值为0,则满足checksum校验。如果不为0则表示该块有问题。
当对数据块进行修改后,需要重新调整,使checksum为0。
调整方法:
先计算checksum,然后和chkval_kcbh进行异或,并将值赋给chkval_kcbh。
[oracle@testlfy ~]$ bbed filename='/opt/oracle/oradata/orcl/system01.dbf' password=blockedit mode=edit BBED: Release 2.0.0.0.0 - Limited Production on Fri Feb 28 01:41:53 2014 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. ************* !!! For Oracle Internal Use only !!! *************** BBED> set block 225 BLOCK# 225
BBED> p kcbh struct kcbh, 20 bytes @0 ub1 type_kcbh @0 0x06 ub1 frmt_kcbh @1 0xa2 ub1 spare1_kcbh @2 0x00 ub1 spare2_kcbh @3 0x00 ub4 rdba_kcbh @4 0x004000e1 ub4 bas_kcbh @8 0x0010abf2 ub2 wrp_kcbh @12 0x0000 ub1 seq_kcbh @14 0x01 ub1 flg_kcbh @15 0x06 (KCBHFDLC, KCBHFCKV) ub2 chkval_kcbh @16 0x8724 ub2 spare3_kcbh @18 0x0000
BBED> p kcbh.chkval_kcbh ub2 chkval_kcbh @16 0x8724
BBED> dump File: /opt/oracle/oradata/orcl/system01.dbf (0) Block: 225 Offsets: 16 to 527 Dba:0x00000000 ------------------------------------------------------------------------ 24870000 01000000 0f000000 f1ab1000 00000000 01f80300 00000000 00002100 18000000 25024000 22000f00 01200000 f2ab1000 00011500 02003c00 fa0f811c 811c0000 15008e1f fb100300 04000500 06000700 08000900 0a00ffff fa0f3b10 ae157115 2f15ed14 ab146914 2714e613 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <32 bytes per line>
BBED> m /x 2488 File: /opt/oracle/oradata/orcl/system01.dbf (0) Block: 225 Offsets: 16 to 527 Dba:0x00000000 ------------------------------------------------------------------------ 24880000 01000000 0f000000 f1ab1000 00000000 01f80300 00000000 00002100 18000000 25024000 22000f00 01200000 f2ab1000 00011500 02003c00 fa0f811c 811c0000 15008e1f fb100300 04000500 06000700 08000900 0a00ffff fa0f3b10 ae157115 2f15ed14 ab146914 2714e613 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <32 bytes per line>
BBED> p kcbh.chkval_kcbh ub2 chkval_kcbh @16 0x8724 |
如果通过bbed,在修改完成后,直接运行sum apply是checksum校验。
BBED> sum apply Check value for File 0, Block 225: current = 0x8724, required = 0x8724 |
1.2 tailchk校验
tailchk = kcbh.bas_kcbh(后四位)+ kcbh.type_kcbh + kcbh.seq_kcbh
BBED> p tailchk ub4 tailchk @8188 0xabf20601
BBED> p kcbh struct kcbh, 20 bytes @0 ub1 type_kcbh @0 0x06 ub1 frmt_kcbh @1 0xa2 ub1 spare1_kcbh @2 0x00 ub1 spare2_kcbh @3 0x00 ub4 rdba_kcbh @4 0x004000e1 ub4 bas_kcbh @8 0x0010abf2 ub2 wrp_kcbh @12 0x0000 ub1 seq_kcbh @14 0x01 ub1 flg_kcbh @15 0x06 (KCBHFDLC, KCBHFCKV) ub2 chkval_kcbh @16 0x8724 ub2 spare3_kcbh @18 0x0000 |
1.3 事务校验
事务校验主要关注ITL槽位状态和数据行lock状态。
ITL槽事务状态
BBED> p ktbbh struct ktbbh, 72 bytes @20 ub1 ktbbhtyp @20 0x01 (KDDBTDATA) union ktbbhsid, 4 bytes @24 ub4 ktbbhsg1 @24 0x0000004a ub4 ktbbhod1 @24 0x0000004a struct ktbbhcsc, 8 bytes @28 ub4 kscnbas @28 0x0010b84e ub2 kscnwrp @32 0x0000 sb2 ktbbhict @36 2 ub1 ktbbhflg @38 0x02 (NONE) ub1 ktbbhfsl @39 0x00 ub4 ktbbhfnx @40 0x00000000 struct ktbbhitl[0], 24 bytes @44 struct ktbitxid, 8 bytes @44 ub2 kxidusn @44 0x0001 ub2 kxidslt @46 0x000b ub4 kxidsqn @48 0x0000026b struct ktbituba, 8 bytes @52 ub4 kubadba @52 0x014001a7 ub2 kubaseq @56 0x007d ub1 kubarec @58 0x37 ub2 ktbitflg @60 0x8000 (KTBFCOM) union _ktbitun, 2 bytes @62 sb2 _ktbitfsc @62 0 ub2 _ktbitwrp @62 0x0000 ub4 ktbitbas @64 0x0010afe9 struct ktbbhitl[1], 24 bytes @68 struct ktbitxid, 8 bytes @68 ub2 kxidusn @68 0x0001 ub2 kxidslt @70 0x001b ub4 kxidsqn @72 0x00000271 struct ktbituba, 8 bytes @76 ub4 kubadba @76 0x0140042d ub2 kubaseq @80 0x0092 ub1 kubarec @82 0x03 ub2 ktbitflg @84 0x2001 (KTBFUPB) union _ktbitun, 2 bytes @86 sb2 _ktbitfsc @86 0 ub2 _ktbitwrp @86 0x0000 ub4 ktbitbas @88 0x0010b84f |
ktbitflg说明:
前4个bit的含义
KTBFCOM BIT 0x8000 transaction is committed
KTBFIBI BIT 0x4000 rollback of this uba gives a BI of the itl
KTBFUPB BIT 0x2000 commit time is upper bound
* For a cleaned-out committed itl, 0x2000 indicates the commit time is an upperbound time.
* For a uncleaned-out active itl, 0x2000 indicates that the itl has been delayed-logged cleaned out with an scn base set, row locks are not cleared
KTBFTAC BIT 0x1000 this xac is active as of ktbbhcsc
前四个bit都为0,表示事务是活动的,或延迟清理的。
剩下的12bit表示被锁定的行数
KTBFLKC BIT 0x0FFF lock count mask for flag
行事务状态
在2c后面的字节就是行上的ITL操作,ITL槽上状态置为提交时,这个标记为要置为0。
BBED> p *kdbr[10] rowdata[6616] ------------- ub1 rowdata[6616] @7485 0x2c
BBED> dump File: /opt/oracle/oradata/orcl/system01.dbf (0) Block: 665 Offsets: 7485 to 7996 Dba:0x00000000 ------------------------------------------------------------------------ 2c000a03 c2052502 c10202c1 020fce64 64646464 64646464 64646464 64018001 8002c115 02c10220 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 01802c00 0a03c205 1302c102 02c1020f ce646464 64646464 64646464 64646401 8002c102 02c11502 c102202d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d01 802c000a 03c2043f 02c10202 c10202c5 1502c102 018002c3 0204c304 011e202d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d01 802c000a 03c2043d 02c10202 c1020fce 64646464 64646464 64646464 64640180 02c10202 c11503c2 0c12202d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d01 802c000a 03c2040b 02c10201 800fce64 64646464 64646464 64646464 64018002 c10202c1 0b02c102 202d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d01802c 000a03c2 035c02c1 0202c102 05c4051e 324402c1 02018002 c30204c3 02010220 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 01802c00 0a03c203 5802c102 02c10206 c50a6464 646402c1 02018002 c11502c1 04202d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d2d2d 2d2d0180 2c000a03 c2035702 c1020180 0fce6464
<32 bytes per line>
BBED> |
1.4 坏块标识
在块头中,seq_kcbh(占用1字节,块头偏移14)有着特殊的含义,如果该值为0xff,则表示该块被标记为corruption。
查询数据库坏块
select * from v$database_block_corruption; |
2 回滚段
2.1 回滚段信息
SQL> desc undo$ Name Null? Type ----------------------------------------- -------- ---------------------------- US# NOT NULL NUMBER NAME NOT NULL VARCHAR2(30) USER# NOT NULL NUMBER FILE# NOT NULL NUMBER BLOCK# NOT NULL NUMBER SCNBAS NUMBER SCNWRP NUMBER XACTSQN NUMBER UNDOSQN NUMBER INST# NUMBER STATUS$ NOT NULL NUMBER TS# NUMBER UGRP# NUMBER KEEP NUMBER OPTIMAL NUMBER FLAGS NUMBER SPARE1 NUMBER SPARE2 NUMBER SPARE3 NUMBER SPARE4 VARCHAR2(1000) SPARE5 VARCHAR2(1000) SPARE6 DATE |
Status值说明:
1 = INVALID
2 = AVAILABLE
3 = IN USE
4 = OFFLINE
5 = NEED RECOVERY
6 = PARTLY AVAILABLE (contains in-doubt txs)
当出现删除回滚表空间,出现错误提示“ORA-01548: active rollback segment '_SYSSMU1_1240252155$' found, terminate”,可以将status修改为1。将回滚段置为非法状态。这样再配合其他操作,可以删除回滚表空间。
2.2 查找回滚段所在数据块
要想删除回滚段状态,需要知道回滚段存储的位置。
通过查询SQL,可以知道rollback segment所在的数据块。
select t.name, t.status$, t.file#, t.block#, dbms_rowid.rowid_relative_fno(rowid) file#, dbms_rowid.rowid_block_number(rowid) blk# from undo$ t where file# = 3; |
3 跟踪不能启动过程
通过下面的命令,生成跟踪文件,在文件中查找数据库不能正常启动的原因。
startup mount oradebug setmypid oradebug EVENT 10046 TRACE NAME CONTEXT FOREVER, LEVEL 12 oradebug TRACEFILE_NAME alter database open; |
一般不能启动,原因是事务没有完成
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/720091/viewspace-1096615/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/720091/viewspace-1096615/