19c ora-600 internal error code [kkshhcdel:wrong-bucket]
Oracle 19c(19.9) RAC on linux, 应用执行一个update sql时报错ora-600, 错误日志如下
ORA-00600: internal error code, arguments: [kkshhcdel:wrong-bucket], [0x58445A790], [0x000000000], [0x67DFB5820], [], [], [], [], [], [], [], []
----- Current SQL Statement for this session (sql_id=1jkhptdvurr4q) -----
update XXX_LOG set sendstatus=:1 ,ERROR_INFO=:2 ,process_time=sysdate where rowid = chartorowid(:3 )
[TOC00003-END]
[TOC00004]
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
...
dbgePostErrorKGE()+ call dbgexProcessError() 7FFFF7F436D0 7FFFF7F488A0
1853 000000001 000000000
000000000 ? 000000082 ?
dbkePostKGE_kgsf()+ call dbgePostErrorKGE() 7FFFF7F839A0 7FFFF7E0AA88
71 000000258 000000000 ?
000000000 ? 000000082 ?
kgeadse()+447 call dbkePostKGE_kgsf() 7FFFF7F839A0 7FFFF7E0AA88
000000258 000000000 ?
000000000 ? 000000082 ?
kgerinv_internal()+ call kgeadse() 7FFFF7F839A0 ? 7FFFF7E0AA88
44 000000258 ? 013D772D0
000000000 000000003
kgerinv()+40 call kgerinv_internal() 7FFFF7F839A0 ? 7FFFF7E0AA88 ?
000000258 ? 013D772D0 ?
000000000 ? 000000003 ?
kgeasnmierr()+146 call kgerinv() 7FFFF7F839A0 ? 7FFFF7E0AA88 ?
000000258 ? 013D772D0 ?
000000000 ? 000000003 ?
kkshhcdel()+1375 call kgeasnmierr() 7FFFF7F839A0 ? 7FFFF7E0AA88 ?
000000258 ? 013D772D0 ?
000000002 58445A790
kksfbc()+17471 call kkshhcdel() 7FFFF7F839A0 ? 7FFFF7E0AA88 ?
58445A7E0 ? 67DFB5800 ?
000000002 ? 58445A790 ?
opiexe()+2895 call kksfbc() 7FFFF268C4A8 7FFFF7E0AA88 ?
58445A7E0 ? 67DFB5800 ?
000000002 ? 58445A790 ?
kpoal8()+2387 call opiexe() 000000049 7FFFF7E0AA88 ?
7FFFFFFF7FE0 67DFB5800 ?
000000002 ? 58445A790 ?
opiodr()+1202 call kpoal8() 00000005E 000000026
7FFFFFFFBA40 67DFB5800 ?
000000002 ? 58445A790 ?
kkshhcdel – kernel compile shared objects (cursor) cursor hash table hash chain child delete
kksfbc – kernel compile shared objects (cursor) find bound cursor
An Ora-600 [kkshindel:wrong-bucket] might have been seen if we have concurrent access to the same cursor across multiple sessions. This has now be addressed.
应该是在SQL解析阶段,多个session一致性访问相同的sql cursor时, 获取了hash chain后,遍历hash table时,查找hash bucket时出错。还是发生在shared pool内存中, 判断问题是SQL级,影响该SQL, 建议尝试flush shared pool。
目前oracle中未发现完全匹配bug,但和Bug 31213034 相似. 本案例在flush shared pool恢复正常。
BTW, 19c RU选择时,建议19.11以上,相对问题少一些。
ORA-00600: internal error code, arguments: [ktubko_1] Instance terminated
前不久广州某客户Oracle 11.2.0.2 on Linux一套虚机环境,在线扩容了cpu和memory后数据库出现在ora-600 [ktubko_1] 反复重启, 一个事务回滚相关的内部错误,当然在线修改硬件资源这个操作是不建议的,这里记录一下ora-600 [ktubko_1]应急处理方法。
ktubko – kernel transaction undo get undo record to rollback transaction, non-cr only
DB Alert Log
Mon Oct 10 09:41:09 2022
Dumping diagnostic data in directory=[cdmp_20221010094109], requested by (instance=1, osid=77246 (SMON)), summary=[incident=55874085].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORACLE Instance anbob01 (pid = 23) - Error 600 encountered while recovering transaction (2489, 1) on object 389809.
Errors in file /oracle/diag/rdbms/anbob/anbob01/trace/anbob01_smon_77246.trc:
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []
Mon Oct 10 09:41:11 2022
Sweep [inc][55874085]: completed
Sweep [inc2][55874085]: completed
Mon Oct 10 09:41:26 2022
Thread 1 advanced to log sequence 71838 (LGWR switch)
Current log# 2 seq# 71838 mem# 0: +DATA/dslrprd1/redo02.log
Mon Oct 10 09:41:26 2022
LNS: Standby redo logfile selected for thread 1 sequence 71838 for destination LOG_ARCHIVE_DEST_2
Mon Oct 10 09:41:37 2022
Archived Log entry 29265 added for thread 1 sequence 71837 ID 0x38838837 dest 1:
Mon Oct 10 09:43:50 2022
Thread 1 advanced to log sequence 71839 (LGWR switch)
Current log# 3 seq# 71839 mem# 0: +DATA/dslrprd1/redo03.log
Mon Oct 10 09:43:50 2022
LNS: Standby redo logfile selected for thread 1 sequence 71839 for destination LOG_ARCHIVE_DEST_2
Mon Oct 10 09:44:00 2022
Archived Log entry 29267 added for thread 1 sequence 71838 ID 0x38838837 dest 1:
Mon Oct 10 09:44:11 2022
Errors in file /oracle/diag/rdbms/anbob/anbob01/trace/anbob01_smon_77246.trc (incident=55874086):
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/anbob/anbob01/incident/incdir_55874086/anbob01_smon_77246_i55874086.trc
Mon Oct 10 09:44:13 2022
PMON (ospid: 77198): terminating the instance due to error 474
Mon Oct 10 09:44:13 2022
opiodr aborting process unknown ospid (110899) as a result of ORA-1092
Mon Oct 10 09:44:13 2022
System state dump requested by (instance=1, osid=77198 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /oracle/diag/rdbms/anbob/anbob01/trace/anbob01_diag_77208.trc
Mon Oct 10 09:44:13 2022
opiodr aborting process unknown ospid (115704) as a result of ORA-1092
Mon Oct 10 09:44:13 2022
ORA-1092 : opitsk aborting process
Instance terminated by PMON, pid = 77198
Smon Trace file
*** ACTION NAME:() 2022-10-10 09:32:01.607
SMON[TX-RECO]:about to recover undo segment 1036 status:6 inst:1
SMON[TX-RECO]: mark undo segment 1036 as available status:2 ret:0
SMON[TX-RECO]:about to recover undo segment 1276 status:6 inst:1
Dead transaction 0x04fc.01b.000068f5 recovered by SMON
SMON[TX-RECO]: mark undo segment 1276 as available status:2 ret:0
SMON[TX-RECO]:about to recover undo segment 1303 status:6 inst:1
Dead transaction 0x0517.00e.0000354d recovered by SMON
SMON[TX-RECO]: mark undo segment 1303 as available status:2 ret:0
SMON[TX-RECO]:about to recover undo segment 2489 status:6 inst:2
Incorrect next uba in kturCurrBackoutOneChg while backing out xid: 0x09b9.001.00008634 uba: 0x0881036d.0a9c.35
Undo record:
ktubu redo: slt: 1 rci: 52 opc: 10.22 objn: 389809 objd: 389809 tsn: 13
Undo type: Regular undo Undo type: Last buffer split: No
Tablespace Undo: No
0x00000000
index undo for leaf key operations
KTB Redo
op: 0x03 ver: 0x01
compat bit: 4 (post-11) padding: 1
op: Z
Dump kdilk : itl=42, kdxlkflg=0x1 sdc=0 indexid=0x100df6ea block=0x10cdfc32
(kdxlpu): purge leaf row
key :(19): 0d 38 36 30 30 30 32 32 38 39 34 33 32 38 04 44 53 4c 52
Undo block: tsn 0x5 rdba: 0x881036d
Dump of buffer cache at level 4 for tsn=5, rdba=142672749
BH (0x65fa90938) file#: 34 rdba: 0x0881036d (34/66413) class: 4994 ba: 0x65e03a000
set: 61 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,0
dbwrid: 0 obj: -1 objn: 0 tsn: 5 afn: 34 hint: f
hash: [0x95ac76658,0x95ac76658] lru: [0x65fa90b98,0x91cfab5e0]
ckptq: [NULL] fileq: [NULL] objq: [0x8ac156690,0x8ac156690] objaq: [0x65fa91618,0x8ac156680]
st: SCURRENT md: NULL tch: 1 atm: 542555381,542562694
flags: affinity_lock
LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
cr pin refcnt: 0 sh pin refcnt: 0
Data block dump: tsn: 0xd rdba: 0x10cdfc32
Dump of buffer cache at level 3 for tsn=13, rdba=281934898
BH (0x65fb68060) file#: 67 rdba: 0x10cdfc32 (67/916530) class: 1 ba: 0x65f290000
set: 65 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,0
dbwrid: 1 obj: 389809 objn: 389809 tsn: 13 afn: 67 hint: f
hash: [0x90b7cf590,0x90b7cf590] lru: [0x65fb682c0,0x91cfad1e0]
ckptq: [NULL] fileq: [NULL] objq: [0x892283d50,0x892283d50] objaq: [0x892283d40,0x892283d40]
st: XCURRENT md: NULL tch: 1 atm: 542561348,542562739
flags: affinity_lock
LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
cr pin refcnt: 0 sh pin refcnt: 0
buffer tsn: 13 rdba: 0x10cdfc32 (67/916530)
scn: 0x011d.b929e3fd seq: 0x04 flg: 0x04 tail: 0xe3fd0604
frmt: 0x02 chkval: 0xec67 type: 0x06=trans data
Hex dump of block: st=0, typ_found=1
Dump of memory from 0x000000065F290000 to 0x000000065F292000
...
Block header dump: 0x10cdfc32
Object id on Block? Y
seg/obj: 0x5f2b1 csc: 0x11d.b929e3fd itc: 131 flg: E typ: 2 - INDEX
brn: 0 bdba: 0x10cdfa00 ver: 0x01 opc: 0
inc: 0 exflg: 0
Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x041c.002.0002fdf3 0x00c0ed7b.5f77.01 CB-- 0 scn 0x011d.b8c1f170
0x02 0x07ef.008.000510b1 0x08814853.5132.1a C--- 0 scn 0x011d.b8c66bc2
0x03 0x07f3.018.00083fd1 0x08806b85.658e.0f C--- 0 scn 0x011d.b8c66be7
0x04 0x0403.003.0014f1d4 0x00c09259.a7ae.13 C--- 0 scn 0x011d.b8c0a5be
0x05 0x07f3.01d.00084158 0x08806b85.658e.16 C--- 0 scn 0x011d.b8c66c11
0x06 0x0403.00c.0014f07c 0x00c0d7fa.a7b0.01 C--- 0 scn 0x011d.b8c1340b
0x07 0x07f3.019.00084187 0x08806b85.658e.1f C--- 0 scn 0x011d.b8c66c3f
0x08 0x07ff.01a.000fee92 0x08c070e5.96c9.1f C--- 0 scn 0x011d.b8c71373
...
0x28 0x0934.013.0000931e 0x08c088d5.0bb5.23 C-U- 0 scn 0x011d.b8be4f62
0x29 0x09ba.003.000009b4 0x08c00f5c.00df.24 C-U- 0 scn 0x011d.b8be4f92
0x2a 0x09b9.001.00008634 0x0881036d.0a9c.35 C-U- 0 scn 0x0114.bacea0d2
0x2b 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000
...
0x7e 0x0886.005.00009a2b 0x08c1d30c.0c13.28 C-U- 0 scn 0x011d.b8be5224
0x7f 0x0ae4.004.00000ab4 0x11c03372.00c9.40 C-U- 0 scn 0x011d.b8be5371
0x80 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000
0x81 0x0afa.010.0000010a 0x11c03d6f.004e.44 C-U- 0 scn 0x011d.b8be5320
0x82 0x0ac7.000.00000904 0x11c03f90.009e.1e C-U- 0 scn 0x011d.b8be5165
0x83 0x0b13.010.000000e7 0x11c4e6ca.0052.0a C-U- 0 scn 0x011d.b8be5873
Leaf block dump
===============
header address 27366329468=0x65f290c7c
kdxcolev 0
KDXCOLEV Flags = - - -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y
kdxconco 2
kdxcosdc 2
kdxconro 93
kdxcofbo 222=0xde
kdxcofeo 2425=0x979
kdxcoavs 2203
kdxlespl 0
kdxlende 0
kdxlenxt 269401599=0x100ebdff
kdxleprv 269401093=0x100ebc05
kdxledsz 6
kdxlebksz 4936
row#0[2614] flag: ------, lock: 0, len=27, data:(6): 0f 6b 2b ca 00 00
...
ORACLE Instance dslrprd11 (pid = 23) - Error 600 encountered while recovering transaction (2489, 1) on object 389809.
*** 2022-10-10 09:32:02.138
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=3, mask=0x0)
----- Error Stack Dump -----
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []
----- SQL Statement (None) -----
Current SQL information unavailable - no cursor.
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+36 call kgdsdst() 000000000 ? 000000000 ?
7FFFD22095D8 ? 000000001 ?
000000001 ? 000000002 ?
ksedst1()+98 call skdstdst() 000000000 ? 000000000 ?
7FFFD22095D8 ? 000000001 ?
000000000 ? 000000002 ?
ksedst()+34 call ksedst1() 000000000 ? 000000001 ?
7FFFD22095D8 ? 000000001 ?
000000000 ? 000000002 ?
dbkedDefDump()+2741 call ksedst() 000000000 ? 000000001 ?
7FFFD22095D8 ? 000000001 ?
000000000 ? 000000002 ?
ksedmp()+36 call dbkedDefDump() 000000003 ? 000000000 ?
7FFFD22095D8 ? 000000001 ?
000000000 ? 000000002 ?
kturRecoverTxn()+17 call ksedmp() 000000003 ? 000000000 ?
293 7FFFD22095D8 ? 000000001 ?
000000000 ? 000000002 ?
kturRecoverUndoSegm call kturRecoverTxn() 7FFFD220EEB8 ? 000000001 ?
ent()+1202 000000001 ? 000000000 ?
000000010 ? 2200000004 ?
kturRecoverActiveTx call kturRecoverUndoSegm 0000009B9 ? 7FFFD220F058 ?
ns()+3684 ent() 000000001 ? 000000000 ?
00000FFFF ? 100000010 ?
ktprbeg()+323 call kturRecoverActiveTx 000000010 ? 000000000 ?
ns() 0000009B9 ? 000000000 ?
00000FFFF ? 100000010 ?
ktmmon()+13554 call ktprbeg() 000000000 ? 000000000 ?
0000009B9 ? 000000000 ?
00000FFFF ? 100000010 ?
ktmSmonMain()+207 call ktmmon() 0600304B8 ? 000000000 ?
0000009B9 ? 000000000 ?
00000FFFF ? 100000010 ?
ksbrdp()+971 call ktmSmonMain() 0600304B8 ? 000000000 ?
000000000 ? 000000000 ?
00000FFFF ? 100000010 ?
opirip()+618 call ksbrdp() 0600304B8 ? 000000000 ?
000000000 ? 000000000 ?
00000FFFF ? 100000010 ?
opidrv()+598 call opirip() 000000032 ? 000000004 ?
7FFFD2211B08 ? 000000000 ?
00000FFFF ? 100000010 ?
sou2o()+98 call opidrv() 000000032 ? 000000004 ?
7FFFD2211B08 ? 000000000 ?
00000FFFF ? 100000010 ?
opimai_real()+261 call sou2o() 7FFFD2211AE0 ? 000000032 ?
000000004 ? 7FFFD2211B08 ?
00000FFFF ? 100000010 ?
ssthrdmain()+252 call opimai_real() 000000000 ? 7FFFD2211CD0 ?
000000004 ? 7FFFD2211B08 ?
00000FFFF ? 100000010 ?
main()+196 call ssthrdmain() 000000003 ? 7FFFD2211CD0 ?
000000001 ? 000000000 ?
00000FFFF ? 100000010 ?
__libc_start_main() call main() 000000003 ? 7FFFD2211E70 ?
+253 000000001 ? 000000000 ?
00000FFFF ? 100000010 ?
_start()+36 call __libc_start_main() 000A077E4 ? 000000001 ?
7FFFD2211E68 ? 000000000 ?
00000FFFF ? 100000010 ?
--------------------- Binary Stack Dump ---------------------
Dump continued from file: /oracle/diag/rdbms/dslrprd1new/anbob01/trace/anbob01_smon_77246.trc
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+36 call kgdsdst() 000000000 ? 000000000 ?
7FFFD2201058 ? 000000001 ?
000000001 ? 000000002 ?
ksedst1()+98 call skdstdst() 000000000 ? 000000000 ?
7FFFD2201058 ? 000000001 ?
000000000 ? 000000002 ?
ksedst()+34 call ksedst1() 000000000 ? 000000001 ?
7FFFD2201058 ? 000000001 ?
000000000 ? 000000002 ?
dbkedDefDump()+2741 call ksedst() 000000000 ? 000000001 ?
7FFFD2201058 ? 000000001 ?
000000000 ? 000000002 ?
ksedmp()+36 call dbkedDefDump() 000000003 ? 000000002 ?
7FFFD2201058 ? 000000001 ?
000000000 ? 000000002 ?
ksfdmp()+64 call ksedmp() 000000003 ? 000000002 ?
7FFFD2201058 ? 000000001 ?
000000000 ? 000000002 ?
dbgexPhaseII()+1764 call ksfdmp() 000000003 ? 000000002 ?
7FFFD2201058 ? 000000001 ?
000000000 ? 000000002 ?
dbgexProcessError() call dbgexPhaseII() 7FBF06D98710 ? 7FBF06B51798 ?
+2651 7FFFD220D3D0 ? 000000001 ?
000000000 ? 000000002 ?
dbgeExecuteForError call dbgexProcessError() 7FBF06D98710 ? 7FBF06B51798 ?
()+83 000000001 ? 000000000 ?
100000000 ? 000000002 ?
dbgePostErrorKGE()+ call dbgeExecuteForError 7FBF06D98710 ? 7FBF06B51798 ?
2131 () 000000001 ? 000000001 ?
000000000 ? 000000002 ?
dbkePostKGE_kgsf()+ call dbgePostErrorKGE() 000000000 ? 7FBF06920498 ?
63 000000258 ? 7FBF06B51798 ?
100000000 ? 000000002 ?
kgeadse()+390 call dbkePostKGE_kgsf() 00B7C7EA0 ? 7FBF06920498 ?
000000258 ? 7FBF06B51798 ?
100000000 ? 000000002 ?
kgerinv_internal()+ call kgeadse() 00B7C7EA0 ? 7FBF06920498 ?
45 000000258 ? 100000000 ?
000000000 ? 000000002 ?
kgerinv()+33 call kgerinv_internal() 00B7C7EA0 ? 7FBF06920498 ?
7FFFD21FD890 ? 000000258 ?
000000000 ? 000000002 ?
kgeasnmierr()+143 call kgerinv() 00B7C7EA0 ? 7FBF06920498 ?
7FFFD21FD890 ? 000000002 ?
000000000 ? 000000002 ?
kturCurrBackoutOneC call kgeasnmierr() 00B7C7EA0 ? 7FBF06920498 ?
hg()+5986 7FFFD21FD890 ? 000000002 ?
000000000 ? 000000000 ?
kturRecoverTxn()+15 call kturCurrBackoutOneC 7FFFD220EBC8 ? 7FFFD220E5D4 ?
227 hg() 000000004 ? 7FFFD220EEB8 ?
7FFFD220EAD0 ? 7FFFD220E5DC ?
kturRecoverUndoSegm call kturRecoverTxn() 7FFFD220EEB8 ? 000000001 ?
ent()+1202 000000001 ? 000000000 ?
000000010 ? 2200000004 ?
kturRecoverActiveTx call kturRecoverUndoSegm 0000009B9 ? 7FFFD220F058 ?
ns()+3684 ent() 000000001 ? 000000000 ?
00000FFFF ? 100000010 ?
ktprbeg()+323 call kturRecoverActiveTx 000000010 ? 000000000 ?
ns() 0000009B9 ? 000000000 ?
00000FFFF ? 100000010 ?
ktmmon()+13554 call ktprbeg() 000000000 ? 000000000 ?
0000009B9 ? 000000000 ?
00000FFFF ? 100000010 ?
ktmSmonMain()+207 call ktmmon() 0600304B8 ? 000000000 ?
0000009B9 ? 000000000 ?
00000FFFF ? 100000010 ?
ksbrdp()+971 call ktmSmonMain() 0600304B8 ? 000000000 ?
000000000 ? 000000000 ?
00000FFFF ? 100000010 ?
opirip()+618 call ksbrdp() 0600304B8 ? 000000000 ?
000000000 ? 000000000 ?
00000FFFF ? 100000010 ?
opidrv()+598 call opirip() 000000032 ? 000000004 ?
7FFFD2211B08 ? 000000000 ?
00000FFFF ? 100000010 ?
sou2o()+98 call opidrv() 000000032 ? 000000004 ?
7FFFD2211B08 ? 000000000 ?
00000FFFF ? 100000010 ?
Note:
kgeasnmierr – kernel generic error assert named internal error
kturCurrBackoutOneChg – kernel transaction undo rollback current read; backout change (rollback)
kturRecoverTxn() kernel transaction undo rollback recover transaction
kturRecoverUndoSegment – kernel transaction undo rollback recover undo segment
kturRecoverActiveTxns – kernel transaction undo rollback recover active transactions
ktprbeg() kernel transaction parallel recovery begin
ktmmon()+13554 kernel transaction transaction monitor (smon) tx monitor background timeout action (smon)
ktmSmonMain()+207 kernel transaction transaction monitor (smon) main smon routine
从trace文件中可以看到是obj# 389809的索引段在level block 的index undo for leaf key operations时purge leaf row出现了不正确的uba,rollback segment编号为2489, 事务表状态为6 PARTLY AVAILABLE( like DBA_ROLLBACK_SEGS.status or undo$.status)。
/* segment status (see KTS.H): */
/* 1 = INVALID, 2 = AVAILABLE, 3 = IN USE, 4 = OFFLINE, 5 = NEED RECOVERY,
* 6 = PARTLY AVAILABLE (contains in-doubt txs)
PARTLY AVAILABLE means it contains active transactions that SMON has not yet finished recovering or it contains in-doubt transactions(分布式事务)
根据ora600 和call stack 不难定位 匹配bug 14604955
Rediscovery Notes
ORA-00600 [ktubko_1] error.Top Call Stack:
… <- kgeasnmierr <- kturCurrBackoutOneChg <- kturRecoverTxnMain <- kturRecoverTxn <- kturRecoverUndoSegment
<- kturRecoverActiveTxns <- ktprbeg <- ktmmon <- ktmSmonMain <- ksbrdp <- opirip <- …
解决方法
对于undo segment问题通常紧急救援是屏蔽rollseg, drop, 重建对象。当然如果有备份建议备份还原。
可以先屏蔽报错中提示的rollback segment 使用_OFFLINE_ROLLBACK_SEGMENTS或 _CORRUPTED_ROLLBACK_SEGMENTS, 需要重启实例。
*********************************************************************
ORACLE will not support a database that has used the
_CORRUPTED_ROLLBACK_SEGMENTS parameter.
*********************************************************************
UNDO SEGMENT
查找UNDO SEGMENT NAME
select segment_name from dba_rollback_segs where segment_id=2489;
-- _SYSSMU2489_xxxxxx$
-- 增加pfile ,启动数据库
*._OFFLINE_ROLLBACK_SEGMENTS=_SYSSMU2489_xxxxxx$
-- 可以dump undo header查看rollseg上的活动事务,通常如果不是非常高的并发,1个undo segment中算法均分是1个ACTIVE TX。 可以dump rollseg header查看。
ALTER SYSTEM DUMP UNDO HEADER '_SYSSMU2489_xxxx$';
当然因为有ACTIVE tx, 可以直接使用_CORRUPTED_ROLLBACK_SEGMENTS.
_OFFLINE_ROLLBACK_SEGMENTS或 _CORRUPTED_ROLLBACK_SEGMENTS参数
If the datafile related to rollback segment get corrupted, the first method go for the recovery from backup.
If your backup is not working at all, so last unsupported option to choose the parameter _OFFLINE_ROLLBACK_SEGMENTS in init.ora. _OFFLINE_ROLLBACK_SEGMENTS is a unsupported init.ora parameter which can allow you to cause logical database corruption. It is rarely required in Oracle7.3 onwards due to deferred transaction recovery.
When you required this parameter to set _OFFLINE_ROLLBACK_SEGMENTS ?
When you start a database, SMON will recover all the uncommitted transaction. To do the recovery it will check the rollback segment header first to read the transaction table. But if any reason SMON is unable to read the header block of any online rollback segments then you required this parameter.
Unreadable rollback segment header may cause for block corruption, datafile corruption or datafile unavailable.If tablespace is online but the rollback segment header block get corrupted, you need this parameter to start the database.Because you have to stop SMON from performing transaction recovery for specific rollback segments. While starting the database SMON will not check the transaction table only for those rollback segments under the parameter _OFFLINE_ROLLBACK_SEGMENTS . You should choose this parameter only when all the transaction related to this rollback segment is INACTIVE.
How to know the transaction is in inactive mode ?
Do the following steps
===============
- Start the database with _OFFLINE_ROLLBACK_SEGMENTS first.
- Find the header file number and header block number from the following query.
select segment_name,header_file,header_block from dba_segments where segment_name like '';
- dump the block with following command.
alter system dump datafile n block n ;
-
observe for the uncommitted transaction from block dump like below. Observe the status field.
If it is ‘9’ means transaction committed. but if the status value is ’10’ then it is an uncommitted transaction.So this rollback segment is not a good candidate for _OFFLINE_ROLLBACK_SEGMENTS. -
After checking the step 4, if you find all the status field is ‘9’, then drop the rollback segment immediately.
-
Stop the database, take the parameter out from init.ora and start the database. So your dictionary is perfect.
you can continue your production job with this database. -
After checking the step 4, if you find any status field is ’10’ , then you should immediately shutdown the database.
Start the database with “_corrupted_rollback_segments”. And put the name of those rollback segment names where did you find the value of status=’10’. -
There is a logical corruption. So you should drop the rollback segment And do a complete export and import of this database to make it supported.
What is _CORRUPTED_ROLLBACK_SEGMENTS ?
_CORRUPTED_ROLLBACK_SEGMENTS is more dangerous parameter than _OFFLINE_ROLLBACK_SEGMENTS.
It basically prevents access to the listed rollback segments headers and assumes all transactions in them are committed. This can very easily cause logical database corruption. Opening any database with any rollback segment listed in this parameter instantly makes the database unsupported. This should
only EVER be used after a full backup has been taken as as part of a last attempt to salvage data from a database.
Let me make a chart for more details DATAFILE TRANSACTION
_offline_rollback_segments ONLINE/OFFLINE/RECOVERY INACTIVE
_corrupted_rollback_segments ONLINE/OFFLINE/RECOVERY ACTIVE
ONLINE ACTIVE * you can use this option, but better option to choose _corrupted_rollback_segments
IMPORTANT: If this corruption happened with system rollback segment, there is no way to use any of this parameter. You must have to recover database from old backup.
Differences between these parameters
There are differences between these two parameters. If the rollback segment is listed in the _offline parameter list, the transaction table is still read accessible. This is important for delayed block cleanout. If a select statement reads a data block with an open ITL which points to the transaction table of the _offline rollback segment, the table is still checked. If the transaction is committed, delayed block clean out occurs. If the transaction is uncommitted, it will generate ORA-1578. It will not allow to select the block.
If the rollback segment is listed in the _corrupted parameter list, the transaction table is not read accessible. All transactions are assumed COMMITTED and delayed block cleanout will occur but there is no consistent read view of uncommitted transactions (logical corruption).
reference
https://www.parnassusdata.com/en/node/206
Related Posts:
Lots of Long transaction caused by database link, and undo hdr show DBA for that slot is 0x00000000
https://www.anbob.com/archives/2738.html
Troubleshooting large amount of undo/redo generated due to dblink usage
https://www.anbob.com/archives/6614.html
Troubleshooting ORA-600 [4187] Undo segment sequence number is exhausted
https://www.anbob.com/archives/6499.html
ORA-00600:[kzrini:!uprofile]
连接数据库报错ORA-00600[kzrini:!uprofile]
,数据库的tab$
表被恶意清空,发现数据库内部存在几个异常对象,存储过程DBMS_SUPPORT_DBMONITORP
和触发器DBMS_SUPPORT_DBMONITOR
,检查了对象内容,创建数据库300天后自动删除tab$
。问题原因是项目同事使用网上私自下载的绿色版数据库安装介质安装的数据库,而这个安装介质里的prvtsupp.plb
文件被恶意篡改,最终导致这个问题的发生。