Oracle 19c(19.9) RAC on linux, 应用执行一个update sql时报错ora-600

19c ora-600 internal error code [kkshhcdel:wrong-bucket]

Oracle 19c(19.9) RAC on linux, 应用执行一个update sql时报错ora-600, 错误日志如下

ORA-00600: internal error code, arguments: [kkshhcdel:wrong-bucket], [0x58445A790], [0x000000000], [0x67DFB5820], [], [], [], [], [], [], [], []

----- Current SQL Statement for this session (sql_id=1jkhptdvurr4q) -----
update XXX_LOG set sendstatus=:1 ,ERROR_INFO=:2 ,process_time=sysdate where rowid = chartorowid(:3 )
[TOC00003-END]

[TOC00004]
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
...
dbgePostErrorKGE()+  call     dbgexProcessError()  7FFFF7F436D0 7FFFF7F488A0
1853                                               000000001 000000000
                                                   000000000 ? 000000082 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   7FFFF7F839A0 7FFFF7E0AA88
71                                                 000000258 000000000 ?
                                                   000000000 ? 000000082 ?
kgeadse()+447        call     dbkePostKGE_kgsf()   7FFFF7F839A0 7FFFF7E0AA88
                                                   000000258 000000000 ?
                                                   000000000 ? 000000082 ?
kgerinv_internal()+  call     kgeadse()            7FFFF7F839A0 ? 7FFFF7E0AA88
44                                                 000000258 ? 013D772D0
                                                   000000000 000000003
kgerinv()+40         call     kgerinv_internal()   7FFFF7F839A0 ? 7FFFF7E0AA88 ?
                                                   000000258 ? 013D772D0 ?
                                                   000000000 ? 000000003 ?
kgeasnmierr()+146    call     kgerinv()            7FFFF7F839A0 ? 7FFFF7E0AA88 ?
                                                   000000258 ? 013D772D0 ?
                                                   000000000 ? 000000003 ?
kkshhcdel()+1375     call     kgeasnmierr()        7FFFF7F839A0 ? 7FFFF7E0AA88 ?
                                                   000000258 ? 013D772D0 ?
                                                   000000002 58445A790
kksfbc()+17471       call     kkshhcdel()          7FFFF7F839A0 ? 7FFFF7E0AA88 ?
                                                   58445A7E0 ? 67DFB5800 ?
                                                   000000002 ? 58445A790 ?
opiexe()+2895        call     kksfbc()             7FFFF268C4A8 7FFFF7E0AA88 ?
                                                   58445A7E0 ? 67DFB5800 ?
                                                   000000002 ? 58445A790 ?
kpoal8()+2387        call     opiexe()             000000049 7FFFF7E0AA88 ?
                                                   7FFFFFFF7FE0 67DFB5800 ?
                                                   000000002 ? 58445A790 ?
opiodr()+1202        call     kpoal8()             00000005E 000000026
                                                   7FFFFFFFBA40 67DFB5800 ?
                                                   000000002 ? 58445A790 ?

kkshhcdel – kernel compile shared objects (cursor) cursor hash table hash chain child delete

kksfbc – kernel compile shared objects (cursor) find bound cursor

An Ora-600 [kkshindel:wrong-bucket] might have been seen if we have concurrent access to the same cursor across multiple sessions. This has now be addressed.

应该是在SQL解析阶段,多个session一致性访问相同的sql cursor时, 获取了hash chain后,遍历hash table时,查找hash bucket时出错。还是发生在shared pool内存中, 判断问题是SQL级,影响该SQL, 建议尝试flush shared pool。

目前oracle中未发现完全匹配bug,但和Bug 31213034 相似. 本案例在flush shared pool恢复正常。

BTW, 19c RU选择时,建议19.11以上,相对问题少一些。

ORA-00600: internal error code, arguments: [ktubko_1] Instance terminated

前不久广州某客户Oracle 11.2.0.2 on Linux一套虚机环境,在线扩容了cpu和memory后数据库出现在ora-600 [ktubko_1] 反复重启, 一个事务回滚相关的内部错误,当然在线修改硬件资源这个操作是不建议的,这里记录一下ora-600 [ktubko_1]应急处理方法。

ktubko – kernel transaction undo get undo record to rollback transaction, non-cr only

DB Alert Log

Mon Oct 10 09:41:09 2022
Dumping diagnostic data in directory=[cdmp_20221010094109], requested by (instance=1, osid=77246 (SMON)), summary=[incident=55874085].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORACLE Instance anbob01 (pid = 23) - Error 600 encountered while recovering transaction (2489, 1) on object 389809.
Errors in file /oracle/diag/rdbms/anbob/anbob01/trace/anbob01_smon_77246.trc:
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []
Mon Oct 10 09:41:11 2022
Sweep [inc][55874085]: completed
Sweep [inc2][55874085]: completed
Mon Oct 10 09:41:26 2022
Thread 1 advanced to log sequence 71838 (LGWR switch)
  Current log# 2 seq# 71838 mem# 0: +DATA/dslrprd1/redo02.log
Mon Oct 10 09:41:26 2022
LNS: Standby redo logfile selected for thread 1 sequence 71838 for destination LOG_ARCHIVE_DEST_2
Mon Oct 10 09:41:37 2022
Archived Log entry 29265 added for thread 1 sequence 71837 ID 0x38838837 dest 1:
Mon Oct 10 09:43:50 2022
Thread 1 advanced to log sequence 71839 (LGWR switch)
  Current log# 3 seq# 71839 mem# 0: +DATA/dslrprd1/redo03.log
Mon Oct 10 09:43:50 2022
LNS: Standby redo logfile selected for thread 1 sequence 71839 for destination LOG_ARCHIVE_DEST_2
Mon Oct 10 09:44:00 2022
Archived Log entry 29267 added for thread 1 sequence 71838 ID 0x38838837 dest 1:
Mon Oct 10 09:44:11 2022
Errors in file /oracle/diag/rdbms/anbob/anbob01/trace/anbob01_smon_77246.trc  (incident=55874086):
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/anbob/anbob01/incident/incdir_55874086/anbob01_smon_77246_i55874086.trc
Mon Oct 10 09:44:13 2022
PMON (ospid: 77198): terminating the instance due to error 474
Mon Oct 10 09:44:13 2022
opiodr aborting process unknown ospid (110899) as a result of ORA-1092
Mon Oct 10 09:44:13 2022
System state dump requested by (instance=1, osid=77198 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /oracle/diag/rdbms/anbob/anbob01/trace/anbob01_diag_77208.trc
Mon Oct 10 09:44:13 2022
opiodr aborting process unknown ospid (115704) as a result of ORA-1092
Mon Oct 10 09:44:13 2022
ORA-1092 : opitsk aborting process
Instance terminated by PMON, pid = 77198

Smon Trace file

*** ACTION NAME:() 2022-10-10 09:32:01.607
 
SMON[TX-RECO]:about to recover undo segment 1036 status:6 inst:1
SMON[TX-RECO]: mark undo segment 1036 as available status:2 ret:0
SMON[TX-RECO]:about to recover undo segment 1276 status:6 inst:1
Dead transaction 0x04fc.01b.000068f5 recovered by SMON
SMON[TX-RECO]: mark undo segment 1276 as available status:2 ret:0
SMON[TX-RECO]:about to recover undo segment 1303 status:6 inst:1
Dead transaction 0x0517.00e.0000354d recovered by SMON
SMON[TX-RECO]: mark undo segment 1303 as available status:2 ret:0
SMON[TX-RECO]:about to recover undo segment 2489 status:6 inst:2
Incorrect next uba in kturCurrBackoutOneChg while backing out xid: 0x09b9.001.00008634 uba: 0x0881036d.0a9c.35
Undo record:
ktubu redo: slt: 1 rci: 52 opc: 10.22 objn: 389809 objd: 389809 tsn: 13
Undo type:  Regular undo       Undo type:  Last buffer split:  No 
Tablespace Undo:  No 
             0x00000000
index undo for leaf key operations
KTB Redo 
op: 0x03  ver: 0x01  
compat bit: 4 (post-11) padding: 1
op: Z
Dump kdilk : itl=42, kdxlkflg=0x1 sdc=0 indexid=0x100df6ea block=0x10cdfc32
(kdxlpu): purge leaf row
key :(19):  0d 38 36 30 30 30 32 32 38 39 34 33 32 38 04 44 53 4c 52
Undo block: tsn 0x5 rdba: 0x881036d
Dump of buffer cache at level 4 for tsn=5, rdba=142672749
BH (0x65fa90938) file#: 34 rdba: 0x0881036d (34/66413) class: 4994 ba: 0x65e03a000
  set: 61 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,0
  dbwrid: 0 obj: -1 objn: 0 tsn: 5 afn: 34 hint: f
  hash: [0x95ac76658,0x95ac76658] lru: [0x65fa90b98,0x91cfab5e0]
  ckptq: [NULL] fileq: [NULL] objq: [0x8ac156690,0x8ac156690] objaq: [0x65fa91618,0x8ac156680]
  st: SCURRENT md: NULL tch: 1 atm: 542555381,542562694
  flags: affinity_lock
  LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
  cr pin refcnt: 0 sh pin refcnt: 0
Data block dump: tsn: 0xd rdba: 0x10cdfc32
Dump of buffer cache at level 3 for tsn=13, rdba=281934898
BH (0x65fb68060) file#: 67 rdba: 0x10cdfc32 (67/916530) class: 1 ba: 0x65f290000
  set: 65 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,0
  dbwrid: 1 obj: 389809 objn: 389809 tsn: 13 afn: 67 hint: f
  hash: [0x90b7cf590,0x90b7cf590] lru: [0x65fb682c0,0x91cfad1e0]
  ckptq: [NULL] fileq: [NULL] objq: [0x892283d50,0x892283d50] objaq: [0x892283d40,0x892283d40]
  st: XCURRENT md: NULL tch: 1 atm: 542561348,542562739
  flags: affinity_lock
  LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
  cr pin refcnt: 0 sh pin refcnt: 0
  buffer tsn: 13 rdba: 0x10cdfc32 (67/916530)
  scn: 0x011d.b929e3fd seq: 0x04 flg: 0x04 tail: 0xe3fd0604
  frmt: 0x02 chkval: 0xec67 type: 0x06=trans data
Hex dump of block: st=0, typ_found=1
Dump of memory from 0x000000065F290000 to 0x000000065F292000
...
Block header dump:  0x10cdfc32
 Object id on Block? Y
 seg/obj: 0x5f2b1  csc: 0x11d.b929e3fd  itc: 131  flg: E  typ: 2 - INDEX
     brn: 0  bdba: 0x10cdfa00 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x041c.002.0002fdf3  0x00c0ed7b.5f77.01  CB--    0  scn 0x011d.b8c1f170
0x02   0x07ef.008.000510b1  0x08814853.5132.1a  C---    0  scn 0x011d.b8c66bc2
0x03   0x07f3.018.00083fd1  0x08806b85.658e.0f  C---    0  scn 0x011d.b8c66be7
0x04   0x0403.003.0014f1d4  0x00c09259.a7ae.13  C---    0  scn 0x011d.b8c0a5be
0x05   0x07f3.01d.00084158  0x08806b85.658e.16  C---    0  scn 0x011d.b8c66c11
0x06   0x0403.00c.0014f07c  0x00c0d7fa.a7b0.01  C---    0  scn 0x011d.b8c1340b
0x07   0x07f3.019.00084187  0x08806b85.658e.1f  C---    0  scn 0x011d.b8c66c3f
0x08   0x07ff.01a.000fee92  0x08c070e5.96c9.1f  C---    0  scn 0x011d.b8c71373
...
0x28   0x0934.013.0000931e  0x08c088d5.0bb5.23  C-U-    0  scn 0x011d.b8be4f62
0x29   0x09ba.003.000009b4  0x08c00f5c.00df.24  C-U-    0  scn 0x011d.b8be4f92
0x2a   0x09b9.001.00008634  0x0881036d.0a9c.35  C-U-    0  scn 0x0114.bacea0d2
0x2b   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
...
0x7e   0x0886.005.00009a2b  0x08c1d30c.0c13.28  C-U-    0  scn 0x011d.b8be5224
0x7f   0x0ae4.004.00000ab4  0x11c03372.00c9.40  C-U-    0  scn 0x011d.b8be5371
0x80   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
0x81   0x0afa.010.0000010a  0x11c03d6f.004e.44  C-U-    0  scn 0x011d.b8be5320
0x82   0x0ac7.000.00000904  0x11c03f90.009e.1e  C-U-    0  scn 0x011d.b8be5165
0x83   0x0b13.010.000000e7  0x11c4e6ca.0052.0a  C-U-    0  scn 0x011d.b8be5873
Leaf block dump
===============
header address 27366329468=0x65f290c7c
kdxcolev 0
KDXCOLEV Flags = - - -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y
kdxconco 2
kdxcosdc 2
kdxconro 93
kdxcofbo 222=0xde
kdxcofeo 2425=0x979
kdxcoavs 2203
kdxlespl 0
kdxlende 0
kdxlenxt 269401599=0x100ebdff
kdxleprv 269401093=0x100ebc05
kdxledsz 6
kdxlebksz 4936
row#0[2614] flag: ------, lock: 0, len=27, data:(6):  0f 6b 2b ca 00 00
...
ORACLE Instance dslrprd11 (pid = 23) - Error 600 encountered while recovering transaction (2489, 1) on object 389809.

*** 2022-10-10 09:32:02.138
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=3, mask=0x0)
----- Error Stack Dump -----
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []
----- SQL Statement (None) -----
Current SQL information unavailable - no cursor.

----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
skdstdst()+36        call     kgdsdst()            000000000 ? 000000000 ?
                                                   7FFFD22095D8 ? 000000001 ?
                                                   000000001 ? 000000002 ?
ksedst1()+98         call     skdstdst()           000000000 ? 000000000 ?
                                                   7FFFD22095D8 ? 000000001 ?
                                                   000000000 ? 000000002 ?
ksedst()+34          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FFFD22095D8 ? 000000001 ?
                                                   000000000 ? 000000002 ?
dbkedDefDump()+2741  call     ksedst()             000000000 ? 000000001 ?
                                                   7FFFD22095D8 ? 000000001 ?
                                                   000000000 ? 000000002 ?
ksedmp()+36          call     dbkedDefDump()       000000003 ? 000000000 ?
                                                   7FFFD22095D8 ? 000000001 ?
                                                   000000000 ? 000000002 ?
kturRecoverTxn()+17  call     ksedmp()             000000003 ? 000000000 ?
293                                                7FFFD22095D8 ? 000000001 ?
                                                   000000000 ? 000000002 ?
kturRecoverUndoSegm  call     kturRecoverTxn()     7FFFD220EEB8 ? 000000001 ?
ent()+1202                                         000000001 ? 000000000 ?
                                                   000000010 ? 2200000004 ?
kturRecoverActiveTx  call     kturRecoverUndoSegm  0000009B9 ? 7FFFD220F058 ?
ns()+3684                     ent()                000000001 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
ktprbeg()+323        call     kturRecoverActiveTx  000000010 ? 000000000 ?
                              ns()                 0000009B9 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
ktmmon()+13554       call     ktprbeg()            000000000 ? 000000000 ?
                                                   0000009B9 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
ktmSmonMain()+207    call     ktmmon()             0600304B8 ? 000000000 ?
                                                   0000009B9 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
ksbrdp()+971         call     ktmSmonMain()        0600304B8 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
opirip()+618         call     ksbrdp()             0600304B8 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
opidrv()+598         call     opirip()             000000032 ? 000000004 ?
                                                   7FFFD2211B08 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
sou2o()+98           call     opidrv()             000000032 ? 000000004 ?
                                                   7FFFD2211B08 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
opimai_real()+261    call     sou2o()              7FFFD2211AE0 ? 000000032 ?
                                                   000000004 ? 7FFFD2211B08 ?
                                                   00000FFFF ? 100000010 ?
ssthrdmain()+252     call     opimai_real()        000000000 ? 7FFFD2211CD0 ?
                                                   000000004 ? 7FFFD2211B08 ?
                                                   00000FFFF ? 100000010 ?
main()+196           call     ssthrdmain()         000000003 ? 7FFFD2211CD0 ?
                                                   000000001 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
__libc_start_main()  call     main()               000000003 ? 7FFFD2211E70 ?
+253                                               000000001 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
_start()+36          call     __libc_start_main()  000A077E4 ? 000000001 ?
                                                   7FFFD2211E68 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
 

--------------------- Binary Stack Dump ---------------------

Dump continued from file: /oracle/diag/rdbms/dslrprd1new/anbob01/trace/anbob01_smon_77246.trc
ORA-00600: internal error code, arguments: [ktubko_1], [], [], [], [], [], [], [], [], [], [], []

----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
skdstdst()+36        call     kgdsdst()            000000000 ? 000000000 ?
                                                   7FFFD2201058 ? 000000001 ?
                                                   000000001 ? 000000002 ?
ksedst1()+98         call     skdstdst()           000000000 ? 000000000 ?
                                                   7FFFD2201058 ? 000000001 ?
                                                   000000000 ? 000000002 ?
ksedst()+34          call     ksedst1()            000000000 ? 000000001 ?
                                                   7FFFD2201058 ? 000000001 ?
                                                   000000000 ? 000000002 ?
dbkedDefDump()+2741  call     ksedst()             000000000 ? 000000001 ?
                                                   7FFFD2201058 ? 000000001 ?
                                                   000000000 ? 000000002 ?
ksedmp()+36          call     dbkedDefDump()       000000003 ? 000000002 ?
                                                   7FFFD2201058 ? 000000001 ?
                                                   000000000 ? 000000002 ?
ksfdmp()+64          call     ksedmp()             000000003 ? 000000002 ?
                                                   7FFFD2201058 ? 000000001 ?
                                                   000000000 ? 000000002 ?
dbgexPhaseII()+1764  call     ksfdmp()             000000003 ? 000000002 ?
                                                   7FFFD2201058 ? 000000001 ?
                                                   000000000 ? 000000002 ?
dbgexProcessError()  call     dbgexPhaseII()       7FBF06D98710 ? 7FBF06B51798 ?
+2651                                              7FFFD220D3D0 ? 000000001 ?
                                                   000000000 ? 000000002 ?
dbgeExecuteForError  call     dbgexProcessError()  7FBF06D98710 ? 7FBF06B51798 ?
()+83                                              000000001 ? 000000000 ?
                                                   100000000 ? 000000002 ?
dbgePostErrorKGE()+  call     dbgeExecuteForError  7FBF06D98710 ? 7FBF06B51798 ?
2131                          ()                   000000001 ? 000000001 ?
                                                   000000000 ? 000000002 ?
dbkePostKGE_kgsf()+  call     dbgePostErrorKGE()   000000000 ? 7FBF06920498 ?
63                                                 000000258 ? 7FBF06B51798 ?
                                                   100000000 ? 000000002 ?
kgeadse()+390        call     dbkePostKGE_kgsf()   00B7C7EA0 ? 7FBF06920498 ?
                                                   000000258 ? 7FBF06B51798 ?
                                                   100000000 ? 000000002 ?
kgerinv_internal()+  call     kgeadse()            00B7C7EA0 ? 7FBF06920498 ?
45                                                 000000258 ? 100000000 ?
                                                   000000000 ? 000000002 ?
kgerinv()+33         call     kgerinv_internal()   00B7C7EA0 ? 7FBF06920498 ?
                                                   7FFFD21FD890 ? 000000258 ?
                                                   000000000 ? 000000002 ?
kgeasnmierr()+143    call     kgerinv()            00B7C7EA0 ? 7FBF06920498 ?
                                                   7FFFD21FD890 ? 000000002 ?
                                                   000000000 ? 000000002 ?
kturCurrBackoutOneC  call     kgeasnmierr()        00B7C7EA0 ? 7FBF06920498 ?
hg()+5986                                          7FFFD21FD890 ? 000000002 ?
                                                   000000000 ? 000000000 ?
kturRecoverTxn()+15  call     kturCurrBackoutOneC  7FFFD220EBC8 ? 7FFFD220E5D4 ?
227                           hg()                 000000004 ? 7FFFD220EEB8 ?
                                                   7FFFD220EAD0 ? 7FFFD220E5DC ?
kturRecoverUndoSegm  call     kturRecoverTxn()     7FFFD220EEB8 ? 000000001 ?
ent()+1202                                         000000001 ? 000000000 ?
                                                   000000010 ? 2200000004 ?
kturRecoverActiveTx  call     kturRecoverUndoSegm  0000009B9 ? 7FFFD220F058 ?
ns()+3684                     ent()                000000001 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
ktprbeg()+323        call     kturRecoverActiveTx  000000010 ? 000000000 ?
                              ns()                 0000009B9 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
ktmmon()+13554       call     ktprbeg()            000000000 ? 000000000 ?
                                                   0000009B9 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
ktmSmonMain()+207    call     ktmmon()             0600304B8 ? 000000000 ?
                                                   0000009B9 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
ksbrdp()+971         call     ktmSmonMain()        0600304B8 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
opirip()+618         call     ksbrdp()             0600304B8 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
opidrv()+598         call     opirip()             000000032 ? 000000004 ?
                                                   7FFFD2211B08 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?
sou2o()+98           call     opidrv()             000000032 ? 000000004 ?
                                                   7FFFD2211B08 ? 000000000 ?
                                                   00000FFFF ? 100000010 ?

Note:

kgeasnmierr – kernel generic error assert named internal error
kturCurrBackoutOneChg – kernel transaction undo rollback current read; backout change (rollback)
kturRecoverTxn() kernel transaction undo rollback recover transaction
kturRecoverUndoSegment – kernel transaction undo rollback recover undo segment
kturRecoverActiveTxns – kernel transaction undo rollback recover active transactions
ktprbeg() kernel transaction parallel recovery begin
ktmmon()+13554 kernel transaction transaction monitor (smon) tx monitor background timeout action (smon)
ktmSmonMain()+207 kernel transaction transaction monitor (smon) main smon routine

从trace文件中可以看到是obj# 389809的索引段在level block 的index undo for leaf key operations时purge leaf row出现了不正确的uba,rollback segment编号为2489, 事务表状态为6 PARTLY AVAILABLE( like DBA_ROLLBACK_SEGS.status or undo$.status)。

/* segment status (see KTS.H): */
/* 1 = INVALID, 2 = AVAILABLE, 3 = IN USE, 4 = OFFLINE, 5 = NEED RECOVERY,
* 6 = PARTLY AVAILABLE (contains in-doubt txs)

PARTLY AVAILABLE means it contains active transactions that SMON has not yet finished recovering or it contains in-doubt transactions(分布式事务)

根据ora600 和call stack 不难定位 匹配bug 14604955
Rediscovery Notes
ORA-00600 [ktubko_1] error.

Top Call Stack:
… <- kgeasnmierr <- kturCurrBackoutOneChg <- kturRecoverTxnMain <- kturRecoverTxn <- kturRecoverUndoSegment
<- kturRecoverActiveTxns <- ktprbeg <- ktmmon <- ktmSmonMain <- ksbrdp <- opirip <- …

解决方法

对于undo segment问题通常紧急救援是屏蔽rollseg, drop, 重建对象。当然如果有备份建议备份还原。
可以先屏蔽报错中提示的rollback segment 使用_OFFLINE_ROLLBACK_SEGMENTS或 _CORRUPTED_ROLLBACK_SEGMENTS, 需要重启实例。

*********************************************************************

ORACLE will not support a database that has used the
_CORRUPTED_ROLLBACK_SEGMENTS parameter.

*********************************************************************

UNDO SEGMENT

查找UNDO SEGMENT NAME

select segment_name from dba_rollback_segs where segment_id=2489;
--  _SYSSMU2489_xxxxxx$

-- 增加pfile ,启动数据库
*._OFFLINE_ROLLBACK_SEGMENTS=_SYSSMU2489_xxxxxx$

-- 可以dump undo header查看rollseg上的活动事务,通常如果不是非常高的并发,1个undo segment中算法均分是1个ACTIVE TX。  可以dump rollseg header查看。

ALTER SYSTEM DUMP UNDO HEADER '_SYSSMU2489_xxxx$';

当然因为有ACTIVE tx, 可以直接使用_CORRUPTED_ROLLBACK_SEGMENTS.

image-20230916194245669

_OFFLINE_ROLLBACK_SEGMENTS或 _CORRUPTED_ROLLBACK_SEGMENTS参数

If the datafile related to rollback segment get corrupted, the first method go for the recovery from backup.

If your backup is not working at all, so last unsupported option to choose the parameter _OFFLINE_ROLLBACK_SEGMENTS in init.ora. _OFFLINE_ROLLBACK_SEGMENTS is a unsupported init.ora parameter which can allow you to cause logical database corruption. It is rarely required in Oracle7.3 onwards due to deferred transaction recovery.

When you required this parameter to set _OFFLINE_ROLLBACK_SEGMENTS ?

When you start a database, SMON will recover all the uncommitted transaction. To do the recovery it will check the rollback segment header first to read the transaction table. But if any reason SMON is unable to read the header block of any online rollback segments then you required this parameter.

Unreadable rollback segment header may cause for block corruption, datafile corruption or datafile unavailable.If tablespace is online but the rollback segment header block get corrupted, you need this parameter to start the database.Because you have to stop SMON from performing transaction recovery for specific rollback segments. While starting the database SMON will not check the transaction table only for those rollback segments under the parameter _OFFLINE_ROLLBACK_SEGMENTS . You should choose this parameter only when all the transaction related to this rollback segment is INACTIVE.

How to know the transaction is in inactive mode ?

Do the following steps

===============

  1. Start the database with _OFFLINE_ROLLBACK_SEGMENTS first.
  2. Find the header file number and header block number from the following query.
select segment_name,header_file,header_block from dba_segments where segment_name like '';
  1. dump the block with following command.
alter system dump datafile n   block n ;
  1. observe for the uncommitted transaction from block dump like below. Observe the status field.
    If it is ‘9’ means transaction committed. but if the status value is ’10’ then it is an uncommitted transaction.So this rollback segment is not a good candidate for _OFFLINE_ROLLBACK_SEGMENTS.

  2. After checking the step 4, if you find all the status field is ‘9’, then drop the rollback segment immediately.

  3. Stop the database, take the parameter out from init.ora and start the database. So your dictionary is perfect.
    you can continue your production job with this database.

  4. After checking the step 4, if you find any status field is ’10’ , then you should immediately shutdown the database.
    Start the database with “_corrupted_rollback_segments”. And put the name of those rollback segment names where did you find the value of status=’10’.

  5. There is a logical corruption. So you should drop the rollback segment And do a complete export and import of this database to make it supported.

What is _CORRUPTED_ROLLBACK_SEGMENTS ?

_CORRUPTED_ROLLBACK_SEGMENTS is more dangerous parameter than _OFFLINE_ROLLBACK_SEGMENTS.

It basically prevents access to the listed rollback segments headers and assumes all transactions in them are committed. This can very easily cause logical database corruption. Opening any database with any rollback segment listed in this parameter instantly makes the database unsupported. This should
only EVER be used after a full backup has been taken as as part of a last attempt to salvage data from a database.

Let me make a chart for more details DATAFILE TRANSACTION

_offline_rollback_segments ONLINE/OFFLINE/RECOVERY INACTIVE

_corrupted_rollback_segments ONLINE/OFFLINE/RECOVERY ACTIVE

ONLINE ACTIVE * you can use this option, but better option to choose _corrupted_rollback_segments

IMPORTANT: If this corruption happened with system rollback segment, there is no way to use any of this parameter. You must have to recover database from old backup.

Differences between these parameters

There are differences between these two parameters. If the rollback segment is listed in the _offline parameter list, the transaction table is still read accessible. This is important for delayed block cleanout. If a select statement reads a data block with an open ITL which points to the transaction table of the _offline rollback segment, the table is still checked. If the transaction is committed, delayed block clean out occurs. If the transaction is uncommitted, it will generate ORA-1578. It will not allow to select the block.

If the rollback segment is listed in the _corrupted parameter list, the transaction table is not read accessible. All transactions are assumed COMMITTED and delayed block cleanout will occur but there is no consistent read view of uncommitted transactions (logical corruption).

reference

https://www.parnassusdata.com/en/node/206

Related Posts:

Lots of Long transaction caused by database link, and undo hdr show DBA for that slot is 0x00000000

https://www.anbob.com/archives/2738.html

Troubleshooting large amount of undo/redo generated due to dblink usage

https://www.anbob.com/archives/6614.html

Troubleshooting ORA-600 [4187] Undo segment sequence number is exhausted

https://www.anbob.com/archives/6499.html

ORA-00600:[kzrini:!uprofile]

连接数据库报错ORA-00600[kzrini:!uprofile],数据库的tab$表被恶意清空,发现数据库内部存在几个异常对象,存储过程DBMS_SUPPORT_DBMONITORP和触发器DBMS_SUPPORT_DBMONITOR,检查了对象内容,创建数据库300天后自动删除tab$。问题原因是项目同事使用网上私自下载的绿色版数据库安装介质安装的数据库,而这个安装介质里的prvtsupp.plb文件被恶意篡改,最终导致这个问题的发生。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值