数据库版本11.2.0.4,linux,rac环境
一条sql,按照正常执行计划执行会报Oracle内部错误:
SQL> SELECT *
2 FROM (SELECT A.BID,
3 A.B1SUBCASE,
A.B1NO,
4 5 A.B1INGDATE,
6 A.DEALTOTAL,
7 A.CUSTID,
8 A.POSMEMID,
9 A.OUTSTOREROOMID,
C.CUSTID AS C_CUSTID,
10 11 D.STORESID,
12 (SELECT SUM(TRADEMONEY) AS TRADEMONEY
13 FROM TBL_B1ITEMS AA
14 WHERE AA.BID = A.BID) AS SUM_TRADEMONEY,
(SELECT COUNT(IBIID) AS CNTIBIID
15 16 FROM TBL_B1ITEMS AA
17 WHERE AA.BID = A.BID) AS CNTIBIID,
18 (SELECT SUM(DEBIT)
FROM TBL_B1CHECKOUT A1, TBL_DATADICTIONARY A2
19 20 WHERE A1.DATADICTIONARYID = A2.DATADICTIONARYID
21 AND A2.DICTIONARYCODE = 1
22 AND A1.FORECODE IS NULL
AND A1.BID = A.BID) AS DEBIT
23 24 FROM TBL_B1 A
INNER JOIN TBL_B1CASE B
25 26 ON A.B1TYPE = B.B1TYPE
27 AND A.B1SUBCASE = B.B1SUBCASE
28 LEFT JOIN TBL_CUSTOM C
29 ON A.CUSTID = C.CUSTID
LEFT JOIN TBL_STORES D
30 31 ON A.OUTSTOREROOMID = D.STORESID
32 WHERE A.RDATE = TO_DATE('2016-05-11', 'yyyy-mm-dd')
33 AND A.B1SUBCASE IN
(11, 12, 13, 21, 22, 24, 41, 42, 61, 62, 63, 71, 72, 73, 91)
34 35 AND B.AUTOREFER = 1
36 AND A.SERVICSTATUS = 0
AND A.BCOMPLETE = 1
37 38 AND NVL(A.BERRMEMID, 0) <> 1
39 AND NVL(IINVAUTOREFERB1, 0) <> 3
40 ORDER BY A.TIMEORDER, B.POSREFERORDER)
41 WHERE ROWNUM <= 10000;
FROM TBL_B1CHECKOUT A1, TBL_DATADICTIONARY A2
*
第 19 行出现错误:
ORA-00600: 内部错误代码, 参数: [12406], [], [], [], [], [], [], [], [], [], [], []
首先确定这是Oracle的bug无疑,错误的stacktrace如下:
ORA-00600: <C4>?<BF><B4><ED><CE><F3><B4><FA><C2><EB>, <B2><CE><CA><FD>: [12406], [], [], [], [], [], [], [], [], [], [], []
========= Dump for incident 216219 (ORA 600 [12406]) ========
*** 2016-05-11 10:52:10.633
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=gq026hrdjh7sd) -----
SELECT * FROM (SELECT A.BID,A.B1SUBCASE,A.B1NO,A.B1INGDATE, A.DEALTOTAL,A.CUSTID,A.POSMEMID,A.OUTSTOREROOMID, C.CUSTID AS C_CUSTID,D.STORESID, (SELECT SUM(TRADEMONEY) AS TRADEMONEY FROM TBL_B1ITEMS AA WHERE AA.BID=A.BID) AS SUM_
TRADEMONEY, (SELECT COUNT(IBIID) AS CNTIBIID FROM TBL_B1ITEMS AA WHERE AA.BID=A.BID) AS CNTIBIID, (SELECT SUM(DEBIT) FROM TBL_B1CHECKOUT A1,TBL_DATADICTIONARY A2 WHERE A1.DATADICTIONARYID=A2.DATADICTIONARYID AND A2.DICTIONARYCODE=1
AND A1.FORECODE IS NULL AND A1.BID=A.BID) AS DEBIT FROM TBL_B1 A INNER JOIN TBL_B1CASE B ON A.B1TYPE=B.B1TYPE AND A.B1SUBCASE=B.B1SUBCASE LEFT JOIN TBL_CUSTOM C ON A.CUSTID=C.CUSTID LEFT JOIN TBL_STORES D ON A.OUTSTOREROOMID
=D.STORESID WHERE A.RDATE =TO_DATE(:B1 ,'yyyy-mm-dd') AND A.B1SUBCASE IN (11,12,13,21,22,24,41,42,61,62,63,71,72,73,91) AND B.AUTOREFER = 1 AND A.SERVICSTATUS=0 AND A.BCOMPLETE=1 AND NVL(A.BERRMEMID,0)<>1 AND NVL(IINVAUTOREFERB1,0)<
> 3 ORDER BY A.TIMEORDER,B.POSREFERORDER) WHERE ROWNUM<=10000
----- PL/SQL Stack -----
----- PL/SQL Call Stack -----
object line object
handle number name
0xdb38ea018 109 procedure PROD.PROC1
0xf49353b10 1 anonymous block
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+41 call kgdsdst() 000000000 ? 000000000 ?
7FFF3A9F9A20 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
ksedst1()+103 call skdstdst() 000000000 ? 000000000 ?
7FFF3A9F9A20 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
ksedst()+39 call ksedst1() 000000000 ? 000000001 ?
7FFF3A9F9A20 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
dbkedDefDump()+2746 call ksedst() 000000000 ? 000000001 ?
7FFF3A9F9A20 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
ksedmp()+41 call dbkedDefDump() 000000003 ? 000000002 ?
7FFF3A9F9A20 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
ksfdmp()+69 call ksedmp() 000000003 ? 000000002 ?
7FFF3A9F9A20 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
dbgexPhaseII()+1764 call ksfdmp() 000000003 ? 000000002 ?
7FFF3A9F9A20 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
dbgexProcessError() call dbgexPhaseII() 7F988CD217A8 ? 7F988CB73F60 ?
+2680 7FFF3AA03078 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
dbgeExecuteForError call dbgexProcessError() 7F988CD217A8 ? 7F988CB73F60 ?
()+88 000000001 ? 000000000 ?
7FFF3A9FE5A0 ? 000000002 ?
dbgePostErrorKGE()+ call dbgeExecuteForError 7F988CD217A8 ? 7F988CB73F60 ?
2136 () 000000001 ? 000000001 ?
000000000 ? 000000002 ?
dbkePostKGE_kgsf()+ call dbgePostErrorKGE() 00C0CC9E0 ? 7F988CABB520 ?
71 000000258 ? 000000001 ?
000000000 ? 000000002 ?
kgeade()+351 call dbkePostKGE_kgsf() 00C0CC9E0 ? 7F988CABB520 ?
000000258 ? 000000001 ?
000000000 ? 000000002 ?
kgeriv_int()+125 call kgeade() 00C0CC9E0 ? 00C0CCBA0 ?
7F988CABB520 ? 000000258 ?
000000000 ? 000000002 ?
kgeriv()+17 call kgeriv_int() 00C0CC9E0 ? 00C0CCBA0 ?
7F988CABB520 ? 000000258 ?
000000000 ? 000000002 ?
kgesiv()+115 call kgeriv() 00C0CC9E0 ? 00C0CCBA0 ?
7F988CABB520 ? 000000258 ?
000000000 ? 000000002 ?
ksesic0()+193 call kgesiv() 00C0CC9E0 ? 7F988CABB520 ?
000003076 ? 000000000 ?
7FFF3AA03D30 ? 000000002 ?
qesmaHandleBadRowID call ksesic0() 00C0CC9E0 ? 000000001 ?
()+982 000000001 ? 000000001 ?
7FFF3AA02D30 ? 000000000 ?
qertbFetchByRowID() call qesmaHandleBadRowID 7F988C6642B8 ? 000000001 ?
+9426 () 000000001 ? 000000001 ?
7FFF3AA02D30 ? 000000000 ?
qergsFetch()+837 call qertbFetchByRowID() 7F988C6642B8 ? 7F988C64D560 ?
00957693A ? 000000001 ?
从call stack trace中的这几个函数(qergsFetch->qertbFetchByRowID->qesmaHandleBadRowID)来看是在fetch时查到某个索引,并通过rowid回表时触发了Oracle的Bug。
STAT #140396469154440 id=25 cnt=1 pid=16 pos=2 obj=176780 op='TABLE ACCESS BY INDEX ROWID TBL_B1CASE (cr=1 pr=0 pw=0 time=10 us cost=1 size=12 card=1)'
WAIT #140396469154440: nam='SQL*Net break/reset to client' ela= 3 driver id=1650815232 break?=1 p3=0 obj#=177516 tim=1462949120406296
WAIT #140396469154440: nam='SQL*Net break/reset to client' ela= 62 driver id=1650815232 break?=0 p3=0 obj#=177516 tim=1462949120406373
(END)
同时可以通过obj#得到有问题的索引PROD.XFXB1ITEMS_BID_P,尝试不走这个索引执行sql,发现是可以正常出结果的。
SQL> set lines 200 pages 999
SQL> SELECT *
2 FROM (SELECT A.BID,
3 A.B1SUBCASE,
A.B1NO,
4 5 A.B1INGDATE,
6 A.DEALTOTAL,
7 A.CUSTID,
8 A.POSMEMID,
9 A.OUTSTOREROOMID,
10 C.CUSTID AS C_CUSTID,
11 D.STORESID,
(SELECT /*+full(AA) */ SUM(TRADEMONEY) AS TRADEMONEY
12 13 FROM TBL_B1ITEMS AA
14 WHERE AA.BID = A.BID) AS SUM_TRADEMONEY,
15 (SELECT /*+full(AA) */ COUNT(IBIID) AS CNTIBIID
16 FROM TBL_B1ITEMS AA
WHERE AA.BID = A.BID) AS CNTIBIID,
17 18 (SELECT SUM(DEBIT)
19 FROM TBL_B1CHECKOUT A1, TBL_DATADICTIONARY A2
WHERE A1.DATADICTIONARYID = A2.DATADICTIONARYID
20 21 AND A2.DICTIONARYCODE = 1
22 AND A1.FORECODE IS NULL
23 AND A1.BID = A.BID) AS DEBIT
24 FROM TBL_B1 A
INNER JOIN TBL_B1CASE B
25 26 ON A.B1TYPE = B.B1TYPE
27 AND A.B1SUBCASE = B.B1SUBCASE
28 LEFT JOIN TBL_CUSTOM C
29 ON A.CUSTID = C.CUSTID
LEFT JOIN TBL_STORES D
30 31 ON A.OUTSTOREROOMID = D.STORESID
32 WHERE A.RDATE = TO_DATE('2016-05-11', 'yyyy-mm-dd')
33 AND A.B1SUBCASE IN
34 (11, 12, 13, 21, 22, 24, 41, 42, 61, 62, 63, 71, 72, 73, 91)
AND B.AUTOREFER = 1
35 36 AND A.SERVICSTATUS = 0
37 AND A.BCOMPLETE = 1
38 AND NVL(A.BERRMEMID, 0) <> 1
AND NVL(IINVAUTOREFERB1, 0) <> 3
39 40 ORDER BY A.TIMEORDER, B.POSREFERORDER)
41 WHERE ROWNUM <= 10000;
BID B1SUBCASE B1NO B1INGDATE
---------- ----------- ------------------------------------------------------------------------------------------------------------------------------------------------------ -------------------
DEALTOTAL CUSTID
---------- ----------
POSMEMID OUTSTOREROOMID
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ --------------
C_CUSTID STORESID SUM_TRADEMONEY CNTIBIID DEBIT
---------- ---------- -------------- ---------- ----------
4624638 21 040150206457-160511000089 2016/05/11 15:58:57
32
630220 1708
1708 0
4624640 21 040150206457-160511000090 2016/05/11 16:04:35
18
1708
1708 0 18
问题肯定是出在索引PROD.XFXB1ITEMS_BID_P上,这是分区表上的一个普通的全局索引,做了坏块检查,并没发现存在坏块,对整个数据文件都进行了检查,无坏块方面的问题:
RMAN> backup validate datafile '+oradata/XZGdb/datafile/PROD.dbf';
启动 backup 于 2016/05/11 15:03:33
使用目标数据库控制文件替代恢复目录
分配的通道: ORA_DISK_1
通道 ORA_DISK_1: SID=1166 实例=XZGdb1 设备类型=DISK
通道 ORA_DISK_1: 正在启动全部数据文件备份集
通道 ORA_DISK_1: 正在指定备份集内的数据文件
输入数据文件: 文件号=00058 名称=+oradata/XZGdb/datafile/PROD.dbf
通道 ORA_DISK_1: 备份集已完成, 经过时间:00:00:07
数据文件列表
=================
文件状态 标记为损坏 空块 已检查的块 高 SCN
---- ------ -------------- ------------ --------------- ----------
58 OK 0 134018 622080 170429924655
文件名: +oradata/XZGdb/datafile/PROD.dbf
块类型 失败的块 已处理的块
---------- -------------- ----------------
数据 0 178590
索引 0 297832
其他 0 11640
完成 backup 于 2016/05/11 15:03:42
RMAN>
猜测应该可以通过rebuild或者recreate这个索引解决这个问题,于是在非业务高峰期对索引进行了rebuild:
alter index PROD.XFXB1ITEMS_BID_P rebuild online parallel 8 nologging;
alter index PROD.XFXB1ITEMS_BID_P parallel 1;
还是一样的报错,于是尝试recreate:
drop index PROD.XFXB1ITEMS_BID_P;
CREATE INDEX "PROD"."XFXB1ITEMS_BID_P" ON "PROD"."TBL_B1ITEMS" ("BID")
PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS NOLOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE "PROD" ;
再次执行的时候终于正常了。
查到官方的描述:
ORA-00600: Internal Error Code, Arguments: [12406] (文档 ID 1567438.1) 转到底部转到底部
In this Document
Symptoms
Cause
Solution
APPLIES TO:
Oracle Database - Enterprise Edition - Version 11.2.0.2 and later
Information in this document applies to any platform.
SYMPTOMS
ORA-00600 [12406] is reported in 11.2.0.2 environment when running below SQL statement:
SELECT ZELLE, :"SYS_B_0" SHAPE, OBJECTID, LTE800_LABEL_GELB_AKT.SHAPE.points,LTE800_LABEL_GELB_AKT.SHAPE.numpts,LTE800_LABEL_GELB_AKT.SHAPE.entity,LTE800_LABEL_GELB_AKT.SHAPE.minx,LTE800_LABEL_GELB_AKT.SHAPE.miny,LTE800_LABEL_GELB_AKT.SHAPE.maxx,LTE800_LABEL_GELB_AKT.SHAPE.maxy,LTE800_LABEL_GELB_AKT.rowid FROM GEODATA.LTE800_LABEL_GELB_AKT LTE800_LABEL_GELB_AKT WHERE SDE.ST_EnvIntersects(LTE800_LABEL_GELB_AKT.SHAPE,:1,:2,:3,:4) = :"SYS_B_1"
----- Call Stack Trace -----
qesmaHandleBadRowID qertbFetchByRowID opifch2 opiefn opiodr ttcpip opitsk opiino opiodr opidrv
CAUSE
Error is caused due to unpublished Bug 13571154 ORA 600 [12406] - QESMAHANDLEBADROWID [PART]<-- SIGNALING
SOLUTION
This issue will be fixed in version 12.
It has been determined that the fix is too risky to be back ported to 11g.
OR
Drop and recreate indexes on the table used in the query.
一条sql,按照正常执行计划执行会报Oracle内部错误:
SQL> SELECT *
2 FROM (SELECT A.BID,
3 A.B1SUBCASE,
A.B1NO,
4 5 A.B1INGDATE,
6 A.DEALTOTAL,
7 A.CUSTID,
8 A.POSMEMID,
9 A.OUTSTOREROOMID,
C.CUSTID AS C_CUSTID,
10 11 D.STORESID,
12 (SELECT SUM(TRADEMONEY) AS TRADEMONEY
13 FROM TBL_B1ITEMS AA
14 WHERE AA.BID = A.BID) AS SUM_TRADEMONEY,
(SELECT COUNT(IBIID) AS CNTIBIID
15 16 FROM TBL_B1ITEMS AA
17 WHERE AA.BID = A.BID) AS CNTIBIID,
18 (SELECT SUM(DEBIT)
FROM TBL_B1CHECKOUT A1, TBL_DATADICTIONARY A2
19 20 WHERE A1.DATADICTIONARYID = A2.DATADICTIONARYID
21 AND A2.DICTIONARYCODE = 1
22 AND A1.FORECODE IS NULL
AND A1.BID = A.BID) AS DEBIT
23 24 FROM TBL_B1 A
INNER JOIN TBL_B1CASE B
25 26 ON A.B1TYPE = B.B1TYPE
27 AND A.B1SUBCASE = B.B1SUBCASE
28 LEFT JOIN TBL_CUSTOM C
29 ON A.CUSTID = C.CUSTID
LEFT JOIN TBL_STORES D
30 31 ON A.OUTSTOREROOMID = D.STORESID
32 WHERE A.RDATE = TO_DATE('2016-05-11', 'yyyy-mm-dd')
33 AND A.B1SUBCASE IN
(11, 12, 13, 21, 22, 24, 41, 42, 61, 62, 63, 71, 72, 73, 91)
34 35 AND B.AUTOREFER = 1
36 AND A.SERVICSTATUS = 0
AND A.BCOMPLETE = 1
37 38 AND NVL(A.BERRMEMID, 0) <> 1
39 AND NVL(IINVAUTOREFERB1, 0) <> 3
40 ORDER BY A.TIMEORDER, B.POSREFERORDER)
41 WHERE ROWNUM <= 10000;
FROM TBL_B1CHECKOUT A1, TBL_DATADICTIONARY A2
*
第 19 行出现错误:
ORA-00600: 内部错误代码, 参数: [12406], [], [], [], [], [], [], [], [], [], [], []
首先确定这是Oracle的bug无疑,错误的stacktrace如下:
ORA-00600: <C4>?<BF><B4><ED><CE><F3><B4><FA><C2><EB>, <B2><CE><CA><FD>: [12406], [], [], [], [], [], [], [], [], [], [], []
========= Dump for incident 216219 (ORA 600 [12406]) ========
*** 2016-05-11 10:52:10.633
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=gq026hrdjh7sd) -----
SELECT * FROM (SELECT A.BID,A.B1SUBCASE,A.B1NO,A.B1INGDATE, A.DEALTOTAL,A.CUSTID,A.POSMEMID,A.OUTSTOREROOMID, C.CUSTID AS C_CUSTID,D.STORESID, (SELECT SUM(TRADEMONEY) AS TRADEMONEY FROM TBL_B1ITEMS AA WHERE AA.BID=A.BID) AS SUM_
TRADEMONEY, (SELECT COUNT(IBIID) AS CNTIBIID FROM TBL_B1ITEMS AA WHERE AA.BID=A.BID) AS CNTIBIID, (SELECT SUM(DEBIT) FROM TBL_B1CHECKOUT A1,TBL_DATADICTIONARY A2 WHERE A1.DATADICTIONARYID=A2.DATADICTIONARYID AND A2.DICTIONARYCODE=1
AND A1.FORECODE IS NULL AND A1.BID=A.BID) AS DEBIT FROM TBL_B1 A INNER JOIN TBL_B1CASE B ON A.B1TYPE=B.B1TYPE AND A.B1SUBCASE=B.B1SUBCASE LEFT JOIN TBL_CUSTOM C ON A.CUSTID=C.CUSTID LEFT JOIN TBL_STORES D ON A.OUTSTOREROOMID
=D.STORESID WHERE A.RDATE =TO_DATE(:B1 ,'yyyy-mm-dd') AND A.B1SUBCASE IN (11,12,13,21,22,24,41,42,61,62,63,71,72,73,91) AND B.AUTOREFER = 1 AND A.SERVICSTATUS=0 AND A.BCOMPLETE=1 AND NVL(A.BERRMEMID,0)<>1 AND NVL(IINVAUTOREFERB1,0)<
> 3 ORDER BY A.TIMEORDER,B.POSREFERORDER) WHERE ROWNUM<=10000
----- PL/SQL Stack -----
----- PL/SQL Call Stack -----
object line object
handle number name
0xdb38ea018 109 procedure PROD.PROC1
0xf49353b10 1 anonymous block
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+41 call kgdsdst() 000000000 ? 000000000 ?
7FFF3A9F9A20 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
ksedst1()+103 call skdstdst() 000000000 ? 000000000 ?
7FFF3A9F9A20 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
ksedst()+39 call ksedst1() 000000000 ? 000000001 ?
7FFF3A9F9A20 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
dbkedDefDump()+2746 call ksedst() 000000000 ? 000000001 ?
7FFF3A9F9A20 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
ksedmp()+41 call dbkedDefDump() 000000003 ? 000000002 ?
7FFF3A9F9A20 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
ksfdmp()+69 call ksedmp() 000000003 ? 000000002 ?
7FFF3A9F9A20 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
dbgexPhaseII()+1764 call ksfdmp() 000000003 ? 000000002 ?
7FFF3A9F9A20 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
dbgexProcessError() call dbgexPhaseII() 7F988CD217A8 ? 7F988CB73F60 ?
+2680 7FFF3AA03078 ? 7FFF3A9F9AF8 ?
7FFF3A9FE5A0 ? 000000002 ?
dbgeExecuteForError call dbgexProcessError() 7F988CD217A8 ? 7F988CB73F60 ?
()+88 000000001 ? 000000000 ?
7FFF3A9FE5A0 ? 000000002 ?
dbgePostErrorKGE()+ call dbgeExecuteForError 7F988CD217A8 ? 7F988CB73F60 ?
2136 () 000000001 ? 000000001 ?
000000000 ? 000000002 ?
dbkePostKGE_kgsf()+ call dbgePostErrorKGE() 00C0CC9E0 ? 7F988CABB520 ?
71 000000258 ? 000000001 ?
000000000 ? 000000002 ?
kgeade()+351 call dbkePostKGE_kgsf() 00C0CC9E0 ? 7F988CABB520 ?
000000258 ? 000000001 ?
000000000 ? 000000002 ?
kgeriv_int()+125 call kgeade() 00C0CC9E0 ? 00C0CCBA0 ?
7F988CABB520 ? 000000258 ?
000000000 ? 000000002 ?
kgeriv()+17 call kgeriv_int() 00C0CC9E0 ? 00C0CCBA0 ?
7F988CABB520 ? 000000258 ?
000000000 ? 000000002 ?
kgesiv()+115 call kgeriv() 00C0CC9E0 ? 00C0CCBA0 ?
7F988CABB520 ? 000000258 ?
000000000 ? 000000002 ?
ksesic0()+193 call kgesiv() 00C0CC9E0 ? 7F988CABB520 ?
000003076 ? 000000000 ?
7FFF3AA03D30 ? 000000002 ?
qesmaHandleBadRowID call ksesic0() 00C0CC9E0 ? 000000001 ?
()+982 000000001 ? 000000001 ?
7FFF3AA02D30 ? 000000000 ?
qertbFetchByRowID() call qesmaHandleBadRowID 7F988C6642B8 ? 000000001 ?
+9426 () 000000001 ? 000000001 ?
7FFF3AA02D30 ? 000000000 ?
qergsFetch()+837 call qertbFetchByRowID() 7F988C6642B8 ? 7F988C64D560 ?
00957693A ? 000000001 ?
从call stack trace中的这几个函数(qergsFetch->qertbFetchByRowID->qesmaHandleBadRowID)来看是在fetch时查到某个索引,并通过rowid回表时触发了Oracle的Bug。
STAT #140396469154440 id=25 cnt=1 pid=16 pos=2 obj=176780 op='TABLE ACCESS BY INDEX ROWID TBL_B1CASE (cr=1 pr=0 pw=0 time=10 us cost=1 size=12 card=1)'
WAIT #140396469154440: nam='SQL*Net break/reset to client' ela= 3 driver id=1650815232 break?=1 p3=0 obj#=177516 tim=1462949120406296
WAIT #140396469154440: nam='SQL*Net break/reset to client' ela= 62 driver id=1650815232 break?=0 p3=0 obj#=177516 tim=1462949120406373
(END)
同时可以通过obj#得到有问题的索引PROD.XFXB1ITEMS_BID_P,尝试不走这个索引执行sql,发现是可以正常出结果的。
SQL> set lines 200 pages 999
SQL> SELECT *
2 FROM (SELECT A.BID,
3 A.B1SUBCASE,
A.B1NO,
4 5 A.B1INGDATE,
6 A.DEALTOTAL,
7 A.CUSTID,
8 A.POSMEMID,
9 A.OUTSTOREROOMID,
10 C.CUSTID AS C_CUSTID,
11 D.STORESID,
(SELECT /*+full(AA) */ SUM(TRADEMONEY) AS TRADEMONEY
12 13 FROM TBL_B1ITEMS AA
14 WHERE AA.BID = A.BID) AS SUM_TRADEMONEY,
15 (SELECT /*+full(AA) */ COUNT(IBIID) AS CNTIBIID
16 FROM TBL_B1ITEMS AA
WHERE AA.BID = A.BID) AS CNTIBIID,
17 18 (SELECT SUM(DEBIT)
19 FROM TBL_B1CHECKOUT A1, TBL_DATADICTIONARY A2
WHERE A1.DATADICTIONARYID = A2.DATADICTIONARYID
20 21 AND A2.DICTIONARYCODE = 1
22 AND A1.FORECODE IS NULL
23 AND A1.BID = A.BID) AS DEBIT
24 FROM TBL_B1 A
INNER JOIN TBL_B1CASE B
25 26 ON A.B1TYPE = B.B1TYPE
27 AND A.B1SUBCASE = B.B1SUBCASE
28 LEFT JOIN TBL_CUSTOM C
29 ON A.CUSTID = C.CUSTID
LEFT JOIN TBL_STORES D
30 31 ON A.OUTSTOREROOMID = D.STORESID
32 WHERE A.RDATE = TO_DATE('2016-05-11', 'yyyy-mm-dd')
33 AND A.B1SUBCASE IN
34 (11, 12, 13, 21, 22, 24, 41, 42, 61, 62, 63, 71, 72, 73, 91)
AND B.AUTOREFER = 1
35 36 AND A.SERVICSTATUS = 0
37 AND A.BCOMPLETE = 1
38 AND NVL(A.BERRMEMID, 0) <> 1
AND NVL(IINVAUTOREFERB1, 0) <> 3
39 40 ORDER BY A.TIMEORDER, B.POSREFERORDER)
41 WHERE ROWNUM <= 10000;
BID B1SUBCASE B1NO B1INGDATE
---------- ----------- ------------------------------------------------------------------------------------------------------------------------------------------------------ -------------------
DEALTOTAL CUSTID
---------- ----------
POSMEMID OUTSTOREROOMID
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ --------------
C_CUSTID STORESID SUM_TRADEMONEY CNTIBIID DEBIT
---------- ---------- -------------- ---------- ----------
4624638 21 040150206457-160511000089 2016/05/11 15:58:57
32
630220 1708
1708 0
4624640 21 040150206457-160511000090 2016/05/11 16:04:35
18
1708
1708 0 18
问题肯定是出在索引PROD.XFXB1ITEMS_BID_P上,这是分区表上的一个普通的全局索引,做了坏块检查,并没发现存在坏块,对整个数据文件都进行了检查,无坏块方面的问题:
RMAN> backup validate datafile '+oradata/XZGdb/datafile/PROD.dbf';
启动 backup 于 2016/05/11 15:03:33
使用目标数据库控制文件替代恢复目录
分配的通道: ORA_DISK_1
通道 ORA_DISK_1: SID=1166 实例=XZGdb1 设备类型=DISK
通道 ORA_DISK_1: 正在启动全部数据文件备份集
通道 ORA_DISK_1: 正在指定备份集内的数据文件
输入数据文件: 文件号=00058 名称=+oradata/XZGdb/datafile/PROD.dbf
通道 ORA_DISK_1: 备份集已完成, 经过时间:00:00:07
数据文件列表
=================
文件状态 标记为损坏 空块 已检查的块 高 SCN
---- ------ -------------- ------------ --------------- ----------
58 OK 0 134018 622080 170429924655
文件名: +oradata/XZGdb/datafile/PROD.dbf
块类型 失败的块 已处理的块
---------- -------------- ----------------
数据 0 178590
索引 0 297832
其他 0 11640
完成 backup 于 2016/05/11 15:03:42
RMAN>
猜测应该可以通过rebuild或者recreate这个索引解决这个问题,于是在非业务高峰期对索引进行了rebuild:
alter index PROD.XFXB1ITEMS_BID_P rebuild online parallel 8 nologging;
alter index PROD.XFXB1ITEMS_BID_P parallel 1;
还是一样的报错,于是尝试recreate:
drop index PROD.XFXB1ITEMS_BID_P;
CREATE INDEX "PROD"."XFXB1ITEMS_BID_P" ON "PROD"."TBL_B1ITEMS" ("BID")
PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS NOLOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE "PROD" ;
再次执行的时候终于正常了。
查到官方的描述:
ORA-00600: Internal Error Code, Arguments: [12406] (文档 ID 1567438.1) 转到底部转到底部
In this Document
Symptoms
Cause
Solution
APPLIES TO:
Oracle Database - Enterprise Edition - Version 11.2.0.2 and later
Information in this document applies to any platform.
SYMPTOMS
ORA-00600 [12406] is reported in 11.2.0.2 environment when running below SQL statement:
SELECT ZELLE, :"SYS_B_0" SHAPE, OBJECTID, LTE800_LABEL_GELB_AKT.SHAPE.points,LTE800_LABEL_GELB_AKT.SHAPE.numpts,LTE800_LABEL_GELB_AKT.SHAPE.entity,LTE800_LABEL_GELB_AKT.SHAPE.minx,LTE800_LABEL_GELB_AKT.SHAPE.miny,LTE800_LABEL_GELB_AKT.SHAPE.maxx,LTE800_LABEL_GELB_AKT.SHAPE.maxy,LTE800_LABEL_GELB_AKT.rowid FROM GEODATA.LTE800_LABEL_GELB_AKT LTE800_LABEL_GELB_AKT WHERE SDE.ST_EnvIntersects(LTE800_LABEL_GELB_AKT.SHAPE,:1,:2,:3,:4) = :"SYS_B_1"
----- Call Stack Trace -----
qesmaHandleBadRowID qertbFetchByRowID opifch2 opiefn opiodr ttcpip opitsk opiino opiodr opidrv
CAUSE
Error is caused due to unpublished Bug 13571154 ORA 600 [12406] - QESMAHANDLEBADROWID [PART]<-- SIGNALING
SOLUTION
This issue will be fixed in version 12.
It has been determined that the fix is too risky to be back ported to 11g.
OR
Drop and recreate indexes on the table used in the query.