近期的数据库巡检中,发现一oracle rac数据库的一个节点发生ORA-00600 [kkoipt:incorrect pwj], [0], [0], [0], [2], [1], [1]告警,从trc文件中定位到一条与分区表查询相关的sql语句,本案例的处理过程涉及到修改一个隐含参数_full_pwise_join_enabled,查询MOS,与ORA-00600 [kkoipt:incorrect pwj], [0], [0], [0], [2], [1], [1]告警相关的BUG很多,主要有Bug13722997(参考13722997.8)、Bug17164360/17185558(参考17164360.8/17185558.8)、Bug9929660(参考9929660.8)详细的处理分析过程如下。
操作系统:RedHat RHEL6.6
数据库:Oracle 11.2.0.4 RAC
节点orcl1的ORA600告警,告警信息如下:
操作系统:RedHat RHEL6.6
数据库:Oracle 11.2.0.4 RAC
节点orcl1的ORA600告警,告警信息如下:
14/12/2016 09:20:27 Use ADRCI or Support Workbench to package the incident.
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl1/incident/incdir_64283/orcl1_ora_26060_i64283.trc
ORA-00600: internal error code, arguments: [kkoipt:incorrect pwj], [0], [0], [0], [2], [1], [1], [], [], [], [], []
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_26060.trc (incident=64283):
Wed Dec 14 09:20:27 2016
See Note 411.1 at My Oracle Support for error and packaging details.
Incident details in: /u01/app/oracle/diag/rdbms/orcl/orcl1/incident/incdir_64283/orcl1_ora_26060_i64283.trc
ORA-00600: internal error code, arguments: [kkoipt:incorrect pwj], [0], [0], [0], [2], [1], [1], [], [], [], [], []
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_26060.trc (incident=64283):
Wed Dec 14 09:20:27 2016
See Note 411.1 at My Oracle Support for error and packaging details.
orcl1_ora_26060_i64283.trc文件核心错误信息如下:
Dump file /u01/app/oracle/diag/rdbms/orcl/orcl1/incident/incdir_64283/orcl1_ora_26060_i64283.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.2.0
System name: Linux
Node name: rac01
Release: 4.1.12-37.4.1.el6uek.x86_64
Version: #2 SMP Tue May 17 07:23:38 PDT 2016
Machine: x86_64
Instance name: orcl1
Redo thread mounted by this instance: 1
Oracle process number: 35
Unix process pid: 26060, image: oracle@rac01
*** 2016-12-14 09:20:27.349
*** SESSION ID:(2282.25537) 2016-12-14 09:20:27.349
*** CLIENT ID:() 2016-12-14 09:20:27.349
*** SERVICE NAME:(orcl) 2016-12-14 09:20:27.349
*** MODULE NAME:() 2016-12-14 09:20:27.349
*** ACTION NAME:() 2016-12-14 09:20:27.349
Dump continued from file: /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_26060.trc
ORA-00600: internal error code, arguments: [kkoipt:incorrect pwj], [0], [0], [0], [2], [1], [1], [], [], [], [], []
========= Dump for incident 64283 (ORA 600 [kkoipt:incorrect pwj]) ========
*** 2016-12-14 09:20:27.349
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=75gnknpgwrw3j) -----
select *
from (select row_.*, rownum NumRow from (
select /*+FIRST_ROWS(100)*/
b.bill_no as BillNo,
round(b.TOTAL_AMOUNT, 2) as TOTAL_AMOUNT,
b.admission_date as AdmissionDate,
b.discharge_date as DischargeDate,
b.hospital_name as HospitalName,
a.reason as Reason,
b.DISCHARGE_DISEASE_NAME as DischargeDiseaseName,
ex.varchar01 as Department
from dw_bill b
left join dw_bill_ex ex
on ex.table_par=b.table_par and b.hisid = ex.billid
left join ad_auditresult a
on a.claim_id = b.id
and a.table_par = b.table_par
where 1=1 and a.table_par >= '20161101' and a.table_par <= '20161214' and a.rule_code='150802' and b.claim_type in (:paramClaimTypes0, :paramClaimTypes1, :paramClaimTypes2) and b.benefit_type in (:paramBenefitTypes0, :paramBenefitTypes1) order by b.bill_no ) row_ where rownum <= 100)
where NumRow > 0;
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.2.0
System name: Linux
Node name: rac01
Release: 4.1.12-37.4.1.el6uek.x86_64
Version: #2 SMP Tue May 17 07:23:38 PDT 2016
Machine: x86_64
Instance name: orcl1
Redo thread mounted by this instance: 1
Oracle process number: 35
Unix process pid: 26060, image: oracle@rac01
*** 2016-12-14 09:20:27.349
*** SESSION ID:(2282.25537) 2016-12-14 09:20:27.349
*** CLIENT ID:() 2016-12-14 09:20:27.349
*** SERVICE NAME:(orcl) 2016-12-14 09:20:27.349
*** MODULE NAME:() 2016-12-14 09:20:27.349
*** ACTION NAME:() 2016-12-14 09:20:27.349
Dump continued from file: /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_26060.trc
ORA-00600: internal error code, arguments: [kkoipt:incorrect pwj], [0], [0], [0], [2], [1], [1], [], [], [], [], []
========= Dump for incident 64283 (ORA 600 [kkoipt:incorrect pwj]) ========
*** 2016-12-14 09:20:27.349
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=75gnknpgwrw3j) -----
select *
from (select row_.*, rownum NumRow from (
select /*+FIRST_ROWS(100)*/
b.bill_no as BillNo,
round(b.TOTAL_AMOUNT, 2) as TOTAL_AMOUNT,
b.admission_date as AdmissionDate,
b.discharge_date as DischargeDate,
b.hospital_name as HospitalName,
a.reason as Reason,
b.DISCHARGE_DISEASE_NAME as DischargeDiseaseName,
ex.varchar01 as Department
from dw_bill b
left join dw_bill_ex ex
on ex.table_par=b.table_par and b.hisid = ex.billid
left join ad_auditresult a
on a.claim_id = b.id
and a.table_par = b.table_par
where 1=1 and a.table_par >= '20161101' and a.table_par <= '20161214' and a.rule_code='150802' and b.claim_type in (:paramClaimTypes0, :paramClaimTypes1, :paramClaimTypes2) and b.benefit_type in (:paramBenefitTypes0, :paramBenefitTypes1) order by b.bill_no ) row_ where rownum <= 100)
where NumRow > 0;
根据
Bug
13722997
相关文章
13722997.8的提示,需要对sql语句做修改,就是分区表不使用别名:
Description
A join query on reference partitioned tables with aliases may raise an internal error.
Workaround
do not use aliases
但是,取消掉分区表的别名,无论是使用sys还是业务用户,sql语句一旦执行立即报错退出,报错与当前报错相同:
ORA-00600: internal error code, arguments: [kkoipt:incorrect pwj], [0], [0], [0], [2], [1], [1], [], [], [], [], []
根据 Bug9929660(参考 9929660.8)的提示,要修改sql语句的并行执行度,但是sql语句相关的分区表上并没有并行,也没有并行相关的hints,因此排除Bug9929660:
Description
Incorrect Join information not invalidated properly in the case of null aware antijoin queries.
Rediscovery Notes:
ORA-600 [kkoipt:incorrect pwj] from Parallel Query in 11.2
Workaround
Change the degree of parallelism so that optimizer do not choose plan involving partition wise join.
Do not let the degree being set exactly match the number of partitions/subpartitions of the tables involved in the join to help avoid this error.
根据BUG 17185558/ 17164360的提示需要修改数据库隐含参数 _full_pwise_join_enabled(修改该参数并不需要重启数据库,但是注意将参数修改写到spfile,保证数据库重启后也生效),参数修改后,重新执行sql语句,报错不再出现:
Description
This bug is only relevant when using Partitioned Tables Rediscovery Notes ORA-600 [kkoipt:incorrect pwj] may occur while querying partitioned table.
The error occurs during the PARSE phase, when the optimizer is evaluating the join order.
Workaround
Set "_full_pwise_join_enabled"=false and re-parse the query.
Description
A join query on reference partitioned tables with aliases may raise an internal error.
Workaround
do not use aliases
但是,取消掉分区表的别名,无论是使用sys还是业务用户,sql语句一旦执行立即报错退出,报错与当前报错相同:
ORA-00600: internal error code, arguments: [kkoipt:incorrect pwj], [0], [0], [0], [2], [1], [1], [], [], [], [], []
根据 Bug9929660(参考 9929660.8)的提示,要修改sql语句的并行执行度,但是sql语句相关的分区表上并没有并行,也没有并行相关的hints,因此排除Bug9929660:
Description
Incorrect Join information not invalidated properly in the case of null aware antijoin queries.
Rediscovery Notes:
ORA-600 [kkoipt:incorrect pwj] from Parallel Query in 11.2
Workaround
Change the degree of parallelism so that optimizer do not choose plan involving partition wise join.
Do not let the degree being set exactly match the number of partitions/subpartitions of the tables involved in the join to help avoid this error.
根据BUG 17185558/ 17164360的提示需要修改数据库隐含参数 _full_pwise_join_enabled(修改该参数并不需要重启数据库,但是注意将参数修改写到spfile,保证数据库重启后也生效),参数修改后,重新执行sql语句,报错不再出现:
Description
This bug is only relevant when using Partitioned Tables Rediscovery Notes ORA-600 [kkoipt:incorrect pwj] may occur while querying partitioned table.
The error occurs during the PARSE phase, when the optimizer is evaluating the join order.
Workaround
Set "_full_pwise_join_enabled"=false and re-parse the query.
在执行SQL_ID为
75gnknpgwrw3j的会话或数据库级别关闭隐含
参数
"_full_pwise_join_enabled"
,
SYS执行命令:
ALTER SYSTEM SET _full_pwise_join_enabled=FALSE SCOPE=BOTH SID='*';
告警日志输出提示:
Fri Dec 16 15:15:46 2016
ALTER SYSTEM SET _full_pwise_join_enabled=FALSE SCOPE=BOTH SID='*';
ALTER SYSTEM SET _full_pwise_join_enabled=FALSE SCOPE=BOTH SID='*';
参数
"_full_pwise_join_enabled"修改后,执行
75gnknpgwrw3j不再报错ORA600
[oracle@rac02 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Fri Dec 16 15:19:01 2016
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> select count(*) from (
2 select *
3 from (select row_.*, rownum NumRow
4 from (select /*+FIRST_ROWS(100)*/
5 dw_bill.bill_no as BillNo,
6 round(dw_bill.TOTAL_AMOUNT, 2) as TOTAL_AMOUNT,
7 dw_bill.admission_date as AdmissionDate,
8 dw_bill.discharge_date as DischargeDate,
9 dw_bill.hospital_name as HospitalName,
10 ad_auditresult.reason as Reason,
11 dw_bill.DISCHARGE_DISEASE_NAME as DischargeDiseaseName,
12 dw_bill_ex.varchar01 as Department
13 from bmi.dw_bill
14 left join bmi.dw_bill_ex
15 on dw_bill_ex.table_par = dw_bill.table_par
16 and dw_bill.hisid = dw_bill_ex.billid
17 left join bmi.ad_auditresult
18 on ad_auditresult.claim_id = dw_bill.id
19 and ad_auditresult.table_par = dw_bill.table_par
20 where 1 = 1
21 and ad_auditresult.table_par >= '20161101'
22 and ad_auditresult.table_par <= '20161214'
23 and ad_auditresult.rule_code = '150802'
24 and dw_bill.claim_type in (2,
25 3,
26 1)
27 and dw_bill.benefit_type in
28 ('01','02')
29 order by dw_bill.bill_no) row_
30 where rownum <= 100)
31 where NumRow > 0);
COUNT(*)
----------
100
SQL> conn bmi/BMI
Connected.
SQL> select count(*) from (
2 select *
3 from (select row_.*, rownum NumRow
4 from (select /*+FIRST_ROWS(100)*/
5 dw_bill.bill_no as BillNo,
6 round(dw_bill.TOTAL_AMOUNT, 2) as TOTAL_AMOUNT,
7 dw_bill.admission_date as AdmissionDate,
8 dw_bill.discharge_date as DischargeDate,
9 dw_bill.hospital_name as HospitalName,
10 ad_auditresult.reason as Reason,
11 dw_bill.DISCHARGE_DISEASE_NAME as DischargeDiseaseName,
12 dw_bill_ex.varchar01 as Department
13 from bmi.dw_bill
14 left join bmi.dw_bill_ex
15 on dw_bill_ex.table_par = dw_bill.table_par
16 and dw_bill.hisid = dw_bill_ex.billid
17 left join bmi.ad_auditresult
18 on ad_auditresult.claim_id = dw_bill.id
19 and ad_auditresult.table_par = dw_bill.table_par
20 where 1 = 1
21 and ad_auditresult.table_par >= '20161101'
22 and ad_auditresult.table_par <= '20161214'
23 and ad_auditresult.rule_code = '150802'
24 and dw_bill.claim_type in (2,
25 3,
26 1)
27 and dw_bill.benefit_type in
28 ('01','02')
29 order by dw_bill.bill_no) row_
30 where rownum <= 100)
31 where NumRow > 0);
COUNT(*)
----------
100
SQL*Plus: Release 11.2.0.4.0 Production on Fri Dec 16 15:19:01 2016
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> select count(*) from (
2 select *
3 from (select row_.*, rownum NumRow
4 from (select /*+FIRST_ROWS(100)*/
5 dw_bill.bill_no as BillNo,
6 round(dw_bill.TOTAL_AMOUNT, 2) as TOTAL_AMOUNT,
7 dw_bill.admission_date as AdmissionDate,
8 dw_bill.discharge_date as DischargeDate,
9 dw_bill.hospital_name as HospitalName,
10 ad_auditresult.reason as Reason,
11 dw_bill.DISCHARGE_DISEASE_NAME as DischargeDiseaseName,
12 dw_bill_ex.varchar01 as Department
13 from bmi.dw_bill
14 left join bmi.dw_bill_ex
15 on dw_bill_ex.table_par = dw_bill.table_par
16 and dw_bill.hisid = dw_bill_ex.billid
17 left join bmi.ad_auditresult
18 on ad_auditresult.claim_id = dw_bill.id
19 and ad_auditresult.table_par = dw_bill.table_par
20 where 1 = 1
21 and ad_auditresult.table_par >= '20161101'
22 and ad_auditresult.table_par <= '20161214'
23 and ad_auditresult.rule_code = '150802'
24 and dw_bill.claim_type in (2,
25 3,
26 1)
27 and dw_bill.benefit_type in
28 ('01','02')
29 order by dw_bill.bill_no) row_
30 where rownum <= 100)
31 where NumRow > 0);
COUNT(*)
----------
100
SQL> conn bmi/BMI
Connected.
SQL> select count(*) from (
2 select *
3 from (select row_.*, rownum NumRow
4 from (select /*+FIRST_ROWS(100)*/
5 dw_bill.bill_no as BillNo,
6 round(dw_bill.TOTAL_AMOUNT, 2) as TOTAL_AMOUNT,
7 dw_bill.admission_date as AdmissionDate,
8 dw_bill.discharge_date as DischargeDate,
9 dw_bill.hospital_name as HospitalName,
10 ad_auditresult.reason as Reason,
11 dw_bill.DISCHARGE_DISEASE_NAME as DischargeDiseaseName,
12 dw_bill_ex.varchar01 as Department
13 from bmi.dw_bill
14 left join bmi.dw_bill_ex
15 on dw_bill_ex.table_par = dw_bill.table_par
16 and dw_bill.hisid = dw_bill_ex.billid
17 left join bmi.ad_auditresult
18 on ad_auditresult.claim_id = dw_bill.id
19 and ad_auditresult.table_par = dw_bill.table_par
20 where 1 = 1
21 and ad_auditresult.table_par >= '20161101'
22 and ad_auditresult.table_par <= '20161214'
23 and ad_auditresult.rule_code = '150802'
24 and dw_bill.claim_type in (2,
25 3,
26 1)
27 and dw_bill.benefit_type in
28 ('01','02')
29 order by dw_bill.bill_no) row_
30 where rownum <= 100)
31 where NumRow > 0);
COUNT(*)
----------
100
关于Full Partition-Wise Joins,参考:http://docs.oracle.com/database/122/VLDBG/partition-wise-joins.htm#VLDBG00402