背景:某监控系统一条SQL在老环境执行3s,迁移到新环境执行了15分钟不出结果,报错ORA-01555
SQL语句如下:
select .....
from SMMMMVIEW.POS_TTTTT cc,
SMMMMVIEW.OOOOOIZATION org,
SMMMMVIEW.TRRRTYPE ty,
SMMMMVIEW.POS_MMMMMMM mechant,
SMMMMVIEW.POS_RRRRRRRR resp
where org.enable_Flag = 'Y'
and substr(cc.trancode, 5) = ty.trancode
and cc.m_merid = mechant.mechantno
and mechant.org_innercode = org.org_Nnnnn_Code
and cc.compcode = resp.ssb_code(+)
and cc.M_Termdate >= '20180625'
and cc.M_Termdate <= '20180625'
--and cc.M_Merid = '304302059450006'
and org.org_Nnnnn_Code like '1132%'
order by cc.m_termdate desc, cc.m_termtime desc;
老环境(3s响应)的执行计划:
Plan hash value: 52026976
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 398 | 149K| 51332 (1)| 00:10:16 |
| 1 | SORT ORDER BY | | 398 | 149K| 51332 (1)| 00:10:16 |
|* 2 | HASH JOIN | | 398 | 149K| 51331 (1)| 00:10:16 |
| 3 | TABLE ACCESS FULL | TRRRTYPE | 169 | 3042 | 5 (0)| 00:00:01 |
|* 4 | HASH JOIN RIGHT OUTER | | 398 | 142K| 51325 (1)| 00:10:16 |
| 5 | TABLE ACCESS FULL | POS_RRRRRRRR | 150 | 2850 | 3 (0)| 00:00:01 |
|* 6 | HASH JOIN | | 398 | 134K| 51322 (1)| 00:10:16 |
|* 7 | HASH JOIN | | 152 | 9120 | 6904 (1)| 00:01:23 |
|* 8 | INDEX RANGE SCAN | IDX_OOOOOIZATION_1 | 29 | 638 | 2 (0)| 00:00:01 |
|* 9 | TABLE ACCESS FULL | POS_MMMMMMM | 5057 | 187K| 6901 (1)| 00:01:23 |
| 10 | TABLE ACCESS BY INDEX ROWID| POS_TTTTT | 418K| 114M| 44414 (1)| 00:08:53 |
|* 11 | INDEX RANGE SCAN | POS_TTTTT_IDX2 | 418K| | 1850 (1)| 00:00:23 |
------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("TY"."TRANCODE"=SUBSTR("CC"."TRANCODE",5))
4 - access("CC"."COMPCODE"="RESP"."SSB_CODE"(+))
6 - access("CC"."M_MERID"="MECHANT"."MECHANTNO")
7 - access("MECHANT"."ORG_INNERCODE"="ORG"."org_Nnnnn_Code")
8 - access("ORG"."org_Nnnnn_Code" LIKE '1132%' AND "ORG"."ENABLE_FLAG"='Y')
filter("ORG"."org_Nnnnn_Code" LIKE '1132%' AND "ORG"."ENABLE_FLAG"='Y')
9 - filter("MECHANT"."ORG_INNERCODE" LIKE '1132%')
11 - access("CC"."M_TERMDATE"='20180621')
新环境(执行15分钟不出结果报错ORA-01555)的执行计划
Plan hash value: 185538578
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 772 | 50507 (1)| 00:10:07 |
| 1 | SORT ORDER BY | | 2 | 772 | 50507 (1)| 00:10:07 |
| 2 | NESTED LOOPS | | 2 | 772 | 50506 (1)| 00:10:07 |
| 3 | NESTED LOOPS | | 2 | 772 | 50506 (1)| 00:10:07 |
|* 4 | HASH JOIN OUTER | | 2 | 736 | 50504 (1)| 00:10:07 |
| 5 | NESTED LOOPS | | 2 | 694 | 50501 (1)| 00:10:07 |
| 6 | NESTED LOOPS | | 418K| 694 | 50501 (1)| 00:10:07 |
|* 7 | HASH JOIN | | 1 | 60 | 7141 (1)| 00:01:26 |
|* 8 | INDEX RANGE SCAN | IDX_OOOOOIZATION_1 | 1 | 22 | 2 (0)| 00:00:01 |
|* 9 | TABLE ACCESS FULL | POS_MMMMMMM | 338 | 12844 | 7139 (1)| 00:01:26 |
|* 10 | INDEX RANGE SCAN | POS_TTTTT_IDX2 | 418K| | 1203 (1)| 00:00:15 |
|* 11 | TABLE ACCESS BY INDEX ROWID| POS_TTTTT | 3 | 861 | 43360 (1)| 00:08:41 |
| 12 | TABLE ACCESS FULL | POS_RRRRRRRR | 152 | 3192 | 3 (0)| 00:00:01 |
|* 13 | INDEX UNIQUE SCAN | PK_TRRRTYPE | 1 | | 0 (0)| 00:00:01 |
| 14 | TABLE ACCESS BY INDEX ROWID | TRRRTYPE | 1 | 18 | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("CC"."COMPCODE"="RESP"."SSB_CODE"(+))
7 - access("MECHANT"."ORG_INNERCODE"="ORG"."org_Nnnnn_Code")
8 - access("ORG"."org_Nnnnn_Code" LIKE '1132%' AND "ORG"."ENABLE_FLAG"='Y')
filter("ORG"."ENABLE_FLAG"='Y' AND "ORG"."org_Nnnnn_Code" LIKE '1132%')
9 - filter("MECHANT"."ORG_INNERCODE" LIKE '1132%')
10 - access("CC"."M_TERMDATE"='20180625')
11 - filter("CC"."M_MERID"="MECHANT"."MECHANTNO")
13 - access("TY"."TRANCODE"=SUBSTR("CC"."TRANCODE",5))
两个执行计划的起点是一样的 ID=7 都是使用的HASH连接 出现分歧是在ID=6 (老环境hash 新环境nest loop) 原因是ID=7返回的rows 差异(老环境152 新环境为1) 而新环境rows为1的根本原因是 ID=8的索引范围扫描返回的结果rows为1
下面查一下过滤条件org_Nnnnn_Code列分布情况:
select count(1),substr(org_Nnnnn_Code,1,4)
from SMMMMVIEW.OOOOOIZATION org
group by substr(org_Nnnnn_Code,1,4) order by 1;
1 1027
...
1 1000
2 1031
...
2 1026
3 1037
...
31 1113
33 1132
34 1130
...
85 1109
由上面的结果可知:org_Nnnnn_Code列的分布不均匀。过滤条件里面"org_Nnnnn_Code" LIKE '1132%' 应该返回33行。这个SQL是页面查询SQL,org_Nnnnn_Code是类似机构号参数。每次传入的值都不固定
可能第一次传入一个让 "org_Nnnnn_Code" LIKE 'xxxx%' 返回1行的值。导致了这个SQL生成了错误的执行计划,由于绑定变量窥探,以后传入的每个值都按照第一次SQL的执行计划运行。所以当传入这个值的时候 SQL执行变慢了……
其实ORA-01555报错和这个没有直接关系,但是SQL如果能在 undo_retention的时间内出结果,肯定就不会出现ORA-01555
而优化SQL的关键就是让优化器知道 我每次传入的值不一样 而直方图的作用正是如此
1.查一下相关表和表上面索引的统计信息
-------------------------table----------------------
select num_rows,last_analyzed, stale_stats
from dba_tab_statistics
where table_name in ('POS_TTTTT',
'OOOOOIZATION',
'TRRRTYPE',
'POS_MMMMMMM',
'POS_RRRRRRRR');
38938532 2018/6/26 1:42:56 NO
152 2018/6/23 3:13:39 NO
323524 2018/6/23 3:13:25 NO
190 2018/6/24 3:01:24 NO
1505 2018/6/23 3:12:52 NO
------------------------index----------------------
select last_analyzed, stale_stats
from dba_ind_statistics
where table_name in ('POS_TTTTT',
'OOOOOIZATION',
'TRRRTYPE',
'POS_MMMMMMM',
'POS_RRRRRRRR');
2018/6/24 3:01:24 NO
2018/6/23 3:12:52 NO
2018/6/23 3:12:52 NO
2018/6/23 3:12:52 NO
2018/6/23 3:13:27 NO
2018/6/23 3:13:27 NO
2018/6/23 3:13:39 NO
2018/6/26 1:43:00 NO
2018/6/26 1:43:03 NO
2018/6/26 1:43:14 NO
2018/6/26 1:43:17 NO
----------------------histogram------------------
select table_name, column_name, num_distinct,histogram
from dba_tab_col_statistics a
where owner = 'SMMMMVIEW'
and table_name in ('OOOOOIZATION');
OOOOOIZATION ORG_ID 1505 NONE
OOOOOIZATION ORG_CODE 1492 NONE
OOOOOIZATION org_Nnnnn_Code 1505 NONE
OOOOOIZATION ORG_NAME_CN 1501 NONE
OOOOOIZATION PARENT_CODE 107 NONE
OOOOOIZATION ORG_TYPE 1 NONE
OOOOOIZATION ORG_LEVEL 4 NONE
OOOOOIZATION ADDRESS 524 NONE
OOOOOIZATION CREATE_DATE 668 NONE
OOOOOIZATION ENABLE_FLAG 2 NONE
OOOOOIZATION PARENT_BRANCHCODE 41 NONE
我们发现直方图缺失!下面对表OOOOOIZATION收集直方图
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(
OWNNAME=>'SMMMMVIEW',
TABNAME=>'OOOOOIZATION',
ESTIMATE_PERCENT=>100,
METHOD_OPT=>'FOR ALL COLUMNS SIZE AUTO',
CASCADE=>TRUE);
END;
/
注意:千万不要对全库收集直方图
查询直方图信息:
select table_name, column_name, num_distinct,histogram
from dba_tab_col_statistics a
where owner = 'SMMMMVIEW'
and table_name in ('OOOOOIZATION');
table_name column_name num_distinct histogram
OOOOOIZATION ORG_ID 1505 HEIGHT BALANCED
OOOOOIZATION ORG_CODE 1492 NONE
OOOOOIZATION org_Nnnnn_Code 1505 HEIGHT BALANCED
OOOOOIZATION ORG_NAME_CN 1501 HEIGHT BALANCED
OOOOOIZATION PARENT_CODE 107 FREQUENCY
OOOOOIZATION ORG_TYPE 1 FREQUENCY
OOOOOIZATION ORG_LEVEL 4 FREQUENCY
OOOOOIZATION ADDRESS 524 NONE
OOOOOIZATION CREATE_DATE 668 NONE
OOOOOIZATION ENABLE_FLAG 2 FREQUENCY
OOOOOIZATION PARENT_BRANCHCODE 41 FREQUENCY
收集直方图之后SQL重新解析就会生成正确的执行计划,关键是怎么做才能让SQL每次跑都重新解析呢?
提供 /*+ TO_CHAR(SYSDATE,'YYYYMMDD HH24:MI:SS') */ 临时解决绑定变量窥探的问题
Oracle 11g 中,自适应特性从一定程度解决了绑定变量窥探所导致的问题。
我们迁移数据库之后都会做收集统计信息的操作,但基本上不会去收集直方图,因为直方图的收集太消耗资源
所以当迁移之后SQL出现问题,一般的思路可以先考虑直方图缺失的问题