前两天遇到的SQL调优问题。
关键字:索引使用效率 谓词筛选率。
[@more@]问题描述:
某SQL执行时间不能忍受,2小时以上未出结果。
环境:
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE 10.2.0.4.0 Production
TNS for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Productio
NLSRTL Version 10.2.0.4.0 – Production
SQL:
select count(distinct(case
when c.sumdutypaid <> 0 then
c.claimno
else
null
end)),
count(distinct(case
when c.sumdutypaid = 0 then
c.claimno
else
null
end)),
sum(c.sumdutypaid)
FROM (SELECT claimno, sumdutypaid
FROM CCIC.PRPLCOMPENSATE
WHERE SUBSTR(FLAG, 4, 1) = '0'
AND underwriteflag in ('1', '3')
and (SUBSTR(flag, 6, 1) <> 'p' or SUBSTR(flag, 6, 1) is null)
AND compensatetype <> '3'
AND compensateno like 'CDD%') c,
(SELECT claimno
FROM ccic.prplclaimtoplatformsuccess
WHERE uploaddate between
to_date('2011-06-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') and
to_date('2011-06-30 23:59:59', 'YYYY-MM-DD HH24:MI:SS')
AND requesttype = '82'
AND comflag = 'J') s
WHERE s.claimno = c.claimno;
基础数据:
表CCIC.PRPLCOMPENSATE数据量1000W+,4G多。
表ccic.prplclaimtoplatformsuccess较小,不计。
处理过程:
1.首先查执行计划
SQL> set autotrace traceonly
SQL> select * from dual;
Execution Plan
----------------------------------------------------------
ERROR:
ORA-00604: error occurred at recursive SQL level 2
ORA-16000: database open for read-only access
SP2-0612: Error generating AUTOTRACE EXPLAIN report
Statistics
----------------------------------------------------------
24 recursive calls
0 db block gets
8 consistent gets
4 physical reads
0 redo size
511 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
报错。该库是将一个生产的DG库启动为READ_ONLY模式供查询使用。由于不能做写操作,set autotrace和explain plan for then select * from table(dbms_xplan.display)的方式都获取不到执行计划,因为这两种方法都是将执行计划写入一个表中,再从该表读出展示(个人分析应该是这个原因)。故使用sql_trace生成后台trace来获取执行计划,具体操作略。
以下是执行计划,第一个是完整sql,第二个是s子表,第三个是c子表,执行过程的统计信息(parse等)对于一、三无意义(因为时间太长没有执行完,手动中止了)。
*******************************************************************************
select count (distinct (case when c.sumdutypaid <>0 then c.claimno else null end)),
count (distinct (case when c.sumdutypaid = 0 then c.claimno else null end)),
sum(c.sumdutypaid)
FROM (SELECT claimno,sumdutypaid FROM CCIC.PRPLCOMPENSATE
WHERE SUBSTR(FLAG, 4, 1) = '0'
AND underwriteflag in ('1', '3')
and (SUBSTR(flag, 6, 1) <> 'p' or SUBSTR(flag, 6, 1) is null)
AND compensatetype <> '3'
AND compensateno like 'CDD%') C,
(SELECT claimno FROM ccic.prplclaimtoplatformsuccess
WHERE uploaddate between
to_date('2011-06-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') and
to_date('2011-06-30 23:59:59', 'YYYY-MM-DD HH24:MI:SS')
AND requesttype = '82'
AND comflag = 'J' )s
WHERE s.claimno = c.claimno
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.11 6.59 890 1168 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.11 6.59 890 1168 0 0
Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: SYS
Rows Row Source Operation
------- ---------------------------------------------------
0 SORT GROUP BY (cr=0 pr=0 pw=0 time=152 us)
0 HASH JOIN (cr=0 pr=0 pw=0 time=40 us)
0 TABLE ACCESS BY INDEX ROWID PRPLCOMPENSATE (cr=0 pr=0 pw=0 time=32 us)
1218 INDEX RANGE SCAN PK_LCOMPENSATE (cr=15 pr=14 pw=0 time=38117 us)(object id 50498)
0 TABLE ACCESS FULL PRPLCLAIMTOPLATFORMSUCCESS (cr=0 pr=0 pw=0 time=0 us)
*******************************************************************************
SELECT count(1) FROM ccic.prplclaimtoplatformsuccess
WHERE uploaddate between
to_date('2011-06-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') and
to_date('2011-06-30 23:59:59', 'YYYY-MM-DD HH24:MI:SS')
AND requesttype = '82'
AND comflag = 'J'
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 1.86 15.61 35214 66794 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 1.86 15.61 35214 66794 0 1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=66794 pr=35214 pw=0 time=15611750 us)
2876 TABLE ACCESS FULL PRPLCLAIMTOPLATFORMSUCCESS (cr=66794 pr=35214 pw=0 time=14404921 us)
*******************************************************************************
SELECT count(1) FROM CCIC.PRPLCOMPENSATE
WHERE SUBSTR(FLAG, 4, 1) = '0'
AND underwriteflag in ('1', '3')
and (SUBSTR(flag, 6, 1) <> 'p' or SUBSTR(flag, 6, 1) is null)
AND compensatetype <> '3'
AND compensateno like 'CDD%'
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 8.14 190.72 55941 358126 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 8.14 190.73 55941 358126 0 0
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS
Rows Row Source Operation
------- ---------------------------------------------------
0 SORT AGGREGATE (cr=0 pr=0 pw=0 time=64 us)
0 TABLE ACCESS BY INDEX ROWID PRPLCOMPENSATE (cr=0 pr=0 pw=0 time=38 us)
362388 INDEX RANGE SCAN PK_LCOMPENSATE (cr=2896 pr=2864 pw=0 time=20315365 us)(object id 50498)
*******************************************************************************
2.初步判断:
·c子表返回结果集可能太大,执行较长时间没有执行完,这个子查询的消耗比较大。
·s子表返回2000多行,这个很小,也很快,不会是性能瓶颈。
·c和s等值连接时,只要s作驱动表,无论NL还是HashJoin都是效率比较好的。不过目前是c作驱动表,而c又比较大,故效率很差。
调整s作驱动表,sql及执行计划如下:
*******************************************************************************
select/*+ ORDERED */ count (distinct (case when c.sumdutypaid <>0 then c.claimno else null end)),
count (distinct (case when c.sumdutypaid = 0 then c.claimno else null end)),
sum(c.sumdutypaid)
FROM (SELECT claimno FROM ccic.prplclaimtoplatformsuccess
WHERE uploaddate between
to_date('2011-06-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') and
to_date('2011-06-30 23:59:59', 'YYYY-MM-DD HH24:MI:SS')
AND requesttype = '82'
AND comflag = 'J' )s,
(SELECT claimno,sumdutypaid FROM CCIC.PRPLCOMPENSATE
WHERE SUBSTR(FLAG, 4, 1) = '0'
AND underwriteflag in ('1', '3')
and (SUBSTR(flag, 6, 1) <> 'p' or SUBSTR(flag, 6, 1) is null)
AND compensatetype <> '3'
AND compensateno like 'CDD%') c
WHERE s.claimno = c.claimno
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.11 0.40 27 149 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.18 2.33 3711 5807 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.29 2.73 3738 5956 0 0
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS
Rows Row Source Operation
------- ---------------------------------------------------
0 SORT GROUP BY (cr=0 pr=0 pw=0 time=38 us)
0 HASH JOIN (cr=0 pr=0 pw=0 time=11 us)
0 TABLE ACCESS FULL PRPLCLAIMTOPLATFORMSUCCESS (cr=0 pr=0 pw=0 time=5 us)
0 TABLE ACCESS BY INDEX ROWID PRPLCOMPENSATE (cr=0 pr=0 pw=0 time=0 us)
0 INDEX RANGE SCAN PK_LCOMPENSATE (cr=0 pr=0 pw=0 time=0 us)(object id 50498)
*******************************************************************************
但是执行的仍然很慢。
3.进一步判断:
c子查询的效率低是主要因素。重点分析c子查询。
将c子查询的数据量跑出,为90W+记录,数据量不算大,但是跑的时间很长。
c子查询中,对PK的索引进行index range scan,即使首先使用条件(compensateno like 'CDD%'),选出数据后再使用其它条件过滤,经查询,符合条件(compensateno like 'CDD%')的记录有将近1KW条,
SQL>SELECT count(1) FROM CCIC.PRPLCOMPENSATE WHERE compensateno like 'CDD%';
COUNT(1)
-------------
9763692
根据统计信息中该标的总记录数为1KW多条(SQL如下),那么使用全表扫描要优于使用索引(结果集占整个表记录的比例很高)
SQL>select TABLE_NAME,OWNER,LAST_ANALYZED,NUM_ROWS from dba_tables where OWNER='CCIC' and TABLE_NAME='PRPLCOMPENSATE';
但是这张表有4000多M(SQL如下),全表扫描的消耗也会很大
select 'PRPLCOMPENSATE' segment_name,
'CCIC' owner,
sum(bytes) / 1024 / 1024 || 'M' capacity
from dba_segments
where segment_name = 'PRPLCOMPENSATE'
and owner = 'CCIC';
4.最后的判断:
最根本的原因:对PK的使用不好(不是PK不好,而是本sql对PK的使用不好),对PK的使用条件(compensateno like 'CDD%')不能筛选掉大多数数据,c的结果应该是90多w记录,而使用PK筛选得到900多W记录。这与我们使用索引的初衷是不相符的(我们希望使用索引能够尽可能多的过滤掉数据,这样才够快)。
另外一个佐证,执行以下sql
SELECT count(1) FROM CCIC.PRPLCOMPENSATE WHERE compensateno like 'C8%';
形式跟SELECT count(1) FROM CCIC.PRPLCOMPENSATE WHERE compensateno like 'CDD%';一样,但是条件的筛选度高,速度1-2s
10:32:56 SQL>SELECT count(1) FROM CCIC.PRPLCOMPENSATE WHERE compensateno like 'C8%';
COUNT(1)
-------------
9763692
10:32:58 SQL>SELECT count(1) FROM CCIC.PRPLCOMPENSATE WHERE compensateno like 'C8%';
COUNT(1)
-------------
9763692
10:32:59 SQL>
5.优化建议:
因为最主要的原因是使用索引的条件筛选率低,所以解决方法也从这里入手。
1-提高使用PK的效率。进一步挖掘业务逻辑的关联性,从而改变对PK的使用。比如c子查询中还有条件SUBSTR(FLAG, 4, 1) = '0',那么这个条件在业务含义上是否能进一步精确对PK的使用?比如根据业务逻辑凡是满足SUBSTR(FLAG, 4, 1) = '0'的纪录,compensateno都是以CCDXXXXX开头,那么相应可以改写compensateno like 'CDD%' 为compensateno like 'CDDXXXXX%',这样可能对PK的使用更精确,能够过滤更多的数据。
2-谓词条件不变的情况下使用新的更高效的索引。如使用组合索引,跟据c子查询中的若干条件确定一个组合索引,使用该组合索引提高筛选率。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/21129591/viewspace-1052095/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/21129591/viewspace-1052095/