通过将exists改写成in或这inner join优化sql。
Sql_id : 056bs9dzz8mwy
问题简述:逻辑读高。
Sql 文本:
SELECT A.*, a.rowid FROM WBANK.WD_BANK_BASEINFOMATION A WHERE EXISTS (SELECT 1 FROM (select KEYWORD, TYPECODE, INNERCODE, COUNT(*) FROM WBANK.WD_BANK_BASEINFOMATION WHERE SUBSTR(TYPECODE, 1, 3) = '001' GROUP BY KEYWORD, TYPECODE, INNERCODE HAVING COUNT(*) <> 1) B WHERE A.KEYWORD = B.KEYWORD AND A.TYPECODE = B.TYPECODE AND A.INNERCODE = B.INNERCODE); |
执行计划:
Execution Plan ---------------------------------------------------------- Plan hash value: 1318914978
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 130 | 7930K (1)| 39:39:10 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | WD_BANK_BASEINFOMATION | 2640K| 327M| 6249 (2)| 00:01:53 | |* 3 | FILTER | | | | | | | 4 | SORT GROUP BY NOSORT| | 1 | 47 | 3 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN | IDX_WD_B_BI | 1 | 47 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter( EXISTS (SELECT 0 FROM "WBANK"."WD_BANK_BASEINFOMATION" "WD_BANK_BASEINFOMATION" WHERE "TYPECODE"=:B1 AND "KEYWORD"=:B2 AND "INNERCODE"=:B3 AND SUBSTR("TYPECODE",1,3)='001' GROUP BY "KEYWORD","TYPECODE","INNERCODE" HAVING COUNT(*)<>1)) 3 - filter(COUNT(*)<>1) 5 - access("KEYWORD"=:B1 AND "TYPECODE"=:B2 AND "INNERCODE"=:B3) filter("INNERCODE"=:B1 AND SUBSTR("TYPECODE",1,3)='001')
Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 2329554 consistent gets 13 physical reads 0 redo size 2507 bytes sent via SQL*Net to client 513 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed |
可以发现逻辑读高达 200 多万。
刚看到这个 sql 的时候猜想会不会逻辑有问题,导致结果集为空。跑了一遍发现结果集确实为空。子查询的 innercode 列全部为 null 。根据条件 A.INNERCODE = B.INNERCODE 外部表(虽然是同一张表)是不会有匹配结果的。转念一想如果子查询 innercode 列有非空的,那就不会有问题了。当然了还是要询问开发结果集与该列为空是否有必然联系,如果有联系的话可以利用该逻辑关系改写 sql 。当然,这是后话了。
看一下数据分布:
SQL> select count(*) from WBANK.WD_BANK_BASEINFOMATION;
COUNT(*) ---------- 2645546
SQL> select count(*) from (select KEYWORD, TYPECODE, INNERCODE, COUNT(*) 2 FROM WBANK.WD_BANK_BASEINFOMATION 3 WHERE SUBSTR(TYPECODE, 1, 3) = '001' 4 GROUP BY KEYWORD, TYPECODE, INNERCODE 5 HAVING COUNT(*) <> 1);
COUNT(*) ---------- 128 |
外层结果集是全表数据 260 多万。子查询结果集只有 128 条。而根据 oracle 对 exists 的处理,会以外部结果集为驱动,也就是说要执行 260 多万次,这显然是不合理的。