Oracle 子查询展开(subquery unnesting)

一、什么叫子查询
       查询是从表或视图中获取数据的一个操作,最上层的语句称之为查询,而嵌套在另外一个查询中的语句称之为子查询。

如:

SELECT * 
FROM   sales
WHERE  cust_id IN ( SELECT cust_id 
                    FROM   customers );

在上面的语句中,

( SELECT cust_id 
                    FROM   customers );
我们称之为子查询。
About Queries and Subqueries A query is an operation that retrieves data from one or more tables or views. In this reference, a top-level SELECT statement is called a
query, and a query nested within another SQL statement is called a subquery.


二、什么叫子查询展开以及为什么要展开
        就是优化器将嵌套的子查询展开成一个等价的join,然后去优化这个join。上面的语句,不展开的情形是,从sales表中获取的每1条数据,都要代入子查询进行匹配。一般情况下效率都是比较低的。展开的结果可能是将sales表和customers表做一个hash join semi,从而提高效率。因为转换为join后可以从访问路径、连接方法、连接顺序等方面优化整个查询。
       In subquery unnesting, the optimizer transforms a nested query into an equivalent join statement, and then optimizes the join. This transformation enables the optimizer to consider the subquery tables during access path, join method, and join order selection. The optimizer can perform this transformation only if the resulting join statement is guaranteed to return the same rows as the original statement, and if subqueries do not contain aggregate functions such as AVG.


三、展开分类
  1)semi-join
    当子查询前的where条件:exists/in/=any
    CBO会这样转换:SU: Transform an ANY subquery to semi-join or distinct
  2)anti-join
    当子查询前的where条件:not exists/not in/<>all
    CBO会这样转换:SU: Transform ALL/NOTEXISTS subquery into a regular anti-join


四、相关参数(默认为true)

KSPPINM                                  KSPPSTVL   KSPPDESC
---------------------------------------- ---------- --------------------------------------------------
_unnest_subquery                         TRUE       enables unnesting of complex subqueries
_optimizer_unnest_all_subqueries         TRUE       enables unnesting of every type of subquery
hint:unnest/no_unnest


五、限制
        必要条件:对于semi-join和anti-join都有一个共同的限制就是_unnest_subquery和_optimizer_unnest_all_subqueries需要同时为true,而且子查询不能与or在一个查询块内,如:select ... from xxx where exists(subquery) or xxx

 1)semi-join
    A.子查询内只存在一个表的
这种情况属于启发式查询转化,只要满足上述的必要条件,CBO就会做子查询展开,而不去考虑cost。但是实验证明在没有统计信息采用动态采样的情况下,子查询是否展开还是会基于cost的。
这里CBO选择了cost较低的不让子查询展开。

    B.子查询内存在多表关联的
这种情况CBO会将子查询转化成一个内嵌视图后,再与外部查询做半连接,但是必须满足子查询展开后的COST<未子查询展开的COST

 2)anti-join
    A.NOT EXISTS
这种情况需要满足上述的必要条件,并且必须满足子查询展开后的COST<未子查询展开的COST

    B.NOT IN/<>ALL
这种情况除了要需要满足上述的必要条件和子查询展开后的COST<未子查询展开的COST之外,由于对null敏感还需要满足下列条件:
(1)11G前必须连接列上有not null约束或者在sql上写明连接列is not null,CBO会对子查询展开做反连接(ANTI JOIN)
(2)11G以后只需要参数_optimizer_null_aware_antijoin=true,CBO会对子查询展开做一个针对null的反连接(ANTI NA JOIN)

注:11g之后如果_optimizer_null_aware_antijoin=false,但是连接列上有not null约束或者在sql上写明连接列is not null,CBO仍然会对子查询展开做一个正常的反连接(ANTI JOIN)


六、例子

6.1、数据准备,拷贝scott用户下的两张表,规避主外键约束、索引的影响

[oracle@sean ~]$ sqlplus scott/tiger
SQL*Plus: Release 11.2.0.4.0 Production on Mon Jul 10 14:24:28 2017

SCOTT@sean> create table t_emp as select * from emp;
Table created.

SCOTT@sean> create table t_dept as select * from dept;
Table created.

SCOTT@sean> exec dbms_stats.gather_schema_stats(user);
PL/SQL procedure successfully completed.

SCOTT@sean> set linesize 200
SCOTT@sean> set pagesize 200
SCOTT@sean> select * from t_emp;

     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7369 SMITH      CLERK           7902 17-DEC-80        800                    20
      7499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30
      7521 WARD       SALESMAN        7698 22-FEB-81       1250        500         30
      7566 JONES      MANAGER         7839 02-APR-81       2975                    20
      7654 MARTIN     SALESMAN        7698 28-SEP-81       1250       1400         30
      7698 BLAKE      MANAGER         7839 01-MAY-81       2850                    30
      7782 CLARK      MANAGER         7839 09-JUN-81       2450                    10
      7788 SCOTT      ANALYST         7566 19-APR-87       3000                    20
      7839 KING       PRESIDENT            17-NOV-81       5000                    10
      7844 TURNER     SALESMAN        7698 08-SEP-81       1500          0         30
      7876 ADAMS      CLERK           7788 23-MAY-87       1100                    20
      7900 JAMES      CLERK           7698 03-DEC-81        950                    30
      7902 FORD       ANALYST         7566 03-DEC-81       3000                    20
      7934 MILLER     CLERK           7782 23-JAN-82       1300                    10

14 rows selected.

SCOTT@sean> select * from t_dept;

    DEPTNO DNAME          LOC
---------- -------------- -------------
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON
        
SCOTT@sean> set autotrace traceonly exp

6.2、我们要观察的是以下这条语句,oracle将子查询展开,然后进行HASH JOIN SEMI连接。

SCOTT@sean> select * from t_emp where deptno in (select deptno from t_dept where dname='SALES');

Execution Plan
----------------------------------------------------------
Plan hash value: 741372481

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     5 |   255 |     6   (0)| 00:00:01 |
|*  1 |  HASH JOIN SEMI    |        |     5 |   255 |     6   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("DEPTNO"="DEPTNO")
   3 - filter("DNAME"='SALES')

6.3、使用no_unnest这个hint禁用子查询展开,可见使用了FILTER操作,执行计划总成本为12,高于上面展开的执行计划。

SCOTT@sean> select * from t_emp where deptno in (select /*+ no_unnest */ deptno from t_dept where dname='SALES');

Execution Plan
----------------------------------------------------------
Plan hash value: 3716950600

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     5 |   190 |    12   (0)| 00:00:01 |
|*  1 |  FILTER            |        |       |       |            |          |
|   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "T_DEPT" "T_DEPT"
              WHERE "DEPTNO"=:B1 AND "DNAME"='SALES'))
   3 - filter("DEPTNO"=:B1 AND "DNAME"='SALES') 

6.4、子查询和or连用,可见子查询无法展开

SCOTT@sean> select * from t_emp where deptno in (select deptno from t_dept where dname='SALES') or deptno=10;

Execution Plan
----------------------------------------------------------
Plan hash value: 3716950600

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     5 |   190 |     3   (0)| 00:00:01 |
|*  1 |  FILTER            |        |       |       |            |          |
|   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("DEPTNO"=10 OR  EXISTS (SELECT 0 FROM "T_DEPT" "T_DEPT"
              WHERE "DEPTNO"=:B1 AND "DNAME"='SALES'))
   3 - filter("DEPTNO"=:B1 AND "DNAME"='SALES')

SCOTT@sean> select * from t_emp where deptno in (select /*+ unnest */ deptno from t_dept where dname='SALES') or deptno=10;

Execution Plan
----------------------------------------------------------
Plan hash value: 3716950600

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     5 |   190 |     3   (0)| 00:00:01 |
|*  1 |  FILTER            |        |       |       |            |          |
|   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("DEPTNO"=10 OR  EXISTS (SELECT /*+ UNNEST */ 0 FROM
              "T_DEPT" "T_DEPT" WHERE "DEPTNO"=:B1 AND "DNAME"='SALES'))
   3 - filter("DEPTNO"=:B1 AND "DNAME"='SALES')

6.5、NOT IN/<>ALL对null敏感,注意规避和不同优化器版本的选择

SCOTT@sean> select * from t_emp where deptno not in (select deptno from t_dept where dname='SALES');

Execution Plan
----------------------------------------------------------
Plan hash value: 2687964391

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     9 |   459 |     6   (0)| 00:00:01 |
|*  1 |  HASH JOIN ANTI NA |        |     9 |   459 |     6   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("DEPTNO"="DEPTNO")
   3 - filter("DNAME"='SALES')

SCOTT@sean> select value from v$parameter_valid_values where name = 'optimizer_features_enable';

VALUE
--------------------------------------------------------------------------------
......
9.2.0.8
10.1.0
10.1.0.3
10.1.0.4
10.1.0.5
10.2.0.1
10.2.0.2
10.2.0.3
10.2.0.4
10.2.0.5
11.1.0.6
11.1.0.7
11.2.0.1
11.2.0.2
11.2.0.3
11.2.0.4
11.2.0.4.1

32 rows selected.

优化器在10g模式下的情形,列上没有not null约束,也没有显示写明连接列is not null
SCOTT@sean> select /*+ optimizer_features_enable('10.1.0') */ * from t_emp where deptno not in (select deptno from t_dept 
where dname='SALES');

Execution Plan
----------------------------------------------------------
Plan hash value: 3716950600

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     9 |   342 |    12   (0)| 00:00:01 |
|*  1 |  FILTER            |        |       |       |            |          |
|   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "T_DEPT" "T_DEPT" WHERE
              "DNAME"='SALES' AND LNNVL("DEPTNO"<>:B1)))
   3 - filter("DNAME"='SALES' AND LNNVL("DEPTNO"<>:B1))

优化器在10g模式下的情形,写明连接列is not null
--外层限制 not null,无效果
SCOTT@sean> select /*+ optimizer_features_enable('10.1.0') */ * from t_emp where deptno not in (select deptno from t_dept 
where dname='SALES') and deptno is not null;

Execution Plan
----------------------------------------------------------
Plan hash value: 3716950600

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     9 |   342 |    12   (0)| 00:00:01 |
|*  1 |  FILTER            |        |       |       |            |          |
|*  2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "T_DEPT" "T_DEPT" WHERE
              "DNAME"='SALES' AND LNNVL("DEPTNO"<>:B1)))
   2 - filter("DEPTNO" IS NOT NULL)
   3 - filter("DNAME"='SALES' AND LNNVL("DEPTNO"<>:B1))

--内层限制 not null,无效果
SCOTT@sean> select /*+ optimizer_features_enable('10.1.0') */ * from t_emp where deptno not in (select deptno from t_dept where 
dname='SALES' and deptno is not null);

Execution Plan
----------------------------------------------------------
Plan hash value: 3716950600

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     9 |   342 |    12   (0)| 00:00:01 |
|*  1 |  FILTER            |        |       |       |            |          |
|   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "T_DEPT" "T_DEPT" WHERE
              "DNAME"='SALES' AND "DEPTNO" IS NOT NULL AND LNNVL("DEPTNO"<>:B1)))
   3 - filter("DNAME"='SALES' AND "DEPTNO" IS NOT NULL AND
              LNNVL("DEPTNO"<>:B1))

--内外层限制 not null
SCOTT@sean> select /*+ optimizer_features_enable('10.1.0') */ * from t_emp where deptno not in (select deptno from t_dept where 
dname='SALES' and deptno is not null);

Execution Plan
----------------------------------------------------------
Plan hash value: 3716950600

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     9 |   342 |    12   (0)| 00:00:01 |
|*  1 |  FILTER            |        |       |       |            |          |
|   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM "T_DEPT" "T_DEPT" WHERE
              "DNAME"='SALES' AND "DEPTNO" IS NOT NULL AND LNNVL("DEPTNO"<>:B1)))
   3 - filter("DNAME"='SALES' AND "DEPTNO" IS NOT NULL AND
              LNNVL("DEPTNO"<>:B1))


七、注意事项

CBO会选择cost最小的执行计划,所以他会在合适的时候做适当的展开,我们需要注意的就是不要一些限制因素使得子查询无法展开。
比如:
子查询不能与or在一个查询块内,如:select ... from xxx where exists(subquery) or xxx
NOT IN/<>ALL由于对null敏感
......      

三、四、五参考博客 http://blog.itpub.net/31347199/viewspace-2120832/


  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值