查询是从表或视图中获取数据的一个操作,最上层的语句称之为查询,而嵌套在另外一个查询中的语句称之为子查询。
如:
SELECT *
FROM sales
WHERE cust_id IN ( SELECT cust_id
FROM customers );
在上面的语句中,
( SELECT cust_id
FROM customers );
我们称之为子查询。
About Queries and Subqueries A query is an operation that retrieves data from one or more tables or views. In this reference, a top-level SELECT statement is called a
query, and a query nested within another SQL statement is called a subquery.
二、什么叫子查询展开以及为什么要展开
就是优化器将嵌套的子查询展开成一个等价的join,然后去优化这个join。上面的语句,不展开的情形是,从sales表中获取的每1条数据,都要代入子查询进行匹配。一般情况下效率都是比较低的。展开的结果可能是将sales表和customers表做一个hash join semi,从而提高效率。因为转换为join后可以从访问路径、连接方法、连接顺序等方面优化整个查询。
In subquery unnesting, the optimizer transforms a nested query into an equivalent join statement, and then optimizes the join. This transformation enables the optimizer to consider the subquery tables during access path, join method, and join order selection. The optimizer can perform this transformation only if the resulting join statement is guaranteed to return the same rows as the original statement, and if subqueries do not contain aggregate functions such as AVG.
三、展开分类
1)semi-join
当子查询前的where条件:exists/in/=any
CBO会这样转换:SU: Transform an ANY subquery to semi-join or distinct
2)anti-join
当子查询前的where条件:not exists/not in/<>all
CBO会这样转换:SU: Transform ALL/NOTEXISTS subquery into a regular anti-join
四、相关参数(默认为true)
KSPPINM KSPPSTVL KSPPDESC
---------------------------------------- ---------- --------------------------------------------------
_unnest_subquery TRUE enables unnesting of complex subqueries
_optimizer_unnest_all_subqueries TRUE enables unnesting of every type of subquery
hint:unnest/no_unnest
五、限制
必要条件:对于semi-join和anti-join都有一个共同的限制就是_unnest_subquery和_optimizer_unnest_all_subqueries需要同时为true,而且子查询不能与or在一个查询块内,如:select ... from xxx where exists(subquery) or xxx
1)semi-join
A.子查询内只存在一个表的
这种情况属于启发式查询转化,只要满足上述的必要条件,CBO就会做子查询展开,而不去考虑cost。但是实验证明在没有统计信息采用动态采样的情况下,子查询是否展开还是会基于cost的。
这里CBO选择了cost较低的不让子查询展开。
B.子查询内存在多表关联的
这种情况CBO会将子查询转化成一个内嵌视图后,再与外部查询做半连接,但是必须满足子查询展开后的COST<未子查询展开的COST
2)anti-join
A.NOT EXISTS
这种情况需要满足上述的必要条件,并且必须满足子查询展开后的COST<未子查询展开的COST
B.NOT IN/<>ALL
这种情况除了要需要满足上述的必要条件和子查询展开后的COST<未子查询展开的COST之外,由于对null敏感还需要满足下列条件:
(1)11G前必须连接列上有not null约束或者在sql上写明连接列is not null,CBO会对子查询展开做反连接(ANTI JOIN)
(2)11G以后只需要参数_optimizer_null_aware_antijoin=true,CBO会对子查询展开做一个针对null的反连接(ANTI NA JOIN)
注:11g之后如果_optimizer_null_aware_antijoin=false,但是连接列上有not null约束或者在sql上写明连接列is not null,CBO仍然会对子查询展开做一个正常的反连接(ANTI JOIN)
六、例子
6.1、数据准备,拷贝scott用户下的两张表,规避主外键约束、索引的影响
[oracle@sean ~]$ sqlplus scott/tiger
SQL*Plus: Release 11.2.0.4.0 Production on Mon Jul 10 14:24:28 2017
SCOTT@sean> create table t_emp as select * from emp;
Table created.
SCOTT@sean> create table t_dept as select * from dept;
Table created.
SCOTT@sean> exec dbms_stats.gather_schema_stats(user);
PL/SQL procedure successfully completed.
SCOTT@sean> set linesize 200
SCOTT@sean> set pagesize 200
SCOTT@sean> select * from t_emp;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7369 SMITH CLERK 7902 17-DEC-80 800 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7566 JONES MANAGER 7839 02-APR-81 2975 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7839 KING PRESIDENT 17-NOV-81 5000 10
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7934 MILLER CLERK 7782 23-JAN-82 1300 10
14 rows selected.
SCOTT@sean> select * from t_dept;
DEPTNO DNAME LOC
---------- -------------- -------------
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
SCOTT@sean> set autotrace traceonly exp
6.2、我们要观察的是以下这条语句,oracle将子查询展开,然后进行HASH JOIN SEMI连接。
SCOTT@sean> select * from t_emp where deptno in (select deptno from t_dept where dname='SALES');
Execution Plan
----------------------------------------------------------
Plan hash value: 741372481
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 255 | 6 (0)| 00:00:01 |
|* 1 | HASH JOIN SEMI | | 5 | 255 | 6 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T_EMP | 14 | 532 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T_DEPT | 1 | 13 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("DEPTNO"="DEPTNO")
3 - filter("DNAME"='SALES')
6.3、使用no_unnest这个hint禁用子查询展开,可见使用了FILTER操作,执行计划总成本为12,高于上面展开的执行计划。
SCOTT@sean> select * from t_emp where deptno in (select /*+ no_unnest */ deptno from t_dept where dname='SALES');
Execution Plan
----------------------------------------------------------
Plan hash value: 3716950600
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 190 | 12 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| T_EMP | 14 | 532 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T_DEPT | 1 | 13 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "T_DEPT" "T_DEPT"
WHERE "DEPTNO"=:B1 AND "DNAME"='SALES'))
3 - filter("DEPTNO"=:B1 AND "DNAME"='SALES')
6.4、子查询和or连用,可见子查询无法展开
SCOTT@sean> select * from t_emp where deptno in (select deptno from t_dept where dname='SALES') or deptno=10;
Execution Plan
----------------------------------------------------------
Plan hash value: 3716950600
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 190 | 3 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| T_EMP | 14 | 532 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T_DEPT | 1 | 13 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("DEPTNO"=10 OR EXISTS (SELECT 0 FROM "T_DEPT" "T_DEPT"
WHERE "DEPTNO"=:B1 AND "DNAME"='SALES'))
3 - filter("DEPTNO"=:B1 AND "DNAME"='SALES')
SCOTT@sean> select * from t_emp where deptno in (select /*+ unnest */ deptno from t_dept where dname='SALES') or deptno=10;
Execution Plan
----------------------------------------------------------
Plan hash value: 3716950600
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 190 | 3 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| T_EMP | 14 | 532 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T_DEPT | 1 | 13 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("DEPTNO"=10 OR EXISTS (SELECT /*+ UNNEST */ 0 FROM
"T_DEPT" "T_DEPT" WHERE "DEPTNO"=:B1 AND "DNAME"='SALES'))
3 - filter("DEPTNO"=:B1 AND "DNAME"='SALES')
6.5、NOT IN/<>ALL对null敏感,注意规避和不同优化器版本的选择
SCOTT@sean> select * from t_emp where deptno not in (select deptno from t_dept where dname='SALES');
Execution Plan
----------------------------------------------------------
Plan hash value: 2687964391
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 459 | 6 (0)| 00:00:01 |
|* 1 | HASH JOIN ANTI NA | | 9 | 459 | 6 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T_EMP | 14 | 532 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T_DEPT | 1 | 13 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("DEPTNO"="DEPTNO")
3 - filter("DNAME"='SALES')
SCOTT@sean> select value from v$parameter_valid_values where name = 'optimizer_features_enable';
VALUE
--------------------------------------------------------------------------------
......
9.2.0.8
10.1.0
10.1.0.3
10.1.0.4
10.1.0.5
10.2.0.1
10.2.0.2
10.2.0.3
10.2.0.4
10.2.0.5
11.1.0.6
11.1.0.7
11.2.0.1
11.2.0.2
11.2.0.3
11.2.0.4
11.2.0.4.1
32 rows selected.
优化器在10g模式下的情形,列上没有not null约束,也没有显示写明连接列is not null
SCOTT@sean> select /*+ optimizer_features_enable('10.1.0') */ * from t_emp where deptno not in (select deptno from t_dept
where dname='SALES');
Execution Plan
----------------------------------------------------------
Plan hash value: 3716950600
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 342 | 12 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| T_EMP | 14 | 532 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T_DEPT | 1 | 13 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( NOT EXISTS (SELECT 0 FROM "T_DEPT" "T_DEPT" WHERE
"DNAME"='SALES' AND LNNVL("DEPTNO"<>:B1)))
3 - filter("DNAME"='SALES' AND LNNVL("DEPTNO"<>:B1))
优化器在10g模式下的情形,写明连接列is not null
--外层限制 not null,无效果
SCOTT@sean> select /*+ optimizer_features_enable('10.1.0') */ * from t_emp where deptno not in (select deptno from t_dept
where dname='SALES') and deptno is not null;
Execution Plan
----------------------------------------------------------
Plan hash value: 3716950600
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 342 | 12 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
|* 2 | TABLE ACCESS FULL| T_EMP | 14 | 532 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T_DEPT | 1 | 13 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( NOT EXISTS (SELECT 0 FROM "T_DEPT" "T_DEPT" WHERE
"DNAME"='SALES' AND LNNVL("DEPTNO"<>:B1)))
2 - filter("DEPTNO" IS NOT NULL)
3 - filter("DNAME"='SALES' AND LNNVL("DEPTNO"<>:B1))
--内层限制 not null,无效果
SCOTT@sean> select /*+ optimizer_features_enable('10.1.0') */ * from t_emp where deptno not in (select deptno from t_dept where
dname='SALES' and deptno is not null);
Execution Plan
----------------------------------------------------------
Plan hash value: 3716950600
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 342 | 12 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| T_EMP | 14 | 532 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T_DEPT | 1 | 13 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( NOT EXISTS (SELECT 0 FROM "T_DEPT" "T_DEPT" WHERE
"DNAME"='SALES' AND "DEPTNO" IS NOT NULL AND LNNVL("DEPTNO"<>:B1)))
3 - filter("DNAME"='SALES' AND "DEPTNO" IS NOT NULL AND
LNNVL("DEPTNO"<>:B1))
--内外层限制 not null
SCOTT@sean> select /*+ optimizer_features_enable('10.1.0') */ * from t_emp where deptno not in (select deptno from t_dept where
dname='SALES' and deptno is not null);
Execution Plan
----------------------------------------------------------
Plan hash value: 3716950600
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 342 | 12 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| T_EMP | 14 | 532 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T_DEPT | 1 | 13 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( NOT EXISTS (SELECT 0 FROM "T_DEPT" "T_DEPT" WHERE
"DNAME"='SALES' AND "DEPTNO" IS NOT NULL AND LNNVL("DEPTNO"<>:B1)))
3 - filter("DNAME"='SALES' AND "DEPTNO" IS NOT NULL AND
LNNVL("DEPTNO"<>:B1))
七、注意事项
CBO会选择cost最小的执行计划,所以他会在合适的时候做适当的展开,我们需要注意的就是不要一些限制因素使得子查询无法展开。比如:
子查询不能与or在一个查询块内,如:select ... from xxx where exists(subquery) or xxx
NOT IN/<>ALL由于对null敏感
......
三、四、五参考博客 http://blog.itpub.net/31347199/viewspace-2120832/