CBO hint:no_unnest,push_subq,push_pred的用法

最新推荐文章于 2024-09-17 22:21:37 发布

wenhuiqiao

最新推荐文章于 2024-09-17 22:21:37 发布

阅读量1.2k

点赞数

分类专栏： oracle 文章标签： access filter table join loops sql

oracle 专栏收录该内容

39 篇文章 0 订阅

订阅专栏

常常有人把这三个hint搞混，主要是因为对三种重写原理不清楚。特总结如下。（实验环境为10204）
1. no_unnest, unnest
unnest我们称为对子查询展开，顾名思义，就是别让子查询孤单地嵌套(nest)在里面。
所以un_unnest双重否定代表肯定，即让子查询不展开，让它嵌套(nest)在里面。
现做一个简单的实验：
create table hao1 as select * from dba_objects;
create table hao2 as select * from dba_objects;
analyze table hao1 compute statistics;
analyze table hao2 compute statistics;
SQL> select hao1.object_id from hao1 where exists
  2  (select 1 from hao2 where hao1.object_id=hao2.object_id*10);
1038 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2662903432
---------------------------------------------------------------------------
| Id  | Operation       | Name | Rows  | Bytes | Cost (%CPU)| Time    |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT |    |    1 |    8 | 80 (3)| 00:00:01 |
|*  1 |  HASH JOIN SEMI |    |    1 |    8 | 80 (3)| 00:00:01 |
| 2 | TABLE ACCESS FULL| HAO1 | 10662 | 42648 | 40 (3)| 00:00:01 |
| 3 | TABLE ACCESS FULL| HAO2 | 10663 | 42652 | 40 (3)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("HAO1"."OBJECT_ID"="HAO2"."OBJECT_ID"*10)

Statistics
----------------------------------------------------------
      0  recursive calls
      0  db block gets
      352  consistent gets
      0  physical reads
      0  redo size
   18715  bytes sent via SQL*Net to client
   1251  bytes received via SQL*Net from client
      71  SQL*Net roundtrips to/from client
      0  sorts (memory)
      0  sorts (disk)
   1038  rows processed

这里子查询自动展开（unnest），即HAO2和HAO1 hash join在一起。
接下来如果我们不希望HAO2展开，想先让它单独的执行完，然后再来和外部查询进行一种叫做FILTER的操作。
那么我们加入hint no_unnest:
SQL> select hao1.object_id from hao1 where exists
  2  (select /*+no_unnest*/ 1 from hao2 where hao1.object_id=hao2.object_id*10);
1038 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2565749733
---------------------------------------------------------------------------
| Id  | Operation       | Name | Rows  | Bytes | Cost (%CPU)| Time    |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT |    |    1 |    4 | 10750 (1)| 00:01:48 |
|*  1 |  FILTER          |    |    |    |          |       |
| 2 | TABLE ACCESS FULL| HAO1 | 10662 | 42648 | 40 (3)| 00:00:01 |
|*  3 | TABLE ACCESS FULL| HAO2 |    1 |    4 |    2 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "HAO2" "HAO2"
            WHERE "HAO2"."OBJECT_ID"*10=:B1))
3 - filter("HAO2"."OBJECT_ID"*10=:B1)

Statistics
----------------------------------------------------------
      1  recursive calls
      0  db block gets
1369157  consistent gets
      0  physical reads
      0  redo size
   18715  bytes sent via SQL*Net to client
   1251  bytes received via SQL*Net from client
      71  SQL*Net roundtrips to/from client
      0  sorts (memory)
      0  sorts (disk)
   1038  rows processed
这里HAO1和HAO2进行了一种FILTER操作，这个操作在《Cost Based Oracle Fundamental》此书第九章有介绍。他其实很像我们熟悉的neested loop，但它的独特之处在于会维护一个hash table。
举例，如果HAO1里取出object_id=1，那么对于HAO2来说即select 1 from hao2 where hao2.object_id*10=1,如果条件满足，那么对于子查询，输入输出对，即为(1（HAO1.object_id）,1（常量）)。
他存储在hash table里，并且由于条件满足，HAO1.object_id=1被放入结果集。
然后接着从HAO1取出object_id=2，如果子查询依旧条件满足，那么子查询产生另一个输入和输出，即(2,1)，被放入hash table里；并且HAO1.object_id=2被放入结果集。
接着假设HAO1里有重复的object_id，例如我们第三次从HAO1取出的object_id=2,那么由于我们对于子查询来说，已经有输入输出对 (2,1)在hash table里了，所以就不用去再次全表扫描HAO2了，ORACLE非常聪明地知道object_id=2是结果集。这里，filter和neested loop相比,省去了一次全表扫描HAO2。
这个hash table是有大小限制的，当被占满的时候，后续新的HAO1.object_id的FILTER就类似neested loop了。
由此可见，从buffer gets层面上来看，FILTER是应该优于neested loop的，尤其当外部查询需要传递给子查询的输入（此例中为HAO1.object_id）的distinct value非常小时，FILTER就会显得更优。
即使在我这个例子中，HAO1.object_id的distinct value上万，我对比了一下neested loop，FILTER仍然略优：
SQL> select /*+use_nl(hao1 hao2)*/ hao1.object_id from hao1,hao2 where hao1.object_id=hao2.object_id*10;

1038 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 251947914
---------------------------------------------------------------------------
| Id  | Operation       | Name | Rows  | Bytes | Cost (%CPU)| Time    |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT |    | 10663 | 85304 | 404K  (2)| 01:07:23 |
| 1 |  NESTED LOOPS    |    | 10663 | 85304 | 404K  (2)| 01:07:23 |
| 2 | TABLE ACCESS FULL| HAO1 | 10662 | 42648 | 40 (3)| 00:00:01 |
|*  3 | TABLE ACCESS FULL| HAO2 |    1 |    4 | 38 (3)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("HAO1"."OBJECT_ID"="HAO2"."OBJECT_ID"*10)

Statistics
----------------------------------------------------------
      1  recursive calls
      0  db block gets
1503621  consistent gets
      0  physical reads
      0  redo size
   18715  bytes sent via SQL*Net to client
   1251  bytes received via SQL*Net from client
      71  SQL*Net roundtrips to/from client
      0  sorts (memory)
      0  sorts (disk)
   1038  rows processed
FILTER的consistent gets是1369157，neested loop的consistent gets是1503621。
如果我们希望验证我前面的结论，我们可以用distinct value较小的object_type来做个类似的对比试验。
SQL> select hao1.object_id from hao1 where exists
  2  (select /*+no_unnest*/ 1 from hao2 where hao1.object_type=hao2.object_type);
10662 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2565749733
---------------------------------------------------------------------------
| Id  | Operation       | Name | Rows  | Bytes | Cost (%CPU)| Time    |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT |    | 288 |  3168 | 114 (1)| 00:00:02 |
|*  1 |  FILTER          |    |    |    |          |       |
| 2 | TABLE ACCESS FULL| HAO1 | 10662 | 114K| 40 (3)| 00:00:01 |
|*  3 | TABLE ACCESS FULL| HAO2 |    2 | 14 |    2 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "HAO2" "HAO2"
            WHERE "HAO2"."OBJECT_TYPE"=:B1))
3 - filter("HAO2"."OBJECT_TYPE"=:B1)

Statistics
----------------------------------------------------------
      1  recursive calls
      0  db block gets
   17012  consistent gets
      0  physical reads
      0  redo size
   187491  bytes sent via SQL*Net to client
   8302  bytes received via SQL*Net from client
      712  SQL*Net roundtrips to/from client
      0  sorts (memory)
      0  sorts (disk)
   10662  rows processed
可见，同样是HAO1和HAO2的全表扫描后的FILTER操作，却因为传给子查询的输入的distinct value的差别，两者相差的consistent gets却如此巨大，这跟neested loop是完全不一样的。
当然，对于如此的两个全表扫描的结果集，使用hash join是最佳方法。
SQL> select hao1.object_id from hao1 where exists
  2  (select 1 from hao2 where hao1.object_type=hao2.object_type);
10662 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3371915275
-----------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time    |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT    |    | 10662 | 187K| 81 (4)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT SEMI|    | 10662 | 187K| 81 (4)| 00:00:01 |
| 2 | TABLE ACCESS FULL  | HAO2 | 10663 | 74641 | 40 (3)| 00:00:01 |
| 3 | TABLE ACCESS FULL  | HAO1 | 10662 | 114K| 40 (3)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("HAO1"."OBJECT_TYPE"="HAO2"."OBJECT_TYPE")

Statistics
----------------------------------------------------------
      1  recursive calls
      0  db block gets
      985  consistent gets
      0  physical reads
      0  redo size
   187491  bytes sent via SQL*Net to client
   8302  bytes received via SQL*Net from client
      712  SQL*Net roundtrips to/from client
      0  sorts (memory)
      0  sorts (disk)
   10662  rows processed
所以，什么时候该用no_unnest使得子查询能够独立的执行完毕之后再跟外围的查询做FILTER？
首先，子查询的返回结果集应该较小，然后外围查询的输入的distinct value也应该较小（例如object_type）。

2.push_subq
如果说no_unnest是为了让子查询不展开，独立的完成，那么push_subq就是为了让子查询最先进行join。
所以，这个hint其实是控制的join的顺序。
例如某次在生产库中遇到的一个SQL，简化一下然后模拟一下：
create table hao1 as select * from dba_objects;
create table hao2 as select * from dba_objects;
create table hao3 as select * from dba_objects;
create table hao4 as select * from dba_objects;
create index hao3idx on hao3(object_id);
（analyze all tables。）
select hao1.object_name from
hao1,hao2,hao4
where hao1.object_name like '%a%'
and hao1.object_id+hao2.object_id>50
and hao4.object_type=hao1.object_type
and 11 in
(SELECT  hao3.object_id FROM hao3 WHERE hao1.object_id = hao3.object_id);
对于如上的SQL，其中hao3和hao1在子查询中join，
很明显，如果先让hao1和hao3通过join，结果集估计只有一行，或者没有。
但是，此时CBO做出的执行计划为：
--------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time    |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT    |       | 89077 |  3131K|  2070M  (1)|999:59:59 |
|*  1 |  FILTER             |       |    |    |          |       |
|*  2 | HASH JOIN       |       |  3234M| 108G| 289K (24)| 00:48:17 |
| 3 | TABLE ACCESS FULL | HAO4 | 36309 | 212K| 126 (3)| 00:00:02 |
| 4 | NESTED LOOPS    |       |  3296K| 94M| 224K  (2)| 00:37:28 |
|*  5 |    TABLE ACCESS FULL| HAO1 |  1816 | 47216 | 126 (3)| 00:00:02 |
|*  6 |    TABLE ACCESS FULL| HAO2 |  1815 |  7260 | 124 (2)| 00:00:02 |
|*  7 | FILTER          |       |    |    |          |       |
|*  8 | INDEX RANGE SCAN  | HAO3IDX |    1 |    4 |    1 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT /*+ */ 0 FROM "HAO3" "HAO3" WHERE 11=:B1
            AND "HAO3"."OBJECT_ID"=11))
2 - access("HAO4"."OBJECT_TYPE"="HAO1"."OBJECT_TYPE")
5 - filter("HAO1"."OBJECT_NAME" LIKE '%a%')
6 - filter("HAO1"."OBJECT_ID"+"HAO2"."OBJECT_ID">50)
7 - filter(11=:B1)
8 - access("HAO3"."OBJECT_ID"=11)
由上可见，hao1和hao2，hao4先进行无穷无尽的join之后，最后才跟hao3 join，这是非常坏的plan。
于是，我们希望hao1和hao3所在子查询先join，可以采用push_subq：
select /*+push_subq(@tmp)*/ hao1.object_name from
hao1,hao2,hao4
where hao1.object_name like '%a%'
and hao1.object_id+hao2.object_id>50
and hao4.object_type=hao1.object_type
and 11 in
(SELECT /*+QB_Name(tmp)*/ hao3.object_id FROM hao3 WHERE hao1.object_id = hao3.object_id);
--------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time    |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT    |       | 161M|  5552M| 14749  (24)| 00:02:28 |
|*  1 |  HASH JOIN          |       | 161M|  5552M| 14748  (24)| 00:02:28 |
| 2 | TABLE ACCESS FULL  | HAO4 | 36309 | 212K| 126 (3)| 00:00:02 |
| 3 | NESTED LOOPS    |       | 164K|  4828K| 11386 (2)| 00:01:54 |
|*  4 | TABLE ACCESS FULL | HAO1 | 91 |  2366 | 126 (3)| 00:00:02 |
|*  5 |    FILTER          |       |    |    |          |       |
|*  6 |    INDEX RANGE SCAN| HAO3IDX |    1 |    4 |    1 (0)| 00:00:01 |
|*  7 | TABLE ACCESS FULL | HAO2 |  1815 |  7260 | 124 (2)| 00:00:02 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("HAO4"."OBJECT_TYPE"="HAO1"."OBJECT_TYPE")
4 - filter("HAO1"."OBJECT_NAME" LIKE '%a%' AND  EXISTS (SELECT /*+
            PUSH_SUBQ QB_NAME ("TMP") */ 0 FROM "HAO3" "HAO3" WHERE 11=:B1 AND
            "HAO3"."OBJECT_ID"=11))
5 - filter(11=:B1)
6 - access("HAO3"."OBJECT_ID"=11)
7 - filter("HAO1"."OBJECT_ID"+"HAO2"."OBJECT_ID">50)

加上hint后，SQL会在1秒以内完成。

3.push_pred
在谈到push_pred这个hint时，首先要搞清楚mergeable view和unmergeable view的区别。
这个在concept上有明确解释：
Mergeable and Unmergeable Views The optimizer can merge a view into a referencing query block when the view has one or more base tables, provided the view does not contain:

set operators (UNION, UNION ALL, INTERSECT, MINUS)
a CONNECT BY clause
a ROWNUM pseudocolumn
- aggregate functions (AVG, COUNT, MAX, MIN, SUM) in the select list

When a view contains one of the following structures, it can be merged into a referencing query block only if complex view merging is enabled (as described below):

a GROUP BY clause
- a DISTINCT operator in the select list

View merging is not possible for a view that has multiple base tables if it is on the right side of an outer join. If a view on the right side of an outer join has only one base table, however, the optimizer can use complex view merging even if an expression in the view can return a non-null value for a NULL. See "Views in Outer Joins" for more information.
这里在最后，我们发现一个unmergeable view的一种情况就是view在outer join的右侧。
对于这种情况，我们熟知的merge hint也无效。
例如：
create or replace view haoview as
select hao1.* from hao1,hao2
where hao1.object_id=hao2.object_id;
那么对于这样一个简单的查询，可见谓词hao3.object_name=haoview.object_name被merge到了view中：
select  hao3.object_name
from hao3,haoview
where hao3.object_name=haoview.object_name
and hao3.object_id=999;
-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time    |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT             |       |    1 | 44 | 129 (3)| 00:00:02 |
| 1 |  NESTED LOOPS                |       |    1 | 44 | 129 (3)| 00:00:02 |
|*  2 | HASH JOIN                |       |    1 | 40 | 128 (3)| 00:00:02 |
| 3 | TABLE ACCESS BY INDEX ROWID| HAO3 |    1 | 20 |    2 (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN       | HAO3IDX |    1 |    |    1 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL       | HAO1 | 36311 | 709K| 125 (2)| 00:00:02 |
|*  6 | INDEX RANGE SCAN          | HAO2IDX |    1 |    4 |    1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("HAO3"."OBJECT_NAME"="HAO1"."OBJECT_NAME")
4 - access("HAO3"."OBJECT_ID"=999)
6 - access("HAO1"."OBJECT_ID"="HAO2"."OBJECT_ID")
接着，我把haoview放到outer join的右侧，这是haoview就属于unmergeable view了，优化器默认无法将谓词merge进这个haoview中，于是就看到了haoview单独先执行：
select  hao3.object_name
from hao3,haoview
where hao3.object_name=haoview.object_name(+)
and hao3.object_id=999;
----------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time    |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT          |       |    1 | 86 | 153 (5)| 00:00:02 |
|*  1 |  HASH JOIN OUTER          |       |    1 | 86 | 153 (5)| 00:00:02 |
| 2 | TABLE ACCESS BY INDEX ROWID| HAO3 |    1 | 20 |    2 (0)| 00:00:01 |
|*  3 | INDEX RANGE SCAN       | HAO3IDX |    1 |    |    1 (0)| 00:00:01 |
| 4 | VIEW                      | HAOVIEW | 36309 |  2340K| 150 (4)| 00:00:02 |
|*  5 | HASH JOIN                |       | 36309 | 850K| 150 (4)| 00:00:02 |
| 6 |    INDEX FAST FULL SCAN    | HAO2IDX | 36309 | 141K| 22 (5)| 00:00:01 |
| 7 |    TABLE ACCESS FULL       | HAO1 | 36311 | 709K| 125 (2)| 00:00:02 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("HAO3"."OBJECT_NAME"="HAOVIEW"."OBJECT_NAME"(+))
3 - access("HAO3"."OBJECT_ID"=999)
5 - access("HAO1"."OBJECT_ID"="HAO2"."OBJECT_ID")
接着，我们来使用这里的hint push_pred强制优化器将谓词merge进view中，可见到“VIEW PUSHED PREDICATE”：
select /*+push_pred(haoview)*/ hao3.object_name
from hao3,haoview
where hao3.object_name=haoview.object_name(+)
and hao3.object_id=999;
----------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time    |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT          |       |    1 | 40 | 128 (2)| 00:00:02 |
| 1 |  NESTED LOOPS OUTER       |       |    1 | 40 | 128 (2)| 00:00:02 |
| 2 | TABLE ACCESS BY INDEX ROWID| HAO3 |    1 | 36 |    2 (0)| 00:00:01 |
|*  3 | INDEX RANGE SCAN       | HAO3IDX |    1 |    |    1 (0)| 00:00:01 |
| 4 | VIEW PUSHED PREDICATE    | HAOVIEW |    1 |    4 | 126 (2)| 00:00:02 |
| 5 | NESTED LOOPS             |       |    1 | 24 | 126 (2)| 00:00:02 |
|*  6 |    TABLE ACCESS FULL       | HAO1 |    1 | 20 | 125 (2)| 00:00:02 |
|*  7 |    INDEX RANGE SCAN       | HAO2IDX |    1 |    4 |    1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("HAO3"."OBJECT_ID"=999)
6 - filter("HAO1"."OBJECT_NAME"="HAO3"."OBJECT_NAME")
7 - access("HAO1"."OBJECT_ID"="HAO2"."OBJECT_ID")
于是，会有同学问，那么merge hint能否有同样的效果呢？答案是，对于这种unmergeable view来说，merge hint无效。
select /*+merge(haoview)*/ hao3.object_name
from hao3,haoview
where hao3.object_name=haoview.object_name(+)
and hao3.object_id=999;

----------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time    |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT          |       |    1 | 86 | 153 (5)| 00:00:02 |
|*  1 |  HASH JOIN OUTER          |       |    1 | 86 | 153 (5)| 00:00:02 |
| 2 | TABLE ACCESS BY INDEX ROWID| HAO3 |    1 | 20 |    2 (0)| 00:00:01 |
|*  3 | INDEX RANGE SCAN       | HAO3IDX |    1 |    |    1 (0)| 00:00:01 |
| 4 | VIEW                      | HAOVIEW | 36309 |  2340K| 150 (4)| 00:00:02 |
|*  5 | HASH JOIN                |       | 36309 | 850K| 150 (4)| 00:00:02 |
| 6 |    INDEX FAST FULL SCAN    | HAO2IDX | 36309 | 141K| 22 (5)| 00:00:01 |
| 7 |    TABLE ACCESS FULL       | HAO1 | 36311 | 709K| 125 (2)| 00:00:02 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("HAO3"."OBJECT_NAME"="HAOVIEW"."OBJECT_NAME"(+))
3 - access("HAO3"."OBJECT_ID"=999)
5 - access("HAO1"."OBJECT_ID"="HAO2"."OBJECT_ID")

可见，对于此种身处outger join右侧的view来说，merge hint已经无能为力了。

综上，对于大家比较容易混淆的三个hint：
no_unnest/unnest是针对子查询是否展开的，push_subq是针对子查询的连接顺序的，push_pred则是针对 unmergeable view使用外部查询谓词。