Oracle 驱动表
Oracle驱动表也叫做外部表,也叫外层表,是在多表关联查询中首先遍历的表,驱动表的每一行都要到另一个表中寻找相应的记录,然后计算返回最终数据。
驱动表的概念只在nested loops和hash join时存在。
原则:
1. 驱动表一般是小表,但不绝对,看下边
2. 驱动表一般是通过where条件筛选后剩余行数较少的表。
3. 如果表的一条记录很长,占用几个数据块也适合做驱动表
4. CBO和RBO中,对于驱动表的选择是不同的,CBO中通过对统计信息的参考进行计算来选择驱动表,而RBO中按照既定原则选择驱动表。
5. RBO中,from后边最右边的表为驱动表(from后边表从右向左遍历,where条件从下向上遍历)
6. 涉及驱动表的查询,连接条件的索引很重要,驱动表连接字段可以没有索引,但是被驱动表需要被扫描驱动表经过筛选后剩余条数的遍数,所以被驱动表的连接字段上有一条索引是非常重要的。
分析:
假设a表10行记录,b表1000行记录,两个表都有id列,查询时使用id列进行关联
Select * from a,b where a.id=b.id anda.id=100;
A表作为驱动表比较合适,假设a.id=100只有1行,即使全表扫描a表也就几个块,假设a表占用10个块。
B表的id假如非唯一,如果b表的id列有索引,b表占用100个块,每个块10行记录,id列索引占用10个块,并且id为100有2条记录,在两个块中
那么这条语句的成本(以块计算,下同):
A表(10个块)*b表索引(10个块)+b表id为100的2个块=102个块
如果b表没有索引,成本为:
A表(10个块)*b表(100个块)=1000个块
如果a,b表都没有索引,可以看出不管哪个表作为驱动表,语句的执行成本都是一样的。
如果a,b表id列都有索引,a表id列索引占2个块,成本为:
A表id列索引(2个块)*b表id列索引(10个块)+ b表id为100的2个块=22个块
如果B表的记录很长,可以作为驱动表的情况比较复杂,大家可以自己想象适合的场景。
可以看出,在连接中,如果连接列有索引是多么的重要。
实验支撑
SQL> create table a(id,name) as selectobject_id,object_name from all_objects where rownum < 200;
Table created.
SQL>
SQL> create table b as select * fromall_objects ;
Table created.
SQL> select count(*) from a;
COUNT(*)
----------
199
SQL> select count(*) from b
SQL>
COUNT(*)
----------
89083
SQL>
SQL> execdbms_stats.gather_table_stats('TEST','A');
PL/SQL procedure successfully completed.
SQL>
SQL> execdbms_stats.gather_table_stats('TEST','B');
PL/SQL procedure successfully completed.
两个表都没有索引
Select count(*) from a,b wherea.id=b.object_id
And a.id=53
执行计划:(B表驱动)
SQL> Select count(*) from a,b wherea.id=b.object_id
2 And a.id=53
3 /
COUNT(*)
----------
1
Execution Plan
----------------------------------------------------------
Plan hash value: 319234518
----------------------------------------------------------------------------
| Id | Operation | Name |Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 1 | 9 | 420 (1)| 00:00:01 |
| 1| SORT AGGREGATE | | 1 | 9 | | |
|* 2| HASH JOIN | | 1 | 9 | 420 (1)| 00:00:01 |
|* 3| TABLE ACCESS FULL| B | 1 | 5 | 417 (1)| 00:00:01 |
|* 4| TABLE ACCESS FULL| A | 1 | 4 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified byoperation id):
---------------------------------------------------
2- access("A"."ID"="B"."OBJECT_ID")
3- filter("B"."OBJECT_ID"=53)
4- filter("A"."ID"=53)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
1506 consistent gets
0 physical reads
0 redo size
542 bytes sent via SQL*Net toclient
543 bytes received via SQL*Netfrom client
2 SQL*Net roundtrips to/fromclient
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
A表作为驱动表
SQL> Select /*+ ordered use_nl(a) */count(*) from a,b where a.id=b.object_id
2 Anda.id=53;
COUNT(*)
----------
1
1 row selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1397777030
----------------------------------------------------------------------------
| Id | Operation | Name |Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 1 | 9 | 420 (1)| 00:00:01 |
| 1| SORT AGGREGATE | | 1 | 9 | | |
|* 2| HASH JOIN | | 1 | 9 | 420 (1)| 00:00:01 |
|* 3| TABLE ACCESS FULL| A | 1 | 4 | 3 (0)| 00:00:01 |
|* 4| TABLE ACCESS FULL| B | 1 | 5 | 417 (1)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified byoperation id):
---------------------------------------------------
2 -access("A"."ID"="B"."OBJECT_ID")
3- filter("A"."ID"=53)
4- filter("B"."OBJECT_ID"=53)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
1506 consistent gets
0 physical reads
0 redo size
542 bytes sent via SQL*Net toclient
543 bytes received via SQL*Netfrom client
2 SQL*Net roundtrips to/fromclient
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
发现上面两个语句的代价是一样的
/*+ Ordered use_nl(table_name) */ --使用hint强制表作为驱动表,另外,这里使用的use_nl,但是走的是hash join,说明在没有索引的情况下,oracle优化器更倾向hash join。
执行计划中,hash join下第一个表为驱动表,此处为A表。
表B object_id列有索引的情况
SQL> create index id_b_object_id onb(object_id);
Index created.
SQL> execdbms_stats.gather_table_stats(ownname => 'TEST',TABNAME => 'B',CASCADE=> TRUE);
PL/SQL procedure successfully completed.
SQL>
执行计划:
SQL> Select count(*) from a,b wherea.id=b.object_id
2 And a.id=53;
COUNT(*)
----------
1
1 row selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3168189658
----------------------------------------------------------------------------------------
| Id | Operation |Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 1 | 9 | 4 (0)| 00:00:01 |
| 1| SORT AGGREGATE | | 1 | 9 | | |
| 2| MERGE JOINCARTESIAN| | 1 | 9 | 4 (0)| 00:00:01 |
|* 3| TABLE ACCESS FULL | A | 1 | 4 | 3 (0)| 00:00:01 |
| 4| BUFFER SORT | | 1 | 5 | 1 (0)| 00:00:01 |
|* 5| INDEX RANGE SCAN | ID_B_OBJECT_ID | 1 | 5 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified byoperation id):
---------------------------------------------------
3- filter("A"."ID"=53)
5 -access("B"."OBJECT_ID"=53)
Statistics
----------------------------------------------------------
92 recursive calls
0 db block gets
134 consistent gets
23 physical reads
0 redo size
542 bytes sent via SQL*Net toclient
543 bytes received via SQL*Netfrom client
2 SQL*Net roundtrips to/fromclient
12 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
发现执行计划并没有使用nested loop和hash join,不过走索引后,执行代价明显减少。Merge join发生了排序,如果内存够用还好,不够用就比较耗时了。
强制hash
A表驱动
SQL> Select /*+ use_hash(a,b) */count(*) from a,b where a.id=b.object_id
2 And a.id=53;
COUNT(*)
----------
1
1 row selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 895278611
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 1 | 9 | 4 (0)| 00:00:01 |
| 1| SORT AGGREGATE | | 1 | 9 | | |
|* 2| HASH JOIN | | 1 | 9 | 4 (0)| 00:00:01 |
|* 3| TABLE ACCESS FULL| A | 1 | 4 | 3 (0)| 00:00:01 |
|* 4| INDEX RANGE SCAN | ID_B_OBJECT_ID| 1 | 5 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified byoperation id):
---------------------------------------------------
2- access("A"."ID"="B"."OBJECT_ID")
3- filter("A"."ID"=53)
4- access("B"."OBJECT_ID"=53)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
5 consistent gets
0 physical reads
0 redo size
542 bytes sent via SQL*Net toclient
543 bytes received via SQL*Netfrom client
2 SQL*Net roundtrips to/fromclient
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
--强制使用hash join,a表默认变为了驱动表,执行代价很低,符合要求
B表驱动
SQL> Select /*+ ordered use_hash(b) */count(*) from a,b where a.id=b.object_id
2 And a.id=53;
COUNT(*)
----------
1
1 row selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 895278611
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 1 | 9 | 4 (0)| 00:00:01 |
| 1| SORT AGGREGATE | | 1 | 9 | | |
|* 2| HASH JOIN | | 1 | 9 | 4 (0)| 00:00:01 |
|* 3| TABLE ACCESS FULL| A | 1 | 4 | 3 (0)| 00:00:01 |
|* 4| INDEX RANGE SCAN | ID_B_OBJECT_ID| 1 | 5| 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified byoperation id):
---------------------------------------------------
2- access("A"."ID"="B"."OBJECT_ID")
3- filter("A"."ID"=53)
4- access("B"."OBJECT_ID"=53)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
5 consistent gets
0 physical reads
0 redo size
542 bytes sent via SQL*Net toclient
543 bytes received via SQL*Netfrom client
2 SQL*Net roundtrips to/fromclient
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
发现有索引,并且有统计信息的情况下,无法强制B表作为驱动表,oracle对hint进行了忽略。
删除统计信息试试:
SQL> EXEC dbms_stats.delete_table_stats(user,'B',cascade_parts =>TRUE);
PL/SQL procedure successfully completed
SQL> EXEC dbms_stats.delete_table_stats(user,'A',cascade_parts =>TRUE);
PL/SQL procedure successfully completed
SQL>
--测试发现仍然不能将B表作为驱动表,修改optimizer_mode为rule
alter session set optimizer_mode=rule;
SQL> Select /*+ ordered use_nl(b) */count(*) from a,b where a.id=b.object_id
2 And object_id=53;
--发现仍然不能将B表作为驱动表
强制nested loop
SQL> Select /*+ ordered use_nl(b) */count(*) from a,b where a.id=b.object_id
2 And object_id=53;
COUNT(*)
----------
1
1 row selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1183094437
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 1 | 26 | 4 (0)| 00:00:01 |
| 1| SORT AGGREGATE | | 1 | 26 | | |
| 2| NESTED LOOPS | | 1 | 26 | 4 (0)| 00:00:01 |
|* 3| TABLE ACCESS FULL| A | 1 | 13 | 3 (0)| 00:00:01 |
|* 4| INDEX RANGE SCAN | ID_B_OBJECT_ID| 1 | 13 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified byoperation id):
---------------------------------------------------
3- filter("A"."ID"=53)
4- access("OBJECT_ID"=53)
Note
-----
-dynamic statistics used: dynamic sampling (level=2)
Statistics
----------------------------------------------------------
10 recursive calls
0 db block gets
73 consistent gets
1 physical reads
0 redo size
542 bytes sent via SQL*Net toclient
543 bytes received via SQL*Netfrom client
2 SQL*Net roundtrips to/fromclient
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
--代价和hash join差不多,另外,即使强制B表作为驱动表,仍然不能将B表作为驱动表。
两个都有索引的情况
SQL> create index id_a_id on a(id);
Index created.
SQL> execdbms_stats.gather_table_stats(user,'A',CASCADE=>TRUE);
PL/SQL procedure successfully completed.
SQL> execdbms_stats.gather_table_stats(user,'B',cascade => true);
PL/SQL procedure successfully completed.
SQL>
SQL> Select /*+ ordered use_nl(b) */count(*) from a,b where a.id=b.object_id
2 And object_id=53;
COUNT(*)
----------
1
1 row selected.
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 2751652919
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 1 | 9 | 2 (0)| 00:00:01 |
| 1| SORT AGGREGATE | | 1 | 9 | | |
| 2| NESTED LOOPS | | 1 | 9 | 2 (0)| 00:00:01 |
|* 3| INDEX RANGE SCAN| ID_A_ID | 1 | 4 | 1 (0)| 00:00:01 |
|* 4| INDEX RANGE SCAN| ID_B_OBJECT_ID| 1 | 5 | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified byoperation id):
---------------------------------------------------
3- access("A"."ID"=53)
4- access("OBJECT_ID"=53)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
542 bytes sent via SQL*Net toclient
543 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/fromclient
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
--hint强制不能将B表作为驱动表
代价明显变小,又减少一倍(索引是多么重要)
我这里使用的是12c的库,发现12c对于执行计划的准确性确实有提高,hint作为辅助手段越来越显得必要性很小,这是dba要失业的劲头还是帮助dba减轻负担,~~