值得反复研读的表连接之HASH JOIN方式

两表关联返回少量数据应该走嵌套循环,两表关联返回数据量大的应该走HASH JOIN连接。

HASH JOIN :两表等值关联,将数据量小的表作为驱动表,将数据量大的表作为被驱动表,对驱动表的连接列进行HASH 生成HASH TABLE ,并且也对被驱动表的连接列进行HASH生成HASH TABLE。能够匹配的数据关联上。HASH JOIN 只支持等值连接。

scott@orclpdb1:orclcdb> 
select /*+gather_plan_statistics use_hash(e,d)*/
  2   e.ename, e.job, d.dname
  from emp e, dept d
  4   where e.deptno = d.deptno;

ENAME      JOB       DNAME
---------- --------- --------------
SMITH      CLERK     RESEARCH
ALLEN      SALESMAN  SALES
WARD       SALESMAN  SALES
JONES      MANAGER   RESEARCH
MARTIN     SALESMAN  SALES
BLAKE      MANAGER   SALES
CLARK      MANAGER   ACCOUNTING
SCOTT      ANALYST   RESEARCH
KING       PRESIDENT ACCOUNTING
TURNER     SALESMAN  SALES
ADAMS      CLERK     RESEARCH
JAMES      CLERK     SALES
FORD       ANALYST   RESEARCH
MILLER     CLERK     ACCOUNTING

14 rows selected.

scott@orclpdb1:orclcdb> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  d3uaffp94gd3x, child number 0
-------------------------------------
select /*+gather_plan_statistics use_hash(e,d)*/  e.ename, e.job,
d.dname   from emp e, dept d  where e.deptno = d.deptno

Plan hash value: 615168685

-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |     14 |00:00:00.01 |      54 |      5 |       |       |          |
|*  1 |  HASH JOIN         |      |      1 |     14 |     14 |00:00:00.01 |      54 |      5 |  1695K|  1695K| 1079K (0)|
|   2 |   TABLE ACCESS FULL| DEPT |      1 |      4 |      4 |00:00:00.01 |       6 |      5 |       |       |          |
|   3 |   TABLE ACCESS FULL| EMP  |      1 |     14 |     14 |00:00:00.01 |       7 |      0 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("E"."DEPTNO"="D"."DEPTNO")


21 rows selected.

scott@orclpdb1:orclcdb> 

执行计划中离HASH JOIN 最近的表为驱动表,则DEPT为驱动表,EMP为被驱动表。驱动表 扫描一次,被驱动表也扫描一次。

Note: NL中被驱动表需要扫描多次, HASH JOIN 中被驱动表只需扫描一次。

 Used-Mem表示HASH JOIN 消耗的PGA , 当驱动表太大、PGA不能完全容纳驱动表时,驱动表就会溢出到临时表空间,进而产生磁盘HASH JOIN。此刻HASH JOIN 会产生严重的性能问题。而NL无须消耗PGA.

NL 会进行值传递。(NL循环一次,驱动表连接列传值给被驱动表连接列)

HASH JOIN 没有传值的过程。(HASH JOIN 的驱动表与被驱动表的连接列都无须创建索引)

OLTP 环境(一般高并发小事务居多):SQL执行计划多以NL为主。SGA 设置较大,PGA设置较小。(NL不消耗PGA)

OLAP环境 (多数SQL都是大规模ETL),SQL返回结果集很多,SQL执行计划通常以HASH连接为主。往往会消耗大量的PGA,故PGA设置较大。

当两表使用外连接进行关联,如果执行计划走的是HASH JOIN ,想要更改驱动表,我们需要使用swap_join_inputs, 而不是leading.

scott@orclpdb1:orclcdb> 
explain plan for select /*+use_hash(d,e) leading(e)*/
  2   *
  from dept d
  4    left join emp e
  5      on d.deptno = e.deptno;

Explained.

scott@orclpdb1:orclcdb> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3713469723

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    15 |   870 |     6   (0)| 00:00:01 |
|*  1 |  HASH JOIN OUTER   |      |    15 |   870 |     6   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| DEPT |     4 |    80 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| EMP  |    14 |   532 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("D"."DEPTNO"="E"."DEPTNO"(+))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2 (U - Unused (2))
---------------------------------------------------------------------------

   1 -  SEL$2BFA4EE4
         U -  leading(e)

   2 -  SEL$2BFA4EE4 / D@SEL$1
         U -  use_hash(d,e)

25 rows selected.

scott@orclpdb1:orclcdb>

从上执行计划中所得,两表关联走的是HASH JOIN , 表明use_hash(d,e)生效,而驱动表为DEPT故设置leading(e) 未生效。因此需要使用swap_join_inputs来更改外连接中HASH JOIN的驱动表

scott@orclpdb1:orclcdb> explain plan for select /*+use_hash(d,e) swap_join_inputs(e)*/
  2   *
  from dept d
  4    left join emp e
  5      on d.deptno = e.deptno;

Explained.

scott@orclpdb1:orclcdb> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3590956717

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |    15 |   870 |     6   (0)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT OUTER|      |    15 |   870 |     6   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL   | EMP  |    14 |   532 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL   | DEPT |     4 |    80 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("D"."DEPTNO"="E"."DEPTNO"(+))

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------

   3 -  SEL$2BFA4EE4 / D@SEL$1
         U -  use_hash(d,e)

22 rows selected.

scott@orclpdb1:orclcdb> 

 swap_join_inputs(e) 更改了外连接中的HASH JOIN 的驱动表

Q: 如何优化HASH JOIN ?

  • 避免写SELECT *  FROM 语句(由于HASH JOIN 将驱动表的select 检索列 和 join连接列 放到PGA中,尽可能的减少对PGA的占用,避免驱动表被溢出到临时表空间,从而提升查询性能。)
  • 对表进行拆分。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值