执行计划中的NestLoop对比HashJoin对比

Kingbase 研究院

于 2024-04-03 17:03:00 发布

阅读量889

点赞数 3

文章标签： android

本文链接：https://blog.csdn.net/lyu1026/article/details/137883861

版权

执行计划中的nestloop join 对比hash join

两种join 方式的定义

NESTE LOOP:

在嵌套循环中，内表被外表驱动，外表返回的每一行都要在内表中检索找到与它匹配的行。两个概念：驱动表(外部表)和内部表，这里用表这个次其实不是很准确，外部表和内部表可以是某张表的结果集。

在执行计划中如何区分外部表和内部表。我的通常习惯是安装前后顺序，nestloop join 里面位于上面的是外部表，位于下面的是内部表。一般情况下外部表的loop=1，内部表的loop=外部表的行数。

如下：

根据定义我们可以得到 nest loop 适合于一下两个场景

1、外部表结果集较小，循环次数就少。内部表如果是大表一定要保证大表对应的join字段有索引。

2、外部表结果集较大，内部表结果集较少的场景，同样可以保证nest loop 性能较优，但是如果外部表结果集非常大，循环次数就会很多，就会导致查询缓慢，这时候用hash join 可能效果会更好。

所以总结下来如果要优化nest loop join 有两个方向

1、减少驱动表的结果集，降低循环次数

2、加快内部表的遍历时间，通过加索引等方式提高内部表的查询效率。

但是具体外部表的记录集多大之后就不建议使用nest loop join 了，有说1万的有说10万的，本人觉得还是要根据实际sql，这里应该没有具体限定值。

nest loop 相关内核参数

enable_nestloop 是否使用nestloop join 默认是on

HASH JOIN :

hash join 做大数据集连接时的常用方式，优化器使用两个表中较小的表(或数据源)利用连接键在内存中建立散列表，然后扫描较大的表并探测散列表，找出与散列表匹配的行。
这种方式适用于较小的表完全可以放于内存中的情况，

这样总成本就是访问两个表的成本之和。但是在表很大的情况下并不能完全放入内存，这时优化器会将它分割成若干不同的分区，不能放入内存的部分就把该分区写入磁盘的临时段，此时要有较大的临时段从而尽量提高I/O 的性能。

从这里就可以看到他的特点和hash索引有点相似，只能用于等值连接。另外他还受限于内存设置，而KingBase 对应的内存就是work_mem。需要保证hash数据全部存放在内存中才能保证hash join 性能最优，否则如果用到过多的 disk 就会严重影响性能。

下图是一个hash join的执行计划截图

注意一点就是hash join 不走索引。

不同场景中hash join 和nest loop join 对比。

场景1：小表做join

1、构建测试表：
drop table app_family;
CREATE TABLE app_family (
"family_id" character varying(32 char) NOT NULL,
"application_id" character varying(32 char) NULL,
"family_number" character varying(50 char) ,
"household_register_number" character varying(50 char),
"poverty_reason" character varying(32 char),
CONSTRAINT "pk_app_family_idpk" PRIMARY KEY (family_id));
insert into app_family select generate_series(1,10000),generate_series(1,10000),'aaaa','aaa','bbb' from dual ;
INSERT INTO app_family SELECT *  FROM app_family
CREATE TABLE app_family2 AS SELECT *  FROM app_family
UPDATE app_family2 SET application_id ='a' WHERE  family_id:
:int > 5000;
2、查看nestloop join
SET enable_hashjoin TO off ;
SET enable_mergejoin TO OFF;
EXPLAIN ANALYZE select  t.*
from app_family t LEFT JOIN app_family2 p
on  t.family_id=p.application_id;

QUERY PLAN                                                                                                                 |
---------------------------------------------------------------------------------------------------------------------------+
Nested Loop Left Join  (cost=0.00..1500387.00 rows=10000 width=21) (actual time=0.018..11663.381 rows=10000 loops=1)       |
  Join Filter: ((t.family_id)::text = (p.application_id)::text)                                                            |
  Rows Removed by Join Filter: 99995000                                                                                    |
  ->  Seq Scan on app_family t  (cost=0.00..165.00 rows=10000 width=21) (actual time=0.007..10.015 rows=10000 loops=1)     |
  ->  Materialize  (cost=0.00..247.00 rows=10000 width=3) (actual time=0.000..0.405 rows=10000 loops=10000)                |
        ->  Seq Scan on app_family2 p  (cost=0.00..197.00 rows=10000 width=3) (actual time=0.004..1.035 rows=10000 loops=1)|
Planning Time: 0.331 ms                                                                                                    |
Execution Time: 11667.704 ms        
 可以看到由于没有索引， 内部表loop 一次 耗时1ms左右 10000次循环就是1万ms。 
我们创建索引然后再看看执行计划

CREATE INDEX idx_family2 ON app_family2(application_id);
QUERY PLAN                                                                                                                              |
----------------------------------------------------------------------------------------------------------------------------------------+
Nested Loop Left Join  (cost=0.29..3685.00 rows=10000 width=21) (actual time=0.038..35.024 rows=10000 loops=1)                          |
  ->  Seq Scan on app_family t  (cost=0.00..165.00 rows=10000 width=21) (actual time=0.009..1.083 rows=10000 loops=1)                   |
  ->  Index Only Scan using idx_family2 on app_family2 p  (cost=0.29..0.33 rows=2 width=3) (actual time=0.003..0.003 rows=0 loops=10000)|
        Index Cond: (application_id = (t.family_id)::text)                                                                              |
        Heap Fetches: 0                                                                                                                 |
Planning Time: 0.228 ms                                                                                                                 |
Execution Time: 35.682 ms   

创建索引之后用时35ms。有明显提升。

2、收集hash join 执行计划
QUERY PLAN                                                                                                                 |
---------------------------------------------------------------------------------------------------------------------------+
Hash Right Join  (cost=290.00..513.26 rows=10000 width=21) (actual time=1.779..5.117 rows=10000 loops=1)                   |
  Hash Cond: ((p.application_id)::text = (t.family_id)::text)                                                              |
  ->  Seq Scan on app_family2 p  (cost=0.00..197.00 rows=10000 width=3) (actual time=0.013..0.638 rows=10000 loops=1)      |
  ->  Hash  (cost=165.00..165.00 rows=10000 width=21) (actual time=1.743..1.744 rows=10000 loops=1)                        |
        Buckets: 16384  Batches: 1  Memory Usage: 663kB                                                                    |
        ->  Seq Scan on app_family t  (cost=0.00..165.00 rows=10000 width=21) (actual time=0.012..0.602 rows=10000 loops=1)|
Planning Time: 0.105 ms                                                                                                    |
Execution Time: 5.495 ms                                                                                                   |

再这种情况下可以看到hash join 用时5ms。所以hash join 优势还是很明显的。                                                                                                   |                                                                              |

场景2：大表+小表

1、我们重建app_family，将数据量增加到1000万，然后先收集 nest loop的执行计划。
insert into app_family select generate_series(1,10000000),generate_series(1,10000000),'aaaa','aaa','bbb' from dual ;

QUERY PLAN                                                                                                                                 |
-------------------------------------------------------------------------------------------------------------------------------------------+
Nested Loop Left Join  (cost=0.29..3574032.75 rows=9999565 width=27) (actual time=0.063..33128.886 rows=10000000 loops=1)                  |
  ->  Seq Scan on app_family t  (cost=0.00..174060.65 rows=9999565 width=27) (actual time=0.042..1447.904 rows=10000000 loops=1)           |
  ->  Index Only Scan using idx_family2 on app_family2 p  (cost=0.29..0.32 rows=2 width=3) (actual time=0.003..0.003 rows=0 loops=10000000)|
        Index Cond: (application_id = (t.family_id)::text)                                                                                 |
        Heap Fetches: 0                                                                                                                    |
Planning Time: 0.224 ms                                                                                                                    |
Execution Time: 33724.901 ms      

但是如果我们更换两个表的join顺序
EXPLAIN ANALYZE SELECT   t.*
from app_family2 p LEFT JOIN  app_family t 
on  t.family_id=p.application_id
可以看到小表做驱动表，大表走索引情况下执行效率明显提升用时84ms。    

QUERY PLAN                                                                                                                                |
------------------------------------------------------------------------------------------------------------------------------------------+
Nested Loop Left Join  (cost=0.43..78386.00 rows=10000 width=27) (actual time=0.058..83.462 rows=10000 loops=1)                           |
  ->  Seq Scan on app_family2 p  (cost=0.00..197.00 rows=10000 width=3) (actual time=0.031..1.349 rows=10000 loops=1)                     |
  ->  Index Scan using pk_app_family_idpk on app_family t  (cost=0.43..7.82 rows=1 width=27) (actual time=0.008..0.008 rows=0 loops=10000)|
        Index Cond: ((family_id)::text = (p.application_id)::text)                                                                        |
Planning Time: 0.091 ms                                                                                                                   |
Execution Time: 84.238 ms   

同样的情况下我们看一下hash join 的执行情况

QUERY PLAN                                                                                                                           |
-------------------------------------------------------------------------------------------------------------------------------------+
Hash Right Join  (cost=367412.21..436072.46 rows=9999565 width=27) (actual time=3414.711..6855.453 rows=10000000 loops=1)            |
  Hash Cond: ((p.application_id)::text = (t.family_id)::text)                                                                        |
  ->  Seq Scan on app_family2 p  (cost=0.00..197.00 rows=10000 width=3) (actual time=0.010..0.885 rows=10000 loops=1)                |
  ->  Hash  (cost=174060.65..174060.65 rows=9999565 width=27) (actual time=3413.135..3413.136 rows=10000000 loops=1)                 |
        Buckets: 65536  Batches: 256  Memory Usage: 2826kB                                                                           |
        ->  Seq Scan on app_family t  (cost=0.00..174060.65 rows=9999565 width=27) (actual time=0.018..947.863 rows=10000000 loops=1)|
Planning Time: 0.145 ms                                                                                                              |
Execution Time: 7154.517 ms                                                                                                          |

QUERY PLAN                                                                                                                           |
-------------------------------------------------------------------------------------------------------------------------------------+
Hash Left Join  (cost=367412.21..436072.46 rows=10000 width=27) (actual time=3363.830..4873.281 rows=10000 loops=1)                  |
  Hash Cond: ((p.application_id)::text = (t.family_id)::text)                                                                        |
  ->  Seq Scan on app_family2 p  (cost=0.00..197.00 rows=10000 width=3) (actual time=0.017..0.970 rows=10000 loops=1)                |
  ->  Hash  (cost=174060.65..174060.65 rows=9999565 width=27) (actual time=3362.324..3362.325 rows=10000000 loops=1)                 |
        Buckets: 65536  Batches: 256  Memory Usage: 2826kB                                                                           |
        ->  Seq Scan on app_family t  (cost=0.00..174060.65 rows=9999565 width=27) (actual time=0.011..946.358 rows=10000000 loops=1)|
Planning Time: 0.158 ms                                                                                                              |
Execution Time: 4873.893 ms                                                                                                          |
可以看到hash join 在两种不同join顺序的情况下执行时间有差距，但是不大。 
但是相比较于最优情况下的nest loop join 是有明显差距。
                                                                                                         |                                                                                             |

场景3：两个大表

构造两个1000万的大表，然后对比nest loop 和 hash join
DROP  TABLE  app_family2;
CREATE TABLE app_family2 AS SELECT *  FROM app_family;
UPDATE app_family2 SET application_id ='a' WHERE  family_id::int > 5000;
CREATE INDEX idx_family2 ON app_family2(application_id);

首先看一下nest loop 
QUERY PLAN                                                                                                                                   |
---------------------------------------------------------------------------------------------------------------------------------------------+
Nested Loop Left Join  (cost=0.43..5249466.93 rows=10000000 width=27) (actual time=0.024..37066.301 rows=10000000 loops=1)                   |
  ->  Seq Scan on app_family2 p  (cost=0.00..238135.00 rows=10000000 width=7) (actual time=0.008..1733.272 rows=10000000 loops=1)            |
  ->  Index Scan using pk_app_family_idpk on app_family t  (cost=0.43..0.50 rows=1 width=27) (actual time=0.003..0.003 rows=0 loops=10000000)|
        Index Cond: ((family_id)::text = (p.application_id)::text)                                                                           |
Planning Time: 0.241 ms                                                                                                                      |
Execution Time: 37644.050 ms                
可以看到因为有索引，所以全部的执行时间就是集中在loop 上
和场景1 执行计划基本一样唯一差异的就是loop次数  总计用时37s

然后看一下 hash join

QUERY PLAN                                                                                                                            |
--------------------------------------------------------------------------------------------------------------------------------------+
Hash Left Join  (cost=367412.21..778841.36 rows=10027496 width=27) (actual time=4408.948..9869.481 rows=10000000 loops=1)             |
  Hash Cond: ((p.application_id)::text = (t.family_id)::text)                                                                         |
  ->  Seq Scan on app_family2 p  (cost=0.00..238409.96 rows=10027496 width=2) (actual time=0.039..2061.101 rows=10000000 loops=1)     |
  ->  Hash  (cost=174060.65..174060.65 rows=9999565 width=27) (actual time=4408.762..4408.763 rows=10000000 loops=1)                  |
        Buckets: 65536  Batches: 256  Memory Usage: 2826kB                                                                            |
        ->  Seq Scan on app_family t  (cost=0.00..174060.65 rows=9999565 width=27) (actual time=0.013..1281.885 rows=10000000 loops=1)|
Planning Time: 0.156 ms                                                                                                               |
Execution Time: 10187.486 ms                   

可以看到这种情况下hash join 明显快很多， 他的耗时基本上就是 
hash表生成时间+另一张表扫描时间  。
然后我们在模拟一下work_mem  偏小的情况下 hash join 
SET work_mem=1024 --设置work_mem =1M
QUERY PLAN                                                                                                                            |
--------------------------------------------------------------------------------------------------------------------------------------+
Hash Left Join  (cost=367412.21..778841.36 rows=10027496 width=27) (actual time=4983.689..9894.142 rows=10000000 loops=1)             |
  Hash Cond: ((p.application_id)::text = (t.family_id)::text)                                                                         |
  ->  Seq Scan on app_family2 p  (cost=0.00..238409.96 rows=10027496 width=2) (actual time=0.044..1709.922 rows=10000000 loops=1)     |
  ->  Hash  (cost=174060.65..174060.65 rows=9999565 width=27) (actual time=4983.615..4983.615 rows=10000000 loops=1)                  |
        Buckets: 16384  Batches: 1024  Memory Usage: 703kB                                                                            |
        ->  Seq Scan on app_family t  (cost=0.00..174060.65 rows=9999565 width=27) (actual time=0.008..1278.113 rows=10000000 loops=1)|
Planning Time: 0.165 ms                                                                                                               |
Execution Time: 10204.286 ms                                                                                                          |

可以看到 work_mem 减小后， Buckets 少了， 但是Batches 增加了。
 Buckets 就是用于存放hash值的内存空间，因为work_mem 小了 所以 Buckets 减少。
batchs代表为了执行这个hash join Buckets 被服用了几次。

如果我们增大SET work_mem=1024000 到1G 看一下执行计划
QUERY PLAN                                                                                                                            |
--------------------------------------------------------------------------------------------------------------------------------------+
Hash Left Join  (cost=299055.21..563787.36 rows=10027496 width=27) (actual time=5407.391..8326.606 rows=10000000 loops=1)             |
  Hash Cond: ((p.application_id)::text = (t.family_id)::text)                                                                         |
  ->  Seq Scan on app_family2 p  (cost=0.00..238409.96 rows=10027496 width=2) (actual time=0.040..1301.358 rows=10000000 loops=1)     |
  ->  Hash  (cost=174060.65..174060.65 rows=9999565 width=27) (actual time=5315.162..5315.162 rows=10000000 loops=1)                  |
        Buckets: 16777216  Batches: 1  Memory Usage: 724606kB                                                                         |
        ->  Seq Scan on app_family t  (cost=0.00..174060.65 rows=9999565 width=27) (actual time=0.037..1143.247 rows=10000000 loops=1)|
Planning Time: 0.138 ms                                                                                                               |
Execution Time: 8730.006 ms                                                                                                           |

 可以看到 work_mem 到1G后，内存中   Buckets 增加了很多，但是  Batches 就1次
这说明所有的hash值都存在了内存中， 整体时间耗时8.7s 提升1.3s左右                                                                |                                                                                         |