Postgresql杂谈 14—Postgresql执行计划中的扫描和连接类型

本文链接：https://blog.csdn.net/hyman_c/article/details/119531093

一、全表扫描

全表扫描，也叫顺序扫描，扫描时把表中所有的数据块从头到尾遍历一边，找到复合条件的数据块。全表扫描在在explain中使用Seq Scan表示：

stock_analysis_data=# explain (analyze,verbose,buffers,costs,timing) select * from t1 where id=10;
                                            QUERY PLAN                                             
---------------------------------------------------------------------------------------------------
 Seq Scan on public.t1  (cost=0.00..2.25 rows=1 width=8) (actual time=0.017..0.018 rows=0 loops=1)
   Output: id, name
   Filter: (t1.id = 10)
   Rows Removed by Filter: 100
   Buffers: shared hit=1
 Planning Time: 0.054 ms
 Execution Time: 0.036 ms
(7 rows)

(cost=0.00..2.25 rows=1 width=8) —— 启动（返回第一行）的成本为0个cost，返回所有数据的成本的2.25个cost，返回1行，每行的平均宽度8字节。

顺序扫描一个数据块，cost定为1
随机扫描一个数据块，cost值为4
处理一个数据行的CPU代价，cost值为0.01
处理一个索引行的CPU代价，cost值为0.005
每个操作符的CPU代价，cost值为0.0025

(actual time=0.017..0.018 rows=0 loops=1) —— 实际执行中，启动时间0.017毫秒，返回所有行的时间为0.018毫秒。因为数据在共享内存中命中，所以这里从磁盘返回的行数为0，loops=1表示遍历了1遍。
Output: id, name —— 输出字段id，name
Filter: (t1.id = 10) —— 采用条件过滤，过滤条件为t1.id = 10
Rows Removed by Filter: 100 —— 被过滤条件过滤的行数
Buffers: shared hit=1 —— 在共享内存中命中的块数为1

在数据量比较小的情况下，select语句执行时是采用单线程的全表扫描；当数据量比较大时，select语句执行时可能采用并行查询的方式。

stock_analysis_data=# explain (analyze,verbose,buffers,costs,timing) select * from t1 where id=10;
                                                       QUERY PLAN                                                       
------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..13945.04 rows=1 width=8) (actual time=91.579..91.631 rows=0 loops=1)
   Output: id, name
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=5946
   ->  Parallel Seq Scan on public.t1  (cost=0.00..12944.94 rows=1 width=8) (actual time=69.414..69.415 rows=0 loops=3)
         Output: id, name
         Filter: (t1.id = 10)
         Rows Removed by Filter: 366667
         Buffers: shared hit=5946
         Worker 0: actual time=65.028..65.028 rows=0 loops=1
           Buffers: shared hit=1289
         Worker 1: actual time=51.835..51.836 rows=0 loops=1
           Buffers: shared hit=1244
 Planning Time: 0.090 ms
 Execution Time: 91.652 ms
(16 rows)

以上的调试信息表明，除了主master进程之外，额外创建了2个worker进程，共三个进程参与全表扫描。

二、索引扫描

索引扫描，就是通过索引找到目标行的物理位置，然后从表的数据块中把对应的数据读出来。索引扫描在explain中用Index Scan表示：

stock_analysis_data=# explain (analyze,verbose,buffers,costs,timing) select * from t1 where id=100;
                                                      QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
 Index Scan using t1_id_idx on public.t1  (cost=0.43..8.45 rows=1 width=12) (actual time=0.015..0.015 rows=0 loops=1)
   Output: id, name
   Index Cond: (t1.id = 100)
   Buffers: shared hit=3
 Planning Time: 0.099 ms
 Execution Time: 0.035 ms
(6 rows)

Index Cond: (t1.id = 100) —— 扫描索引的条件是t1.id = 100

当查询的条数过多时，即时建立了索引，有可能不走索引，如下例子，表中有100W条数据，当查询的数据量过大时不走索引，反而采用了全表扫描的方式：

stock_analysis_data=# explain (analyze,verbose,buffers,costs,timing) select * from t1 where id > 500;
                                                      QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
 Seq Scan on public.t1  (cost=0.00..19696.00 rows=1100000 width=12) (actual time=0.011..624.830 rows=1100000 loops=1)
   Output: id, name
   Filter: (t1.id > 500)
   Buffers: shared hit=5946
 Planning Time: 0.095 ms
 Execution Time: 1170.307 ms
(6 rows)

查询的数据量小时，走了索引：

stock_analysis_data=# explain (analyze,verbose,buffers,costs,timing) select * from t1 where id < 500;
                                                      QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
 Index Scan using t1_id_idx on public.t1  (cost=0.43..4.45 rows=1 width=12) (actual time=0.005..0.005 rows=0 loops=1)
   Output: id, name
   Index Cond: (t1.id < 500)
   Buffers: shared hit=3
 Planning Time: 0.119 ms
 Execution Time: 0.023 ms
(6 rows)

三、位图扫描

位图扫描是另外一种走索引的方式，是在查询结果行数过多时触发。它先扫描索引，把满足条件的行或块在内存中建立一个位图，再根据位图把表中相应的数据读出来。在位图扫描模式下，如果走了多个索引，会将每个位图进行AND或者OR运算。

位图扫描在Postgresql中用Bitmap Heap Scan表示：

stock_analysis_data=# explain (analyze,verbose,buffers,costs,timing) select * from t1 where id in (10000,999,9999) or id < 500;
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.t1  (cost=17.74..29.59 rows=3 width=12) (actual time=0.022..0.023 rows=0 loops=1)
   Output: id, name
   Recheck Cond: ((t1.id = ANY ('{10000,999,9999}'::integer[])) OR (t1.id < 500))
   Buffers: shared hit=12
   ->  BitmapOr  (cost=17.74..17.74 rows=3 width=0) (actual time=0.019..0.019 rows=0 loops=1)
         Buffers: shared hit=12
         ->  Bitmap Index Scan on t1_id_idx  (cost=0.00..13.31 rows=3 width=0) (actual time=0.013..0.014 rows=0 loops=1)
               Index Cond: (t1.id = ANY ('{10000,999,9999}'::integer[]))
               Buffers: shared hit=9
         ->  Bitmap Index Scan on t1_id_idx  (cost=0.00..4.44 rows=1 width=0) (actual time=0.002..0.003 rows=0 loops=1)
               Index Cond: (t1.id < 500)
               Buffers: shared hit=3
 Planning Time: 0.225 ms
 Execution Time: 0.069 ms
(14 rows)

可以看到，当查询条件含有多个查询条件时，采用位图扫描，上述的查询语句形成了2个位图，然后再进行OR。

BitmapOr —— 表明两个位图结果进行OR运算
Recheck Cond: —— 表明在进行BitmapOr后，再次采用该条件检查数据。

四、嵌套循环连接

嵌套循环连接是在两个表进行关联查询时数据扫描的方法。执行时，先确定一个外表、一个内表，遍历外表的每一行，然后以连接条件查询内表，对查询结果进行join。嵌套连接适用于外表的结果集比较小的连接查询，否则查询的效率会很低。

嵌套循环在Postgresql中使用

stock_analysis_data=# explain (analyze,verbose,buffers,costs,timing) select t1.id,t1.name,t2.name from t1,t2 where t1.id=t2.id and t1.id=1;
                                                        QUERY PLAN                                                         
---------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.15..26.68 rows=1 width=14) (actual time=0.026..0.093 rows=1 loops=1)
   Output: t1.id, t1.name, t2.name
   Buffers: shared hit=8
   ->  Index Scan using t1_id_idx on public.t1  (cost=0.15..8.17 rows=1 width=9) (actual time=0.012..0.014 rows=1 loops=1)
         Output: t1.id, t1.name
         Index Cond: (t1.id = 1)
         Buffers: shared hit=2
   ->  Seq Scan on public.t2  (cost=0.00..18.50 rows=1 width=9) (actual time=0.010..0.073 rows=1 loops=1)
         Output: t2.id, t2.name
         Filter: (t2.id = 1)
         Rows Removed by Filter: 999
         Buffers: shared hit=6
 Planning Time: 0.103 ms
 Execution Time: 0.122 ms
(14 rows)

Index Scan using t1_id_idx on public.t1 —— 将t1作为外表，扫描索引
Seq Scan on public.t2 —— 将t2作为内表，全表扫描（t2没有建立索引）。

五、散列连接

优化器使用两个表中较小的表，以连接字段的为键在内存中建立散列表，然后扫描另一个表并探测散列表，找出匹配行。如果连接的两个表中，较小的表可以完全放到内存中，这种连接方式将非常高效。但是如果较小的表也不能完全放入到内存中，则表将会被分段放入内存中，而适当调大work_mem参数，可以减少IO次数，提高查询性能。

下面是一个使用散列连接（Hash Join）的例子：

stock_analysis_data=# explain (analyze,verbose,buffers,costs,timing) select t1.id,t1.name,t2.name from t1,t2 where t1.id=t2.id and t1.id<5;
                                                           QUERY PLAN                                                            
---------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=8.18..27.95 rows=1 width=14) (actual time=0.049..1.884 rows=4 loops=1)
   Output: t1.id, t1.name, t2.name
   Hash Cond: (t2.id = t1.id)
   Buffers: shared hit=8
   ->  Seq Scan on public.t2  (cost=0.00..16.00 rows=1000 width=9) (actual time=0.012..0.896 rows=1000 loops=1)
         Output: t2.id, t2.name
         Buffers: shared hit=6
   ->  Hash  (cost=8.17..8.17 rows=1 width=9) (actual time=0.022..0.023 rows=4 loops=1)
         Output: t1.id, t1.name
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         Buffers: shared hit=2
         ->  Index Scan using t1_id_idx on public.t1  (cost=0.15..8.17 rows=1 width=9) (actual time=0.006..0.012 rows=4 loops=1)
               Output: t1.id, t1.name
               Index Cond: (t1.id < 5)
               Buffers: shared hit=2
 Planning Time: 0.247 ms
 Execution Time: 1.932 ms
(17 rows)

由于表t1较小，所以是用的t1在内存中建立散列表，扫描t2来探测散列表。

Index Scan using t1_id_idx on public.t1 —— 使用索引扫描t1，将内存中建立哈希表。
Buckets: 1024 Batches: 1 Memory Usage: 9kB —— Hash表含有1024个槽，占用内存9KB。
Seq Scan on public.t2 —— 扫描t2表，探测散列表。

六、合并连接

合并连接适合两个表在连接字段上都建立了索引或者结果已经拍过序，由于不需要在内存中建立Hash表，此时合并连接的性能会由于散列连接，否则Hash连接的性能会较好。以下就是一个合并连接的例子：

stock_analysis_data=# explain (analyze,verbose,buffers,costs,timing) select t1.id,t1.name,t2.name from t1,t2 where t1.id=t2.id and t2.id<10;
                                                           QUERY PLAN                                                            
---------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=0.43..8.73 rows=9 width=14) (actual time=0.015..0.052 rows=9 loops=1)
   Output: t1.id, t1.name, t2.name
   Merge Cond: (t1.id = t2.id)
   Buffers: shared hit=5
   ->  Index Scan using t1_id_idx on public.t1  (cost=0.15..54.16 rows=2200 width=9) (actual time=0.004..0.011 rows=10 loops=1)
         Output: t1.id, t1.name
         Buffers: shared hit=2
   ->  Index Scan using t2_id_btree_inx on public.t2  (cost=0.28..8.43 rows=9 width=9) (actual time=0.003..0.011 rows=9 loops=1)
         Output: t2.id, t2.name
         Index Cond: (t2.id < 10)
         Buffers: shared hit=3
 Planning Time: 0.241 ms
 Execution Time: 0.090 ms
(13 rows)