PostgreSQL in 语法 的优化器处理以及如何优化

本文探讨了PostgreSQL中'in'语句的优化器处理,包括index scan、bitmap scan和seq scan等不同执行计划的适用场景。通过测试不同SQL写法,分析了它们在不同情况下的性能差异。建议根据扫描集大小和索引情况选择合适的查询方式,以提高查询效率。当优化器未选择最优执行计划时,可以调整优化器开关或使用pg_hint_plan插件进行指导。
摘要由CSDN通过智能技术生成

PostgreSQL in 的优化器处理以及如何优化

在使用数据库的过程中,经常会遇到需要匹配多个值的情况。
通常的写法包括:

-- select * from table where id = any(array);  
-- select * from table where id in (values);  
-- select * from table where id=x or id=x or ....;  
-- select * from table where id in (query);  
-- select * from table where id in ( values query );  
-- select * from table t1 join (query or values query) t2 on t1.id=t2.id;  

每种写法会产生多种执行计划的可能,如下:

-- select * from table where id = any(array);  
  优化器可以使用index scan, bitmap scan, seq scan.  
  
-- select * from table where id in (values);  
  优化器可以使用index scan, bitmap scan, seq scan.  
  
-- select * from table where id=x or id=x or ....;  
  优化器可以使用bitmap scan + BitmapOr, seq scan.  
  
-- select * from table where id in (query);  
  优化器可以使用join (merge,hash,nest).  
  
-- select * from table where id in ( values query );  
  优化器可以使用join (merge,hash,nest).  
  
-- select * from table t1 join (query or values query) t2 on t1.id=t2.id;  
  优化器可以使用join (merge,hash,nest).  

SQL优化策略是尽量减少CPU的运算以及page的扫描数量。

下面针对每种SQL,看看对应的可能的执行计划有什么差别,(使用开关来控制执行计划的选择, 如set enable_indexscan=off)
支持的开关如下:

enable_bitmapscan     enable_hashjoin       enable_indexscan      enable_mergejoin      enable_seqscan        enable_tidscan          
enable_hashagg        enable_indexonlyscan  enable_material       enable_nestloop       enable_sort   

开始测试,使用auto_explain输出执行计划:

load 'auto_explain';    
set auto_explain.log_analyze =true;    
set auto_explain.log_buffers =true;    
set auto_explain.log_nested_statements=true;    
set auto_explain.log_timing=true;    
set auto_explain.log_triggers=true;    
set auto_explain.log_verbose=true;    
set auto_explain.log_min_duration=0;    
set client_min_messages ='log';    
set work_mem='8GB';    

测试SQL写法1:

-- select * from table where id = any(array);  
  
do language plpgsql 
$$
  
declare  
  v_id int[];  
begin  
  select array_agg(trunc(random()*100000)) into v_id from generate_series(1,200) t(id);  
  perform * from t_in_test where id = any (v_id);  
end;  

$$
;  

优化器选择1 (index scan):
离散扫描,适合小的扫描集。

LOG:  duration: 2.312 ms  plan:  
Query Text: SELECT * from t_in_test where id = any (v_id)  
Index Scan using t_in_test_pkey on public.t_in_test  (cost=0.43..895.50 rows=200 width=37) (actual time=0.039..2.266 rows=200 loops=1)  
  Output: id, info  
  Index Cond: (t_in_test.id = ANY ('{50836,73414,41071,45604,...省略部分...,76236}'::integer[]))  
  Buffers: shared hit=776  
CONTEXT:  SQL statement "SELECT * from t_in_test where id = any (v_id)"  
PL/pgSQL function inline_code_block line 6 at PERFORM  

优化器选择2 (bitmap scan):
比index scan多了Recheck的开销,以及按照ctid排序的开销。
适合大的扫描集,排序的目的是减少离散扫描,还可以用到块设备的prefetch。

LOG:  duration: 1.602 ms  plan:  
Query Text: SELECT * from t_in_test where id = any (v_id)  
Bitmap Heap Scan on public.t_in_test  (cost=888.55..1711.16 rows=200 width=37) (actual time=0.880..1.563 rows=200 loops=1)  
  Output: id, info  
  Recheck Cond: (t_in_test.id = ANY ('{32635,31123,6282,59640,...省略部分...,87705}'::integer[]))  
  Heap Blocks: exact=184  
  Buffers: shared hit=784  
  ->  Bitmap Index Scan on t_in_test_pkey  (cost=0.00..888.50 rows=200 width=0) (actual time=0.846..0.846 rows=200 loops=1)  
        Index Cond: (t_in_test.id = ANY ('{32635,31123,6282,59640,...省略部分...,87705}'::integer[]))  
        Buffers: shared hit=600  
CONTEXT:  SQL statement "SELECT * from t_in_test where id = any (v_id)"  
PL/pgSQL function inline_code_block line 6 at PERFORM  

优化器选择3 (seq scan):
适合非常庞大的扫描集。

LOG:  duration: 19940.394 ms  plan:  
Query Text: SELECT * from t_in_test where id = any (v_id)  
Seq Scan on public.t_in_test  (cost=0.00..2683354.80 rows=200 width=37) (actual time=4.237..19940.330 rows=199 loops=1)  
  Output: id, info  
  Filter: (t_in_test.id = ANY ('{45867,72450,95153,86233,63073,11016,56010,47158,...省略部分...,90444}'::integer[]))  
  Rows Removed by Filter: 9999801  
  Buffers: shared hit=83334  
CONTEXT:  SQL statement "SELECT * from t_in_test where id = any (v_id)"  
PL/pgSQL function inline_code_block line 6 at PERFORM  

测试SQL写法2:

-- select * from table where id in (values);  
  
do language plpgsql 
$$
  
declare  
  v_where text;  
begin  
  select string_agg(id::text,',') into v_where from (select trunc(random()*100000)::int as id from generate_series(1,200) t(id)) t;  
  execute 'select * from t_in_test where id in ('||v_where||')';  
end;  

$$
;  

优化器选择1 (index scan):

LOG:  duration: 0.919 ms  plan:  
Query Text: select * from t_in_test where id in (8826,2038,72163,29843,76886,37893,5279,64308,...省略部分...,48126,44868)  
Index Scan using t_in_test_pkey on public.t_in_test  (cost=0.43..895.50 rows=200 width=37) (actual time=0.017..0.894 rows=200 loops=1)  
  Output: id, info  
  Index Cond: (t_in_test.id = ANY ('{8826,2038,72163,29843,76886,37893,5279,64308,7370,80216,...省略部分...,48126,44868}'::integer[]))  
  Buffers: shared hit=779  
CONTEXT:  SQL statement "select * from t_in_test where id in (8826,2038,72163,29843,76886,37893,5279,64308,7370,80216,...省略部分...,73366,48126,44868)"  
PL/pgSQL function inline_code_block line 6 at EXECUTE  

优化器选择2 (bitmap scan):

LOG:  duration: 1.012 ms  plan:  
Query Text: select * from t_in_test where id in (17424,80517,35148,38245,93
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值