PostgreSQL SQL OUTLINE插件sr_plan (保存、篡改、固定执行计划)

最新推荐文章于 2023-07-27 20:09:31 发布

weixin_34399060

最新推荐文章于 2023-07-27 20:09:31 发布

阅读量814

点赞数

文章标签：数据库 python git

原文链接：https://yq.aliyun.com/articles/71115

版权

本文介绍了PostgreSQL中的sr_plan插件，该插件类似Oracle的SQL Outline，允许用户保存和篡改查询的执行计划。通过示例展示了如何使用sr_plan收集、修改并应用执行计划，以优化查询性能。

摘要由CSDN通过智能技术生成

背景

功能较为强大的数据库，通常都有query rewrite的功能，比如JOIN时提升或下推条件，调整JOIN顺序等。

例如

create table a(id int, info text);  
create table b(id int, info text);  
create index idx_a_info on a (info);  
create index idx_b_id on b(id);

执行以下查询，我们看看query rewrite如何工作的

postgres=# explain select * from a left join (select id,count(info) from b group by id) b on (a.id=b.id) where a.id=1;  
                                      QUERY PLAN                                        
--------------------------------------------------------------------------------------  
 Nested Loop Left Join  (cost=13.24..180340.52 rows=934 width=27)  
   Join Filter: (a.id = b.id)  
   ->  Seq Scan on a  (cost=0.00..179054.03 rows=1 width=15)  
         Filter: (id = 1)  
   ->  GroupAggregate  (cost=13.24..1265.48 rows=934 width=12)  
         Group Key: b.id  
         ->  Bitmap Heap Scan on b  (cost=13.24..1251.24 rows=981 width=9)  
               Recheck Cond: (id = 1)  
               ->  Bitmap Index Scan on idx_b_id  (cost=0.00..12.99 rows=981 width=0)  
                     Index Cond: (id = 1)  
(10 rows)

这里只提供了a.id=1的条件，这个查询条件被重写，推入子查询中，所以我们看到子查询实际上也过滤了b.id=1的条件。

通过以上例子，我们见识到了QUERY REWRITE的用途，那么是不是所有场景都能rewrite 呢？

例如我们把以上QUERY换一个条件，改写为如下

postgres=# explain select * from a left join (select id,count(info) from b group by id) b on (a.id=b.id) where a.info='test1';  
                                   QUERY PLAN                                      
---------------------------------------------------------------------------------  
 Hash Right Join  (cost=204060.69..204298.22 rows=1 width=27)  
   Hash Cond: (b.id = a.id)  
   ->  HashAggregate  (cost=204057.62..204157.64 rows=10001 width=12)  
         Group Key: b.id  
         ->  Seq Scan on b  (cost=0.00..154056.75 rows=10000175 width=9)  
   ->  Hash  (cost=3.05..3.05 rows=1 width=15)  
         ->  Index Scan using idx_a_info on a  (cost=0.43..3.05 rows=1 width=15)  
               Index Cond: (info = 'test1'::text)  
(8 rows)

从以上执行计划，我们看到这个query rewrite并没有将a.info='test1'间接的推入子查询。

而实际上，PostgreSQL只是根据成本选择了一个执行计划，并不是说它不能推入a.info='test1'的条件，请继续看我在后面sr_plan中的测试，会看到PostgreSQL的CBO还是非常强大的。

另一方面，作为用户，以上QUERY可以改写为如下（或者说这是你期待的query rewrite对吧）

select * from a left join (select id,count(info) from b   
  where exists (select 1 from a where a.id=b.id and a.info='test')  -- 改写QUERY，得到同样结果，但是B的聚合量减少了  
  -- 或者 where id in (select id from a where a.info='test1')  
  -- 或者 还有其他改法，拆掉子查询  
group by id) b on (a.id=b.id) where a.info='test1';

改写后的执行计划如下，b在聚合前，可以使用a的条件过滤掉一些记录，从而减少聚合的量

                                           QUERY PLAN                                             
------------------------------------------------------------------------------------------------  
 Hash Right Join  (cost=1295.06..1318.82 rows=1 width=27)  
   Hash Cond: (b.id = a.id)  
   ->  HashAggregate  (cost=1292.00..1302.00 rows=1000 width=12)  
         Group Key: b.id  
         ->  Nested Loop  (cost=16.44..1287.00 rows=1000 width=9)  
               ->  HashAggregate  (cost=3.05..3.06 rows=1 width=4)  
                     Group Key: a_1.id  
                     ->  Index Scan using idx_a_info on a a_1  (cost=0.43..3.05 rows=1 width=4)  
                           Index Cond: (info = 'test1'::text)  
               ->  Bitmap Heap Scan on b  (cost=13.38..1273.93 rows=1000 width=9)  
                     Recheck Cond: (id = a_1.id)  
                     ->  Bitmap Index Scan on idx_b_id  (cost=0.00..13.13 rows=1000 width=0)  
                           Index Cond: (id = a_1.id)  
   ->  Hash  (cost=3.05..3.05 rows=1 width=15)  
         ->  Index Scan using idx_a_info on a  (cost=0.43..3.05 rows=1 width=15)  
               Index Cond: (info = 'test1'::text)  
(16 rows)

query rewrite是一个比较智能的工作，在某些情况下，可以起到很好的性能优化作用，query rewrite也是许多数据库产品比拼的技术之一。

PostgreSQL这方面还是非常有优势的，请看我在SR_PLAN中的例子，加油。

其实除了query rewrite，PostgreSQL的社区还提供了一个非常强大的插件，sr_plan。

类似于Oracle的sql outline。