3月第六讲pg distinct 改写递归优化

下面SQL1000W行数据,列的选择性很低,只有两个值('1'和'11')都是字符串类型,'1'只有一条数据,'11'有9999999行数据。

慢SQL:

select distinct col from tt;

                                                      QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=169247.11..169247.12 rows=1 width=3) (actual time=5082.733..5082.735 rows=2 loops=1)
   Group Key: col
   ->  Seq Scan on tt  (cost=0.00..144247.29 rows=9999929 width=3) (actual time=0.005..275.906 rows=10000000 loops=1)
 Planning Time: 0.365 ms
 Execution Time: 5082.772 ms
(5 行记录)

CTE递归优化:

WITH RECURSIVE t AS (
   (SELECT col FROM tt ORDER BY col LIMIT 1)  
   UNION ALL
   SELECT (SELECT col FROM tt WHERE col > t.col ORDER BY col LIMIT 1)
   FROM t
   WHERE t.col IS NOT NULL
   )
SELECT col FROM t WHERE col IS NOT NULL;

                                                                        QUERY PLAN                                                                        
----------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on t  (cost=50.84..52.86 rows=100 width=38) (actual time=0.024..0.079 rows=2 loops=1)
   Filter: (col IS NOT NULL)
   Rows Removed by Filter: 1
   CTE t
     ->  Recursive Union  (cost=0.43..50.84 rows=101 width=38) (actual time=0.022..0.076 rows=3 loops=1)
           ->  Limit  (cost=0.43..0.46 rows=1 width=3) (actual time=0.021..0.021 rows=1 loops=1)
                 ->  Index Only Scan using idx_1_2_tt on tt tt_1  (cost=0.43..260443.37 rows=9999929 width=3) (actual time=0.020..0.020 rows=1 loops=1)
                       Heap Fetches: 0
           ->  WorkTable Scan on t t_1  (cost=0.00..4.84 rows=10 width=38) (actual time=0.017..0.017 rows=1 loops=3)
                 Filter: (col IS NOT NULL)
                 Rows Removed by Filter: 0
                 SubPlan 1
                   ->  Limit  (cost=0.43..0.46 rows=1 width=3) (actual time=0.024..0.024 rows=0 loops=2)
                         ->  Index Only Scan using idx_1_2_tt on tt  (cost=0.43..95149.36 rows=3333310 width=3) (actual time=0.024..0.024 rows=0 loops=2)
                               Index Cond: (col > (t_1.col)::text)
                               Heap Fetches: 0
 Planning Time: 0.096 ms
 Execution Time: 0.096 ms
(18 行记录)

里面的逻辑是:

(SELECT col FROM tt ORDER BY col LIMIT 1)

  根节点通过order b

  • 16
    点赞
  • 19
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值