Group 和 Distinct 列的次序影响查询性能

一、概述

优化拥有大量的分组和去重列的SQL时,这些排序列的次序,也是可以优化的地方。

测试数据结构

kingbase=# select count(distinct txt1 ) txt1, avg(length(txt1))::int ln1, count(distinct txt3 ) txt3 ,avg(length(txt3))::int ln3 from txt01;

 txt1 | ln1  |  txt3   | ln3 
------+------+---------+-----
 1000 | 1000 | 1000000 |  10
(1 行记录)

二、work_mem 满足排序情况

1、Distinct 语句

次序: txt1,txt3

kingbase=# explain (analyse ,buffers ) select /*+ set(work_mem 100MB) */ distinct txt1 ,txt3 from txt01 ;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=269287.33..269687.33 rows=40000 width=64) (actual time=1543.995..1877.527 rows=1000000 loops=1)
   Group Key: txt1, txt3
   Buffers: shared hit=142858
   ->  Seq Scan on txt01  (cost=0.00..227144.22 rows=8428622 width=64) (actual time=0.008..159.858 rows=1000000 loops=1)
         Buffers: shared hit=142858
 Planning Time: 0.081 ms
 Execution Time: 1947.951 ms
(7 行记录)

次序: txt3,txt1

ingbase=# explain (analyse ,buffers ) select /*+ set(work_mem 100MB) */ distinct txt3 ,txt1 from txt01 ;
                                                       QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=269287.33..269687.33 rows=40000 width=64) (actual time=1596.040..1812.380 rows=1000000 loops=1)
   Group Key: txt3, txt1
   Buffers: shared hit=142858
   ->  Seq Scan on txt01  (cost=0.00..227144.22 rows=8428622 width=64) (actual time=0.007..163.399 rows=1000000 loops=1)
         Buffers: shared hit=142858
 Planning Time: 0.075 ms
 Execution Time: 1884.907 ms
(7 行记录)

2、Group by 语句

次序: txt1,txt3

kingbase=# explain (analyse ,buffers ) select /*+ set(work_mem 100MB) */  txt1 ,txt3 from txt01 group by txt1 ,txt3 ;
                                                       QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=269287.33..269687.33 rows=40000 width=64) (actual time=1540.948..1875.917 rows=1000000 loops=1)
   Group Key: txt1, txt3
   Buffers: shared hit=142858
   ->  Seq Scan on txt01  (cost=0.00..227144.22 rows=8428622 width=64) (actual time=0.006..160.419 rows=1000000 loops=1)
         Buffers: shared hit=142858
 Planning Time: 0.084 ms
 Execution Time: 1939.103 ms
(7 行记录)

次序: txt3,txt1

kingbase=# explain (analyse ,buffers ) select /*+ set(work_mem 100MB) */ txt1 ,txt3 from txt01 group by txt3 ,txt1 ;
                                                       QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=269287.33..269687.33 rows=40000 width=64) (actual time=1557.257..1780.662 rows=1000000 loops=1)
   Group Key: txt3, txt1
   Buffers: shared hit=142858
   ->  Seq Scan on txt01  (cost=0.00..227144.22 rows=8428622 width=64) (actual time=0.018..165.221 rows=1000000 loops=1)
         Buffers: shared hit=142858
 Planning Time: 0.330 ms
 Execution Time: 1844.664 ms
(7 行记录)

三、work_mem 不满足排序情况

1、Distinct 语句

次序: txt1,txt3

kingbase=# explain (analyse ,buffers ) select /*+ set(work_mem 1MB) */ distinct txt1 ,txt3 from txt01  ;
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=2464313.08..2527527.74 rows=40000 width=64) (actual time=21031.092..22131.259 rows=1000000 loops=1)
   Buffers: shared hit=142858, temp read=125368 written=125369
   ->  Sort  (cost=2464313.08..2485384.63 rows=8428622 width=64) (actual time=21031.089..22002.850 rows=1000000 loops=1)
         Sort Key: txt1, txt3
         Sort Method: external merge  Disk: 1002944kB
         Buffers: shared hit=142858, temp read=125368 written=125369
         ->  Seq Scan on txt01  (cost=0.00..227144.22 rows=8428622 width=64) (actual time=0.039..272.327 rows=1000000 loops=1)
               Buffers: shared hit=142858
 Planning Time: 0.272 ms
 Execution Time: 23648.185 ms
(10 行记录)

次序: txt3,txt1

kingbase=# explain (analyse ,buffers ) select /*+ set(work_mem 1MB) */ distinct txt3 ,txt1 from txt01  ;
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=2464313.08..2527527.74 rows=40000 width=64) (actual time=4004.641..4367.218 rows=1000000 loops=1)
   Buffers: shared hit=142858, temp read=125491 written=125492
   ->  Sort  (cost=2464313.08..2485384.63 rows=8428622 width=64) (actual time=4004.639..4239.599 rows=1000000 loops=1)
         Sort Key: txt3, txt1
         Sort Method: external merge  Disk: 1003928kB
         Buffers: shared hit=142858, temp read=125491 written=125492
         ->  Seq Scan on txt01  (cost=0.00..227144.22 rows=8428622 width=64) (actual time=0.011..271.572 rows=1000000 loops=1)
               Buffers: shared hit=142858
 Planning Time: 0.086 ms
 Execution Time: 4457.751 ms
(10 行记录)

2、Group by 语句

次序: txt1,txt3

kingbase=# explain (analyse ,buffers ) select /*+ set(work_mem 1MB) */ txt1 ,txt3 from txt01 group by txt1 ,txt3 ;
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 Group  (cost=2464313.08..2527527.74 rows=40000 width=64) (actual time=21715.770..22796.166 rows=1000000 loops=1)
   Group Key: txt1, txt3
   Buffers: shared hit=142858, temp read=125368 written=125369
   ->  Sort  (cost=2464313.08..2485384.63 rows=8428622 width=64) (actual time=21715.764..22658.413 rows=1000000 loops=1)
         Sort Key: txt1, txt3
         Sort Method: external merge  Disk: 1002944kB
         Buffers: shared hit=142858, temp read=125368 written=125369
         ->  Seq Scan on txt01  (cost=0.00..227144.22 rows=8428622 width=64) (actual time=0.029..271.335 rows=1000000 loops=1)
               Buffers: shared hit=142858
 Planning Time: 0.285 ms
 Execution Time: 25365.012 ms
(11 行记录)

次序: txt3,txt1

kingbase=# explain (analyse ,buffers ) select /*+ set(work_mem 1MB) */ txt1 ,txt3 from txt01 group by txt3 ,txt1 ;
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 Group  (cost=2464313.08..2527527.74 rows=40000 width=64) (actual time=4156.296..4541.315 rows=1000000 loops=1)
   Group Key: txt3, txt1
   Buffers: shared hit=142858, temp read=125368 written=125369
   ->  Sort  (cost=2464313.08..2485384.63 rows=8428622 width=64) (actual time=4156.291..4402.265 rows=1000000 loops=1)
         Sort Key: txt3, txt1
         Sort Method: external merge  Disk: 1002944kB
         Buffers: shared hit=142858, temp read=125368 written=125369
         ->  Seq Scan on txt01  (cost=0.00..227144.22 rows=8428622 width=64) (actual time=0.008..270.872 rows=1000000 loops=1)
               Buffers: shared hit=142858
 Planning Time: 0.081 ms
 Execution Time: 4632.567 ms
(11 行记录)

四、总结

次序txt1,txt1txt3,txt1
work_mem满足排序1947.951 ms1884.907 ms
work_mem不足排序25365.012 ms4632.567 ms

字节少数据值多的列,处于排序列的前列,可以带来性能的提升。当work_mem满足排序时,性能差异不大,当work_mem不足时,性能提升较大。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值