【postgresql案例一】一个LIMIT引发的“血案”

最新推荐文章于 2023-07-26 16:53:17 发布

小象Elephant

最新推荐文章于 2023-07-26 16:53:17 发布

阅读量2.9k

点赞数 1

分类专栏： pg案例 postgresql

本文链接：https://blog.csdn.net/u010922420/article/details/80147687

版权

postgresql 同时被 2 个专栏收录

7 篇文章 1 订阅

订阅专栏

pg案例

0 篇文章 0 订阅

订阅专栏

之前给业务排查的时候发现，业务上存在大量的全量查询，其中很多信息都是无用的，采用全量查询存在以下几个方面问题：

查询数据量大，导致发送时间长，在数据库中经常看到慢查询日志，经常需要分析
当前业务采用sqlachemly方式操作数据库，数据量大的时候，经常出现python接收慢，并且占用大量内存的情况，之前出现过大并发下全量查询导致客户端卡死的情况

根据经验，提出了加上LIMIT的方法，限制每次查询返回的行数，并且这也是业务社区上提供的方法。但是，万万没有想到，这样一个小小的LIMIT，竟然导致业务性能下降百倍，生产环境“瘫痪”两次，真是苦不堪言啊，下面一起看下这个“血案”究竟是怎样发生的。

查询SQL介绍

通过对业务模型梳理，简化后的业务查询SQL语句如下：

create table T_A(id int, c_1 int);
create index t_a_id on T_A(id);
create index t_a_c on T_A(c_1);
create table T_B(id int, c_1 int);
create index t_b_id on T_B(id);


insert into T_A select generate_series(1,200000),generate_series(1,1000000);
insert into T_A select generate_series(1,200000),generate_series(1,1000000);
insert into T_A select generate_series(1,200000),generate_series(1,1000000);
insert into T_A select generate_series(1,200000),generate_series(1,1000000);
insert into T_A select generate_series(1,200000),generate_series(1,1000000);

truncate table t_b;
insert into T_B select generate_series(1,1000),generate_series(1,10);


vacuum analyze t_a;
vacuum analyze t_b;

查询SQL如下：

explain (analyze,buffers,verbose) select T_A.id from T_A left outer join T_B on T_A.id=T_B.id
where (T_A.c_1 = 999 or T_B.c_1=1 ) order by T_A.id ASC limit 4000;

执行结果

目前使用的是PG9.2的主干版本，首先在该版本上测试：

postgres=# explain (analyze,buffers,verbose) select T_A.id from T_A left outer join T_B on T_A.id=T_B.id
where (T_A.c_1 = 999 or T_B.c_1=1 ) order by T_A.id ASC limit 4000;postgres-#
                                                                     QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..96110.03 rows=4000 width=4) (actual time=0.023..4798.584 rows=2505 loops=1)
   Output: t_a.id
   Buffers: shared hit=4977865 read=40181 written=2720
   ->  Merge Left Join  (cost=0.42..12013798.38 rows=500004 width=4) (actual time=0.023..4798.168 rows=2505 loops=1)
         Output: t_a.id
         Merge Cond: (t_a.id = t_b.id)
         Filter: ((t_a.c_1 = 999) OR (t_b.c_1 = 1))
         Rows Removed by Filter: 4997495
         Buffers: shared hit=4977865 read=40181 written=2720
         ->  Index Scan using t_a_id on public.t_a  (cost=0.00..12000762.83 rows=5000000 width=8) (actual time=0.008..3570.243 rows=5000000 loops=1)
               Output: t_a.id, t_a.c_1
               Buffers: shared hit=4977856 read=40181 written=2720
         ->  Materialize  (cost=0.00..45.75 rows=1000 width=8) (actual time=0.009..3.625 rows=24976 loops=1)
               Output: t_b.id, t_b.c_1
               Buffers: shared hit=9
               ->  Index Scan using t_b_id on public.t_b  (cost=0.00..43.25 rows=1000 width=8) (actual time=0.007..0.413 rows=1000 loops=1)
                     Output: t_b.id, t_b.c_1
                     Buffers: shared hit=9
 Total runtime: 4798.890 ms
(19 rows)

postgres=# select version();
                                                     version
-----------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.2.24 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit
(1 row)

从上面的执行计划中，可以看出，优化器在对OR条件进行选择的时候，采用的是index scan的方式，对表t_a的扫描消耗了3.57s。

对比下，去掉LIMIT的执行计划

postgres=# explain (analyze,buffers,verbose) select T_A.id from T_A left outer join T_B on T_A.id=T_B.id
where (T_A.c_1 = 999 or T_B.c_1=1 ) order by T_A.id ASCpostgres-# ;
                                                            QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=152278.05..153528.06 rows=500004 width=4) (actual time=2409.779..2410.012 rows=2505 loops=1)
   Output: t_a.id
   Sort Key: t_a.id
   Sort Method: quicksort  Memory: 214kB
   Buffers: shared hit=3149 read=18983
   ->  Hash Left Join  (cost=27.50..91270.73 rows=500004 width=4) (actual time=0.514..2408.678 rows=2505 loops=1)
         Output: t_a.id
         Hash Cond: (t_a.id = t_b.id)
         Filter: ((t_a.c_1 = 999) OR (t_b.c_1 = 1))
         Rows Removed by Filter: 4997495
         Buffers: shared hit=3146 read=18983
         ->  Seq Scan on public.t_a  (cost=0.00..72124.00 rows=5000000 width=8) (actual time=0.030..809.089 rows=5000000 loops=1)
               Output: t_a.id, t_a.c_1
               Buffers: shared hit=3144 read=18980
         ->  Hash  (cost=15.00..15.00 rows=1000 width=8) (actual time=0.463..0.463 rows=1000 loops=1)
               Output: t_b.id, t_b.c_1
               Buckets: 1024  Batches: 1  Memory Usage: 40kB
               Buffers: shared hit=2 read=3
               ->  Seq Scan on public.t_b  (cost=0.00..15.00 rows=1000 width=8) (actual time=0.003..0.198 rows=1000 loops=1)
                     Output: t_b.id, t_b.c_1
                     Buffers: shared hit=2 read=3
 Total runtime: 2410.330 ms
(22 rows)

从执行计划上可以看到，这里采用的是全表扫描方式，耗时只有809ms。

是的，全表扫描只有809ms，索引扫描耗时3.57ms，这部分时间差距是整个语句的差距，没有看错。

由于目前使用的是PG92版本，再测试一个PG9.6.8的版本

postgres=# explain analyze select T_A.id from T_A left outer join T_B on T_A.id=T_B.id
where (T_A.c_1 = 999 or T_B.c_1=1 ) order by T_A.id ASC limit 4000;postgres-# ;
                                                                 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.75..1992.77 rows=4000 width=4) (actual time=0.024..4631.075 rows=2505 loops=1)
   ->  Merge Left Join  (cost=1.75..248880.44 rows=500004 width=4) (actual time=0.023..4630.797 rows=2505 loops=1)
         Merge Cond: (t_a.id = t_b.id)
         Filter: ((t_a.c_1 = 999) OR (t_b.c_1 = 1))
         Rows Removed by Filter: 4997495
         ->  Index Scan using t_a_id on t_a  (cost=0.43..235843.67 rows=5000000 width=8) (actual time=0.009..3393.133 rows=5000000 loops=1)
         ->  Materialize  (cost=0.28..45.77 rows=1000 width=8) (actual time=0.009..2.465 rows=24976 loops=1)
               ->  Index Scan using t_b_id on t_b  (cost=0.28..43.27 rows=1000 width=8) (actual time=0.008..0.384 rows=1000 loops=1)
 Planning time: 0.586 ms
 Execution time: 4631.266 ms
(10 rows)

postgres=# select version();
                                                    version
----------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.6.8 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit
(1 row)

发现加上LIMIT之后，执行时间差不多，采用的仍然是索引扫描方式（本地测试，PG10不存在这个问题，一方面存在并行查询，二是有优化，关闭并行查询之后，采用hash join的方式，不会选择index scan）。

问题分析

仔细分析这个SQL语句，整个语句比较简单，但是存在一个OR语法，并且是两个不同的表：

 select T_A.id from T_A left outer join T_B on T_A.id=T_B.id where (T_A.c_1 = 999 or T_B.c_1=1 ) order by T_A.id ASC limit 4000;

查询资料，分析执行计划变慢的原因可能是：

由于后面存在order by id limit 4000，优化器认为只需要获取id的前4000条元组即可，并且id是主键，因此会默认选择id的索引扫描，认为这种方式是最快的，但不幸的是，由于前面是带有OR的条件判断，并且值和id无关，因此需要遍历元组判断是否满足条件，导致扫描的文件块增多（每次从索引找到对应的元组是否满足条件），从而性能下降。当然，这个语句只有在大数据量下，存在执行计划发生变化。

如果采用LIMIT 400000条件，执行计划如下：

postgres=# explain analyze select T_A.id from T_A left outer join T_B on T_A.id=T_B.id
where (T_A.c_1 = 999 or T_B.c_1=1 ) order by T_A.id ASC limit 400000;postgres-#
                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=145439.95..146439.95 rows=400000 width=4) (actual time=1976.531..1977.094 rows=2505 loops=1)
   ->  Sort  (cost=145439.95..146689.96 rows=500004 width=4) (actual time=1976.530..1976.848 rows=2505 loops=1)
         Sort Key: t_a.id
         Sort Method: quicksort  Memory: 214kB
         ->  Hash Left Join  (cost=27.50..91271.62 rows=500004 width=4) (actual time=0.445..1975.517 rows=2505 loops=1)
               Hash Cond: (t_a.id = t_b.id)
               Filter: ((t_a.c_1 = 999) OR (t_b.c_1 = 1))
               Rows Removed by Filter: 4997495
               ->  Seq Scan on t_a  (cost=0.00..72124.00 rows=5000000 width=8) (actual time=0.037..559.274 rows=5000000 loops=1)
               ->  Hash  (cost=15.00..15.00 rows=1000 width=8) (actual time=0.394..0.394 rows=1000 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 48kB
                     ->  Seq Scan on t_b  (cost=0.00..15.00 rows=1000 width=8) (actual time=0.004..0.153 rows=1000 loops=1)
 Planning time: 1.164 ms
 Execution time: 1977.321 ms
(14 rows)

看到执行返回结果一致，从索引扫描变回顺序扫描，从merge join变成hash join，执行时间大大减少。

最后，经过细致分析，建议业务在ORDER BY上增加一个没有影响的排序列，如ORDER BY T_A.id，T_A.c_1

执行计划如下：

postgres=# explain analyze select T_A.id from T_A left outer join T_B on T_A.id=T_B.id
where (T_A.c_1 = 999 or T_B.c_1=1 ) order by T_A.id ASC,T_A.c_1 ASC limit 4000;postgres-#
                                                           QUERY PLAN

-------------------------------------------------------------------------------------------------------
--------------------------
 Limit  (cost=123686.35..123696.35 rows=4000 width=8) (actual time=1921.081..1921.665 rows=2505 loops=1
)
   ->  Sort  (cost=123686.35..124936.36 rows=500004 width=8) (actual time=1921.081..1921.401 rows=2505
loops=1)
         Sort Key: t_a.id, t_a.c_1
         Sort Method: quicksort  Memory: 214kB
         ->  Hash Left Join  (cost=27.50..91271.62 rows=500004 width=8) (actual time=0.402..1919.781 ro
ws=2505 loops=1)
               Hash Cond: (t_a.id = t_b.id)
               Filter: ((t_a.c_1 = 999) OR (t_b.c_1 = 1))
               Rows Removed by Filter: 4997495
               ->  Seq Scan on t_a  (cost=0.00..72124.00 rows=5000000 width=8) (actual time=0.063..565.
419 rows=5000000 loops=1)
               ->  Hash  (cost=15.00..15.00 rows=1000 width=8) (actual time=0.330..0.330 rows=1000 loop
s=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 48kB
                     ->  Seq Scan on t_b  (cost=0.00..15.00 rows=1000 width=8) (actual time=0.005..0.11
7 rows=1000 loops=1)
 Planning time: 0.330 ms
 Execution time: 1921.997 ms
(14 rows)

这样修改之后，不影响查询结果，执行计划和之前保持一致，但是需要注意的是，不能再T_A表的（id，c_1）上建立组合索引，否则和之前一样，执行计划发生改变。

结束语

这个问题在我们的生产环境上出现了2次，后来排查，又发现了类似的问题，修改之后，性能没有下降。并且在mysql5.5上测试，发现类似同样的情况。

后来分析，最优的解决方式应该是采用UNION ALL的方式转换OR条件，但是目前pg底层不能自动转换。可喜的是，社区已经注意到这个问题，着手开发pacth进行底层性能优化，这样性能会明显提升

Convert join OR clauses into UNION queries

https://commitfest.postgresql.org/18/1001/

由于对数据库底层的执行计划不是很了解，因此对这个执行计划改变过程不是很理解，希望高手能给解答下。

小象Elephant

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
2
评论
【postgresql案例一】一个LIMIT引发的“血案”

之前给业务排查的时候发现，业务上存在大量的全量查询，其中很多信息都是无用的，采用全量查询存在以下几个方面问题：查询数据量大，导致发送时间长，在数据库中经常看到慢查询日志，经常需要分析当前业务采用sqlachemly方式操作数据库，数据量大的时候，经常出现python接收慢，并且占用大量内存的情况，之前出现过大并发下全量查询导致客户端卡死的情况根据经验，提出了加上LIMIT的方法，限制每次查询返回的...
复制链接

扫一扫