PostgreSQL VACUUM 之深入浅出 (二)

本文深入探讨了 PostgreSQL 的 AUTOVACUUM 机制,包括 AUTOVACUUM 的原理、配置参数以及如何精准触发 AUTOVACUUM。通过实验演示了 AUTOVACUUM 触发的条件,如达到死元组阈值和分析阈值,并展示了如何设置表级 AUTOVACUUM 参数以适应不同大小的表。同时,文章还提供了监控 AUTOVACUUM 的方法和调整策略。
摘要由CSDN通过智能技术生成

Python微信订餐小程序课程视频

https://edu.csdn.net/course/detail/36074

Python实战量化交易理财系统

https://edu.csdn.net/course/detail/35475

AUTOVACUUM

AUTOVACUUM 简介

PostgreSQL 提供了 AUTOVACUUM 的机制。

autovacuum 不仅会自动进行 VACUUM,也会自动进行 ANALYZE,以分析统计信息用于执行计划。

在 postgresql.conf 中,autovacuum 参数已默认打开。



|  | autovacuum = on |


autovacuum 打开后,会有一个 autovacuum launcher 进程



|  | $ ps -ef|grep postgres|grep autovacuum|grep -v grep |
|  | postgres 28398 28392 0 Nov13 ? 00:00:19 postgres: autovacuum launcher  |


pg_stat_activity 也可以看到 backend_type 为 autovacuum launcher 的连接:



|  | psql -d alvindb -U postgres |
|  | alvindb=# \x |
|  | Expanded display is on. |
|  | alvindb=# SELECT * FROM pg\_stat\_activity WHERE backend\_type = 'autovacuum launcher'; |
|  | -[ RECORD 1 ]----+------------------------------ |
|  | datid |  |
|  | datname |  |
|  | pid | 28398 |
|  | usesysid |  |
|  | usename |  |
|  | application\_name |  |
|  | client\_addr |  |
|  | client\_hostname |  |
|  | client\_port |  |
|  | backend\_start | 2021-11-13 23:18:00.406618+08 |
|  | xact\_start |  |
|  | query\_start |  |
|  | state\_change |  |
|  | wait\_event\_type | Activity |
|  | wait\_event | AutoVacuumMain |
|  | state |  |
|  | backend\_xid |  |
|  | backend\_xmin |  |
|  | query |  |
|  | backend\_type | autovacuum launcher |


那么 AUTOVACUUM 多久运行一次?

autovacuum launcher 会每隔 autovacuum_naptime ,创建 autovacuum worker,检查是否需要做 autovacuum。



|  | psql -d alvindb -U postgres |
|  | alvindb=# SELECT * FROM pg\_stat\_activity WHERE backend\_type = 'autovacuum worker'; |
|  | -[ RECORD 1 ]----+------------------------------ |
|  | datid | 13220 |
|  | datname | postgres |
|  | pid | 32457 |
|  | usesysid | |
|  | usename | |
|  | application\_name | |
|  | client\_addr | |
|  | client\_hostname | |
|  | client\_port | |
|  | backend\_start | 2021-11-06 23:32:53.880281+08 |
|  | xact\_start | |
|  | query\_start | |
|  | state\_change | |
|  | wait\_event\_type | |
|  | wait\_event | |
|  | state | |
|  | backend\_xid | |
|  | backend\_xmin | |
|  | query | |
|  | backend\_type | autovacuum worker |


autovacuum_naptime 默认为 1min:



|  | #autovacuum\_naptime = 1min # time between autovacuum runs |


autovacuum 又是根据什么标准决定是否进行 VACUUM 和 ANALYZE 呢?

当 autovacuum worker 检查到,

dead tuples 大于 vacuum threshold 时,会自动进行 VACUUM。

vacuum threshold 公式如下:



|  | vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples |


增删改的行数据大于 analyze threshold 时,会自动进行 ANALYZE。

analyze threshold 公式如下:



|  | analyze threshold = analyze base threshold + analyze scale factor * number of tuples |


对应 postgresql.conf 中相关参数如下:



|  | #autovacuum\_vacuum\_threshold = 50 # min number of row updates before vacuum |
|  | #autovacuum\_analyze\_threshold = 50 # min number of row updates before analyze |
|  | #autovacuum\_vacuum\_scale\_factor = 0.2 # fraction of table size before vacuum |
|  | #autovacuum\_analyze\_scale\_factor = 0.1 # fraction of table size before analyze |


dead tuples 为 pg_stat_user_tables.n_dead_tup(Estimated number of dead rows)



|  | alvindb=> SELECT * FROM pg\_stat\_user\_tables WHERE schemaname = 'alvin' AND relname = 'tb\_test\_vacuum'; |
|  | -[ RECORD 1 ]-------+--------------- |
|  | relid | 37409 |
|  | schemaname | alvin |
|  | relname | tb\_test\_vacuum |
|  | seq\_scan | 2 |
|  | seq\_tup\_read | 0 |
|  | idx\_scan | 0 |
|  | idx\_tup\_fetch | 0 |
|  | n\_tup\_ins | 0 |
|  | n\_tup\_upd | 0 |
|  | n\_tup\_del | 0 |
|  | n\_tup\_hot\_upd | 0 |
|  | n\_live\_tup | 0 |
|  | n\_dead\_tup | 0 |
|  | n\_mod\_since\_analyze | 0 |
|  | last\_vacuum | |
|  | last\_autovacuum | |
|  | last\_analyze | |
|  | last\_autoanalyze | |
|  | vacuum\_count | 0 |
|  | autovacuum\_count | 0 |
|  | analyze\_count | 0 |
|  | autoanalyze\_count | 0 |


那么 number of tuples 是哪个列的值?是 pg_stat_user_tables.n_live_tup(Estimate number of live rows)?还是实际的 count 值?

其实是 pg_class.reltuples (Estimate number of live rows in the table used by the planner)。



|  | alvindb=> SELECT u.schemaname,u.relname,c.reltuples,u.n\_live\_tup,u.n\_mod\_since\_analyze,u.n\_dead\_tup,u.last\_autoanalyze,u.last\_autovacuum |
|  | FROM |
|  |  pg\_stat\_user\_tables u, pg\_class c, pg\_namespace n |
|  | WHERE n.oid = c.relnamespace |
|  | AND c.relname = u.relname |
|  | AND n.nspname = u.schemaname |
|  | AND u.schemaname = 'alvin' |
|  | AND u.relname = 'tb\_test\_vacuum' |
|  | -[ RECORD 1 ]-------+--------------- |
|  | schemaname | alvin |
|  | relname | tb\_test\_vacuum |
|  | reltuples | 0 |
|  | n\_live\_tup | 0 |
|  | n\_mod\_since\_analyze | 0 |
|  | n\_dead\_tup | 0 |
|  | last\_autoanalyze | |
|  | last\_autovacuum | |


所以 AUTO VACUUM 具体公式如下:



|  | pg\_stat\_user\_tables.n\_dead\_tup > autovacuum\_vacuum\_threshold + autovacuum\_vacuum\_scale\_factor * pg\_class.reltuples |


同理,AUTO ANALYZE 具体公式如下:



|  | pg\_stat\_user\_tables.n\_mod\_since\_analyze > autovacuum\_analyze\_threshold + autovacuum\_analyze\_scale\_factor * pg\_class.reltuples |


精准触发 AUTOVACUUM

下面实测一下 autovacuum。为了测试方便,autovacuum_naptime 临时修改为 5s,这样触发了临界条件,只需要等 5s 就能看到效果,而不是等 1min。

修改参数如下:



|  | autovacuum\_naptime = 5s |
|  | autovacuum\_vacuum\_threshold = 100 # min number of row updates before vacuum |
|  | autovacuum\_analyze\_threshold = 100 # min number of row updates before analyze |
|  | autovacuum\_vacuum\_scale\_factor = 0.2 # fraction of table size before vacuum |
|  | autovacuum\_analyze\_scale\_factor = 0.1 # fraction of table size before analyze |


接下来通过一步一步测试,精准触

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值