AUTOVACUUM
AUTOVACUUM 简介
PostgreSQL 提供了 AUTOVACUUM 的机制。
autovacuum 不仅会自动进行 VACUUM,也会自动进行 ANALYZE,以分析统计信息用于执行计划。
在 postgresql.conf 中,autovacuum
参数已默认打开。
autovacuum = on
autovacuum 打开后,会有一个 autovacuum launcher 进程
$ ps -ef|grep postgres|grep autovacuum|grep -v grep
postgres 28398 28392 0 Nov13 ? 00:00:19 postgres: autovacuum launcher
在 pg_stat_activity
也可以看到 backend_type
为 autovacuum launcher 的连接:
psql -d alvindb -U postgres
alvindb=# \x
Expanded display is on.
alvindb=# SELECT * FROM pg_stat_activity WHERE backend_type = 'autovacuum launcher';
-[ RECORD 1 ]----+------------------------------
datid |
datname |
pid | 28398
usesysid |
usename |
application_name |
client_addr |
client_hostname |
client_port |
backend_start | 2021-11-13 23:18:00.406618+08
xact_start |
query_start |
state_change |
wait_event_type | Activity
wait_event | AutoVacuumMain
state |
backend_xid |
backend_xmin |
query |
backend_type | autovacuum launcher
那么 AUTOVACUUM 多久运行一次?
autovacuum launcher 会每隔 autovacuum_naptime
,创建 autovacuum worker,检查是否需要做 autovacuum。
psql -d alvindb -U postgres
alvindb=# SELECT * FROM pg_stat_activity WHERE backend_type = 'autovacuum worker';
-[ RECORD 1 ]----+------------------------------
datid | 13220
datname | postgres
pid | 32457
usesysid |
usename |
application_name |
client_addr |
client_hostname |
client_port |
backend_start | 2021-11-06 23:32:53.880281+08
xact_start |
query_start |
state_change |
wait_event_type |
wait_event |
state |
backend_xid |
backend_xmin |
query |
backend_type | autovacuum worker
autovacuum_naptime 默认为 1min:
#autovacuum_naptime = 1min # time between autovacuum runs
autovacuum 又是根据什么标准决定是否进行 VACUUM 和 ANALYZE 呢?
当 autovacuum worker 检查到,
dead tuples 大于 vacuum threshold 时,会自动进行 VACUUM。
vacuum threshold 公式如下:
vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples
增删改的行数据大于 analyze threshold 时,会自动进行 ANALYZE。
analyze threshold 公式如下:
analyze threshold = analyze base threshold + analyze scale factor * number of tuples
对应 postgresql.conf 中相关参数如下:
#autovacuum_vacuum_threshold = 50 # min number of row updates before vacuum
#autovacuum_analyze_threshold = 50 # min number of row updates before analyze
#autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum
#autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze
dead tuples 为 pg_stat_user_tables.n_dead_tup
(Estimated number of dead rows)
alvindb=> SELECT * FROM pg_stat_user_tables WHERE schemaname = 'alvin' AND relname = 'tb_test_vacuum';
-[ RECORD 1 ]-------+---------------
relid | 37409
schemaname | alvin
relname | tb_test_vacuum
seq_scan | 2
seq_tup_read | 0
idx_scan | 0
idx_tup_fetch | 0
n_tup_ins | 0
n_tup_upd | 0
n_tup_del | 0
n_tup_hot_upd | 0
n_live_tup | 0
n_dead_tup | 0
n_mod_since_analyze | 0
last_vacuum |
last_autovacuum |
last_analyze |
last_autoanalyze |
vacuum_count | 0
autovacuum_count | 0
analyze_count | 0
autoanalyze_count | 0
那么 number of tuples 是哪个列的值?是 pg_stat_user_tables.n_live_tup
(Estimate number of live rows)?还是实际的 count 值?
其实是 pg_class.reltuples
(Estimate number of live rows in the table used by the planner)。
alvindb=> SELECT u.schemaname,u.relname,c.reltuples,u.n_live_tup,u.n_mod_since_analyze,u.n_dead_tup,u.last_autoanalyze,u.last_autovacuum
FROM
pg_stat_user_tables u, pg_class c, pg_namespace n
WHERE n.oid = c.relnamespace
AND c.relname = u.relname
AND n.nspname = u.schemaname
AND u.schemaname = 'alvin'
AND u.relname = 'tb_test_vacuum'
-[ RECORD 1 ]-------+---------------
schemaname | alvin
relname | tb_test_vacuum
reltuples | 0
n_live_tup | 0
n_mod_since_analyze | 0
n_dead_tup | 0
last_autoanalyze |
last_autovacuum |
所以 AUTO VACUUM 具体公式如下:
pg_stat_user_tables.n_dead_tup > autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor * pg_class.reltuples
同理,AUTO ANALYZE 具体公式如下:
pg_stat_user_tables.n_mod_since_analyze > autovacuum_analyze_threshold + autovacuum_analyze_scale_factor * pg_class.reltuples
精准触发 AUTOVACUUM
下面实测一下 autovacuum。为了测试方便,autovacuum_naptime
临时修改为 5s,这样触发了临界条件,只需要等 5s 就能看到效果,而不是等 1min。
修改参数如下:
autovacuum_naptime = 5s
autovacuum_vacuum_threshold = 100 # min number of row updates before vacuum
autovacuum_analyze_threshold = 100 # min number of row updates before analyze
autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum
autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze
接下来通过一步一步测试,精准触发 autovacuum。
为了方便测试,通过如下 AUTOVACUUM 计算 SQL 计算需要删除或修改的数据行数。