Postgresql之autovacuum分析及表的垃圾数据查看

原文:https://blog.csdn.net/Linzhongyilisha/article/details/79259099

转载:http://blog.163.com/digoal@126/blog/static/163877040201343031118890/

PostgreSQL数据库日常维护需要维护哪些东西, 和数据库中的业务类型有莫大的关系.

PostgreSQL的并发控制简单来说是通过多tuple版本, tuple infomask信息, 事务提交状态以及事务snapshot来实现的.

当删除一条记录时, 并不是马上回收被删除的空间, 因为有可能其他事务还会用到它, 当更新一条记录是, 老的记录会保留, 然后插入新的记录.

例如 : 

digoal=# create table tbl(id int, info text);

CREATE TABLE

digoal=# insert into tbl values (1, 'test');

INSERT 0 1

digoal=# delete from tbl;

DELETE 1

digoal=# insert into tbl values (1, 'test');

INSERT 0 1

digoal=# delete from tbl;

DELETE 1

digoal=# insert into tbl values (1, 'test');

INSERT 0 1

digoal=# select ctid,* from tbl;

 ctid  | id | info 

-------+----+------

 (0,3) |  1 | test

(1 row)

 

多次删除插入后, ctid以及变成3了, 因为前面的两条并为删除. 

update也是如此 : 

 

 

digoal=# update tbl set info='new';

UPDATE 1

digoal=# select ctid,* from tbl;

 ctid  | id | info 

-------+----+------

 (0,4) |  1 | new

(1 row)

 

老的tuple在0号block的itemid=3的位置, 新的tuple是后面插入的在0号block的4号槽.

那么这些垃圾数据是怎么回收的呢, PostgreSQL的vacuum进程就是干这个事情的.

1. vacuum 数据清理.

以上测试表在执行vacuum后的输出如下 : 

移除了3个版本. 

 

 

digoal=# vacuum verbose tbl;

INFO:  vacuuming "public.tbl"

INFO:  "tbl": removed 3 row versions in 1 pages

INFO:  "tbl": found 3 removable, 1 nonremovable row versions in 1 out of 1 pages

DETAIL:  0 dead row versions cannot be removed yet.

There were 0 unused item pointers.

0 pages are entirely empty.

CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO:  vacuuming "pg_toast.pg_toast_32771"

INFO:  index "pg_toast_32771_index" now contains 0 row versions in 1 pages

DETAIL:  0 index row versions were removed.

0 index pages have been deleted, 0 are currently reusable.

CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO:  "pg_toast_32771": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages

DETAIL:  0 dead row versions cannot be removed yet.

There were 0 unused item pointers.

0 pages are entirely empty.

CPU 0.00s/0.00u sec elapsed 0.00 sec.

VACUUM

 

重新插入数据, 此时那些被垃圾占用的槽位就可以被利用了.

 

 

digoal=# insert into tbl values (1, 'test');

INSERT 0 1

digoal=# select ctid,* from tbl;

 ctid  | id | info 

-------+----+------

 (0,1) |  1 | test

 (0,4) |  1 | new

(2 rows)

 

一个表有多少条垃圾数据, 多少条活跃数据在系统表pg_stat_all_tables中可以查询.

 

 

digoal=# select * from pg_stat_all_tables where relid='tbl'::regclass;

-[ RECORD 1 ]-----+------------------------------

relid             | 32771

schemaname        | public

relname           | tbl

seq_scan          | 6

seq_tup_read      | 7

idx_scan          | 

idx_tup_fetch     | 

n_tup_ins         | 4

n_tup_upd         | 1

n_tup_del         | 2

n_tup_hot_upd     | 1

n_live_tup        | 2

n_dead_tup        | 0

last_vacuum       | 2013-05-27 17:00:17.094391+08

last_autovacuum   | 

last_analyze      | 

last_autoanalyze  | 

vacuum_count      | 1

autovacuum_count  | 0

analyze_count     | 0

autoanalyze_count | 0

 

n_live_tup        | 2表示有2条活跃数据, 

n_dead_tup        | 0表示有0条垃圾数据.

执行以下删除后, 会发生变化 : 

 

 

digoal=# delete from tbl;;

DELETE 2

digoal=# select * from pg_stat_all_tables where relid='tbl'::regclass;

-[ RECORD 1 ]-----+------------------------------

relid             | 32771

schemaname        | public

relname           | tbl

seq_scan          | 7

seq_tup_read      | 9

idx_scan          | 

idx_tup_fetch     | 

n_tup_ins         | 4

n_tup_upd         | 1

n_tup_del         | 4

n_tup_hot_upd     | 1

n_live_tup        | 0

n_dead_tup        | 2

last_vacuum       | 2013-05-27 17:00:17.094391+08

last_autovacuum   | 

last_analyze      | 

last_autoanalyze  | 

vacuum_count      | 1

autovacuum_count  | 0

analyze_count     | 0

autoanalyze_count | 0

 

n_live_tup        | 0表示有0条活跃数据, 

n_dead_tup        | 2表示有2条垃圾数据.

vacuum 后活跃数据和垃圾数据都会变成0

 

 

digoal=# vacuum tbl;

VACUUM

digoal=# select * from pg_stat_all_tables where relid='tbl'::regclass;

-[ RECORD 1 ]-----+------------------------------

relid             | 32771

schemaname        | public

relname           | tbl

seq_scan          | 7

seq_tup_read      | 9

idx_scan          | 

idx_tup_fetch     | 

n_tup_ins         | 4

n_tup_upd         | 1

n_tup_del         | 4

n_tup_hot_upd     | 1

n_live_tup        | 0

n_dead_tup        | 0

last_vacuum       | 2013-05-27 17:05:17.664564+08

last_autovacuum   | 

last_analyze      | 

last_autoanalyze  | 

vacuum_count      | 2

autovacuum_count  | 0

analyze_count     | 0

autoanalyze_count | 0

 

 

2. 自动垃圾回收的配置.

对于一个DML频繁的数据库, 如果靠手动来回收垃圾是不太靠谱的事情, PostgreSQL提供了自动的垃圾回收配置.

相关参数如下 : 

 

 

#------------------------------------------------------------------------------

# AUTOVACUUM PARAMETERS

#------------------------------------------------------------------------------

 

#autovacuum = on                        # Enable autovacuum subprocess?  'on'

                                        # requires track_counts to also be on.

#log_autovacuum_min_duration = -1       # -1 disables, 0 logs all actions and

                                        # their durations, > 0 logs only

                                        # actions running at least this number

                                        # of milliseconds.

#autovacuum_max_workers = 3             # max number of autovacuum subprocesses

                                        # (change requires restart)

#autovacuum_naptime = 1min              # time between autovacuum runs

#autovacuum_vacuum_threshold = 50       # min number of row updates before

                                        # vacuum

#autovacuum_analyze_threshold = 50      # min number of row updates before

                                        # analyze

#autovacuum_vacuum_scale_factor = 0.2   # fraction of table size before vacuum

#autovacuum_analyze_scale_factor = 0.1  # fraction of table size before analyze

#autovacuum_freeze_max_age = 200000000  # maximum XID age before forced vacuum

                                        # (change requires restart)

#autovacuum_vacuum_cost_delay = 20ms    # default vacuum cost delay for

                                        # autovacuum, in milliseconds;

                                        # -1 means use vacuum_cost_delay

#autovacuum_vacuum_cost_limit = -1      # default vacuum cost limit for

                                        # autovacuum, -1 means use

                                        # vacuum_cost_limit

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0 # 0-100 milliseconds
#vacuum_cost_page_hit = 1 # 0-10000 credits
#vacuum_cost_page_miss = 10 # 0-10000 credits
#vacuum_cost_page_dirty = 20 # 0-10000 credits
#vacuum_cost_limit = 200 # 1-10000 credits

 

简单介绍一下参数的含义 : 

autovacuum, 自动垃圾回收的开关

log_autovacuum_min_duration, 在什么情况下记录autovacuum日志输出. 0表示记录所有的autovacuum, -1表示不记录, 其他为时间阈值, 大于或等于这个时长的autovacuum才记录.

autovacuum_max_workers, 指最大允许多少个autovacuum子进程同时工作. 因为vacuum会带来IO上的开销, 还会消耗内存. 这个就不要配太大了. 

autovacuum_vacuum_threshold表示autovacuum的vacuum操作所需的最小变更数, 如果这个表的update/delete的tuple总数小于这个数字则不会触发autovacuum的vacuum操作.

和autovacuum_analyze_threshold表示autovacuum的analyze操作所需的最小变更数, 如果这个表的insert/update/delete的tuple总数小于这个数字则不会触发autovacuum的analyze操作.

autovacuum_vacuum_scale_factor, 表示autovacuum的vacuum操作所需的变更量阈值,当这个表的update/delete的tuple总数大于(pg_class.reltuples*autovacuum_vacuum_scale_factor+autovacuum_vacuum_threshold)时, 触发vacuum操作.

autovacuum_analyze_scale_factor, 表示autovacuum的analyze操作所需的变更量阈值,当这个表的INSERT/update/delete的tuple总数大于(pg_class.reltuples*autovacuum_analyze_scale_factor+autovacuum_analyze_threshold)时, 触发analyze操作.

autovacuum_freeze_max_age, 即使autovacuum未开启, 为了防止wrapped xid导致数据不可见, 也会自动触发的vacuum操作. 表示一个表中存在的最早的事务信息到现在为止经历的事务数. 超出则强制vacuum. 防止xid wrapped.

autovacuum_vacuum_cost_delay, 因为vacuum会带来一定的IO开销, 所以PostgreSQL允许管理员指定当vacuum达到一定的阈值后进入随眠状态, 然后再唤醒继续vacuum. 具体的计算需要配置项Cost-Based Vacuum Delay决定.

接下来主要举例说明几个threshold参数的作用 : 

查看当前的阈值 : 

 

digoal=# show autovacuum_analyze_scale_factor;

 autovacuum_analyze_scale_factor 

---------------------------------

 0.1

(1 row)

digoal=# show autovacuum_vacuum_scale_factor;

 autovacuum_vacuum_scale_factor 

--------------------------------

 0.2

(1 row)

digoal=# show autovacuum_analyze_threshold;

 autovacuum_analyze_threshold 

------------------------------

 50

(1 row)

digoal=# show autovacuum_vacuum_threshold;

 autovacuum_vacuum_threshold 

-----------------------------

 50

(1 row)

 

修改naptime, 以及log_autovacuum_min_duration 便于从日志中或者统计表中观察结果 : 

 

pg93@db-172-16-3-33-> cd $PGDATA

pg93@db-172-16-3-33-> vi postgresql.conf 

autovacuum_naptime = 1s

log_autovacuum_min_duration = 0

pg93@db-172-16-3-33-> pg_ctl reload

server signaled

 

创建测试表 : 

 

digoal=# create table tbl(id int, info text);

CREATE TABLE

 

计算插入多少条数据后会触发analyze : 

 

digoal=# select reltuples from pg_class where relname='tbl';

 reltuples 

-----------

         0

(1 row)

autovacuum_analyze_scale_factor*0+autovacuum_analyze_threshold=50;

 

因此插入51条数据后会发生analyze.

记录pg_stat_all_tables的tbl信息, 注意last_autovacuum , 和last_autoanalyze 的值.

 

digoal=# select * from pg_stat_all_tables where relname ='tbl';

-[ RECORD 1 ]-----+-------

relid             | 32798

schemaname        | public

relname           | tbl

seq_scan          | 0

seq_tup_read      | 0

idx_scan          | 

idx_tup_fetch     | 

n_tup_ins         | 0

n_tup_upd         | 0

n_tup_del         | 0

n_tup_hot_upd     | 0

n_live_tup        | 0

n_dead_tup        | 0

last_vacuum       | 

last_autovacuum   | 

last_analyze      | 

last_autoanalyze  | 

vacuum_count      | 0

autovacuum_count  | 0

analyze_count     | 0

autoanalyze_count | 0

 

插入50条测试数据 : 

 

digoal=# insert into tbl select generate_series(1,50),'test';

INSERT 0 50

 

stat信息, 未触发analyze.

 

digoal=# select * from pg_stat_all_tables where relname ='tbl';

-[ RECORD 1 ]-----+-------

relid             | 32798

schemaname        | public

relname           | tbl

seq_scan          | 0

seq_tup_read      | 0

idx_scan          | 

idx_tup_fetch     | 

n_tup_ins         | 50

n_tup_upd         | 0

n_tup_del         | 0

n_tup_hot_upd     | 0

n_live_tup        | 50

n_dead_tup        | 0

last_vacuum       | 

last_autovacuum   | 

last_analyze      | 

last_autoanalyze  | 

vacuum_count      | 0

autovacuum_count  | 0

analyze_count     | 0

autoanalyze_count | 0

 

再插入1条记录.

digoal=# insert into tbl select 51,'test';

INSERT 0 1

触发analyze : 

digoal=# select * from pg_stat_all_tables where relname ='tbl';

-[ RECORD 1 ]-----+------------------------------

relid             | 32798

schemaname        | public

relname           | tbl

seq_scan          | 0

seq_tup_read      | 0

idx_scan          | 

idx_tup_fetch     | 

n_tup_ins         | 51

n_tup_upd         | 0

n_tup_del         | 0

n_tup_hot_upd     | 0

n_live_tup        | 51

n_dead_tup        | 0

last_vacuum       | 

last_autovacuum   | 

last_analyze      | 

last_autoanalyze  | 2013-05-27 21:23:49.144829+08

vacuum_count      | 0

autovacuum_count  | 0

analyze_count     | 0

autoanalyze_count | 1

 

 

计算插入多少条数据后会触发analyze : 

 

 

digoal=# select reltuples from pg_class where relname='tbl';

-[ RECORD 1 ]-

reltuples | 51

autovacuum_analyze_scale_factor*51+autovacuum_analyze_threshold=55.1;

 

因此插入56条数据后会触发analyze.

 

digoal=# insert into tbl select generate_series(1,55),'test';

INSERT 0 55

 

digoal=# select * from pg_stat_all_tables where relname ='tbl';

-[ RECORD 1 ]-----+------------------------------

relid             | 32798

schemaname        | public

relname           | tbl

seq_scan          | 0

seq_tup_read      | 0

idx_scan          | 

idx_tup_fetch     | 

n_tup_ins         | 106

n_tup_upd         | 0

n_tup_del         | 0

n_tup_hot_upd     | 0

n_live_tup        | 106

n_dead_tup        | 0

last_vacuum       | 

last_autovacuum   | 

last_analyze      | 

last_autoanalyze  | 2013-05-27 21:23:49.144829+08

vacuum_count      | 0

autovacuum_count  | 0

analyze_count     | 0

autoanalyze_count | 1

 

digoal=# select reltuples from pg_class where relname='tbl';

-[ RECORD 1 ]-

reltuples | 51

 

再插入1条即可触发analyze.

 

digoal=# insert into tbl select 1,'test';

INSERT 0 1

 

digoal=# select * from pg_stat_all_tables where relname ='tbl';

-[ RECORD 1 ]-----+-----------------------------

relid             | 32798

schemaname        | public

relname           | tbl

seq_scan          | 0

seq_tup_read      | 0

idx_scan          | 

idx_tup_fetch     | 

n_tup_ins         | 107

n_tup_upd         | 0

n_tup_del         | 0

n_tup_hot_upd     | 0

n_live_tup        | 107

n_dead_tup        | 0

last_vacuum       | 

last_autovacuum   | 

last_analyze      | 

last_autoanalyze  | 2013-05-27 21:26:25.57402+08

vacuum_count      | 0

autovacuum_count  | 0

analyze_count     | 0

autoanalyze_count | 2

 

digoal=# select reltuples from pg_class where relname='tbl';

-[ RECORD 1 ]--

reltuples | 107

 

计算UPDATE/DELETE多少条数据后会触发vacuum : 

autovacuum_vacuum_scale_factor*107+autovacuum_vacuum_threshold=71.4;

因此更新或删除共计72条数据后会触发vacuum, 如果中间发生了analyze, 导致pg_class.reltuples发生变化, 这个值也会变化.

发生62次insert,update,delete后会触发analyze.

digoal=# update tbl set info='new' where id<18;

UPDATE 35

digoal=# update tbl set info='new' where id<10;

UPDATE 19

digoal=# delete from tbl where id<9;

DELETE 17

 

总共发生了35+19+17=71超出62次dml, 发生analyze, 

analyze后, pg_class.reltuples变成90, 

digoal=# select reltuples from pg_class where relname='tbl';

-[ RECORD 1 ]-

reltuples | 90

所以触发vacuum的值变成了多少呢?

autovacuum_vacuum_scale_factor*90+autovacuum_vacuum_threshold=68;

因此只需要69次update/delete即可触发vacuum, 而上一次vacuum到现在已经发生了71次update/delete, 因此会触发vacuum.

 

digoal=# select * from pg_stat_all_tables where relname ='tbl';

-[ RECORD 1 ]-----+------------------------------

relid             | 32798

schemaname        | public

relname           | tbl

seq_scan          | 8

seq_tup_read      | 856

idx_scan          | 

idx_tup_fetch     | 

n_tup_ins         | 107

n_tup_upd         | 54

n_tup_del         | 17

n_tup_hot_upd     | 54

n_live_tup        | 90

n_dead_tup        | 0

last_vacuum       | 

last_autovacuum   | 2013-05-27 21:31:23.560703+08

last_analyze      | 

last_autoanalyze  | 2013-05-27 21:31:22.474655+08

vacuum_count      | 0

autovacuum_count  | 1

analyze_count     | 0

autoanalyze_count | 3

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值