日常维护
垃圾回收vacuum
统计信息的收集
更新visibility map (vm可以在垃圾回收时减少数据块的扫描,只需要扫描有dead_tup的数据块) ,日常的dml操作也会更新vm
事务号(xid)的处理 (模拟事务消失的实验)
(一)全局自动vacuum
autovacuum = on
autovacuum_max_workers = 3
maintenance_work_mem = 64MB
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_threshold = 50
track_counts = on
1.查看是否设置了autovacuum
[postgres@v-vlxsz-devdb01 ~]$ psql
psql (14.1)
Type "help" for help.
postgres=# show autovacuum;
autovacuum
------------
on
(1 row)
postgres=#
2.autovacuum launcher的进程
垃圾回收的后台进程是autovacuum launcher
[root@v-vlxsz-devdb01 ~]# top -c -u postgres
op - 16:54:30 up 81 days, 53 min, 1 user, load average: 0.05, 0.06, 0.05
Tasks: 298 total, 1 running, 297 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.2 us, 0.2 sy, 0.0 ni, 99.5 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 8010576 total, 171260 free, 1333224 used, 6506092 buff/cache
KiB Swap: 3145724 total, 3145724 free, 0 used. 3443880 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31233 postgres 20 0 277392 16448 15972 S 0.0 0.2 0:01.86 /var/postgre/14.1/bin/postgres
31234 postgres 20 0 125364 968 552 S 0.0 0.0 0:00.00 postgres: logger
31236 postgres 20 0 277544 6872 6348 S 0.0 0.1 0:00.22 postgres: checkpointer
31237 postgres 20 0 277392 2412 1916 S 0.0 0.0 0:00.60 postgres: background writer
31238 postgres 20 0 277392 5052 4556 S 0.0 0.1 0:00.64 postgres: walwriter
31239 postgres 20 0 278088 2236 1380 S 0.0 0.0 0:02.14 postgres: autovacuum launcher
31240 postgres 20 0 277392 1120 652 S 0.0 0.0 0:00.14 postgres: archiver last was 000000030000000000000015
31241 postgres 20 0 127752 1244 612 S 0.0 0.0 0:04.00 postgres: stats collector
31242 postgres 20 0 277924 1588 916 S 0.0 0.0 0:00.05 postgres: logical replication launcher
如果数据库并发量很多,业务更新频繁的话,3个autovacuum_max_workers进程数不够用的情况下,需要增加autovacuum_max_workers进程数。此时添加个数要注意,每一个autovacuum_max_workers的进程都要消耗maintenance_work_mem的值对应的内存大小。例如maintenance_work_mem = 64MB时,autovacuum_max_workers = 3 需要消耗64*3=192MB内存。 autovacuum_max_workers = 4 需要消耗64*3=256MB内存。
3.垃圾回收触发的条件
autovacuum_vacuum_scale_factor = 0.2 和autovacuum_analyze_threshold = 50 垃圾数据超过了表的20%且超过了50行的时候会触发autovacuum
n_dead_tup表示垃圾数据的行数(track_counts = on打开才可以看到有多少条垃圾数据)
postgres=# \l --列出所有的数据库
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
test | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
test01 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
test02 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
test1215 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
(7 rows)
postgres=# \c test1215 --连接test1215数据库
You are now connected to database "test1215" as user "postgres".
test1215=# \x 显示格式调整
Expanded display is on.
test1215=# select * from pg_stat_user_tables;
-[ RECORD 1 ]-------+------------------------------
relid | 24581
schemaname | public
relname | test0202 --表名字为test0202
seq_scan | 2
seq_tup_read | 100
idx_scan |
idx_tup_fetch |
n_tup_ins | 50
n_tup_upd | 50
n_tup_del | 0
n_tup_hot_upd | 50
n_live_tup | 50
n_dead_tup | 50 --表名字为test0202的表的垃圾数据的条数有50条
n_mod_since_analyze | 0
n_ins_since_vacuum | 50
last_vacuum |
last_autovacuum |
last_analyze |
last_autoanalyze | 2023-02-02 11:48:44.123782+08
vacuum_count | 0
autovacuum_count | 0
analyze_count | 0
autoanalyze_count | 1
test1215=# \d pg_stat_user_tables 查看表结构
View "pg_catalog.pg_stat_user_tables"
Column | Type | Collation | Nullable | Default
---------------------+--------------------------+-----------+----------+---------
relid | oid | | |
schemaname | name | | |
relname | name | | |
seq_scan | bigint | | |
seq_tup_read | bigint | | |
idx_scan | bigint | | |
idx_tup_fetch | bigint | | |
n_tup_ins | bigint | | |
n_tup_upd | bigint | | |
n_tup_del | bigint | | |
n_tup_hot_upd | bigint | | |
n_live_tup | bigint | | |
n_dead_tup | bigint | | |
n_mod_since_analyze | bigint | | |
n_ins_since_vacuum | bigint | | |
last_vacuum | timestamp with time zone | | |
last_autovacuum | timestamp with time zone | | |
last_analyze | timestamp with time zone | | |
last_autoanalyze | timestamp with time zone | | |
vacuum_count | bigint | | |
autovacuum_count | bigint | | |
analyze_count | bigint | | |
autoanalyze_count | bigint | | |
test1215=#
test1215=# select * from pg_stat_user_tables;
-[ RECORD 1 ]-------+------------------------------
relid | 24581
schemaname | public
relname | test0202 --表名字为test0202
seq_scan | 4
seq_tup_read | 210
idx_scan |
idx_tup_fetch |
n_tup_ins | 59
n_tup_upd | 109
n_tup_del | 0
n_tup_hot_upd | 109
n_live_tup | 59
n_dead_tup | 0 --垃圾数据的条数变为0了
n_mod_since_analyze | 0
n_ins_since_vacuum | 0
last_vacuum |
last_autovacuum | 2023-02-02 14:11:46.733079+08 上一次自动autovacuum的时间
last_analyze |
last_autoanalyze | 2023-02-02 14:11:46.733985+08
vacuum_count | 0
autovacuum_count | 1
analyze_count | 0
autoanalyze_count | 2
test1215=#
(二)手动垃圾回收
vacuum table_name; --手动垃圾回收
如果自动垃圾回收没有办法满足产生垃圾速率的情况下需要手动进行垃圾回收(即n_dead_tup的值超过了50行,且也超过了20%,但是系统没有垃圾回收)
test1215=# select relname,n_live_tup,n_dead_tup from pg_stat_all_tables where n_dead_tup<>0 order by n_dead_tup desc; --查看垃圾数据占用情况
relname | n_live_tup | n_dead_tup
--------------+------------+------------
pg_attribute | 14 | 14
pg_class | 2 | 9
test0202 | 70 | 6
pg_depend | 6 | 6
pg_type | 4 | 4
pg_authid | 1 | 2
(6 rows)
test1215=#
垃圾回收不会降低表的高水位,除非垃圾数据是表末尾的数据。不然没法降低高水位,除非执行vacuum full t 重组表(处理膨胀表的方法) ,这个可以降低高水位。举例如下:
test1215=# create table t(id int);
CREATE TABLE
test1215=# insert into t select generate_series(1,100000); --插入10万条记录
INSERT 0 100000
test1215=#
test1215=# select id,ctid from t limit 10; --显示值,数据块及行,查询结果只显示10行。
id | ctid
----+--------
1 | (0,1)
2 | (0,2)
3 | (0,3)
4 | (0,4)
5 | (0,5)
6 | (0,6)
7 | (0,7)
8 | (0,8)
9 | (0,9) --0是数据块,9是数据块中的行
10 | (0,10)
(10 rows)
test1215=#
test1215=# select max(ctid) from t ; --,查看总共占用了多少个数据库,从如下结果知总共占用了443个数据块(数据库是从0开始标记的)
max
-----------
(442,108) --442是数据块号,109是442这个数据块中的108行
(1 row)
test1215=#
test1215=# select id from t where ctid='(442,108)'; --查询442这种块中,108行的数据,从结果来看第108行的值是100000
id
--------
100000
(1 row)
test1215=#
垃圾回收不会降低表的高水位,除非垃圾数据是表末尾的数据。不然没法降低高水位 ,案例如下:
--操作删除表末尾的数据,此时对表做垃圾回收,可以看到高水位下降了
test1215=# delete from t where id > 90000;
DELETE 10000
test1215=# vacuum t;
VACUUM
test1215=# analyze t;
ANALYZE
test1215=# select relpages from pg_class where relname='t';
relpages
----------
399 --操作删除表末尾的数据,此时对表做垃圾回收,可以看到高水位下降了
(1 row)
test1215=#
--操作删除表中间的数据,此时对表做垃圾回收,可以看到高水位不会下降
test1215=# delete from t where id < 90000;
DELETE 89999
test1215=# vacuum t;
VACUUM
test1215=# analyze t;
ANALYZE
test1215=# select relpages from pg_class where relname='t';
relpages
----------
399 --操作删除表中间的数据,此时对表做垃圾回收,可以看到高水位不会下降
(1 row)
test1215=#
test1215=# vacuum full t; (表重组) --除非执行vacuum t 重组表 ,这个可以降低高水位,案例如下
VACUUM
test1215=# select relpages from pg_class where relname='t';
relpages
----------
1 --vacuum full 可以降低高水位
(1 row)
test1215=#
select relname,n_live_tup,n_dead_tup,n_dead_tup/n_live_tup*100 from pg_stat_all_tables where n_dead_tup<>0 order by n_dead_tup desc;
结论:普通的垃圾回收可以释放n_dead_tup,但是无法降低表的高水位,手工vacuum full重组表可以释放高水位,但是会锁着表
二、膨胀表的维护
表的膨胀问题,表的update和delete频繁的话,会导致膨胀表的发生(因为普通的vacuum无法释放表的高水位,除非垃圾数据是表末尾的数据)
1.如何判断表是否膨胀?
postgres=# create extension pgstattuple; --安装pgstattuple插件
CREATE EXTENSION
postgres=# select oid::regclass,(pgstattuple(oid)).* from pg_class where relkind='r' order by free_space desc limit 1 offset 0;
-[ RECORD 1 ]------+--------
oid | pg_proc
table_len | 778240
tuple_count | 3211
tuple_len | 719575
tuple_percent | 92.46 --92.46%被真正的位行占用
dead_tuple_count | 16
dead_tuple_len | 6686
dead_tuple_percent | 0.86
free_space | 24808 --24808个字节是空闲的,如果free_space很大的话说明水位很高,浪费的空间大。
free_percent | 3.19 --3.19%是空闲的,如果free_space很大的话说明水位很高,浪费的空间大。
postgres=#
2.如何处理膨胀的表?
如果浪费的空间很大,并且确实想回收空闲的话,可以适当的使用vacuum full,这个操作会重建表(排他锁。这个时候业务不能dml操作或查询这个表),需要注意
或者也可以使用pg_reorg或pg_repack或pg_sequeeze重组表,可以减少排他锁的时间
https://blog.csdn.net/lianshunke/article/details/38982099
https://www.bbsmax.com/A/kvJ3qZBXdg/
3.膨胀索引不平衡的重建
postgres=# select oid::regclass,(pgstattuple(oid)).* from pg_class where relkind='i' order by free_space desc limit 1 offset 0;
-[ RECORD 1 ]------+--------------------------
oid | pg_depend_reference_index
table_len | 434176
tuple_count | 8841
tuple_len | 212184
tuple_percent | 48.87
dead_tuple_count | 0
dead_tuple_len | 0
dead_tuple_percent | 0
free_space | 167240
free_percent | 38.52
postgres=#
如果浪费的空间很大,并且确实想回收空闲的话,可以新建同样的索引,然后删除老的索引,建索引时可以选择concurrently参数(相当于oracle数据库的online参数,不然会锁表)
在比较繁忙的系统或者大表上执行该语句的话,可以使用concurrently 参数:
create index concurrently idx_table_name_x1 on table_name(col_name);
三、prevent xid wrapped的处理,因为事务ID为32位循环使用的,所以如果不做处理的话,会出现数据'disappear'现象
为了防止数据库disappear的现象,数据库的vacuum操作将记录的事务ID改写为FrozenTransactionId,这个ID视为比所有XID更早的ID
test1215=# select cmin,cmax,xmin,xmax,ctid,* from t;
cmin | cmax | xmin | xmax | ctid | id
------+------+------+------+-------+----
0 | 0 | 836 | 0 | (0,1) | 1 --xmin为836,即数据插入时,事务号为836
(1 row)
test1215=# begin;
BEGIN
test1215=*# update t set id=id;
UPDATE 1
test1215=*# select cmin,cmax,xmin,xmax,ctid,* from t;
cmin | cmax | xmin | xmax | ctid | id
------+------+------+------+-------+----
0 | 0 | 837 | 0 | (0,2) | 1 --xmin为837,即事务号为836
(1 row)
如果此时不提交的话,另开一个窗口去查询(另启一个事务),结果如下
test1215=# select cmin,cmax,xmin,xmax,ctid,* from t;
cmin | cmax | xmin | xmax | ctid | id
------+------+------+------+-------+----
0 | 0 | 836 | 837 | (0,1) | 1 --xmin为836,xmax是837
(1 row)
test1215=# commit;
test1215=*# select cmin,cmax,xmin,xmax,ctid,* from t;
cmin | cmax | xmin | xmax | ctid | id
------+------+------+------+-------+----
0 | 0 | 837 | 0 | (0,2) | 1 --提交之后xmin为837,ctid有(0,1)变为(0,2)
(1 row)
test1215=# select age(relfrozenxid) from pg_class where relname='t'; --查看表t的年龄
age
-----
3
(1 row)
test1215=# \df *.*current*
List of functions
Schema | Name | Result data type | Argument data types | Type
------------+--------------------------------+------------------+---------------------+------
pg_catalog | current_database | name | | func
pg_catalog | current_query | text | | func
pg_catalog | current_schema | name | | func
pg_catalog | current_schemas | name[] | boolean | func
pg_catalog | current_setting | text | text | func
pg_catalog | current_setting | text | text, boolean | func
pg_catalog | current_user | name | | func
pg_catalog | get_current_ts_config | regconfig | | func
pg_catalog | pg_current_logfile | text | | func
pg_catalog | pg_current_logfile | text | text | func
pg_catalog | pg_current_snapshot | pg_snapshot | | func
pg_catalog | pg_current_wal_flush_lsn | pg_lsn | | func
pg_catalog | pg_current_wal_insert_lsn | pg_lsn | | func
pg_catalog | pg_current_wal_lsn | pg_lsn | | func
pg_catalog | pg_current_xact_id | xid8 | | func
pg_catalog | pg_current_xact_id_if_assigned | xid8 | | func
pg_catalog | txid_current | bigint | | func
pg_catalog | txid_current_if_assigned | bigint | | func
pg_catalog | txid_current_snapshot | txid_snapshot | | func
test1215=# select cmin,cmax,xmin,xmax,ctid,* from t;
cmin | cmax | xmin | xmax | ctid | id
------+------+------+------+-------+----
0 | 0 | 839 | 0 | (0,4) | 1
(1 row)
test1215=# select txid_current(); --查看当前事务号,会产生新的事务号
txid_current
--------------
841
(1 row)
test1215=# select txid_current();
txid_current
--------------
843
(1 row)
test1215=# select txid_current();
txid_current
--------------
844
(1 row)
test1215=# select txid_current();
txid_current
--------------
845
(1 row)
test1215=#
test1215=# select age(relfrozenxid) from pg_class where relname='t';
age
-----
4
(1 row)
test1215=# select txid_current();
txid_current
--------------
846
(1 row)
test1215=# select age(relfrozenxid) from pg_class where relname='t';
age
-----
5
(1 row)
test1215=#
test1215=# select cmin,cmax,xmin,xmax,ctid,* from t;
cmin | cmax | xmin | xmax | ctid | id
------+------+------+------+-------+----
0 | 0 | 839 | 0 | (0,4) | 1
(1 row)
test1215=# vacuum freeze t;
VACUUM
test1215=# select cmin,cmax,xmin,xmax,ctid,* from t;
cmin | cmax | xmin | xmax | ctid | id
------+------+------+------+-------+----
0 | 0 | 839 | 0 | (0,4) | 1
(1 row)
test1215=# select age(relfrozenxid) from pg_class where relname='t';
age
-----
0
(1 row)
test1215=#
表的年龄不能大于当前事务,如果表的年龄比当前事务大,那么表的数据就会消失
相关的配置参数
autovacuum_freeze_max_age = 200000000 # maximum XID age before forced vacuum
vacuum_freeze_min_age = 50000000
vacuum_freeze_table_age = 150000000
1.如果表的年龄(pg_class.relfrozenxid)超过autovacuum_freeze_max_age,即使未设置autovacuum参数(即autovacuum=off),也将自动强制对该表执行vacuum freeze,从而降低表的年龄。(数据库的最后一道防线)
2.vacuum在扫描数据块时,允许保留的事务ID的年龄,年龄大于 vacuum_freeze_min_age 值的事务id将被替换为FrozenXID,年龄小于vacuum_freeze_min_age值的事务id不会被替换。
3.如果表的年龄大于vacuum_freeze_table_age,那么vacuum操作将扫描全部,因此可用于降低表的年龄,表的年龄将降到vacuum_freeze_min_age设置的值3
postgres=# \df *.*file*
List of functions
Schema | Name | Result data type | Argument data types
| Type
------------+---------------------------+------------------+------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------+------
pg_catalog | pg_current_logfile | text |
| func
pg_catalog | pg_current_logfile | text | text
| func
pg_catalog | pg_filenode_relation | regclass | oid, oid
| func
pg_catalog | pg_hba_file_rules | SETOF record | OUT line_number integer, OUT type text, OUT database text[], OUT user_name text[], OUT address text, OUT netmask text, OUT auth_method text, OUT opti
ons text[], OUT error text | func
pg_catalog | pg_read_binary_file | bytea | text
| func
pg_catalog | pg_read_binary_file | bytea | text, bigint, bigint
| func
pg_catalog | pg_read_binary_file | bytea | text, bigint, bigint, boolean
| func
pg_catalog | pg_read_file | text | text
| func
pg_catalog | pg_read_file | text | text, bigint, bigint
| func
pg_catalog | pg_read_file | text | text, bigint, bigint, boolean
| func
pg_catalog | pg_read_file_old | text | text, bigint, bigint
| func
pg_catalog | pg_relation_filenode | oid | regclass
| func
pg_catalog | pg_relation_filepath | text | regclass
| func
pg_catalog | pg_rotate_logfile | boolean |
| func
pg_catalog | pg_rotate_logfile_old | boolean |
| func
test1215=# select pg_relation_filepath('t'); --查看文件路径
pg_relation_filepath
----------------------
base/24577/24603
(1 row)
test1215=#
[postgres@v-vlxsz-devdb01 24577]$ ls -al | grep 24603
-rw------- 1 postgres postgres 8192 Feb 8 17:39 24603
-rw------- 1 postgres postgres 24576 Feb 3 15:17 24603_fsm
-rw------- 1 postgres postgres 8192 Feb 8 17:39 24603_vm
[postgres@v-vlxsz-devdb01 24577]$ pwd
/var/postgre/data/base/24577
[postgres@v-vlxsz-devdb01 24577]$
[postgres@v-vlxsz-devdb01 24577]$ ls -al | grep 24603
-rw------- 1 postgres postgres 8192 Feb 8 17:39 24603
-rw------- 1 postgres postgres 24576 Feb 3 15:17 24603_fsm
-rw------- 1 postgres postgres 8192 Feb 8 17:39 24603_vm --vm文件中记录了
[postgres@v-vlxsz-devdb01 24577]$ pwd
/var/postgre/data/base/24577
[postgres@v-vlxsz-devdb01 24577]$
举例说明:
postgres=# \c test1215
You are now connected to database "test1215" as user "postgres".
test1215=# alter table t rename to tbl_freeze_test;
ALTER TABLE
test1215=# insert into tbl_freeze_test select generate_series(1,100000); --插入10万条记录
INSERT 0 100000
test1215=# set vacuum_freeze_min_age=10000000;
SET
test1215=# select pg_relation_filepath('tbl_freeze_test'); --查看表所在位置
pg_relation_filepath
----------------------
base/24577/24603
(1 row)
test1215=#
第一次还没有生成vm文件,所以如果现在执行vacuum是会扫描全表
test1215=# vacuum verbose tbl_freeze_test;
INFO: vacuuming "public.tbl_freeze_test"
INFO: table "tbl_freeze_test": found 0 removable, 109 nonremovable row versions in 1 out of 443 pages --0 removable表明没有可以移除的数据块,一共有443个数据块。
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 851
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM
test1215=#
未生成VM文件前,vacuum进程必须扫描全表。
--vm文件也就是为了减少vacuum进程扫描开销而设计的
test1215=# select min(xmin::text),max(xmin::text) from tbl_freeze_test limit 10;
min | max
-----+-----
839 | 849
(1 row)
test1215=#
test1215=# set vacuum_freeze_min_age=0;
SET
test1215=# vacuum verbose tbl_freeze_test; --年龄比vacuum_freeze_min_age大的值的事务id将被替换为FrozenXID,年龄小于vacuum_freeze_min_age值的事务id不会被替换,
INFO: vacuuming "public.tbl_freeze_test"
INFO: table "tbl_freeze_test": found 0 removable, 109 nonremovable row versions in 1 out of 443 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 851
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM
test1215=#
test1215=# select min(xmin::text),max(xmin::text) from tbl_freeze_test limit 10;
min | max
-----+-----
839 | 849
(1 row)
test1215=#
如果表的年龄大于vacuum_freeze_table_age,那么vacuum操作将扫描全部,因此可用于降低表的年龄,表的年龄将降到vacuum_freeze_min_age设置的值
test1215=# vacuum verbose tbl_freeze_test;
INFO: aggressively vacuuming "public.tbl_freeze_test"
INFO: table "tbl_freeze_test": found 0 removable, 100001 nonremovable row versions in 443 out of 443 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 851
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s.
VACUUM
test1215=# select min(xmin::text),max(xmin::text) from tbl_freeze_test limit 10;
min | max
-----+-----
839 | 849
(1 row)
空闲时段的人为干预freeze(set vacuum_freeze_table_age=0; vacuum table;或者直接执行vacuum freeze tbl;效果一样),可以减少自动触发force whole table vacuum for prevent wrapped xid的概率。
因为自动触发如果发生在数据库繁忙节点,会带来较大的IO性能影响。
查找年龄较老的表,手动降低年龄
postgres=# select age(relfrozenxid),relname from pg_class where relkind='r' order by age(relfrozenxid) desc;
age | relname
-----+-------------------------
126 | pg_statistic
126 | pg_type
126 | pg_foreign_table
126 | pg_authid
126 | pg_statistic_ext_data
126 | pg_user_mapping
126 | pg_subscription
postgres=# vacuum freeze tablename;
--模拟数据丢失的过程
postgres=# \c test1215
You are now connected to database "test1215" as user "postgres".
test1215=# create table resetxlog_test(id int);
CREATE TABLE
test1215=# insert into resetxlog_test values(1); --每一条数据都是以单个事务插入操作的
INSERT 0 1
test1215=# insert into resetxlog_test values(2);
INSERT 0 1
test1215=# insert into resetxlog_test values(3);
INSERT 0 1
test1215=# insert into resetxlog_test values(4);
INSERT 0 1
test1215=# insert into resetxlog_test values(5);
INSERT 0 1
test1215=# insert into resetxlog_test values(6);
INSERT 0 1
test1215=# insert into resetxlog_test values(7);
INSERT 0 1
test1215=# insert into resetxlog_test values(8);
INSERT 0 1
test1215=# insert into resetxlog_test values(9);
INSERT 0 1
test1215=# insert into resetxlog_test values(10);
INSERT 0 1
test1215=#
test1215=# select xmin,* from resetxlog_test;
xmin | id
------+----
853 | 1
854 | 2
855 | 3
856 | 4 --
857 | 5
858 | 6
859 | 7
860 | 8
861 | 9
862 | 10
(10 rows)
test1215=#
[postgres@v-vlxsz-devdb01 ~]$ pg_ctl stop
waiting for server to shut down.... done
server stopped
[postgres@v-vlxsz-devdb01 ~]$ pg_controldata
pg_control version number: 1300
Catalog version number: 202107181
Database system identifier: 7176899893436068200
Database cluster state: shut down
pg_control last modified: Fri 10 Feb 2023 05:07:10 PM CST
Latest checkpoint location: 0/19000028
Latest checkpoint's REDO location: 0/19000028
Latest checkpoint's REDO WAL file: 000000030000000000000019
Latest checkpoint's TimeLineID: 3
Latest checkpoint's PrevTimeLineID: 3
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:863 --下一个是的xid号码
Latest checkpoint's NextOID: 24608
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 726
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 0
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint: Fri 10 Feb 2023 05:07:10 PM CST
Fake LSN counter for unlogged rels: 0/3E8
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: replica
wal_log_hints setting: on
max_connections setting: 200
max_worker_processes setting: 8
max_wal_senders setting: 5
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float8 argument passing: by value
Data page checksum version: 0
Mock authentication nonce: 154bab7b60d3e9889a24b6addf4c8b6bc4f9e5fce538a6a45918bc6f02847042
[postgres@v-vlxsz-devdb01 ~]$
[postgres@v-vlxsz-devdb01 bin]$ pg_resetwal -x 856 $PGDATA --修改xid为856
Write-ahead log reset
[postgres@v-vlxsz-devdb01 bin]$ pg_ctl start
waiting for server to start....2023-02-10 17:24:21.423 CST [11384] LOG: redirecting log output to logging collector process
2023-02-10 17:24:21.423 CST [11384] HINT: Future log output will appear in directory "log".
done
server started
test1215=# select xmin,* from resetxlog_test; --修改之后856这个sid后面的数据就看不到了
xmin | id
------+----
853 | 1
854 | 2
855 | 3
(3 rows)
test1215=#
test1215=# select * from resetxlog_test;
id
----
1
2
3
(3 rows)
test1215=# select xmin,* from resetxlog_test;
xmin | id
------+----
853 | 1
854 | 2
855 | 3
(3 rows)
test1215=# select txid_current(); --产生新的事务
txid_current
--------------
856
(1 row)
test1215=# select xmin,* from resetxlog_test;
xmin | id
------+----
853 | 1
854 | 2
855 | 3
856 | 4
(4 rows)
test1215=# select txid_current();
txid_current
--------------
857
(1 row)
test1215=# select xmin,* from resetxlog_test;
xmin | id
------+----
853 | 1
854 | 2
855 | 3
856 | 4
857 | 5
(5 rows)
四、日志文件维护
日志文件如下。可以设置后台事务压缩这些日志文件
[postgres@v-vlxsz-devdb01 ~]$ cd $PGDATA/log
[postgres@v-vlxsz-devdb01 log]$ ls -al
-rw------- 1 postgres postgres 997 Dec 22 16:43 postgresql-2022-12-22_164307.log
-rw------- 1 postgres postgres 997 Dec 22 16:44 postgresql-2022-12-22_164340.log
-rw------- 1 postgres postgres 601 Dec 22 16:44 postgresql-2022-12-22_164449.log
-rw------- 1 postgres postgres 0 Dec 23 00:00 postgresql-2022-12-23_000000.log
-rw------- 1 postgres postgres 0 Dec 24 00:00 postgresql-2022-12-24_000000.log
-rw------- 1 postgres postgres 0 Dec 25 00:00 postgresql-2022-12-25_000000.log
-rw------- 1 postgres postgres 0 Dec 26 00:00 postgresql-2022-12-26_000000.log
-rw------- 1 postgres postgres 0 Dec 27 00:00 postgresql-2022-12-27_000000.log
-rw------- 1 postgres postgres 0 Dec 28 00:00 postgresql-2022-12-28_000000.log
-rw------- 1 postgres postgres 0 Dec 29 00:00 postgresql-2022-12-29_000000.log
-rw------- 1 postgres postgres 0 Dec 30 00:00 postgresql-2022-12-30_000000.log
-rw------- 1 postgres postgres 0 Dec 31 00:00 postgresql-2022-12-31_000000.log
-rw------- 1 postgres postgres 0 Jan 1 00:00 postgresql-2023-01-01_000000.log
五、日志内容的检查
日志的错误输出
长sql
锁等待
错误级别的日志
1.日志的错误输出
2.长sql
配置文件参数设置
#log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements
# and their durations, > 0 logs only
# statements running at least this number
# of milliseconds
log_min_duration_statement如果配置了10,则sql语句执行时间超过了10毫秒都会被记录到日志文件中
3.锁等待
首先需要再配置文件中打开锁等待的参数
五、日志文件查询
外部表
六、检查集群是否正常
检查standby的延迟(pg_stat_replication)
postgres=# \d pg_stat_replication
View "pg_catalog.pg_stat_replication"
Column | Type | Collation | Nullable | Default
------------------+--------------------------+-----------+----------+---------
pid | integer | | |
usesysid | oid | | |
usename | name | | |
application_name | text | | |
client_addr | inet | | |
client_hostname | text | | |
client_port | integer | | |
backend_start | timestamp with time zone | | |
backend_xmin | xid | | |
state | text | | |
sent_lsn | pg_lsn | | |
write_lsn | pg_lsn | | |
flush_lsn | pg_lsn | | |
replay_lsn | pg_lsn | | |
write_lag | interval | | |
flush_lag | interval | | |
replay_lag | interval | | |
sync_priority | integer | | |
sync_state | text | | |
reply_time | timestamp with time zone | | |
postgres=#