postgres Page详解、存储结构、空间使用、空间回收

存储结构

       在PG中,磁盘存储和内存中的最小管理单位都是page,也是通常所说的block。一般PG页的大小为8K,在源码编译时可以设置。此后都不可更改,因为许多PG内存结构设计都是以此为基础的

PG块的定义文件为src/include/storage/bufpage.h

PAGE的结构如下:

参考网上整理之后的结构图如下:

Page Header的定义如下:

       lsn:记录最后一次对页修改的xlog记录
       checksum:页面的校验和,用于判断当前页是否完整
       flags:(指示当前页的状态)
       lower:本页空闲位置的起始指针
       uper:本业空闲位置的结束指针
       special:页预留的位置
       pagesize:页面大小
       version:当前版本
       prune_xid:最后一次删除或更新的xid
       页后面存储的是元组(tuple)信息的数据(表的数据行,一个元组信息就是一行),也就是下一节的page item

ItemIdData的定义如下:

itemid.h头文件

参考网友整理图

ItemIdData之后之后是tuple,tuple的组成为tuple header+data

首先tuple header的定义文件为access/htup_details.h

其中HeapTupleFields的定义如下:

参考网上整理图如下:

tuple header之后就是data数据了。

1)、空间使用测试

       首先确认您是否拥有了 pageinspect 这个 extension

如果没有,使用create extension pageinspect命令安装插件。

postgres=# drop table if EXISTS test1;
DROP TABLE
postgres=# create table test1(id int);
CREATE TABLE
postgres=# insert into test1 values(1);
INSERT 0 1
postgres=# select * from page_header(get_raw_page('test1', 0));
    lsn     | checksum | flags | lower | upper | special | pagesize | version | prune_xid 
------------+----------+-------+-------+-------+---------+----------+---------+-----------
 2/DB128048 |        0 |     0 |    28 |  8160 |    8192 |     8192 |       4 |         0
(1 row)
postgres=# insert into test1 values(2);
INSERT 0 1
postgres=# select * from page_header(get_raw_page('test1', 0));
    lsn     | checksum | flags | lower | upper | special | pagesize | version | prune_xid 
------------+----------+-------+-------+-------+---------+----------+---------+-----------
 2/DB1741C0 |        0 |     0 |    32 |  8128 |    8192 |     8192 |       4 |         0
(1 row)

通过上面可以看到,每插入一行数据,lower的值就会增加4(itemIdData),其中page header的大小为24,插入两行数据之后增加了两个ItemIdData。相应的upper地址会向上移动。

查询tuple header的

 select * from heap_page_items(get_raw_page('test1',0));

lp:这是插件自己定义的列,在源码中其实没有,这个是项指针的顺序
lp_off: tuple在页面中的位置
lp_len:tuple的实际长度(tuple header + data)
t_xmin,t_xmax: 插入,删除和更新时的事物id,插入时xmin写入当前事物id,删除时xmax写入事物id。更新也是先删除再插入
t_field3:
t_ctid:物理id,或者关联更新之后的物理ID

2)、更新第二行数据

postgres=# update test1 set id = 3 where id = 2;
UPDATE 1
postgres=# select * from page_header(get_raw_page('test1', 0));
    lsn     | checksum | flags | lower | upper | special | pagesize | version | prune_xid 
------------+----------+-------+-------+-------+---------+----------+---------+-----------
 2/DB174418 |        0 |     0 |    36 |  8096 |    8192 |     8192 |       4 |   2097527
(1 row)

postgres=# select * from heap_page_items(get_raw_page('test1',0));
 lp | lp_off | lp_flags | lp_len | t_xmin  | t_xmax  | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data   
----+--------+----------+--------+---------+---------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |   8160 |        1 |     28 | 2097522 |       0 |        0 | (0,1)  |           1 |       2304 |     24 |        |       | \x01000000
  2 |   8128 |        1 |     28 | 2097526 | 2097527 |        0 | (0,3)  |       16385 |        256 |     24 |        |       | \x02000000
  3 |   8096 |        1 |     28 | 2097527 |       0 |        0 | (0,3)  |       32769 |      10240 |     24 |        |       | \x03000000
(3 rows)

postgres=# 

更新之后发现,lower增加了4,说明新增了一个ItemIddata,同时upper的值减少也说明了free space空间在减少。查询heap_page_items发现lp的值从原来的2增加到了3,多了一行数据。原来ID=2的lp=2的行还在,t_xmax=2097527和lp=3的t_xmin的值一样,并且lp=2和lp=3的t_ctid一样,说明是update test1 set id = 3 where id = 2;更新之后的新数据为lp=3的行,被更新之前的行还在,只是数据指向了lp=3的数据(t_ctid=(0,3))。

3)、再插入一行新数据


postgres=# insert into test1 values (4);
INSERT 0 1
postgres=# select * from page_header(get_raw_page('test1', 0));
    lsn     | checksum | flags | lower | upper | special | pagesize | version | prune_xid 
------------+----------+-------+-------+-------+---------+----------+---------+-----------
 2/DB174688 |        0 |     0 |    40 |  8064 |    8192 |     8192 |       4 |   2097527
(1 row)

postgres=# select * from heap_page_items(get_raw_page('test1',0));
 lp | lp_off | lp_flags | lp_len | t_xmin  | t_xmax  | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data   
----+--------+----------+--------+---------+---------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |   8160 |        1 |     28 | 2097522 |       0 |        0 | (0,1)  |           1 |       2304 |     24 |        |       | \x01000000
  2 |   8128 |        1 |     28 | 2097526 | 2097527 |        0 | (0,3)  |       16385 |       1280 |     24 |        |       | \x02000000
  3 |   8096 |        1 |     28 | 2097527 |       0 |        0 | (0,3)  |       32769 |      10496 |     24 |        |       | \x03000000
  4 |   8064 |        1 |     28 | 2097528 |       0 |        0 | (0,4)  |           1 |       2048 |     24 |        |       | \x04000000
(4 rows)

postgres=# 

通过heap_page_items新增了一行数据,被更新lp=2的行对应的空间没有被重复使用。其实这里使用到的就是postgres的MVCC机制,如果表频繁的发生更新、删掉等。vacuum没有即使的清理旧版本的信息,就会导致表膨胀。关于vacuum什么时候会触发清理旧数据,可以参考链接:postgres统计信息规则和触发条件分析_qxy0503的博客-CSDN博客_postgres 统计信息(postgres统计信息规则和触发条件分析)。

4)、手动对表test1做vacuum操作,然后再插入数据观察

postgres=# vacuum test1;
VACUUM
postgres=# select * from page_header(get_raw_page('test1', 0));
    lsn     | checksum | flags | lower | upper | special | pagesize | version | prune_xid 
------------+----------+-------+-------+-------+---------+----------+---------+-----------
 2/DB176940 |        0 |     4 |    40 |  8096 |    8192 |     8192 |       4 |         0
(1 row)

<=====vacuum 之后lower没有发生变化,但是upper从原来的8064 变成了8096

postgres=# select * from heap_page_items(get_raw_page('test1',0));
 lp | lp_off | lp_flags | lp_len | t_xmin  | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data   
----+--------+----------+--------+---------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |   8160 |        1 |     28 | 2097522 |      0 |        0 | (0,1)  |           1 |       2304 |     24 |        |       | \x01000000
  2 |      3 |        2 |      0 |         |        |          |        |             |            |        |        |       | 
  3 |   8128 |        1 |     28 | 2097527 |      0 |        0 | (0,3)  |       32769 |      10496 |     24 |        |       | \x03000000
  4 |   8096 |        1 |     28 | 2097528 |      0 |        0 | (0,4)  |           1 |       2304 |     24 |        |       | \x04000000
(4 rows)

<=====新插入一行数据
postgres=# insert into test1 values (5);
INSERT 0 1
postgres=# select * from page_header(get_raw_page('test1', 0));
    lsn     | checksum | flags | lower | upper | special | pagesize | version | prune_xid 
------------+----------+-------+-------+-------+---------+----------+---------+-----------
 2/DB17E7C8 |        0 |     0 |    44 |  8064 |    8192 |     8192 |       4 |         0
(1 row)

postgres=# select * from heap_page_items(get_raw_page('test1',0));
 lp | lp_off | lp_flags | lp_len | t_xmin  | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data   
----+--------+----------+--------+---------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |   8160 |        1 |     28 | 2097522 |      0 |        0 | (0,1)  |           1 |       2304 |     24 |        |       | \x01000000
  2 |      3 |        2 |      0 |         |        |          |        |             |            |        |        |       | 
  3 |   8128 |        1 |     28 | 2097527 |      0 |        0 | (0,3)  |       32769 |      10496 |     24 |        |       | \x03000000
  4 |   8096 |        1 |     28 | 2097528 |      0 |        0 | (0,4)  |           1 |       2304 |     24 |        |       | \x04000000
  5 |   8064 |        1 |     28 | 2097529 |      0 |        0 | (0,5)  |           1 |       2048 |     24 |        |       | \x05000000
(5 rows)

postgres=# 

vacuum test1之后,虽然空间已经回收,upper从原来的8096变成了8064,但是新插入的一行记录并没有使用lp=2的空间(理论上这个空间(t_ctid(0,2))的空间是可以重新使用的)。

5)、再插入一行数据测试

postgres=# insert into test1 values (2);
INSERT 0 1
postgres=# \x
Expanded display is off.
postgres=# select * from page_header(get_raw_page('test1', 0));
    lsn     | checksum | flags | lower | upper | special | pagesize | version | prune_xid 
------------+----------+-------+-------+-------+---------+----------+---------+-----------
 2/DB17EA10 |        0 |     0 |    48 |  8032 |    8192 |     8192 |       4 |         0
(1 row)

postgres=# select * from heap_page_items(get_raw_page('test1',0));
 lp | lp_off | lp_flags | lp_len | t_xmin  | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data   
----+--------+----------+--------+---------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |   8160 |        1 |     28 | 2097522 |      0 |        0 | (0,1)  |           1 |       2304 |     24 |        |       | \x01000000
  2 |      3 |        2 |      0 |         |        |          |        |             |            |        |        |       | 
  3 |   8128 |        1 |     28 | 2097527 |      0 |        0 | (0,3)  |       32769 |      10496 |     24 |        |       | \x03000000
  4 |   8096 |        1 |     28 | 2097528 |      0 |        0 | (0,4)  |           1 |       2304 |     24 |        |       | \x04000000
  5 |   8064 |        1 |     28 | 2097529 |      0 |        0 | (0,5)  |           1 |       2304 |     24 |        |       | \x05000000
  6 |   8032 |        1 |     28 | 2097530 |      0 |        0 | (0,6)  |           1 |       2304 |     24 |        |       | \x02000000
(6 rows)

新插入的数据还是没有使用lp=3的空间。

6)、先把id<6的数据都删除了,之后再插入新数据

postgres=# delete from test1 where id <6;
DELETE 4
postgres=# select * from heap_page_items(get_raw_page('test1',0));
 lp | lp_off | lp_flags | lp_len | t_xmin  | t_xmax  | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data   
----+--------+----------+--------+---------+---------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |   8160 |        1 |     28 | 2097522 | 2097540 |        0 | (0,1)  |        8193 |       1280 |     24 |        |       | \x01000000
  2 |      3 |        2 |      0 |         |         |          |        |             |            |        |        |       | 
  3 |   8128 |        1 |     28 | 2097527 | 2097540 |        0 | (0,3)  |       40961 |       9472 |     24 |        |       | \x03000000
  4 |   8096 |        1 |     28 | 2097528 | 2097540 |        0 | (0,4)  |        8193 |       1280 |     24 |        |       | \x04000000
  5 |   8064 |        1 |     28 | 2097529 | 2097540 |        0 | (0,5)  |        8193 |       1280 |     24 |        |       | \x05000000
  6 |   8032 |        1 |     28 | 2097530 |       0 |        0 | (0,6)  |           1 |       2304 |     24 |        |       | \x02000000
(6 rows)

postgres=# vacuum VERBOSE test1;
INFO:  vacuuming "public.test1"
INFO:  "test1": removed 4 row versions in 1 pages
INFO:  "test1": found 4 removable, 1 nonremovable row versions in 1 out of 1 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 2097542
There were 1 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM
postgres=# select * from heap_page_items(get_raw_page('test1',0));
 lp | lp_off | lp_flags | lp_len | t_xmin  | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data   
----+--------+----------+--------+---------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |      0 |        0 |      0 |         |        |          |        |             |            |        |        |       | 
  2 |      0 |        0 |      0 |         |        |          |        |             |            |        |        |       | 
  3 |      0 |        0 |      0 |         |        |          |        |             |            |        |        |       | 
  4 |      0 |        0 |      0 |         |        |          |        |             |            |        |        |       | 
  5 |      0 |        0 |      0 |         |        |          |        |             |            |        |        |       | 
  6 |   8160 |        1 |     28 | 2097530 |      0 |        0 | (0,6)  |           1 |       2304 |     24 |        |       | \x06000000
(6 rows)

postgres=# select * from page_header(get_raw_page('test1', 0));
    lsn     | checksum | flags | lower | upper | special | pagesize | version | prune_xid 
------------+----------+-------+-------+-------+---------+----------+---------+-----------
 2/DB18FA68 |        0 |     5 |    48 |  8160 |    8192 |     8192 |       4 |         0
(1 row)

<======vacuum之后lower的值没有发生变化,但是upper从原来的8032 变成了8160,说明空间释放了

postgres=# insert into test1 values (1);
INSERT 0 1
postgres=# 
postgres=# select * from page_header(get_raw_page('test1', 0));
    lsn     | checksum | flags | lower | upper | special | pagesize | version | prune_xid 
------------+----------+-------+-------+-------+---------+----------+---------+-----------
 2/DB18FC00 |        0 |     1 |    48 |  8128 |    8192 |     8192 |       4 |         0
(1 row)

postgres=# select * from heap_page_items(get_raw_page('test1',0));
 lp | lp_off | lp_flags | lp_len | t_xmin  | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data   
----+--------+----------+--------+---------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |   8128 |        1 |     28 | 2097542 |      0 |        0 | (0,1)  |           1 |       2048 |     24 |        |       | \x01000000
  2 |      0 |        0 |      0 |         |        |          |        |             |            |        |        |       | 
  3 |      0 |        0 |      0 |         |        |          |        |             |            |        |        |       | 
  4 |      0 |        0 |      0 |         |        |          |        |             |            |        |        |       | 
  5 |      0 |        0 |      0 |         |        |          |        |             |            |        |        |       | 
  6 |   8160 |        1 |     28 | 2097530 |      0 |        0 | (0,6)  |           1 |       2304 |     24 |        |       | \x06000000
(6 rows)

postgres=# 
<=====新插入的数据也使用了之前已经分配过的空间(t_ctid(0,1))。

通过上面的测试发现,其实vacuum table是不会降低page的高水位的,只是把旧数据占的空间释放,所以如果vaccum不及时导致了表的膨胀,那么再使用vacuum table是不能把表的大小降下去的。

测试如下:

7)、创建测试表

postgres=# 
postgres=# create table tbl (id int primary key, info text, crt_time timestamp); 
CREATE TABLE
postgres=# insert into tbl select generate_series(1,2000000),md5(random()::text),clock_timestamp(); 
INSERT 0 2000000
postgres=# \dt+ tbl  
                    List of relations
 Schema | Name | Type  |  Owner   |  Size  | Description 
--------+------+-------+----------+--------+-------------
 public | tbl  | table | postgres | 146 MB | 
(1 row)

postgres=# \dt+ tbl_pkey  
Did not find any relation named "tbl_pkey".
postgres=# \di+ tbl_pkey  
                         List of relations
 Schema |   Name   | Type  |  Owner   | Table | Size  | Description 
--------+----------+-------+----------+-------+-------+-------------
 public | tbl_pkey | index | postgres | tbl   | 43 MB | 
(1 row)

postgres=# 

8)、编写测试脚本

[postgres@postgres ~]$ cat test.sql 
\set id random(1,2000000)
update tbl set info=md5(random()::text) where id between :id-250000 and :id+250000;  
[postgres@postgres ~]$ 

9)、测试数据库中存在持有事务号的长事务,这个事务时间段内,数据库产生的垃圾无法被回收。

[postgres@postgres ~]$  pgbench -M prepared -h 127.0.0.1 -p 5442 -n -r -f ./test.sql -c 1 -j 1 -T 30 
transaction type: ./test.sql
scaling factor: 1
query mode: prepared
number of clients: 1
number of threads: 1
duration: 30 s
number of transactions actually processed: 4
latency average = 10183.262 ms
tps = 0.098200 (including connections establishing)
tps = 0.098204 (excluding connections establishing)
statement latencies in milliseconds:
         0.005  \set id random(1,2000000)
     10182.856  update tbl set info=md5(random()::text) where id between :id-250000 and :id+250000;
[postgres@postgres ~]$ 

查看更新表之后的表大小和索引大小

postgres=# \dt+ tbl  
                    List of relations
 Schema | Name | Type  |  Owner   |  Size  | Description 
--------+------+-------+----------+--------+-------------
 public | tbl  | table | postgres | 203 MB | 
(1 row)

postgres=# \di+ tbl_pkey  
                          List of relations
 Schema |   Name   | Type  |  Owner   | Table |  Size  | Description 
--------+----------+-------+----------+-------+--------+-------------
 public | tbl_pkey | index | postgres | tbl   | 103 MB | 
(1 row)

10)、执行vacuum tbl

postgres=# vacuum VERBOSE tbl;
INFO:  vacuuming "public.tbl"
INFO:  "tbl": found 0 removable, 107 nonremovable row versions in 1 out of 26029 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 2097561
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 7335 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  vacuuming "pg_toast.pg_toast_157481"
INFO:  index "pg_toast_157481_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "pg_toast_157481": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 2097561
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM
postgres=# \dt+ tbl  
                    List of relations
 Schema | Name | Type  |  Owner   |  Size  | Description 
--------+------+-------+----------+--------+-------------
 public | tbl  | table | postgres | 203 MB | 
(1 row)

postgres=# \di+ tbl_pkey  
                          List of relations
 Schema |   Name   | Type  |  Owner   | Table |  Size  | Description 
--------+----------+-------+----------+-------+--------+-------------
 public | tbl_pkey | index | postgres | tbl   | 103 MB | 
(1 row)

postgres=# 

vacuum发现表的空间和tbl的空间并没有释放,所以,如果已经膨胀之后的表,再使用vacuum table是无法降低表的大小的,但是使用vacuum full table是可以降低表的大小的。

如下:

postgres=# vacuum FULL VERBOSE tbl;
INFO:  vacuuming "public.tbl"
INFO:  "tbl": found 0 removable, 2000000 nonremovable row versions in 26029 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU: user: 0.25 s, system: 2.93 s, elapsed: 3.54 s.
VACUUM
postgres=# \dt+ tbl  
                    List of relations
 Schema | Name | Type  |  Owner   |  Size  | Description 
--------+------+-------+----------+--------+-------------
 public | tbl  | table | postgres | 146 MB | 
(1 row)

postgres=# \di+ tbl_pkey  
                         List of relations
 Schema |   Name   | Type  |  Owner   | Table | Size  | Description 
--------+----------+-------+----------+-------+-------+-------------
 public | tbl_pkey | index | postgres | tbl   | 43 MB | 
(1 row)

       PostgreSQL的表或索引发生膨胀后,用户可以使用vacuum full或rewrite table(如cluster)的方式重建表。
但是vacuum full或者rewrite都需要持有排它锁,会堵塞读操作。为了减少锁冲突,社区有一个名为pg_reorg或pg_repack的插件,使用了增量的方式重组数据,最后通过切换FILENODE完成数据重组。仅仅在切换FILENODE时需要持有排他锁,非常短暂,影响比VACUUM FULL和rewrite的方式小多了。

参考资料:

PostgreSQL中page页结构_ITPUB博客

postgresql磁盘存储结构 - exact - 博客园

https://github.com/digoal/blog/blob/master/201504/20150429_02.md

  • 2
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值