PostgreSQL 中的系统字段：tableoid、xmin、xmax、cmin、cmax、ctid

最新推荐文章于 2024-06-05 11:34:42 发布

不剪发的Tony老师

最新推荐文章于 2024-06-05 11:34:42 发布

阅读量4.2k

点赞数 4

分类专栏： PostgreSQL 文章标签： postgresql 系统字段隐藏字段 ctid xmin

本文链接：https://blog.csdn.net/horses/article/details/106400379

版权

PostgreSQL 专栏收录该内容

89 篇文章 77 订阅

订阅专栏

文章目录

- tableoid
- ctid
- xmin
- xmax
- cmin
- cmax
- oid
- 总结

大家好！我是只谈技术不剪发的 Tony 老师。今天我们来谈谈 PostgreSQL 数据表中几个隐藏的系统字段和它们的作用。

在 PostgreSQL 中，当我们创建一个数据表时，数据库会隐式增加几个系统字段。这些字段由系统进行维护，用户一般不会感知它们的存在。例如，以下语句创建了一个简单的表：

create table test(col integer);

insert into test(col)
values (1),(2),(3);

从定义上来看，表 test 中只有一个字段；但是当我们查询数据字典表 pg_attribute 时，结果却不是如此：

hrdb=> select version();
                                                 version                                                 
---------------------------------------------------------------------------------------------------------
 PostgreSQL 12.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
(1 row)

hrdb=> select attname, attnum, atttypid::regtype 
hrdb-> from pg_attribute
hrdb-> where attrelid = 'test'::regclass;
 attname  | attnum | atttypid 
----------+--------+----------
 tableoid |     -6 | oid
 cmax     |     -5 | cid
 xmax     |     -4 | xid
 cmin     |     -3 | cid
 xmin     |     -2 | xid
 ctid     |     -1 | tid
 col      |      1 | integer
(7 rows)

查询结果显示，表 test 中一共包含 7 个字段。PostgreSQL 为我们增加了 6 个额外的系统字段，它们的 attnum 属性都是负数。

下面让我们分别看看这些系统字段的作用。

tableoid

tableoid 字段代表了数据所在表的对象 id（OID），也就是数据字典表 pg_class 中与该表信息相关的数据行。

hrdb=> select oid, relname from pg_class where relname = 'test';
  oid  | relname 
-------+---------
 90277 | test
(1 row)

hrdb=> select t.tableoid, t.col, c.relname
hrdb-> from test t
hrdb-> join pg_class c on (c.oid = t.tableoid);
 tableoid | col | relname 
----------+-----+---------
    90277 |   1 | test
    90277 |   2 | test
    90277 |   3 | test
(3 rows)

tableoid 的另一个用途就是在涉及分区表查询或者 UNION 操作时标识数据行所在的具体表。例如存在以下分区表：

create table part_t
(id integer) partition by hash (id);
create table part_t_p1
partition of part_t for values with (modulus 4, remainder 0);
create table part_t_p2
partition of part_t for values with (modulus 4, remainder 1);
create table part_t_p3
partition of part_t for values with (modulus 4, remainder 2);
create table part_t_p4
partition of part_t for values with (modulus 4, remainder 3);

insert into part_t select generate_series(1,100);

我们可以通过以下查询返回每行数据所在的分区：

hrdb=> select tableoid::regclass, id
hrdb-> from part_t
hrdb-> order by id
hrdb-> limit 10;
 tableoid  | id 
-----------+----
 part_t_p1 |  1
 part_t_p3 |  2
 part_t_p2 |  3
 part_t_p4 |  4
 part_t_p2 |  5
 part_t_p4 |  6
 part_t_p4 |  7
 part_t_p2 |  8
 part_t_p2 |  9
 part_t_p4 | 10
(10 rows)

对于集合操作 UNION、INTERSECT、EXCEPT 也是如此：

hrdb=> select tableoid::regclass, col from test
hrdb-> union all
hrdb-> select tableoid::regclass, id from part_t where id < 4
hrdb-> order by 2;
 tableoid  | col 
-----------+-----
 test      |   1
 part_t_p1 |   1
 test      |   2
 part_t_p3 |   2
 test      |   3
 part_t_p2 |   3
(6 rows)

ctid

ctid 字段代表了数据行在表中的物理位置，也就是行标识（tuple identifier），由一对数值组成（块编号和行索引）。ctid 类似于 Oracle 中的伪列 ROWID。

ctid 可以用于快速查找表中的数据行，也可以用于修复数据损坏。另外，它也可以用于查找并删除表中的重复数据。例如：

insert into test(col)
values (1),(2),(3);

hrdb=> select ctid, * from test;
 ctid  | col 
-------+-----
 (0,1) |   1
 (0,2) |   2
 (0,3) |   3
 (0,4) |   1
 (0,5) |   2
 (0,6) |   3
(6 rows)

我们为 test 表插入了 3 条重复的数据。接下来利用 ctid 删除重复的数据：

hrdb=> delete from test 
hrdb-> where ctid not in
hrdb-> (
hrdb(>   select max(ctid) 
hrdb(>   from test 
hrdb(>   group by col
hrdb(> );
DELETE 3

需要注意的是，ctid 的值有可能会改变（例如 VACUUM FULL）；因此，ctid 不适合作为一个长期的行标识，应该使用主键作为行的逻辑标识。

xmin

xmin 代表了该行版本（row version ）的插入事务 ID（XID）。行版本是数据行的具体状态，每次更新操作都会为相同的逻辑行创建一个新的行版本（多版本并发控制，MVCC）。事务 ID 是一个 32 bit 数字。

我们继续为 test 表插入几条数据，并查看它们的 xmin：

hrdb=> insert into test(col) values(4);
INSERT 0 1
hrdb=> insert into test(col) values(5);
INSERT 0 1

hrdb=> select xmin,col from test;
 xmin | col 
------+-----
 2852 |   1
 2852 |   2
 2852 |   3
 2854 |   4
 2855 |   5
(5 rows)

xmin 字段可以用于查看数据行的插入时间：

hrdb=> select col,
hrdb->        to_char(pg_xact_commit_timestamp(xmin) ,'YYYY-MM-DD HH24:MI:SS') AS insert_time
hrdb-> from test;
 col |     insert_time     
-----+---------------------
   1 | 2020-05-28 16:52:08
   2 | 2020-05-28 16:52:08
   3 | 2020-05-28 16:52:08
   4 | 2020-05-28 17:03:33
   5 | 2020-05-28 17:03:35
(5 rows)

注意，系统函数 pg_xact_commit_timestamp 需要将配置参数 track_commit_timestamp 设置为 on 才能使用。

xmax

xmax 字段代表了删除改行的事务 ID，对于未删除的行版本显示为 0。非零的 xmax 通常意味着删除事务还没有提交，或者删除操作被回滚。

我们查看一下 test 表中的 xmax：

hrdb=> select txid_current();
 txid_current 
--------------
         2858
(1 row)

hrdb=> select xmax, col from test;
 xmax | col 
------+-----
    0 |   1
    0 |   2
    0 |   3
    0 |   4
    0 |   5
(5 rows)

然后打开另一个会话，在事务中修改 test 表中的数据：

-- 会话 2
hrdb=> update test
hrdb-> set col= col*2;
UPDATE 5

回到第一个会话，再次查看 xmax：

hrdb=> select xmax, col from test;
 xmax | col 
------+-----
 2858 |   1
 2858 |   2
 2858 |   3
 2858 |   4
 2858 |   5
(5 rows)

2858 是第二个会话的事务 ID，它是删除这些行版本的事务。PostgreSQL 中的 UPDATE 相当于 DELETE 加 INSERT。

将第二个事务回滚：

-- 会话 2
hrdb=> rollback;
ROLLBACK

如果再次查询 test 表中的 xmax，仍然返回 2858。

xmax 还有可能表示当前正在占用行锁的事务 ID，利用 PostgreSQL 扩展插件 pageinspect 可以获取详细信息：

create extension pageinspect;

select t.col,
       t.xmax
       case
         when (t_infomask & 128)::boolean then 'LOCK'
         when (t_infomask & 1024)::boolean then 'COMMITTED'
         when (t_infomask & 2048)::boolean then 'ROLLBACKED'
         when (t_infomask & 4096)::boolean then 'MULTI XACT'
       end as xmax_info
from test t 
left outer join heap_page_items(get_raw_page('test', 0)) hp on (t.ctid = hp.t_ctid)
where hp.t_xmax = t.xmax;

cmin

cmin 代表了插入事务中的命令标识符（从 0 开始）。命令标识符是一个 32 bit 数字。

cmax

cmax 代表了删除事务中的命令标识符，或者 0。

我们先查看一下 test 表中的

hrdb=> select cmin, cmax, col from test;
 cmin | cmax | col 
------+------+-----
    0 |    0 |   1
    0 |    0 |   2
    0 |    0 |   3
    0 |    0 |   4
    0 |    0 |   5
(5 rows)

然后在事务中修改数据：

hrdb=> begin;
BEGIN
hrdb=> select txid_current();
 txid_current 
--------------
         2859
(1 row)

hrdb=> insert into test(col) values(6);
INSERT 0 1
hrdb=> insert into test(col) values(7);
INSERT 0 1
hrdb=> insert into test(col) values(8);
INSERT 0 1

hrdb=> select cmin, cmax, col from test;
 cmin | cmax | col 
------+------+-----
    0 |    0 |   1
    0 |    0 |   2
    0 |    0 |   3
    0 |    0 |   4
    0 |    0 |   5
    0 |    0 |   6
    1 |    1 |   7
    2 |    2 |   8
(8 rows)

然后删除一条记录：

hrdb=> delete from test where col=1;
DELETE 1

此时，从另一个会话中查看：

-- 会话 2
hrdb=> select cmin, cmax, col from test;
 cmin | cmax | col 
------+------+-----
    3 |    3 |   1
    0 |    0 |   2
    0 |    0 |   3
    0 |    0 |   4
    0 |    0 |   5
(5 rows)

oid

如果使用 PostgreSQL 11 或者更早版本，还有一个隐藏的系统字段：oid。它代表了数据行的对象 ID，只有当创建表时使用了 WITH OIDS 选项或者配置参数 default_with_oids 设置为 true 时才会创建这个字段。

从 PostgreSQL 12 开始，不再支持 WITH OIDS 选项，oid 只用于系统内部。

总结

PostgreSQL 中的每个表都包含了 6 个隐藏的系统字段，可以用于获取关于数据行的一些内部信息。这些字段名称不能用于创建普通的字段，即使使用双引号包含也不可以。

如果你点击了收藏⭐，请不要忘了关注❤️、评论📝、点赞👍！

不剪发的Tony老师

关注

4
点赞
踩
13

收藏

觉得还不错? 一键收藏
打赏
0
评论
PostgreSQL 中的系统字段：tableoid、xmin、xmax、cmin、cmax、ctid

PostgreSQL 中的每个表都包含了 6 个隐藏的系统字段：tableoid、xmin、xmax、cmin、cmax、ctid，这些字段可以用于获取关于数据行的一些内部信息。
复制链接

扫一扫