Postgres 之 TOAST技术

作者:瀚高PG实验室 (Highgo PG Lab)

TOAST简介

首先,Toast是一个名字缩写,全写是The OverSized Attribute Storage Technique(超尺寸字段存储技术)顾名思义,是说超长字段在Postgres的一个存储方式;

Postgres采用的存储默认是每个页面存储固定8Kb大小的数据,并且元组不允许跨页面存储,所以并不能直接存储大字段数据;

Toast就是为此应运而生,它会将大字段值压缩或者分散为多个物理行来存储;

对于用户来说完全不用关注这一技术实现,完全是透明的。 

TOAST的存储方式 

Postgres的部分类型数据支持toast,因为有些字段类型是不会产生大字段数据的,完全没必要用到Toast技术(比如date,time,boolean等);

支持Toast的数据类型应当时变长的(variable-length);

当表中字段任何一个有Toast,那这个表都会有这一个相关联的Toast表,OID被存储在pg_class.reltoastrelid里面;

超出的的数值将会被分割成chunks,并最多toast_max_chunk_size 个byte(缺省是2Kb)

当存储的行数据超过toast_tuple_threshold值(通常是2kB),就会触发toast存储;

toast将会压缩或者移动字段值直到超出部分比toast_tuple_targer值小(这个值通常也是2KB)。 

Toast4种策略

策略

说明

PLAIN   

避免压缩和行外存储。
只有那些不需要 TOAST 策略就能存放的数据类型允许选择(例如 int 类型),而对于 text 这类要求存储长度超过页大小的类型,是不允许采用此策略的。                                         

EXTENDED

允许压缩和行外存储。
一般会先压缩,如果还是太大,就会行外存储                                                                                                                                 

EXTERNA

允许行外存储,但不许压缩。
类似字符串这种会对数据的一部分进行操作的字段,采用此策略可能获得更高的性能,因为不需要读取出整行数据再解压。                                            

MAIN    

允许压缩,但不许行外存储。
不过实际上,为了保证过大数据的存储,行外存储在其它方式(例如压缩)都无法满足需求的情况下,作为最后手段还是会被启动。因此理解为尽量不使用行外存储更贴切。

更改表的存储方式为TOAST

ALTER TABLE table_name ALTER COLUMN column_name SET STORAGE {PLAIN | EXTENDED | MAIN | EXTERNAL};

示例:

testdb=# select relname,relfilenode,reltoastrelid from pg_class where relname='toast_t';

 relname | relfilenode | reltoastrelid

---------+-------------+---------------

 toast_t |       65584 |         65587

(1 row)

testdb=# create table toast_t1(dd character varying);

CREATE TABLE

testdb=#

testdb=# \d+ toast_t1

                            Table "public.toast_t1"

 Column |       Type        | Modifiers | Storage  | Stats target | Description

--------+-------------------+-----------+----------+--------------+-------------

 dd     | character varying |           | extended |              |

testdb=#  alter table toast_t1 alter column dd set storage main;

ALTER TABLE

testdb=# \d+ toast_t1

                            Table "public.toast_t1"

 Column |       Type        | Modifiers | Storage | Stats target | Description

--------+-------------------+-----------+---------+--------------+-------------

 dd     | character varying |           | main    |              |

TOAST额外的三个字段

字段名

属性

chunk_id  

标识TOAST表的OID字段                                

chunk_seq

chunk的序列号,与chunk_id的组合唯一索引可以加速访问 

chunk_data

存储TOAST表的实际数据                               

testdb=# select relname,relfilenode,reltoastrelid from pg_class where relname='toast_t1';

 relname  | relfilenode | reltoastrelid

----------+-------------+---------------

 toast_t1 |       65596 |         65599

(1 row)

通过以上语句,我们查到 blog 表的 oid 为16441,其对应 TOAST 表的 oid 为16444(关于 oid 和 pg_class 的概念,请参考PG官方文档),那么其对应 TOAST 表名则为: pg_toast.pg_toast_65596(注意这里是 toast_t1表的 oid ),我们看下其定义:

testdb=# \d+ pg_toast.pg_toast_65596;

TOAST table "pg_toast.pg_toast_65596"

   Column   |  Type   | Storage

------------+---------+---------

 chunk_id   | oid     | plain

 chunk_seq  | integer | plain

 chunk_data | bytea   | plain

TOAST表的计算 

计算一个表的大小时要注意统计Toast的大小,因为对超长字段存储时,在基础表上可能只存了20%,另外的数据都存到了Toast里面去了,计算大小时要结合起来看,索引也是一样,对于表里有main或者extended类型的会创建Toast表,两者的关联是通过pg_class里的OID去关联的。

1.TOAST表关联查询 

 Example a:

[postgres@localhost ~]$ psql

psql (9.2.3)

Type "help" for help.

postgres=# create table toast_t(id int);

CREATE TABLE

postgres=# select relname,reltoastrelid from pg_class where relname = 'toast_t';

relname  | reltoastrelid

----------+---------------

toast_t |             0

(1 row)

postgres=# \d+ toast_t

                       Table "public.toast_t"

Column |  Type   | Modifiers | Storage | Stats target | Description

-

--------+---------+-----------+---------+--------------+-------------

i

id     | integer |           | plain   |              |

Has OIDs: no上面的字段没有toast表,因为字段int是定长的。 

Example b:
postgres=# create table toast_t(id int,vname varchar(48),remark text);

postgres=# select relname,reltoastrelid from pg_class where relname = 'toast_t';

relname  | reltoastrelid

----------+---------------

toast_t |         65587

(1 row)

postgres=# select relname from pg_class where oid = 65587;

    relname    

----------------

pg_toast_65584

(1 row)

2.TOAST表计算大小

testdb=# \d+ toast_t

                               Table "public.toast_t"

 Column |         Type          | Modifiers | Storage  | Stats target | Description

--------+-----------------------+-----------+----------+--------------+-------------

 id     | integer               |           | plain    |              |

 vname  | character varying(48) |           | extended |              |

 remark | text                  |           | extended |              |

postgres=# select oid,relname,reltoastrelid from pg_class where relname = 'toast_t';

  oid  | relname | reltoastrelid

-------+---------+---------------

 65584 | toast_t |         65587

postgres=# insert into toast_t select generate_series(1,2),repeat('come here'||'^_^',2),repeat('^_^ It is wonderful',500);

INSERT 0 2000

postgres=# insert into toast_t select generate_series(1,2),repeat('come here'||'^_^',2),repeat('^_^ It is wonderful,Remark here!!',2000);

I

INSERT 0 2

postgres=# select pg_column_size(id),pg_column_size(vname),pg_column_size(remark) from toast_t limit 2;

 pg_column_size | pg_column_size | pg_column_size

----------------+----------------+----------------

              4 |             29 |            154

              4 |             29 |            154

(2 rows)

--查看基础表和Toast的大小

testdb=# select pg_relation_size(65584 );  或 select pg_size_pretty(pg_relation_size(65576));

 pg_relation_size  

------------------

           450560

(1 row)

postgres=# select pg_relation_size(65587);

pg_relation_size

------------------

                0

(1 row)

--文本数据量增多,这时可以看到toast表字段大小在2kb左右时有大小了

postgres=# insert into toast_t select generate_series(3,4),repeat('come here'||'^_^',2),repeat('^_^ It is wonderful,Remark here!!',4000);

INSERT 0 2

postgres=# select pg_relation_size(65587);

pg_relation_size

------------------

                0

(1 row)

postgres=# insert into toast_t select generate_series(5,6),repeat('come here'||'^_^',2),repeat('^_^ It is wonderful,Remark here!!',5500);

INSERT 0 2

postgres=#  select pg_relation_size(65587);

 pg_relation_size

------------------

             8192

(1 row)

postgres=# select pg_column_size(id),pg_column_size(vname),pg_column_size(remark) from toast_t;

pg_column_size | pg_column_size | pg_column_size

----------------+----------------+----------------

              4 |             29 |            154

              4 |             29 |            154

              4 |             29 |            851

              4 |             29 |            851

              4 |             29 |            851

              4 |             29 |            851

              4 |             29 |           1651

              4 |             29 |           1651

              4 |             29 |           2247

              4 |             29 |           2247

(10 rows)

postgres=# insert into toast_t select generate_series(1,2),repeat('come here'||'^_^',2),repeat('^_^ It is wonderful,Remark here!!',10000);

INSERT 0 2

postgres=# select pg_relation_size(65584);

pg_relation_size

------------------

             8192

(1 row)

postgres=# select pg_relation_size(65587);

pg_relation_size

------------------

            16384

(1 row)

postgres=# insert into toast_t select generate_series(7,8),repeat('come here'||'^_^',2),repeat('^_^ It is wonderful,Remark here!!',20000);

INSERT 0 2

postgres=# select id,pg_column_size(id),pg_column_size(vname),pg_column_size(remark) from toast_t;

 id | pg_column_size | pg_column_size | pg_column_size

----+----------------+----------------+----------------

  1 |              4 |             29 |            154

  2 |              4 |             29 |            154

  1 |              4 |             29 |            851

  2 |              4 |             29 |            851

  1 |              4 |             29 |            851

  2 |              4 |             29 |            851

  3 |              4 |             29 |           1651

  4 |              4 |             29 |           1651

  5 |              4 |             29 |           2247

  6 |              4 |             29 |           2247

  1 |              4 |             29 |           4050

  2 |              4 |             29 |           4050

  7 |              4 |             29 |           8056

  8 |              4 |             29 |           8056

(14 rows)

postgres=# select pg_relation_size(65584);

pg_relation_size

------------------

             8192

(1 row)

postgres=# select pg_relation_size(65587);

pg_relation_size

------------------

            32768

(1 row)

可以看到后插入的数据随着字段内容的增多,toast段一直在变大。基础表的大小没有变化。

这个和Oracle存储的大字段内容比较像,Oracle存储Blob类的数据时也是指定另外的segment来存储,而不是在原表中存储,当然可以设置enable storage in row来指定表中存储

Toast的优点 

1.可以存储超长超大字段,避免之前不能直接存储的限制 

2.物理上与普通表是分离的,检索查询时不检索到该字段会极大地加快速度 

3.更新普通表时,该表的Toast数据没有被更新时,不用去更新Toast表 

Toast的缺点

1.对大字段的索引创建是一个问题,有可能会失败,其实通常也不建议在大字段上创建,全文检索倒是一个解决方案。 

2.大字段的更新会有点慢,其它DB也存在。

  • 2
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值