greenplum中所有的delete并没有删除数据,而是将这数据标识为delete,而且update是delete加insert操作,随着业务的发展,表的大小就会越来越大。green提供的vacuum工具回收已经删除的数据。
简单的vacuum会回收空间并可再次利用,vacuum表时没有请求排它锁,其他操作仍可以对表进行读写。
vacuum full回收空间会跨块移动行,这个过程需要对表请求排他锁。
vacuum都需要在业务低峰期执行除了vacuum full会请求排他锁,都会产生IO。
vacuum除了回收空间,还会收集统计信息。
实验一:收集统计信息
1、vacumm
edw=# create table vac(id int) distributed by(id);
edw=# insert into vac select generate_series(1,100000);
INSERT 0 100000
edw=# select pg_relation_size('vac');
pg_relation_size
------------------
3670016
(1 row)
edw=# select ctid,relname,relpages,reltuples from pg_class where relname='vac';
ctid | relname | relpages | reltuples
---------+---------+----------+-----------
(6,180) | vac | 112 | 99784
(1 row)
刚插入数据统计信息是抽样的不准确,只有99784条记录
edw=# vacuum vac;
edw=# select ctid,relname,relpages,reltuples from pg_class where relname='vac';
ctid | relname | relpages | reltuples
---------+---------+----------+-----------
(6,180) | vac | 112 | 100000
(1 row)
结论:简单vacuum操作重新收集的统计信息
2、vacuum full
edw=# drop table vac;
DROP TABLE
edw=# create table vac(id int) distributed by(id);
CREATE TABLE
edw=# insert into vac select generate_series(1,100000);
INSERT 0 100000
edw=# select pg_relation_size('vac');
pg_relation_size
------------------
3670016
(1 row)
edw=# select ctid,relname,relpages,reltuples from pg_class where relname='vac';
ctid | relname | relpages | reltuples
---------+---------+----------+-----------
(6,182) | vac | 112 | 99784
(1 row)
edw=# vacuum full vac;
VACUUM
edw=# select ctid,relname,relpages,reltuples from pg_class where relname='vac';
ctid | relname | relpages | reltuples
---------+---------+----------+-----------
(6,182) | vac | 112 | 100000
(1 row)
结论:vacuum full也会收集统计信息
实验二:回收空间情况
1、vacuum
edw=# delete from vac where id<=50000;
DELETE 50000
edw=# select pg_relation_size('vac');
pg_relation_size
------------------
3670016
(1 row)
删除数据表的大小没变
edw=# insert into vac select generate_series(1,50000);
INSERT 0 50000
edw=# select pg_relation_size('vac');
pg_relation_size
------------------
5505024
(1 row)
说明delete数据没有回收,表大小持续变大
edw=# vacuum vac;
VACUUM
edw=# select pg_relation_size('vac');
pg_relation_size
------------------
5505024
(1 row)
vacuum表大小没变
edw=# insert into vac select generate_series(1,50000);
INSERT 0 50000
edw=# select pg_relation_size('vac');
pg_relation_size
------------------
5505024
(1 row)
结论:vacuum后之前被删除的数据的空间会被重复使用
2、vacuum full
edw=# delete from vac where id<=50000;
DELETE 100000
edw=# vacuum full vac;
edw=# select pg_relation_size('vac');
pg_relation_size
------------------
1835008
(1 row)
结论:vacuum full直接回收所有delete数据的空间
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/29989552/viewspace-2128914/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/29989552/viewspace-2128914/