一、Greenplum 分区原理
分区表意思是将一个大表在物理上分割成几块,GPDB中的分区表和PostgreSQL中实现原理一样,都是用过表继承、约束来实现。但是与PostgreSQL也有所不同,在PostgreSQL中,一个父表,多个子表来实现分区表,需要手动向子表插入数据,如果向父表插入数据,则直接会被插入到父表中,在GPDB中,可以直接想父表插入数据,便可以根据约束直接自动向对应的子表插入数据,当分区子表不存在时,插入失败
二、分区表创建
2.1、范围分区(range)
根据分区字段的值范围区间来分区,每一个分区就是一个子表
-
eg:
-
create
table test_partition_range
-
(
-
id
int,
-
name
varchar(
64),
-
fdate
varchar(
64)
-
)
distributed
by (
id)
-
partition
by
range(fdate)
-
(
-
partition p1
start (
'2017-01-01') inclusive
end (
'2017-01-31') exclusive,
-
partition p2
start (
'2017-02-01') inclusive
end (
'2017-02-29') exclusive,
-
default
partition default_p
-
);
-
-
inclusive :指定包含,例如上面的
start (
'2017-01-01') inclusive 则是包含
'2017-01-01'
-
exclusive : 指定不包含, 例如上面的
end (
'2017-01-31') exclusive 则是不包含
'2017-01-31'
2.2、快速分区(every)
根据选定的范围,跨越基数,快速分区每一个子表
-
eg:
-
create
table test_partition_every_1
-
(
-
id
int,
-
name
varchar(
64),
-
fdate
date
-
)
-
distributed
by (
id)
-
partition
by
range (fdate)
-
(
-
partition pn_
start (
'2017-01-01'::
date)
end (
'2017-12-31'::
date) every (
'1 day'::
interval),
-
default
partition default_p
-
);
-
-
every:指定跨越基数
2.3、list分区(list)
根据值的分组,相同的数据归类到一组,也就一个分区中
-
eg:
-
create
table test_partition_list
-
(
-
id
int,
-
name
varchar(
64),
-
fdate
varchar(
10)
-
)
-
distributed
by (
id)
-
partition
by
list (fdate)
-
(
-
partition p1
values (
'2017-01-01',
'2017-01-02'),
-
partition p2
values (
'2017-01-03'),
-
default
partition pd
-
);
三、分区相关操作
3.1、分区split
切割普通分区:
-
将分区p2 在 '2017-02-20' 左右切分成两块
-
alter
table test_partition_range
split
partition p2
at (
'2017-02-20')
into (
partition p2,
partition p3);
-
-
切割默认分区:
-
alter
table test_partition_range
split
default
partition
start (
'2017-03-01')
end (
'2017-03-31')
into (
partition p4,
default
partition);
3.2、分区add
如果存在default partition,则不能add分区,只能split default partition
alter table test_partition_range_1 add partition p2 start ('2017-02-01') end ('2017-02-31');
3.3、分区drop
彻底删除对应的分区表
alter table test_partition_range_1 DROP partition p2;
3.4、分区truncate
清空分区表数据,相当于删除分区,然后再新建一个
alter table test_partition_range_1 truncate partition p1;
四、子分区创建与操作
4.1、子分区创建
在GPDB中,分区是可以嵌套增加的,分区下面可以有子分区
-
create
table test_partition_range_2
-
(
-
id
int,
-
name
varchar(
64),
-
fdate
varchar(
10)
-
)
-
distributed
by (
id)
-
partition
by
range(fdate)
-
subpartition
by
list(
name)
-
subpartition
template
-
(
-
subpartition c1
values (
'xiaoxiao'),
-
subpartition c2
values (
'xiaohua')
-
)
-
(
-
partition p1
start (
'2017-01-01')
end (
'2017-01-31')
-
)
-
上面的分区中,p1会再分两个c1/c2子分区
4.2、truncate 子分区
alter table test_partition_range_2 alter partition p1 truncate partition c2;
4.3、drop 子分区
alter table test_partition_range_2 alter partition p1 drop partition c2;
创建时间戳分区实例
CREATE TABLE fi_middle.order_detail
(
date_id integer,
order_id character varying(22),
product_id character varying(50),
order_quantity numeric,
allot_quantity numeric,
original_price numeric,
sale_price numeric,
vip_price numeric,
bargin_price numeric,
medium numeric,
promotion_id numeric,
is_vip_discount numeric,
product_type numeric,
reduce_price numeric,
etl_change_date timestamp without time zone,
order_items_id numeric,
gift_card_charge numeric(12,2),
gift_unit_price numeric,
item_id numeric,
parent_item_id numeric,
allot_activity_fee numeric(12,2),
allot_point_deduction_amount numeric,
send_date timestamp without time zone,
privilege_code_discount_amount numeric,
relation_type numeric,
parent_id character varying(16),
shop_id numeric,
shop_type numeric
)
WITH (
OIDS=FALSE
)
DISTRIBUTED BY (order_id)
PARTITION BY RANGE(send_date)
(
PARTITION p_order_detail_20170701 START ('2017-06-01 00:00:00'::timestamp without time zone) END ('2017-07-01 00:00:00'::timestamp without time zone),
PARTITION p_order_detail_20170801 START ('2017-07-01 00:00:00'::timestamp without time zone) END ('2017-08-01 00:00:00'::timestamp without time zone)
)
添加分区
alter table fi.order_detail_adt_cp add partition p_order_detail_adt_20170601 START ('2017-05-01 00:00:00'::timestamp without time zone) END ('2017-06-01 00:00:00'::timestamp without time zone) EVERY ('1 mon'::interval)
修改表名称
alter table fi_middle.order_detail rename to order_detail_adt;
(表名默认在旧表模式下,不必指定模式)
内容修改
update fi.fi_promotion_info set supp_no='00'||supp_no where date_id=20170915 and length(supp_no)=5 and supp_no<>'80000';
update fi.fi_promotion_info set supp_no='0'||supp_no where date_id=20170915 and length(supp_no)=4;
update fi.fi_promotion_info set end_date=end_date+interval '1 day' where date_id=20170915 and to_char(end_date,'yyyymmdd hh24:mi:ss') like '%00:00:00';
update fi.fi_promotion_info set cat2_name='全品' where date_id=20170915 and cat2_name='全部品种';
转载于:https://blog.51cto.com/13126942/2053712