greenplum 5.7 + create table + insert into

os: centos 7.4
gp: gpdb-5.7.0

三台机器
node1 为master host
node2、node3为segment host

psql 登录 node1 master

$ psql -d peiybdb
peiybdb=# select current_database();
 current_database 
------------------
 peiybdb
(1 row)

create table tmp_t0(
c1 varchar(100),
c2 varchar(100),
c3 varchar(100)
);
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'c1' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE

peiybdb=# \d+ tmp_t0
                        Table "public.tmp_t0"
 Column |          Type          | Modifiers | Storage  | Description 
--------+------------------------+-----------+----------+-------------
 c1     | character varying(100) |           | extended | 
 c2     | character varying(100) |           | extended | 
 c3     | character varying(100) |           | extended | 
Has OIDs: no
Distributed by: (c1)

查看node1、node2、node3 当前的PGHOME目录大小

[gpadmin@node1 gpseg-1]$ pwd
/u01/greenplum-data/gpseg-1
[gpadmin@node1 gpseg-1]$ du -sh
109M	.

[gpadmin@node2 gpseg0]$ pwd
/u01/greenplum-data/gpseg0
[gpadmin@node2 gpseg0]$ du -sh
109M	.

[gpadmin@node3 gpseg1]$ pwd
/u01/greenplum-data/gpseg1
[gpadmin@node3 gpseg1]$ du -sh
109M	.

插入数据

peiybdb=# select gp_opt_version();
                gp_opt_version                 
-----------------------------------------------
 GPOPT version: 2.55.20, Xerces version: 3.1.2
(1 row)


peiybdb=# set optimizer=on;
set optimizer_enumerate_plans=on;
set optimizer_minidump=always;
set optimizer_enable_constant_expression_evaluation=off;
set client_min_messages='debug5';


peiybdb=# insert into tmp_t0
(c1,c2,c3)
select generate_series(1, 1000000) as c1,
       md5(random()::text) as c2 ,
       md5(random()::text) as c3
;

node1 master上查询pg_stat_activity

peiybdb=# \x
Expanded display is on.
peiybdb=# select * from pg_stat_activity;
-[ RECORD 1 ]----+-------------------------------------------
datid            | 16384
datname          | peiybdb
procpid          | 10904
sess_id          | 20
usesysid         | 10
usename          | gpadmin
current_query    | insert into tmp_t0
                 | (c1,c2,c3)
                 | select generate_series(1, 1000000) as c1, 
                 |        md5(random()::text) as c2 , 
                 |        md5(random()::text) as c3 
                 | 
waiting          | f
query_start      | 2018-05-02 07:20:33.325637+00
backend_start    | 2018-05-02 07:13:17.573451+00
client_addr      | 192.168.56.1
client_port      | 63217
application_name | 
xact_start       | 2018-05-02 07:20:33.3212+00
waiting_reason   | 
rsgid            | 0
rsgname          | 
rsgqueueduration | 
-[ RECORD 2 ]----+-------------------------------------------
datid            | 16384
datname          | peiybdb
procpid          | 10906
sess_id          | 21
usesysid         | 10
usename          | gpadmin
current_query    | <IDLE>
waiting          | f
query_start      | 2018-05-02 07:17:17.675179+00
backend_start    | 2018-05-02 07:13:17.580506+00
client_addr      | 192.168.56.1
client_port      | 63218
application_name | 
xact_start       | 
waiting_reason   | 
rsgid            | 0
rsgname          | 
rsgqueueduration | 
-[ RECORD 3 ]----+-------------------------------------------
datid            | 16384
datname          | peiybdb
procpid          | 11095
sess_id          | 22
usesysid         | 10
usename          | gpadmin
current_query    | select * from pg_stat_activity;
waiting          | f
query_start      | 2018-05-02 07:28:07.30111+00
backend_start    | 2018-05-02 07:27:54.501407+00
client_addr      | 
client_port      | -1
application_name | psql
xact_start       | 2018-05-02 07:28:07.30111+00
waiting_reason   | 
rsgid            | 0
rsgname          | 
rsgqueueduration | 

再次查看node1、node2、node3的文件目录大小

[gpadmin@node1 gpseg-1]$ du -sh
109M	.

[gpadmin@node2 gpseg0]$ du -sh
224M	.

[gpadmin@node3 gpseg1]$ du -sh
224M	.

可以观察到 master 节点node1的数据文件大小并没有发生变化,segment的node2、node3的数据文件增长了不少。
主要就是由于 master 节点是用来存储定义,segment是用来存储数据的。

查看tmp_t0的定义

peiybdb=# 
peiybdb=# \d
              List of relations
 Schema |  Name  | Type  |  Owner  | Storage 
--------+--------+-------+---------+---------
 public | tmp_t0 | table | gpadmin | heap
(1 row)

peiybdb=# \d+ tmp_t0;
                        Table "public.tmp_t0"
 Column |          Type          | Modifiers | Storage  | Description 
--------+------------------------+-----------+----------+-------------
 c1     | character varying(100) |           | extended | 
 c2     | character varying(100) |           | extended | 
 c3     | character varying(100) |           | extended | 
Has OIDs: no
Distributed by: (c1)

Distributed by 这个就是tmp_t0表的分布列,表的分布列一定要合理,能够降数据比较均匀的分布到各个segment节点上。
检索数据时能够在多个节点并发处理数据。

使用 limit 查询数据时,可以看到数据是随机到某个节点查询

postgres=# select * from tmp_t0 limit 4;
 c1 |                c2                |                c3                
----+----------------------------------+----------------------------------
 1  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
 3  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
 5  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
 7  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
(4 rows)

postgres=# select * from tmp_t0 limit 4;
 c1 |                c2                |                c3                
----+----------------------------------+----------------------------------
 2  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
 4  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
 6  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
 8  | 1f50f9a3a8f6aa5015823d43fa41910d | 5067d2103a5cbebe0557391f8b440bee
(4 rows)

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

数据库人生

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值