Citus数据分片分布研究（一在工作节点直接操作表）

最新推荐文章于 2024-07-25 18:15:54 发布

皓月如我

最新推荐文章于 2024-07-25 18:15:54 发布

阅读量2.6k

点赞数 2

分类专栏：分布式数据存储云大物移智数据库技术文章标签： citus

本文链接：https://blog.csdn.net/fm0517/article/details/79460980

版权

数据库技术同时被 3 个专栏收录

132 篇文章 12 订阅

订阅专栏

云大物移智

25 篇文章 0 订阅

订阅专栏

分布式数据存储

13 篇文章 0 订阅

订阅专栏

（本文中凡是未显式指出的SQL，均在协调节点上执行）

工作节点

mydb1=# SELECT * FROM master_get_active_worker_nodes();
   node_name   | node_port 
---------------+-----------
 192.168.7.131 |      5432
 192.168.7.135 |      5432
 192.168.7.136 |      5432
 192.168.7.137 |      5432
 192.168.7.133 |      5432
 192.168.7.132 |      5432
 192.168.7.134 |      5432
 192.168.7.130 |      5432
(8 rows)

创建表test_table

create table test_table(id int, name varchar(16));

此时在协调节点上用 \d 可以看到表test_table。

配置分片原则

SELECT master_create_distributed_table('test_table', 'id', 'hash');

根据分片数和副本数进行分片

SELECT master_create_worker_shards('test_table', 8, 1);

此时在工作节点上用 \d 可以看到表test_table_XXXXXX。

查看分片

mydb1=# SELECT * from pg_dist_shard;
 logicalrelid | shardid | shardstorage | shardminvalue | shardmaxvalue 
--------------+---------+--------------+---------------+---------------
 test_table   |  102024 | t            | -2147483648   | -1610612737
 test_table   |  102025 | t            | -1610612736   | -1073741825
 test_table   |  102026 | t            | -1073741824   | -536870913
 test_table   |  102027 | t            | -536870912    | -1
 test_table   |  102028 | t            | 0             | 536870911
 test_table   |  102029 | t            | 536870912     | 1073741823
 test_table   |  102030 | t            | 1073741824    | 1610612735
 test_table   |  102031 | t            | 1610612736    | 2147483647
(8 rows)

可见，工作节点上的表名test_table_XXXXXX中最后的一段数字，即是这里的shardid。

操作表test_table

mydb1=# INSERT INTO test_table VALUES(1,'a');
INSERT 0 1
mydb1=# INSERT INTO test_table VALUES(2,'b');
INSERT 0 1
mydb1=# INSERT INTO test_table VALUES(3,'c');
INSERT 0 1
mydb1=# INSERT INTO test_table VALUES(4,'d');
INSERT 0 1
mydb1=# INSERT INTO test_table VALUES(5,'e');
INSERT 0 1
mydb1=# INSERT INTO test_table VALUES(6,'f');
INSERT 0 1
mydb1=# INSERT INTO test_table VALUES(7,'g');
INSERT 0 1
mydb1=# INSERT INTO test_table VALUES(8,'h');
INSERT 0 1
mydb1=# select * from test_table;
 id | name 
----+------
  1 | a
  8 | h
  5 | e
  4 | d
  7 | g
  3 | c
  6 | f
  2 | b
(8 rows)

在工作节点上直接操作表

在worker node上执行：

mydb1=# select * from test_table_102024;
 id | name 
----+------
  1 | a
  8 | h
(2 rows)

可见表test_table的记录是根据id列hash散布在8个工作节点中的。
尝试直接向工作节点插入数据，在worker node上执行：

mydb1=# INSERT INTO test_table_102024 VALUES(999,'999');
INSERT 0 1

然后再进行查询：

mydb1=# select * from test_table_102024;
 id  | name 
-----+------
   1 | a
   8 | h
 999 | 999
(3 rows)

在协调节点coordinator node上进行查询：

mydb1=# select * from test_table;
 id  | name 
-----+------
   1 | a
   8 | h
 999 | 999
   5 | e
   4 | d
   7 | g
   3 | c
   6 | f
   2 | b
(9 rows)