原创文章,转载须注明出处。访问我的Github(地址:https://guobo507.github.io)查看最新文章列表。
CitusData:Citus DB分布式数据库系统是一个将SQL的表现力、关系型数据库的性能,以及Hadoop的可扩展性与可用性有效地整合的数据库产品。
Citus Documentation:https://docs.citusdata.com/en/v8.2/
本文中,我们将适用Vagrant来快速搭建一台测试服务器,并在上面安装PostgreSQL 11和Citus来测试。
Vagrantfile内容如下(路径:~/vagrantdata/pgcitus/):
Vagrant.configure("2") do |config|
config.vm.hostname = "pgcitus"
config.vm.box = "centos7/1902_01"
config.vm.box_check_update = false
config.vm.network "private_network", ip: "10.128.0.30"
config.vm.network :forwarded_port, guest: 22, host: 10030, auto_correct: true
# config.vm.network :forwarded_port, guest: 9000, host: 9000, auto_correct: true
config.vm.provider "virtualbox" do |vb|
vb.name = "vagrant_pgcitus"
vb.gui = false
vb.memory = "10240"
vb.cpus = 4
end
config.vm.provision "shell", inline: <<-SHELL
yum install -y openssh-server net-tools wget dstat &> /dev/null
echo 'LoginGraceTime 0' >> /etc/ssh/sshd_config
echo 'PermitRootLogin yes' >> /etc/ssh/sshd_config
echo 'PasswordAuthentication yes' >> /etc/ssh/sshd_config
sed -i "/^PasswordAuthentication no/d" /etc/ssh/sshd_config
echo r00tr00t |passwd --stdin root
systemctl enable sshd
systemctl restart sshd
echo "export TZ='Asia/Shanghai'" >> /etc/profile
echo "export LANG=en_US.UTF-8" >> /etc/profile
echo "alias df='df -hTP'" >> /etc/profile
echo "alias df='df -hP'" >> /etc/profile
echo "alias free='free -h'" >> /etc/profile
sed -i "/pgcitus/d" /etc/hosts
sed -i "/SELINUX=/s/enforcing/disabled/g" /etc/selinux/config
echo >> /etc/hosts
echo "10.128.0.30 pgcitus" >> /etc/hosts
SHELL
end
启动虚拟机:
cd ~/vagrantdata/pgcitus/
vagrant up
1 安装Citus和PostgreSQL软件包
以下内容需连接到虚拟机上操作,如下所示:
ssh root@10.128.0.30
1.1 创建postgres用户
groupadd postgres
useradd -g postgres postgres
echo postgres |passwd --stdin postgres
1.2 适用Citus提供的脚本安装YUM仓库
curl https://install.citusdata.com/community/rpm.sh |bash
实际上,可以直接安装PGDG源,其中也包含了Citus软件包。
1.4 安装PostgreSQL软件包
yum install -y postgresql11-server postgresql11-contrib
1.5 安装Citus软件包
yum install -y citus82_11 citus82_11-debuginfo
1.3 创建各数据库实例的数据文件目录
mkdir -p /u01/pgdata/{master,worker1,worker2,worker3}
chown -R postgres.postgres /u01/*
chmod 700 /u01/pgdata/{master,worker1,worker2,worker3}
2 创建各数据库实例
2.1 初始化master和三个worker数据库实例
su - postgres
echo "export PATH=/usr/pgsql-11/bin:\$PATH" >> ~/.bash_profile
source ~/.bash_profile
echo postgres > ./password.tmp
/usr/pgsql-11/bin/initdb -D /u01/pgdata/master --auth=trust --encoding=utf8 --locale=C --wal-segsize=16 --pwfile=./password.tmp
/usr/pgsql-11/bin/initdb -D /u01/pgdata/worker1 --auth=trust --encoding=utf8 --locale=C --wal-segsize=16 --pwfile=./password.tmp
/usr/pgsql-11/bin/initdb -D /u01/pgdata/worker2 --auth=trust --encoding=utf8 --locale=C --wal-segsize=16 --pwfile=./password.tmp
/usr/pgsql-11/bin/initdb -D /u01/pgdata/worker3 --auth=trust --encoding=utf8 --locale=C --wal-segsize=16 --pwfile=./password.tmp
rm -f ./password.tmp
ls -l /u01/pgdata/*/
2.2 为每个实例创建归档日志目录
mkdir -p /u01/pgdata/{master,worker1,worker2,worker3}/pg_archive
2.3 启用每个实例的wal日志归档
INSTANCE_LIST='master worker1 worker2 worker3'
for d in $INSTANCE_LIST; do sed -i "/^#archive_mode/s/^#//g" /u01/pgdata/$d/postgresql.conf; done
for d in $INSTANCE_LIST; do sed -i "/^archive_mode/s/off/on/g" /u01/pgdata/$d/postgresql.conf; done
for d in $INSTANCE_LIST; do sed -i "223 aarchive_command = 'test ! -f /u01/pgdata/$d/pg_archive/%f && cp %p /u01/pgdata/$d/pg_archive/%f'" /u01/pgdata/$d/postgresql.conf; done
2.4 启用citus和pg_stat_statements扩展
INSTANCE_LIST='master worker1 worker2 worker3'
for d in $INSTANCE_LIST; do echo "shared_preload_libraries = 'citus,pg_stat_statements'" >> /u01/pgdata/$d/postgresql.conf; done
2.5 修改各实例的监听端口
修改各实例的监听端口:
sed -i "/^port/s/5432/9000/g" /u01/pgdata/master/postgresql.conf
for i in `seq 1 3`; do sed -i "/^port/s/5432/900$i/g" /u01/pgdata/worker$i/postgresql.conf; done
INSTANCE_LIST='master worker1 worker2 worker3'
for d in $INSTANCE_LIST; do grep ^port /u01/pgdata/$d/postgresql.conf; done
2.6 根据需要修改各实例的其他参数配置
根据需要,修改所有实例的其他配置,如监听地址、wal参数、checkpoint参数以及日志功能参数等:
INSTANCE_LIST='master worker1 worker2 worker3'
for d in $INSTANCE_LIST; do
echo "shared_preload_libraries = 'citus,pg_stat_statements'" >> /u01/pgdata/$d/postgresql.conf
sed -i "/^#listen_addresses/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^listen_addresses/s/localhost/*/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^shared_buffers/s/128/512/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#listen_address/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#port/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^listen_address/s/localhost/*/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^shared_buffers/s/128/2048/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^max_connections/s/100/256/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#superuser_reserved_connections/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#wal_level/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#fsync/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#checkpoint_completion_target/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^checkpoint_completion_target/s/0.5/0.9/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#max_wal_senders/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#wal_keep_segments/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^wal_keep_segments/s/0/64/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#effective_cache_size/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^effective_cache_size/s/4/2/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#log_file_mode/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#log_checkpoints/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^log_checkpoints/s/off/on/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#log_connections/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^log_connections/s/off/on/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#log_disconnections/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^log_disconnections/s/off/on/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#log_duration/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^log_duration/s/off/on/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#log_error_verbosity/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#log_lock_waits/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^log_lock_waits/s/off/on/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#log_statement/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^log_statement/s/none/all/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^#log_replication_commands/s/^#//g" /u01/pgdata/$d/postgresql.conf
sed -i "/^log_replication_commands/s/off/on/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^max_wal_size/s/1/2/g" /u01/pgdata/$d/postgresql.conf
sed -i "/^min_wal_size/s/80/512/g" /u01/pgdata/$d/postgresql.conf
done
2.7 创建统一的集群管理脚本
mkdir ~/scripts
cat << EOF > ~/scripts/start_citus.sh
INSTANCE_LIST='master worker1 worker2 worker3'
for d in \$INSTANCE_LIST; do echo "====> Starting \$d:" && /usr/pgsql-11/bin/pg_ctl -D /u01/pgdata/\$d -l /u01/pgdata/\$d/logfile_\$d start && echo ; done
EOF
cat << EOF > ~/scripts/stop_citus.sh
INSTANCE_LIST='master worker1 worker2 worker3'
for d in \$INSTANCE_LIST; do echo "====> Stopping \$d:" && /usr/pgsql-11/bin/pg_ctl -D /u01/pgdata/\$d stop && echo ; done
EOF
cat << EOF > ~/scripts/reload_citus.sh
INSTANCE_LIST='master worker1 worker2 worker3'
for d in \$INSTANCE_LIST; do echo "====> Reloading \$d:" && /usr/pgsql-11/bin/pg_ctl -D /u01/pgdata/\$d reload && echo ; done
EOF
cat << EOF > ~/scripts/restart_citus.sh
INSTANCE_LIST='master worker1 worker2 worker3'
for d in \$INSTANCE_LIST; do echo "====> Restarting \$d:" && /usr/pgsql-11/bin/pg_ctl -D /u01/pgdata/\$d -l /u01/pgdata/\$d/logfile_\$d restart && echo ; done
EOF
chmod 700 ~/scripts/*_citus.sh
2.8 启动Citus集群的各个实例
~/scripts/start_citus.sh
# 检查结果:
ps -ef |grep pg |grep -v grep
ss -tunlp |grep postgres
2.9 在各实例中创建扩展视图
2.9.1 为master实例启用citus和pg_stat_statements视图
psql -p 9000 -c "CREATE EXTENSION citus;"
psql -p 9000 -d template1 -c "CREATE EXTENSION citus;"
psql -p 9000 -c "CREATE EXTENSION pg_stat_statements;"
psql -p 9000 -d template1 -c "CREATE EXTENSION pg_stat_statements;"
2.9.2 为3个worker实例启用citus和pg_stat_statements视图
for i in `seq 1 3`; do psql -p 900$i -c "CREATE EXTENSION citus;"; done
for i in `seq 1 3`; do psql -p 900$i -d template1 -c "CREATE EXTENSION citus;"; done
for i in `seq 1 3`; do psql -p 900$i -c "CREATE EXTENSION pg_stat_statements;"; done
for i in `seq 1 3`; do psql -p 900$i -d template1 -c "CREATE EXTENSION pg_stat_statements;"; done
3 配置Citus的分表功能
3.1 在master节点添加3各worker节点
在master节点中添加三个worker节点,作为后端实例:
psql -p 9000 -c "SELECT * from master_add_node('localhost', 9001);"
psql -p 9000 -c "SELECT * from master_add_node('localhost', 9002);"
psql -p 9000 -c "SELECT * from master_add_node('localhost', 9003);"
# 检查结果:
psql -p 9000 -c "select * from master_get_active_worker_nodes();"
3.2 在master节点创建测试表
本文适用Citus官方提供的三张测试表创建语句来测试:
psql -U postgres -p 9000
CREATE TABLE companies (
id bigint NOT NULL,
name text NOT NULL,
image_url text,
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL
);
CREATE TABLE campaigns (
id bigint NOT NULL,
company_id bigint NOT NULL,
name text NOT NULL,
cost_model text NOT NULL,
state text NOT NULL,
monthly_budget bigint,
blacklisted_site_urls text[],
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL
);
CREATE TABLE ads (
id bigint NOT NULL,
company_id bigint NOT NULL,
campaign_id bigint NOT NULL,
name text NOT NULL,
image_url text,
target_url text,
impressions_count bigint DEFAULT 0,
clicks_count bigint DEFAULT 0,
created_at timestamp without time zone NOT NULL,
updated_at timestamp without time zone NOT NULL
);
# 检查结果:
\dt
注意:此时,worker节点是不存在这三张表的。
3.3 为所有测试表添加主键
执行如下操作,为上述三张分区表添加主键,以便启用复制:
ALTER TABLE companies ADD PRIMARY KEY (id);
ALTER TABLE campaigns ADD PRIMARY KEY (id, company_id);
ALTER TABLE ads ADD PRIMARY KEY (id, company_id);
3.4 将所有测试表添加到Citus分布式进程
执行如下操作,将上述三张分区表添加到Citus的进程中,并设置分库键:
SELECT create_distributed_table('companies', 'id');
SELECT create_distributed_table('campaigns', 'company_id');
SELECT create_distributed_table('ads', 'company_id');
注意:此时,worker节点已存在这三张表了。如下所示为worker1节点上的表:
[postgres@pgcitus ~]$ psql -U postgres -p 9001 -c "select relname as TABLE_NAME from pg_class where relname like 'ads_1%' and relkind = 'r';" table_name ------------ ads_102096 ads_102102 ads_102072 ads_102084 ads_102090 ads_102078 ads_102081 ads_102087 ads_102075 ads_102093 ads_102099 (11 rows) [postgres@pgcitus ~]$ psql -U postgres -p 9001 -c "select relname as TABLE_NAME from pg_class where relname like 'campaigns_1%' and relkind = 'r';" table_name ------------------ campaigns_102046 campaigns_102052 campaigns_102067 campaigns_102049 campaigns_102064 campaigns_102055 campaigns_102040 campaigns_102058 campaigns_102043 campaigns_102070 campaigns_102061 (11 rows) [postgres@pgcitus ~]$ psql -U postgres -p 9001 -c "select relname as TABLE_NAME from pg_class where relname like 'companies_1%' and relkind = 'r';" table_name ------------------ companies_102011 companies_102026 companies_102014 companies_102035 companies_102008 companies_102032 companies_102038 companies_102029 companies_102017 companies_102020 companies_102023 (11 rows)
可以看到,这三张表的分表均被自动编号。如果你仔细观察,还会发现分表的命名规律,如
ads_102072
之后为ads_102075
,ads_102073
在worker2上,ads_102074
在worker3上。因此分表的命名在三个worker节点上是连续的。
可以在master实例上执行如下命令查看分表id:postgres=# SELECT * from pg_dist_shard_placement order by shardid, placementid; shardid | shardstate | shardlength | nodename | nodeport | placementid ---------+------------+-------------+-----------+----------+------------- 102008 | 1 | 0 | localhost | 9001 | 1 102009 | 1 | 0 | localhost | 9002 | 2 102010 | 1 | 0 | localhost | 9003 | 3 102011 | 1 | 0 | localhost | 9001 | 4 102012 | 1 | 0 | localhost | 9002 | 5 102013 | 1 | 0 | localhost | 9003 | 6 102014 | 1 | 0 | localhost | 9001 | 7 102015 | 1 | 0 | localhost | 9002 | 8 102016 | 1 | 0 | localhost | 9003 | 9 102017 | 1 | 0 | localhost | 9001 | 10 102018 | 1 | 0 | localhost | 9002 | 11 102019 | 1 | 0 | localhost | 9003 | 12 102020 | 1 | 0 | localhost | 9001 | 13 102021 | 1 | 0 | localhost | 9002 | 14 102022 | 1 | 0 | localhost | 9003 | 15 102023 | 1 | 0 | localhost | 9001 | 16 102024 | 1 | 0 | localhost | 9002 | 17 102025 | 1 | 0 | localhost | 9003 | 18 102026 | 1 | 0 | localhost | 9001 | 19 102027 | 1 | 0 | localhost | 9002 | 20 102028 | 1 | 0 | localhost | 9003 | 21 102029 | 1 | 0 | localhost | 9001 | 22 102030 | 1 | 0 | localhost | 9002 | 23 102031 | 1 | 0 | localhost | 9003 | 24 102032 | 1 | 0 | localhost | 9001 | 25 102033 | 1 | 0 | localhost | 9002 | 26 102034 | 1 | 0 | localhost | 9003 | 27 102035 | 1 | 0 | localhost | 9001 | 28 102036 | 1 | 0 | localhost | 9002 | 29 102037 | 1 | 0 | localhost | 9003 | 30 102038 | 1 | 0 | localhost | 9001 | 31 102039 | 1 | 0 | localhost | 9002 | 32 102040 | 1 | 0 | localhost | 9001 | 33 102041 | 1 | 0 | localhost | 9002 | 34 102042 | 1 | 0 | localhost | 9003 | 35 102043 | 1 | 0 | localhost | 9001 | 36 102044 | 1 | 0 | localhost | 9002 | 37 102045 | 1 | 0 | localhost | 9003 | 38 102046 | 1 | 0 | localhost | 9001 | 39 102047 | 1 | 0 | localhost | 9002 | 40 102048 | 1 | 0 | localhost | 9003 | 41 102049 | 1 | 0 | localhost | 9001 | 42 102050 | 1 | 0 | localhost | 9002 | 43 102051 | 1 | 0 | localhost | 9003 | 44 102052 | 1 | 0 | localhost | 9001 | 45 102053 | 1 | 0 | localhost | 9002 | 46 102054 | 1 | 0 | localhost | 9003 | 47 102055 | 1 | 0 | localhost | 9001 | 48 102056 | 1 | 0 | localhost | 9002 | 49 102057 | 1 | 0 | localhost | 9003 | 50 102058 | 1 | 0 | localhost | 9001 | 51 102059 | 1 | 0 | localhost | 9002 | 52 102060 | 1 | 0 | localhost | 9003 | 53 102061 | 1 | 0 | localhost | 9001 | 54 102062 | 1 | 0 | localhost | 9002 | 55 102063 | 1 | 0 | localhost | 9003 | 56 102064 | 1 | 0 | localhost | 9001 | 57 102065 | 1 | 0 | localhost | 9002 | 58 102066 | 1 | 0 | localhost | 9003 | 59 102067 | 1 | 0 | localhost | 9001 | 60 102068 | 1 | 0 | localhost | 9002 | 61 102069 | 1 | 0 | localhost | 9003 | 62 102070 | 1 | 0 | localhost | 9001 | 63 102071 | 1 | 0 | localhost | 9002 | 64 102072 | 1 | 0 | localhost | 9001 | 65 102073 | 1 | 0 | localhost | 9002 | 66 102074 | 1 | 0 | localhost | 9003 | 67 102075 | 1 | 0 | localhost | 9001 | 68 102076 | 1 | 0 | localhost | 9002 | 69 102077 | 1 | 0 | localhost | 9003 | 70 102078 | 1 | 0 | localhost | 9001 | 71 102079 | 1 | 0 | localhost | 9002 | 72 102080 | 1 | 0 | localhost | 9003 | 73 102081 | 1 | 0 | localhost | 9001 | 74 102082 | 1 | 0 | localhost | 9002 | 75 102083 | 1 | 0 | localhost | 9003 | 76 102084 | 1 | 0 | localhost | 9001 | 77 102085 | 1 | 0 | localhost | 9002 | 78 102086 | 1 | 0 | localhost | 9003 | 79 102087 | 1 | 0 | localhost | 9001 | 80 102088 | 1 | 0 | localhost | 9002 | 81 102089 | 1 | 0 | localhost | 9003 | 82 102090 | 1 | 0 | localhost | 9001 | 83 102091 | 1 | 0 | localhost | 9002 | 84 102092 | 1 | 0 | localhost | 9003 | 85 102093 | 1 | 0 | localhost | 9001 | 86 102094 | 1 | 0 | localhost | 9002 | 87 102095 | 1 | 0 | localhost | 9003 | 88 102096 | 1 | 0 | localhost | 9001 | 89 102097 | 1 | 0 | localhost | 9002 | 90 102098 | 1 | 0 | localhost | 9003 | 91 102099 | 1 | 0 | localhost | 9001 | 92 102100 | 1 | 0 | localhost | 9002 | 93 102101 | 1 | 0 | localhost | 9003 | 94 102102 | 1 | 0 | localhost | 9001 | 95 102103 | 1 | 0 | localhost | 9002 | 96 (96 rows)
4 导入验证数据测试
4.1 从Citus官方获取测试表的示例数据
本例Citus官方提供的示例表的测试数据来测试。执行如下操作,获取示例数据:
mkdir ~/citus_data
cd ~/citus_data
curl https://examples.citusdata.com/tutorial/companies.csv > companies.csv
curl https://examples.citusdata.com/tutorial/campaigns.csv > campaigns.csv
curl https://examples.citusdata.com/tutorial/ads.csv > ads.csv
4.2 导入示例数据
cd ~/citus_data
psql -U postgres -p 9000
\copy companies from 'companies.csv' with csv
\copy campaigns from 'campaigns.csv' with csv
\copy ads from 'ads.csv' with csv
# 检查结果:
select count(*) from companies; # 共100条数据
select count(*) from campaigns; # 共978条数据
select count(*) from ads; # 共7364条数据
4.3 检查各worker节点的数据分布情况
分别适用下面的语句查看到三个worker节点的子表名称及其记录数:
[root@pgcitus ~]# su - postgres
[root@pgcitus ~]# for p in 9001 9002 9003; do echo "===> Worker Instance $p: " && psql -U postgres -p $p -c "select relname as TABLE_NAME, reltuples as ROW_COUNT from pg_class where relname like 'ads_10%' and relkind = 'r' order by ROW_COUNT desc;"; done
===> Worker Instance 9001:
table_name | row_count
------------+-----------
ads_102075 | 434
ads_102102 | 407
ads_102072 | 270
ads_102084 | 252
ads_102093 | 219
ads_102096 | 193
ads_102081 | 183
ads_102078 | 134
ads_102087 | 134
ads_102099 | 109
ads_102090 | 0
(11 rows)
===> Worker Instance 9002:
table_name | row_count
------------+-----------
ads_102100 | 471
ads_102103 | 380
ads_102073 | 306
ads_102091 | 274
ads_102094 | 266
ads_102076 | 221
ads_102079 | 195
ads_102088 | 160
ads_102097 | 134
ads_102085 | 96
ads_102082 | 92
(11 rows)
===> Worker Instance 9003:
table_name | row_count
------------+-----------
ads_102092 | 502
ads_102086 | 466
ads_102080 | 457
ads_102101 | 245
ads_102095 | 191
ads_102083 | 157
ads_102074 | 128
ads_102089 | 101
ads_102077 | 98
ads_102098 | 89
(10 rows)
同理,下面的语句可以统计companies表和campaigns分表的数据分布情况:
for p in 9001 9002 9003; do echo "===> Worker Instance $p: " && psql -U postgres -p $p -c "select relname as TABLE_NAME, reltuples as ROW_COUNT from pg_class where relname like 'companies_10%' and relkind = 'r' order by ROW_COUNT desc;"; done for p in 9001 9002 9003; do echo "===> Worker Instance $p: " && psql -U postgres -p $p -c "select relname as TABLE_NAME, reltuples as ROW_COUNT from pg_class where relname like 'campaigns_10%' and relkind = 'r' order by ROW_COUNT desc;"; done
下面的语句分别统计三张测试表在三个worker实例上的所有分表的行数之和:
for p in 9001 9002 9003; do echo "===> Worker Instance $p: " && psql -U postgres -p $p -c "select sum(ROW_COUNT) from (select relname as TABLE_NAME, reltuples as ROW_COUNT from pg_class where relname like 'ads_10%' and relkind = 'r' order by ROW_COUNT desc) as ADS_COUNT;"; done
for p in 9001 9002 9003; do echo "===> Worker Instance $p: " && psql -U postgres -p $p -c "select sum(ROW_COUNT) from (select relname as TABLE_NAME, reltuples as ROW_COUNT from pg_class where relname like 'companies_1%' and relkind = 'r' order by ROW_COUNT desc) as ADS_COUNT;"; done
for p in 9001 9002 9003; do echo "===> Worker Instance $p: " && psql -U postgres -p $p -c "select sum(ROW_COUNT) from (select relname as TABLE_NAME, reltuples as ROW_COUNT from pg_class where relname like 'campaigns_1%' and relkind = 'r' order by ROW_COUNT desc) as ADS_COUNT;"; done
可以看到Citus根据规则将示例数据几乎存储到了三个worker示例上。