利用ShardingSphere-proxy实现PostgreSQL数据分片

1.节点信息

192.168.10.11 PostgreSQL14.7 数据节点
192.168.10.12 PostgreSQL14.7 数据节点
192.168.10.13 ShardingSphere-proxy 5.4.0

2.准备环境

修改/etc/hosts文件

cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.10.11 du101
192.168.10.12 du102
192.168.10.13 du103

防火墙设置

systemctl status firewalld.service
systemctl stop firewalld.service
systemctl disable firewalld

SELINUX设置

setenforce 0 
vi /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three two values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected. 
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted 

设置三节点时间同步

1、时间服务器配置(以192.168.10.13为例)
1)安装ntpd服务
yum install -y ntp
2)配置ntpd.conf
vi /etc/ntp.conf #以下定义是让NTP Server和其自身保持同步,如果在/etc/ntp.conf中定义的server都不可用时,将使用local时间作为ntp服务提供给ntp客户端。
server 127.127.1.0
fudge 127.127.1.0 stratum 10
3)启动ntpd服务
systemctl start ntpd
systemctl enable ntpd
systemctl status ntpd

2、数据库节点配置
yum install -y ntpdate
其余节点安装用ntpdate,并从192.168.10.13进行时间同步,
为避免时间出现误差,对数据库主备节点添加crontab任务保持定时同步时间,每分钟同步一次:

crontab -e
*/10 * * * *  /sbin/ntpdate -u 192.168.10.13 >> /var/log/ntp.log

检查时间同步情况
cat /var/log/ntp.log

2.数据库安装

搭建好PostgreSQL单机2节点,数据库配置好pg_hba.conf

[postgres@du101 ~]$ tail -5 /data/pgdata/pg_hba.conf    
host    replication     all             127.0.0.1/32            trust
host    replication     all             ::1/128                 trust
host    all     all     0.0.0.0/0 md5
[postgres@du101 ~]$

2数据节点分别创建好对应的数据库

192.168.10.11
postgres=# create database ds_0;
CREATE DATABASE
postgres=# 

192.168.10.12
postgres=# create database ds_1;
CREATE DATABASE
postgres=#

3.Proxy部署

ShardingSphere-proxy的运行需要java环境

[root@du103 ~]# yum -y install java
[root@du103 ~]# java -version      
openjdk version "1.8.0_332"
OpenJDK Runtime Environment (build 1.8.0_332-b09)
OpenJDK 64-Bit Server VM (build 25.332-b09, mixed mode)

[root@du103 ~]# tar zxvf apache-shardingsphere-5.4.0-shardingsphere-proxy-bin.tar.gz  -C /opt/
[root@du103 ~]# mv /opt/apache-shardingsphere-5.4.0-shardingsphere-proxy-bin /opt/apache-shardingsphere-5.4.0

[root@du103 ~]# ll /opt/apache-shardingsphere-5.4.0/conf
total 36
-rw-r--r-- 1 root root 3529 Dec 12  2022 config-encrypt.yaml
-rw-r--r-- 1 root root 3772 Dec 12  2022 config-mask.yaml
-rw-r--r-- 1 root root 3762 Dec 12  2022 config-readwrite-splitting.yaml
-rw-r--r-- 1 root root 5210 Dec 12  2022 config-shadow.yaml
-rw-r--r-- 1 root root 5701 Dec 12  2022 config-sharding.yaml
-rw-r--r-- 1 root root 1715 Dec 12  2022 hbase-db.yaml
-rw-r--r-- 1 root root 3096 Dec 12  2022 server.yaml
[root@du103 ~]#

–以 config- 前缀命名的 YAML 配置文件,即为一个逻辑数据源。
–server.yaml 配置注册中心、认证信息以及公用属性
##保留 config-sharding.yaml
server.yaml 文件其他可以创建bak目录放进去

[root@du103 ~]# mkdir /opt/apache-shardingsphere-5.4.0/conf/bak
[root@du103 ~]# mv /opt/apache-shardingsphere-5.4.0/conf/*.yaml /opt/apache-shardingsphere-5.4.0/conf/bak
[root@du103 ~]# cp /opt/apache-shardingsphere-5.4.0/conf/bak/server.yaml /opt/apache-shardingsphere-5.4.0/conf/
[root@du103 ~]# cp /opt/apache-shardingsphere-5.4.0/conf/bak/config-sharding.yaml /opt/apache-shardingsphere-5.4.0/conf/
[root@du103 ~]#

配制server.yaml文件

server.yaml文件的模板。该配置文件的主要作用是配置前端的认证数据库、用户名和密码, 以及连接相关的属性:包括分布式事务类型、sql日志等。
配置虚拟帐号密码,使用默认的root用户及密码root,且允许外网连接,同时权限指定为“所有权限无限制”。

[root@du103 ~]# vi /opt/apache-shardingsphere-5.4.0/conf/server.yaml
authority:
  users:
    - user: root@%
      password: root
  privilege:
    type: ALL_PERMITTED      
props:
  max-connections-size-per-query: 1
  kernel-executor-size: 16  # Infinite by default.
  proxy-frontend-flush-threshold: 128  # The default value is 128.
  sql-show: true
[root@du103 ~]#

参数说明:
max-connections-size-per-query:一次查询请求在每个数据库实例中所能使用的最大连接数,默认为1;
kernel-executor-size:用于设置任务处理线程池的大小。每个 ShardingSphereDataSource 使用一个独立的线程池,同一个 JVM 的不同数据源不共享线程池。默认为infinite;
proxy-frontend-flush-threshold:在 ShardingSphere-Proxy 中设置传输数据条数的 IO 刷新阈值。默认128;
sql-show:是否在日志中打印 SQL。打印 SQL 可以帮助开发者快速定位系统问题。日志内容包含:逻辑 SQL,真实 SQL 和 SQL 解析结果。如果开启配置,日志将使用 Topic ShardingSphere-SQL,日志级别是 INFO。

修改数据分片配置文件

config-sharding.yaml文件,该文件主要作用是配置后端与postgres数据库的连接属性,分库分表规则等。
本次分片示例表order_info和order_item数据分两个库,表分为3片,数据库分片键为user_id,值按2取余,表分片键为order_id,值按3取余。
并且包含广播表user_info

[root@du103 ~]# vi /opt/apache-shardingsphere-5.4.0/conf/config-sharding.yaml                           
databaseName: sharding_db
dataSources:
  ds_0:
    url: jdbc:postgresql://192.168.10.11:1921/ds_0
    username: postgres
    password: postgres
  ds_1:
    url: jdbc:postgresql://192.168.10.12:1921/ds_1
    username: postgres
    password: postgres
rules:
- !SHARDING
  tables:
    order_info:
      actualDataNodes: ds_${0..1}.order_info_${0..2}
      tableStrategy:
        standard:
          shardingColumn: order_id
          shardingAlgorithmName: order_info_inline
    order_item:
      actualDataNodes: ds_${0..1}.order_item_${0..2}
      tableStrategy:
        standard:
          shardingColumn: order_id
          shardingAlgorithmName: order_item_inline
  bindingTables:
    - order_info,order_item
  defaultDatabaseStrategy:
    standard:
      shardingColumn: user_id
      shardingAlgorithmName: ds_inline
  shardingAlgorithms:
    ds_inline:
      type: INLINE
      props:
        algorithm-expression: ds_${user_id % 2}
    order_info_inline:
      type: INLINE
      props:
        algorithm-expression: order_info_${order_id % 3}
    order_item_inline:
      type: INLINE
      props:
        algorithm-expression: order_item_${order_id % 3}
- !BROADCAST
  tables:
    - user_info
[root@du103 ~]#

参数涵义请自行查看shardingsphere配置
https://shardingsphere.apache.org/document/current/cn/user-manual/shardingsphere-jdbc/yaml-config/rules/sharding/
https://shardingsphere.apache.org/document/current/cn/user-manual/shardingsphere-jdbc/yaml-config/data-source/

启动ShardingSphere

启动shardingsphere之前需要启动后端数据库

[root@du103 ~]# ll /opt/apache-shardingsphere-5.4.0/bin/
total 16
-rwxr-xr-x 1 root root 3313 Dec 12  2022 start.bat
-rwxr-xr-x 1 root root 7492 Dec 12  2022 start.sh  **--启动脚本**
-rwxr-xr-x 1 root root 1455 Dec 12  2022 stop.sh   **--关闭脚本**
[root@du103 ~]# 

[root@du103 ~]# sh /opt/apache-shardingsphere-5.4.0/bin/start.sh
/usr/bin/java
we find java version: java8, full_version=1.8.0_332, full_path=/usr/bin/java
The classpath is /opt/apache-shardingsphere-5.4.0/conf:/opt/apache-shardingsphere-5.4.0/conf:.:/opt/apache-shardingsphere-5.4.0/lib/*:/opt/apache-shardingsphere-5.4.0/ext-lib/*
main class org.apache.shardingsphere.proxy.Bootstrap -1 /opt/apache-shardingsphere-5.4.0/conf 0.0.0.0 false
Starting the ShardingSphere-Proxy ...Please check the STDOUT file: /opt/apache-shardingsphere-5.4.0/logs/stdout.log
[root@du103 ~]#

查看shardingsphere日志:

[root@du103 ~]# tail -f  /opt/apache-shardingsphere-5.4.0/logs/stdout.log
[INFO ] 2023-10-16 11:21:18.184 [main] o.a.s.d.p.c.l.PipelineContextManagerLifecycleListener - mode type is not Cluster, mode type='Standalone', ignore
[INFO ] 2023-10-16 11:21:18.232 [main] o.a.s.p.v.ShardingSphereProxyVersion - Database name is `PostgreSQL`, version is `14.7`, database name is `sharding_db`
[INFO ] 2023-10-16 11:21:18.234 [main] o.a.s.p.frontend.ssl.ProxySSLContext - Proxy frontend SSL/TLS is not enabled.
[INFO ] 2023-10-16 11:21:18.385 [main] o.a.s.p.frontend.ShardingSphereProxy - ShardingSphere-Proxy Standalone mode started successfully

4.测试

代理端

shardingsphere默认端口为3307,前端数据库为sharding_db,在## server.yaml文件中有相关配置

[postgres@du102 ~]$ psql -d  sharding_db -h 192.168.10.13 -p 3307 -U root  
Password for user root: 
psql (14.7)
Type "help" for help.

sharding_db=> create table order_info(order_id int primary key, user_id int, order_ts timestamp, total_mount money);
CREATE TABLE
sharding_db=> insert into order_info values(1,1001,now(),88.90);
INSERT 0 1
sharding_db=> insert into order_info values(2,1002,now(),128.50);
INSERT 0 1
sharding_db=> insert into order_info values(3,1002,now(),228.70);
INSERT 0 1
sharding_db=> insert into order_info values(4,1002,now(),23.80);
INSERT 0 1
sharding_db=> insert into order_info values(5,1001,now(),988.00);
INSERT 0 1
sharding_db=> insert into order_info values(6,1001,now(),5000.00);
INSERT 0 1
sharding_db=> create table order_item(order_id int primary key, user_id int, product varchar, vprice money);
CREATE TABLE
sharding_db=> insert into order_item values(1,1001,'鸡肉',23.00);
INSERT 0 1
sharding_db=> insert into order_item values(2,1002,'猪肉',30.50);
INSERT 0 1
sharding_db=> insert into order_item values(3,1002,'牛肉',50.00);
INSERT 0 1
sharding_db=> insert into order_item values(4,1002,'西红柿',5.00);
INSERT 0 1
sharding_db=> insert into order_item values(5,1001,'海鲜',200.00);
INSERT 0 1
sharding_db=> insert into order_item values(6,1001,'茅台酒',2500.00);
INSERT 0 1
sharding_db=> create table user_info(user_id int primary key,username varchar);
CREATE TABLE
sharding_db=> insert into user_info values(1001,'Join');
INSERT 0 1
sharding_db=> insert into user_info values(1002,'Kevin');
INSERT 0 1
sharding_db=>

数据节点1

[postgres@du101 ~]$ psql ds_0
psql (14.7)
Type "help" for help.

ds_0=# \dt
            List of relations
 Schema |     Name     | Type  |  Owner   
--------+--------------+-------+----------
 public | order_info_0 | table | postgres
 public | order_info_1 | table | postgres
 public | order_info_2 | table | postgres
 public | order_item_0 | table | postgres
 public | order_item_1 | table | postgres
 public | order_item_2 | table | postgres
 public | user_info    | table | postgres
(7 rows)

ds_0=# select * from order_info_0;
 order_id | user_id |          order_ts          | total_mount 
----------+---------+----------------------------+-------------
        3 |    1002 | 2023-10-16 11:58:34.319382 |     $228.70
(1 row)

ds_0=# select * from order_info_1;
 order_id | user_id |          order_ts          | total_mount 
----------+---------+----------------------------+-------------
        4 |    1002 | 2023-10-16 11:58:37.807301 |      $23.80
(1 row)

ds_0=# select * from order_info_2;
 order_id | user_id |          order_ts          | total_mount 
----------+---------+----------------------------+-------------
        2 |    1002 | 2023-10-16 11:58:30.510203 |     $128.50
(1 row)

ds_0=# select * from user_info;
 user_id | username 
---------+----------
    1001 | Join
    1002 | Kevin
(2 rows)

ds_0=# select * from order_item_0;
 order_id | user_id | product | vprice 
----------+---------+---------+--------
        3 |    1002 | 牛肉    | $50.00
(1 row)

ds_0=# select * from order_item_1;
 order_id | user_id | product | vprice 
----------+---------+---------+--------
        4 |    1002 | 西红柿  |  $5.00
(1 row)

ds_0=# select * from order_item_2;
 order_id | user_id | product | vprice 
----------+---------+---------+--------
        2 |    1002 | 猪肉    | $30.50
(1 row)

ds_0=#

数据节点2

[postgres@du102 ~]$ psql ds_1
psql (14.7)
Type "help" for help.

ds_1=# \dt
            List of relations
 Schema |     Name     | Type  |  Owner   
--------+--------------+-------+----------
 public | order_info_0 | table | postgres
 public | order_info_1 | table | postgres
 public | order_info_2 | table | postgres
 public | order_item_0 | table | postgres
 public | order_item_1 | table | postgres
 public | order_item_2 | table | postgres
 public | user_info    | table | postgres
(7 rows)

ds_1=# select * from order_info_0;
 order_id | user_id |          order_ts          | total_mount 
----------+---------+----------------------------+-------------
        6 |    1001 | 2023-10-16 11:58:46.403778 |   $5,000.00
(1 row)

ds_1=# select * from order_info_1;
 order_id | user_id |          order_ts          | total_mount 
----------+---------+----------------------------+-------------
        1 |    1001 | 2023-10-16 11:58:25.990573 |      $88.90
(1 row)

ds_1=# select * from order_info_2;
 order_id | user_id |          order_ts          | total_mount 
----------+---------+----------------------------+-------------
        5 |    1001 | 2023-10-16 11:58:41.870171 |     $988.00
(1 row)

ds_1=# select * from order_item_0;
 order_id | user_id | product |  vprice   
----------+---------+---------+-----------
        6 |    1001 | 茅台酒  | $2,500.00
(1 row)

ds_1=# select * from order_item_1;
 order_id | user_id | product | vprice 
----------+---------+---------+--------
        1 |    1001 | 鸡肉    | $23.00
(1 row)

ds_1=# select * from order_item_2;
 order_id | user_id | product | vprice  
----------+---------+---------+---------
        5 |    1001 | 海鲜    | $200.00
(1 row)

ds_1=# 
ds_1=# select * from user_info;
 user_id | username 
---------+----------
    1001 | Join
    1002 | Kevin
(2 rows)

ds_1=#

代理端

sharding_db=> explain analyze select info.*,item.product,item.vprice from order_info info inner join order_item item on info.order_id=item.order_id;
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=34.08..63.91 rows=1070 width=64) (actual time=0.040..0.042 rows=1 loops=1)
   Hash Cond: (info.order_id = item.order_id)
   ->  Seq Scan on order_info_0 info  (cost=0.00..25.70 rows=1570 width=24) (actual time=0.007..0.008 rows=1 loops=1)
   ->  Hash  (cost=20.70..20.70 rows=1070 width=44) (actual time=0.015..0.015 rows=1 loops=1)
         Buckets: 2048  Batches: 1  Memory Usage: 17kB
         ->  Seq Scan on order_item_0 item  (cost=0.00..20.70 rows=1070 width=44) (actual time=0.009..0.009 rows=1 loops=1)
 Planning Time: 0.169 ms
 Execution Time: 0.066 ms
(8 rows)

sharding_db=> 
sharding_db=> explain analyze select info.*,item.product,item.vprice from order_info info inner join order_item item on info.user_id=item.user_id;  
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=183.58..314.93 rows=8400 width=64) (actual time=0.129..0.133 rows=1 loops=1)
   Merge Cond: (item.user_id = info.user_id)
   ->  Sort  (cost=74.54..77.21 rows=1070 width=44) (actual time=0.100..0.102 rows=1 loops=1)
         Sort Key: item.user_id
         Sort Method: quicksort  Memory: 25kB
         ->  Seq Scan on order_item_0 item  (cost=0.00..20.70 rows=1070 width=44) (actual time=0.010..0.012 rows=1 loops=1)
   ->  Sort  (cost=109.04..112.96 rows=1570 width=24) (actual time=0.020..0.021 rows=1 loops=1)
         Sort Key: info.user_id
         Sort Method: quicksort  Memory: 25kB
         ->  Seq Scan on order_info_0 info  (cost=0.00..25.70 rows=1570 width=24) (actual time=0.004..0.004 rows=1 loops=1)
 Planning Time: 1.026 ms
 Execution Time: 0.261 ms
(12 rows)

sharding_db=>

可以看出,分库分表的效果已经达到:
1.表order_info和表order_item按照列user_id%2分布于ds_0和ds_1的两个数据库中,在每个数据库内再按照order_id%3分布于3个不同的表中。
2.表user_info定义为广播表,在每个数据库中都有相同的全量数据;
3.当多表关联的时候,以关联键作为表的分布键,可以加速关联查询;

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值