1.节点信息
192.168.10.11 PostgreSQL14.7 数据节点
192.168.10.12 PostgreSQL14.7 数据节点
192.168.10.13 ShardingSphere-proxy 5.4.0
2.准备环境
修改/etc/hosts文件
cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.10.11 du101
192.168.10.12 du102
192.168.10.13 du103
防火墙设置
systemctl status firewalld.service
systemctl stop firewalld.service
systemctl disable firewalld
SELINUX设置
setenforce 0
vi /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three two values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
设置三节点时间同步
1、时间服务器配置(以192.168.10.13为例)
1)安装ntpd服务
yum install -y ntp
2)配置ntpd.conf
vi /etc/ntp.conf #以下定义是让NTP Server和其自身保持同步,如果在/etc/ntp.conf中定义的server都不可用时,将使用local时间作为ntp服务提供给ntp客户端。
server 127.127.1.0
fudge 127.127.1.0 stratum 10
3)启动ntpd服务
systemctl start ntpd
systemctl enable ntpd
systemctl status ntpd
2、数据库节点配置
yum install -y ntpdate
其余节点安装用ntpdate,并从192.168.10.13进行时间同步,
为避免时间出现误差,对数据库主备节点添加crontab任务保持定时同步时间,每分钟同步一次:
crontab -e
*/10 * * * * /sbin/ntpdate -u 192.168.10.13 >> /var/log/ntp.log
检查时间同步情况
cat /var/log/ntp.log
2.数据库安装
搭建好PostgreSQL单机2节点,数据库配置好pg_hba.conf
[postgres@du101 ~]$ tail -5 /data/pgdata/pg_hba.conf
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
host all all 0.0.0.0/0 md5
[postgres@du101 ~]$
2数据节点分别创建好对应的数据库
192.168.10.11
postgres=# create database ds_0;
CREATE DATABASE
postgres=#
192.168.10.12
postgres=# create database ds_1;
CREATE DATABASE
postgres=#
3.Proxy部署
ShardingSphere-proxy的运行需要java环境
[root@du103 ~]# yum -y install java
[root@du103 ~]# java -version
openjdk version "1.8.0_332"
OpenJDK Runtime Environment (build 1.8.0_332-b09)
OpenJDK 64-Bit Server VM (build 25.332-b09, mixed mode)
[root@du103 ~]# tar zxvf apache-shardingsphere-5.4.0-shardingsphere-proxy-bin.tar.gz -C /opt/
[root@du103 ~]# mv /opt/apache-shardingsphere-5.4.0-shardingsphere-proxy-bin /opt/apache-shardingsphere-5.4.0
[root@du103 ~]# ll /opt/apache-shardingsphere-5.4.0/conf
total 36
-rw-r--r-- 1 root root 3529 Dec 12 2022 config-encrypt.yaml
-rw-r--r-- 1 root root 3772 Dec 12 2022 config-mask.yaml
-rw-r--r-- 1 root root 3762 Dec 12 2022 config-readwrite-splitting.yaml
-rw-r--r-- 1 root root 5210 Dec 12 2022 config-shadow.yaml
-rw-r--r-- 1 root root 5701 Dec 12 2022 config-sharding.yaml
-rw-r--r-- 1 root root 1715 Dec 12 2022 hbase-db.yaml
-rw-r--r-- 1 root root 3096 Dec 12 2022 server.yaml
[root@du103 ~]#
–以 config- 前缀命名的 YAML 配置文件,即为一个逻辑数据源。
–server.yaml 配置注册中心、认证信息以及公用属性
##保留 config-sharding.yaml
server.yaml 文件其他可以创建bak目录放进去
[root@du103 ~]# mkdir /opt/apache-shardingsphere-5.4.0/conf/bak
[root@du103 ~]# mv /opt/apache-shardingsphere-5.4.0/conf/*.yaml /opt/apache-shardingsphere-5.4.0/conf/bak
[root@du103 ~]# cp /opt/apache-shardingsphere-5.4.0/conf/bak/server.yaml /opt/apache-shardingsphere-5.4.0/conf/
[root@du103 ~]# cp /opt/apache-shardingsphere-5.4.0/conf/bak/config-sharding.yaml /opt/apache-shardingsphere-5.4.0/conf/
[root@du103 ~]#
配制server.yaml文件
server.yaml文件的模板。该配置文件的主要作用是配置前端的认证数据库、用户名和密码, 以及连接相关的属性:包括分布式事务类型、sql日志等。
配置虚拟帐号密码,使用默认的root用户及密码root,且允许外网连接,同时权限指定为“所有权限无限制”。
[root@du103 ~]# vi /opt/apache-shardingsphere-5.4.0/conf/server.yaml
authority:
users:
- user: root@%
password: root
privilege:
type: ALL_PERMITTED
props:
max-connections-size-per-query: 1
kernel-executor-size: 16 # Infinite by default.
proxy-frontend-flush-threshold: 128 # The default value is 128.
sql-show: true
[root@du103 ~]#
参数说明:
max-connections-size-per-query:一次查询请求在每个数据库实例中所能使用的最大连接数,默认为1;
kernel-executor-size:用于设置任务处理线程池的大小。每个 ShardingSphereDataSource 使用一个独立的线程池,同一个 JVM 的不同数据源不共享线程池。默认为infinite;
proxy-frontend-flush-threshold:在 ShardingSphere-Proxy 中设置传输数据条数的 IO 刷新阈值。默认128;
sql-show:是否在日志中打印 SQL。打印 SQL 可以帮助开发者快速定位系统问题。日志内容包含:逻辑 SQL,真实 SQL 和 SQL 解析结果。如果开启配置,日志将使用 Topic ShardingSphere-SQL,日志级别是 INFO。
修改数据分片配置文件
config-sharding.yaml文件,该文件主要作用是配置后端与postgres数据库的连接属性,分库分表规则等。
本次分片示例表order_info和order_item数据分两个库,表分为3片,数据库分片键为user_id,值按2取余,表分片键为order_id,值按3取余。
并且包含广播表user_info
[root@du103 ~]# vi /opt/apache-shardingsphere-5.4.0/conf/config-sharding.yaml
databaseName: sharding_db
dataSources:
ds_0:
url: jdbc:postgresql://192.168.10.11:1921/ds_0
username: postgres
password: postgres
ds_1:
url: jdbc:postgresql://192.168.10.12:1921/ds_1
username: postgres
password: postgres
rules:
- !SHARDING
tables:
order_info:
actualDataNodes: ds_${0..1}.order_info_${0..2}
tableStrategy:
standard:
shardingColumn: order_id
shardingAlgorithmName: order_info_inline
order_item:
actualDataNodes: ds_${0..1}.order_item_${0..2}
tableStrategy:
standard:
shardingColumn: order_id
shardingAlgorithmName: order_item_inline
bindingTables:
- order_info,order_item
defaultDatabaseStrategy:
standard:
shardingColumn: user_id
shardingAlgorithmName: ds_inline
shardingAlgorithms:
ds_inline:
type: INLINE
props:
algorithm-expression: ds_${user_id % 2}
order_info_inline:
type: INLINE
props:
algorithm-expression: order_info_${order_id % 3}
order_item_inline:
type: INLINE
props:
algorithm-expression: order_item_${order_id % 3}
- !BROADCAST
tables:
- user_info
[root@du103 ~]#
参数涵义请自行查看shardingsphere配置
https://shardingsphere.apache.org/document/current/cn/user-manual/shardingsphere-jdbc/yaml-config/rules/sharding/
https://shardingsphere.apache.org/document/current/cn/user-manual/shardingsphere-jdbc/yaml-config/data-source/
启动ShardingSphere
启动shardingsphere之前需要启动后端数据库
[root@du103 ~]# ll /opt/apache-shardingsphere-5.4.0/bin/
total 16
-rwxr-xr-x 1 root root 3313 Dec 12 2022 start.bat
-rwxr-xr-x 1 root root 7492 Dec 12 2022 start.sh **--启动脚本**
-rwxr-xr-x 1 root root 1455 Dec 12 2022 stop.sh **--关闭脚本**
[root@du103 ~]#
[root@du103 ~]# sh /opt/apache-shardingsphere-5.4.0/bin/start.sh
/usr/bin/java
we find java version: java8, full_version=1.8.0_332, full_path=/usr/bin/java
The classpath is /opt/apache-shardingsphere-5.4.0/conf:/opt/apache-shardingsphere-5.4.0/conf:.:/opt/apache-shardingsphere-5.4.0/lib/*:/opt/apache-shardingsphere-5.4.0/ext-lib/*
main class org.apache.shardingsphere.proxy.Bootstrap -1 /opt/apache-shardingsphere-5.4.0/conf 0.0.0.0 false
Starting the ShardingSphere-Proxy ...Please check the STDOUT file: /opt/apache-shardingsphere-5.4.0/logs/stdout.log
[root@du103 ~]#
查看shardingsphere日志:
[root@du103 ~]# tail -f /opt/apache-shardingsphere-5.4.0/logs/stdout.log
[INFO ] 2023-10-16 11:21:18.184 [main] o.a.s.d.p.c.l.PipelineContextManagerLifecycleListener - mode type is not Cluster, mode type='Standalone', ignore
[INFO ] 2023-10-16 11:21:18.232 [main] o.a.s.p.v.ShardingSphereProxyVersion - Database name is `PostgreSQL`, version is `14.7`, database name is `sharding_db`
[INFO ] 2023-10-16 11:21:18.234 [main] o.a.s.p.frontend.ssl.ProxySSLContext - Proxy frontend SSL/TLS is not enabled.
[INFO ] 2023-10-16 11:21:18.385 [main] o.a.s.p.frontend.ShardingSphereProxy - ShardingSphere-Proxy Standalone mode started successfully
4.测试
代理端
shardingsphere默认端口为3307,前端数据库为sharding_db,在## server.yaml文件中有相关配置
[postgres@du102 ~]$ psql -d sharding_db -h 192.168.10.13 -p 3307 -U root
Password for user root:
psql (14.7)
Type "help" for help.
sharding_db=> create table order_info(order_id int primary key, user_id int, order_ts timestamp, total_mount money);
CREATE TABLE
sharding_db=> insert into order_info values(1,1001,now(),88.90);
INSERT 0 1
sharding_db=> insert into order_info values(2,1002,now(),128.50);
INSERT 0 1
sharding_db=> insert into order_info values(3,1002,now(),228.70);
INSERT 0 1
sharding_db=> insert into order_info values(4,1002,now(),23.80);
INSERT 0 1
sharding_db=> insert into order_info values(5,1001,now(),988.00);
INSERT 0 1
sharding_db=> insert into order_info values(6,1001,now(),5000.00);
INSERT 0 1
sharding_db=> create table order_item(order_id int primary key, user_id int, product varchar, vprice money);
CREATE TABLE
sharding_db=> insert into order_item values(1,1001,'鸡肉',23.00);
INSERT 0 1
sharding_db=> insert into order_item values(2,1002,'猪肉',30.50);
INSERT 0 1
sharding_db=> insert into order_item values(3,1002,'牛肉',50.00);
INSERT 0 1
sharding_db=> insert into order_item values(4,1002,'西红柿',5.00);
INSERT 0 1
sharding_db=> insert into order_item values(5,1001,'海鲜',200.00);
INSERT 0 1
sharding_db=> insert into order_item values(6,1001,'茅台酒',2500.00);
INSERT 0 1
sharding_db=> create table user_info(user_id int primary key,username varchar);
CREATE TABLE
sharding_db=> insert into user_info values(1001,'Join');
INSERT 0 1
sharding_db=> insert into user_info values(1002,'Kevin');
INSERT 0 1
sharding_db=>
数据节点1
[postgres@du101 ~]$ psql ds_0
psql (14.7)
Type "help" for help.
ds_0=# \dt
List of relations
Schema | Name | Type | Owner
--------+--------------+-------+----------
public | order_info_0 | table | postgres
public | order_info_1 | table | postgres
public | order_info_2 | table | postgres
public | order_item_0 | table | postgres
public | order_item_1 | table | postgres
public | order_item_2 | table | postgres
public | user_info | table | postgres
(7 rows)
ds_0=# select * from order_info_0;
order_id | user_id | order_ts | total_mount
----------+---------+----------------------------+-------------
3 | 1002 | 2023-10-16 11:58:34.319382 | $228.70
(1 row)
ds_0=# select * from order_info_1;
order_id | user_id | order_ts | total_mount
----------+---------+----------------------------+-------------
4 | 1002 | 2023-10-16 11:58:37.807301 | $23.80
(1 row)
ds_0=# select * from order_info_2;
order_id | user_id | order_ts | total_mount
----------+---------+----------------------------+-------------
2 | 1002 | 2023-10-16 11:58:30.510203 | $128.50
(1 row)
ds_0=# select * from user_info;
user_id | username
---------+----------
1001 | Join
1002 | Kevin
(2 rows)
ds_0=# select * from order_item_0;
order_id | user_id | product | vprice
----------+---------+---------+--------
3 | 1002 | 牛肉 | $50.00
(1 row)
ds_0=# select * from order_item_1;
order_id | user_id | product | vprice
----------+---------+---------+--------
4 | 1002 | 西红柿 | $5.00
(1 row)
ds_0=# select * from order_item_2;
order_id | user_id | product | vprice
----------+---------+---------+--------
2 | 1002 | 猪肉 | $30.50
(1 row)
ds_0=#
数据节点2
[postgres@du102 ~]$ psql ds_1
psql (14.7)
Type "help" for help.
ds_1=# \dt
List of relations
Schema | Name | Type | Owner
--------+--------------+-------+----------
public | order_info_0 | table | postgres
public | order_info_1 | table | postgres
public | order_info_2 | table | postgres
public | order_item_0 | table | postgres
public | order_item_1 | table | postgres
public | order_item_2 | table | postgres
public | user_info | table | postgres
(7 rows)
ds_1=# select * from order_info_0;
order_id | user_id | order_ts | total_mount
----------+---------+----------------------------+-------------
6 | 1001 | 2023-10-16 11:58:46.403778 | $5,000.00
(1 row)
ds_1=# select * from order_info_1;
order_id | user_id | order_ts | total_mount
----------+---------+----------------------------+-------------
1 | 1001 | 2023-10-16 11:58:25.990573 | $88.90
(1 row)
ds_1=# select * from order_info_2;
order_id | user_id | order_ts | total_mount
----------+---------+----------------------------+-------------
5 | 1001 | 2023-10-16 11:58:41.870171 | $988.00
(1 row)
ds_1=# select * from order_item_0;
order_id | user_id | product | vprice
----------+---------+---------+-----------
6 | 1001 | 茅台酒 | $2,500.00
(1 row)
ds_1=# select * from order_item_1;
order_id | user_id | product | vprice
----------+---------+---------+--------
1 | 1001 | 鸡肉 | $23.00
(1 row)
ds_1=# select * from order_item_2;
order_id | user_id | product | vprice
----------+---------+---------+---------
5 | 1001 | 海鲜 | $200.00
(1 row)
ds_1=#
ds_1=# select * from user_info;
user_id | username
---------+----------
1001 | Join
1002 | Kevin
(2 rows)
ds_1=#
代理端
sharding_db=> explain analyze select info.*,item.product,item.vprice from order_info info inner join order_item item on info.order_id=item.order_id;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=34.08..63.91 rows=1070 width=64) (actual time=0.040..0.042 rows=1 loops=1)
Hash Cond: (info.order_id = item.order_id)
-> Seq Scan on order_info_0 info (cost=0.00..25.70 rows=1570 width=24) (actual time=0.007..0.008 rows=1 loops=1)
-> Hash (cost=20.70..20.70 rows=1070 width=44) (actual time=0.015..0.015 rows=1 loops=1)
Buckets: 2048 Batches: 1 Memory Usage: 17kB
-> Seq Scan on order_item_0 item (cost=0.00..20.70 rows=1070 width=44) (actual time=0.009..0.009 rows=1 loops=1)
Planning Time: 0.169 ms
Execution Time: 0.066 ms
(8 rows)
sharding_db=>
sharding_db=> explain analyze select info.*,item.product,item.vprice from order_info info inner join order_item item on info.user_id=item.user_id;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=183.58..314.93 rows=8400 width=64) (actual time=0.129..0.133 rows=1 loops=1)
Merge Cond: (item.user_id = info.user_id)
-> Sort (cost=74.54..77.21 rows=1070 width=44) (actual time=0.100..0.102 rows=1 loops=1)
Sort Key: item.user_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on order_item_0 item (cost=0.00..20.70 rows=1070 width=44) (actual time=0.010..0.012 rows=1 loops=1)
-> Sort (cost=109.04..112.96 rows=1570 width=24) (actual time=0.020..0.021 rows=1 loops=1)
Sort Key: info.user_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on order_info_0 info (cost=0.00..25.70 rows=1570 width=24) (actual time=0.004..0.004 rows=1 loops=1)
Planning Time: 1.026 ms
Execution Time: 0.261 ms
(12 rows)
sharding_db=>
可以看出,分库分表的效果已经达到:
1.表order_info和表order_item按照列user_id%2分布于ds_0和ds_1的两个数据库中,在每个数据库内再按照order_id%3分布于3个不同的表中。
2.表user_info定义为广播表,在每个数据库中都有相同的全量数据;
3.当多表关联的时候,以关联键作为表的分布键,可以加速关联查询;