数据库分库分表（MyCat，ShardingSphere）

最新推荐文章于 2024-08-25 18:45:07 发布

Aizen_Sousuke

最新推荐文章于 2024-08-25 18:45:07 发布

阅读量820

点赞数

分类专栏：一些问题数据库分布式

本文链接：https://blog.csdn.net/Aizen_Sousuke/article/details/103820128

版权

一些问题同时被 3 个专栏收录

17 篇文章 0 订阅

订阅专栏

分布式

17 篇文章 0 订阅

订阅专栏

数据库

8 篇文章 0 订阅

订阅专栏

分库分表原则

数据库切分根据其切分类型，可以分为两种方式：垂直切分和水平切分

数据库三大范式

范式是关系数据库理论的基础，也是我们设计数据库结构过程中索要遵循的规范和指导方法，常用的三大范式

第一范式（1NF）
强调的是列的原子性，即列不能再细化分成其他多列
第二范式（2NF）
除了符合第一范式外，还有两个部分。
一：表中必须有一个主键
二：没有包含在主键中得列，必须完全依赖于主键。
第三范式（3NF）
除了符合第二范式外，还要符合非主键必须直接依赖主键，而不是依赖于其他非主键的列，不存在依赖传递。

垂直（纵向）切分（专库专表）

垂直分切常见有垂直分库和垂直分表两种。

垂直分库就是根据业务耦合性，将关联度低的不同表存储在不同的数据库。

垂直分表是根据数据库中的列进行，某个表字段较多，将不常用或者字段长度较大的字段拆分到扩展表中。

水平（横向）切分（同一数据多个表或库）

水平切分分为库内分表和分库分表

根据表内数据内在的逻辑关系，将同一个表按不同的条件分散到多个数据库或多个表中，每个表中只包含一部分数据，从而使得单个表的数据量变小，达到分布式的效果

库内分表只解决了单一表数据量过大的问题，但没有将表分布到不同机器的库上，因此对于减轻MySQL数据库的压力来说，帮助不是很大，大家还是竞争同一个物理机的CPU、内存、网络IO，最好通过分库分表来解决

水平切分后，同一张表会出现在多个数据库或表中，几个库或表的内容不同。几种典型的数据分片规则：
根据数值范围
按照时间区间或ID区分来切分。
根据数值取模
一般采用hash取模的分切方式。例如：将表根据某个字段切分到四份库中，余数为0放第一个库，余数为1放到第二个，以此类推。如果查询条件带有该字段，则可明确定位到相应库去查询

分库分表带来的问题

事务一致性问题
在提交事务时需要协调多个节点，推后了提交事务的时间点，延长了事务的执行时间。导致事务在访问共享资源时发生冲突或死锁的概率增高
跨界点关联查询join问题
数据可能分布在不同的节点上，此时join带来的问题就比较麻烦了，考虑到性能，尽量避免使用join查询。
跨节点分页、排序、函数问题
跨节点多库进行查询时，会出现limit分页、order by排序等问题
需要先在不同的分片节点中将数据进行排序并返回，然后将不同分片返回的结果集进行汇总和再次排序，最终返回给用户
如果取得页数很大，情况则变得复杂很多，因为各分片节点中的数据可能是随机的，为了排序的准确性，需要将所有节点的前N页数据都排序好做合并，最后再进行整体的排序，这样的操作时很耗费CPU和内存资源的，所以页数越大，系统的性能也会越差
全局主键避重问题
在分库分表环境中，由于表中数据同时存在不同数据库中，主键值平时使用的自增长将无用武之地，某个分区数据库自生成的ID无法保证全局唯一。因此需要单独设计全局主键，以避免跨库主键重复问题。
数据迁移、扩容问题

MyCat（proxy模式）

在这里插入图片描述

下载mycat-server

修改配置文件

server.xml

<!--mycat的账号密码-->
<user name="mycat">
	<property name="password">mycat</property>
	<!--添加两个mycat逻辑库：user,pay-->
	<property name="schemas">user,pay</property>
</user>

连接方式和mysql相同，默认的 Mycat 端口是 8066

schema.xml

<!-- user对应的逻辑库,userNode逻辑库对应的分片 -->
<schema name="user" checkSQLschema="false" sqlMaxLimit="100" dataNode="userNode" />
<!-- userDB对应的数据库，host 实际的物理库配置地址 -->
<dataNode name="userNode" dataHost="host" database="userDB" />



<schema name="pay"  checkSQLschema="false" sqlMaxLimit="100" dataNode="pay" >
   <!-- 指定order表 所在的数据库位置 -->
   <table name="order" dataNode="pay1,pay2" rule="rule1"/>
</schema>
<dataNode name="pay1" dataHost="host" database="pay1" />
<dataNode name="pay2" dataHost="host" database="pay2" />
<!-- 
	balance =0：不开启读写分离，所有读操作都发送到当前可用的writeHost 
	balance =1：双主双从模式，只有一台writeHost写，其他的全部参与读操作的负载均衡
	balance =2：所有读操作随机发生再 读主机和写主机 上
	balance =3：所有读请求随机发送到 readHost 
-->
<dataHost name="host" maxCon="1000" minCon="10" balance="0"
   writeType="0" dbType="mysql" dbDriver="native">
   <!-- 通过向数据库执行select 1这个简单的sql语句判断是否保持链接。 -->
   <heartbeat>select 1</heartbeat>
   <!-- 写主机 -->
   <writeHost host="hostM1" url="192.168.0.2:3306" user="root" password="root" >
   		<!-- 读主机 -->
    	<readHost host="hostS2" url="192.168.0.3:3306" user="root" password="root" />
   </writeHost>
   <writeHost host="hostM2" url="192.168.0.4:3306" user="root" password="root" />
</dataHost>

table ：实际表配置
- name：表名称
- dataNode：dataNode分区名称
- rule：对应规则名称
dataNode：分区信息
- name：dataNode分区名称
- dataHost：dataHost名称
- database：指定连接的物理库名称
dataHost：数据库信息
- name：dataHost名称

rule.xml

<mycat:rule  xmlns:mycat="http://org.opencloudb/">
  <tableRule name="sharding-by-hour">
    <rule>
      <columns>createTime</columns>
      <algorithm>sharding-by-hour</algorithm>
    </rule>
  </tableRule>
  
  <function name="sharding-by-hour" class="org.opencloudb.route.function.LatestMonthPartion">
    <property name="splitOneDay">24</property>
  </function>
   
</mycat:rule >

tableRule

name 为schema.xml 中table 标签中对应的 rule=“sharding-by-hour” ,也就是配置表的分片规则，
columns 是表的切分字段： createTime 创建日期。
algorithm 是规则对应的切分规则：映射到function 的name。

function

function 配置是分片规则的配置。
name 为切分规则的名称，名字任意取，但是需要与tableRule 中匹配。
class 是切分规则对应的切分类，写死，需要哪种规则则配置哪种，例如本例子是按小时分片：org.opencloudb.route.function.LatestMonthPartion
property 标签是切分规则对应的不同属性，不同的切分规则配置不同。

ER表：分表通过创建关联查询

<table name="customer" dataNode="dn1,dn2" rule="sharding-by-intfile">
	<!--
		customer主表，orders子表
		name：子表的名称
		joinKey：子表的字段
		parentKey：主表被关联的字段
		primaryKey：子表的主键
	-->
	<childTable name="orders" joinKey="customer_id" parentKey="id" primaryKey="id"/>
</table>

全局序列：数据库配置方式

创建 MYCAT_SEQUENCE 表存放序列号，当作插入数据的主键

name 序列号名称
current_value 当前 value
increment 增长步长! 可理解为 mycat 在数据库中一次读取多少个 sequence. 当这些用完后, 下次再从数据库中读取。

第一步创建表

创建表

CREATE TABLE MYCAT_SEQUENCE (name VARCHAR(50) NOT NULL,current_value INT NOT NULL,increment INT NOT NULL DEFAULT 100, PRIMARY KEY(name)) ENGINE=InnoDB;

插入数据

<!--设置序列号从100000开始，每次重新获取步长为100-->
INSERT INTO MYCAT_SEQUENCE(name,current_value,increment) VALUES (‘GLOBAL’, 100000, 100);

创建三个相关函数

1. 获取当前 sequence 的值 (返回当前值,增量)
DROP FUNCTION IF EXISTS mycat_seq_currval;
DELIMITER
CREATE FUNCTION mycat_seq_currval(seq_name VARCHAR(50)) RETURNS varchar(64) CHARSET utf-8
DETERMINISTIC
BEGIN
DECLARE retval VARCHAR(64);
SET retval=“-999999999,null”;
SELECT concat(CAST(current_value AS CHAR),“,”,CAST(increment AS CHAR)) INTO retval FROM
MYCAT_SEQUENCE WHERE name = seq_name;
RETURN retval;
END
DELIMITER;

2. 设置 sequence 值
DROP FUNCTION IF EXISTS mycat_seq_setval;
DELIMITER
CREATE FUNCTION mycat_seq_setval(seq_name VARCHAR(50),value INTEGER) RETURNS varchar(64) CHARSET utf-8
DETERMINISTIC
BEGIN
UPDATE MYCAT_SEQUENCE
SET current_value = value
WHERE name = seq_name;
RETURN mycat_seq_currval(seq_name);
END
DELIMITER;

3. 获取下一个 sequence 值
DROP FUNCTION IF EXISTS mycat_seq_nextval;
DELIMITER
CREATE FUNCTION mycat_seq_nextval(seq_name VARCHAR(50)) RETURNS varchar(64) CHARSET utf-8
DETERMINISTIC
BEGIN
UPDATE MYCAT_SEQUENCE
SET current_value = current_value + increment WHERE name = seq_name;
RETURN mycat_seq_currval(seq_name);
END
DELIMITER;

第二步修改配置

sequence_db_conf.properties

<!--
	USER_SEQ：schema.xml中，schema 的name或table的name
	test_dn1：dataNode 名称
-->
USER_SEQ=test_dn1

server.xml

<system><property name="sequnceHandlerType">1</property></system>
<!--
	sequnceHandlerType 需要配置为 1，表示使用数据库方式生成 sequence。
-->

插入数据时

<!-- 通过next value for MYCATSEQ_GLOBAL获取序列号当主键-->
insert into table(id,name) values(next value for MYCATSEQ_GLOBAL,‘test’)

ShardingSphere（推荐）

有三款产品

ShardingSphere-JDBC
- 使用客户端直连数据库，以 jar 包形式提供服务，无需额外部署和依赖，可理解为增强版的 JDBC 驱动
ShardingSphere-Proxy
- 透明化的数据库代理端，可直接当做 MySQL/PostgreSQL 使用。
ShardingSphere-Sidecar（TODO）

ShardingSphere-JDBC

在这里插入图片描述
sharding-JDBC不是做分库分表，目的是用来简化对分库分表之后数据相关操作：数据分片和读写分离

水平分表

springboot + mybaitsPlus + Sharding-JDBC + Druid连接池
根据存入数据id是否为偶数去分表

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter</artifactId>
        </dependency>

        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>

        <!--连接池-->
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>druid-spring-boot-starter</artifactId>
            <version>1.1.20</version>
        </dependency>
        <!--mysql依赖-->
        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
        </dependency>
        <!--sharding依赖-->
        <dependency>
            <groupId>org.apache.shardingsphere</groupId>
            <artifactId>sharding-jdbc-spring-boot-starter</artifactId>
            <version>4.0.0-RC1</version>
        </dependency>
        <!--mybatisPlus-->
        <dependency>
            <groupId>com.baomidou</groupId>
            <artifactId>mybatis-plus-boot-starter</artifactId>
            <version>3.2.0</version>
        </dependency>
    </dependencies>

创建数据库course_db
创建表course_1和course_2，分别存入id为偶数和id为奇数的数据

CREATE TABLE `course_1` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT '主键',
  `c_name` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL DEFAULT '' COMMENT '名称',
  `user_id`  int(11) NOT NULL DEFAULT '0'  COMMENT 'userid',
  `status` smallint(1) NOT NULL DEFAULT '0' COMMENT '状态（0正常 1冻结）',
  PRIMARY KEY (`id`) USING BTREE
) ENGINE=InnoDB  DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci ROW_FORMAT=DYNAMIC;

application.properties

# 配置真实数据源
spring.shardingsphere.datasource.names=ds0
# 配置数据源
spring.shardingsphere.datasource.ds0.type=com.alibaba.druid.pool.DruidDataSource
spring.shardingsphere.datasource.ds0.driver-class-name=com.mysql.cj.jdbc.Driver
spring.shardingsphere.datasource.ds0.url=jdbc:mysql://localhost:3306/course_db?serverTimezone=GMT%2B8
spring.shardingsphere.datasource.ds0.username=root
spring.shardingsphere.datasource.ds0.password=123456

# 配置 course表规则,
#其中course为自定义名称
#行表达式标识符 $->{...}，$是占位符，括号里边的是值
#表示ds0数据库中的course_1和course_2
spring.shardingsphere.sharding.tables.course.actual-data-nodes=ds0.course_$->{1..2}

#指定course表里面主键字段，及主键生成策略
spring.shardingsphere.sharding.tables.course.key-generator.column=id
spring.shardingsphere.sharding.tables.course.key-generator.type=SNOWFLAKE

# 配置分表策略
#指定分区字段
spring.shardingsphere.sharding.tables.course.table-strategy.inline.sharding-column=id
#分区字段取模，能被2整除等于0，否则等于1，+1 后等于1和2，
spring.shardingsphere.sharding.tables.course.table-strategy.inline.algorithm-expression=course_$->{id%2 + 1}

#显示sql
spring.shardingsphere.props.sql.show=true

策略配置5X官方写法

# 配置分表策略
spring.shardingsphere.rules.sharding.tables.t_order.table-strategy.standard.sharding-column=order_id
spring.shardingsphere.rules.sharding.tables.t_order.table-strategy.standard.sharding-algorithm-name=table_inline
spring.shardingsphere.rules.sharding.sharding-algorithms.table_inline.type=INLINE
spring.shardingsphere.rules.sharding.sharding-algorithms.table_inline.props.algorithm-expression=t_order_${order_id % 2}

代码中使用一个实体类Course，对应数据库两张表（course_1,course_1）会报错

Consider renaming one of the enabling overriding by setting spring.main.allow-bean-definition-overriding=true

按照提示在配置文件中设置

spring.main.allow-bean-definition-overriding=true

因为配置中已经设置了主键生成策略是雪花算法，所以插入数据时，主键都会根据雪花算法计算出来。
查询也会根据
代码

---------- Course --------------------
@Data
public class Course {
    private Long id;
    private String cName;
    private Integer userId;
    private Integer status;
}

---------- CourseMapper --------------------
@Repository
public interface CourseMapper extends BaseMapper<Course> {
}

----------- ShradingjdbcApplication --------------------
@SpringBootApplication
@MapperScan("com.example.shradingjdbc.mapper")
public class ShradingjdbcApplication {

    public static void main(String[] args) {
        SpringApplication.run(ShradingjdbcApplication.class, args);
    }

}

------------------ShradingjdbcApplicationTests --------------------
@SpringBootTest
class ShradingjdbcApplicationTests {

    @Autowired
    CourseMapper courseMapper;
    @Test
    void contextLoads() {
        for (int i = 0; i < 10; i++) {
            Course course = new Course();
            course.setCName("ceshi+" + i);
            course.setUserId(i);
            courseMapper.insert(course);
        }
    }

    @Test
    void test1(){
        List<Course> courses = courseMapper.selectList(new QueryWrapper<>());
        for (Course cours : courses) {
            System.out.println(cours);
        }
    }
}

水平分库

两个数据库ds0,ds1，每个数据库都有两个表t_order0，t_order1

# 配置真实数据源
spring.shardingsphere.datasource.names=ds0,ds1

# 配置第 1 个数据源
spring.shardingsphere.datasource.ds0.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds0.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds0.jdbc-url=jdbc:mysql://localhost:3306/ds0
spring.shardingsphere.datasource.ds0.username=root
spring.shardingsphere.datasource.ds0.password=

# 配置第 2 个数据源
spring.shardingsphere.datasource.ds1.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds1.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds1.jdbc-url=jdbc:mysql://localhost:3306/ds1
spring.shardingsphere.datasource.ds1.username=root
spring.shardingsphere.datasource.ds1.password=

# 配置 t_order 表规则
spring.shardingsphere.rules.sharding.tables.t_order.actual-data-nodes=ds$->{0..1}.t_order$->{0..1}

# 配置分库策略
spring.shardingsphere.rules.sharding.tables.t_order.database-strategy.standard.sharding-column=user_id
spring.shardingsphere.rules.sharding.tables.t_order.database-strategy.standard.sharding-algorithm-name=database_inline

# 配置分表策略
spring.shardingsphere.rules.sharding.tables.t_order.table-strategy.standard.sharding-column=order_id
spring.shardingsphere.rules.sharding.tables.t_order.table-strategy.standard.sharding-algorithm-name=table_inline

# 省略配置 t_order_item 表规则...
# ...

# 配置 分片算法
spring.shardingsphere.rules.sharding.sharding-algorithms.database_inline.type=INLINE
spring.shardingsphere.rules.sharding.sharding-algorithms.database_inline.props.algorithm-expression=ds_${user_id % 2}
spring.shardingsphere.rules.sharding.sharding-algorithms.table_inline.type=INLINE
spring.shardingsphere.rules.sharding.sharding-algorithms.table_inline.props.algorithm-expression=t_order_${order_id % 2}

垂直分库（专库专表）

所有user信息存储在user_db库的t_user表中

# 配置真实数据源
spring.shardingsphere.datasource.names=ds0
# 配置数据源
spring.shardingsphere.datasource.ds0.type=com.alibaba.druid.pool.DruidDataSource
spring.shardingsphere.datasource.ds0.driver-class-name=com.mysql.cj.jdbc.Driver

spring.shardingsphere.datasource.ds0.url=jdbc:mysql://localhost:3306/user_db

spring.shardingsphere.datasource.ds0.username=root
spring.shardingsphere.datasource.ds0.password=12356

# 配置 t_user表规则,
spring.shardingsphere.sharding.tables.t_user.actual-data-nodes=ds0.t_user
#指定course表里面主键字段，及主键生成策略
spring.shardingsphere.sharding.tables.t_user.key-generator.column=id
spring.shardingsphere.sharding.tables.t_user.key-generator.type=SNOWFLAKE

# 配置分表策略
#指定分区字段
spring.shardingsphere.sharding.tables.t_user.table-strategy.inline.sharding-column=id
#分表策略
spring.shardingsphere.sharding.tables.t_user.table-strategy.inline.algorithm-expression=t_user

操作公共表

每个数据库都创建一个相同的表t_dict
公共表三个字段：主键id，类型key，类型值value

#设置公共表
spring.shardingsphere.sharding.broadcast-tables=t_dict
#主键生成策略
spring.shardingsphere.sharding.tables.t_dict.key-generator.column=id
spring.shardingsphere.sharding.tables.t_dict.key-generator.type=SNOWFLAKE

增删改数据会向每个数据库的t_dict表中增删改数据

读写分离

spring.shardingsphere.datasource.names=m0,s0
# 配置第 1 个数据源
spring.shardingsphere.datasource.m0.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.m0.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.m0.jdbc-url=jdbc:mysql://localhost:3306/m0
spring.shardingsphere.datasource.m0.username=root
spring.shardingsphere.datasource.m0.password=
# 配置第 2 个数据源
spring.shardingsphere.datasource.s0.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.s0.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.s0.jdbc-url=jdbc:mysql://localhost:3306/s0
spring.shardingsphere.datasource.s0.username=root
spring.shardingsphere.datasource.s0.password=
#声明主从，并声明数据源为ds0
spring.shardingsphere.sharding.master-slave-rules.ds0.master-data-source-name=m0
spring.shardingsphere.sharding.master-slave-rules.ds0.slvave-data-source-name=s0
#假设只复制t_user，分表固定分配在t_user表
spring.shardingsphere.sharding.tables.t_user.actual-data-nodes=ds0.t_user

ShardingSphere-Proxy

在这里插入图片描述

下载解压，修改配置文件

server.yaml取消注释即可

authentication:
  users:
    root:
      password: root
    sharding:
      password: sharding 
      authorizedSchemas: sharding_db

props:
  max-connections-size-per-query: 1
  acceptor-size: 16  # The default value is available processors count * 2.
  executor-size: 16  # Infinite by default.
  proxy-frontend-flush-threshold: 128  # The default value is 128.
    # LOCAL: Proxy will run with LOCAL transaction.
    # XA: Proxy will run with XA transaction.
    # BASE: Proxy will run with B.A.S.E transaction.
  proxy-transaction-type: LOCAL
  proxy-opentracing-enabled: false
  proxy-hint-enabled: false
  query-with-cipher-column: true
  sql-show: false
  check-table-metadata-enabled: false

config-sharding.yaml
在这里插入图片描述

复制mysql-connector-java-8.0.18.jar到lib目录
取消注释

  schemaName: sharding_db

  dataSourceCommon:
    username: root
    password:
    connectionTimeoutMilliseconds: 30000
    idleTimeoutMilliseconds: 60000
    maxLifetimeMilliseconds: 1800000
    maxPoolSize: 50
    minPoolSize: 1
    maintenanceIntervalMilliseconds: 30000

  dataSources:
    ds_0:
      url: jdbc:mysql://127.0.0.1:3306/demo_ds_0?serverTimezone=UTC&useSSL=false
    ds_1:
      url: jdbc:mysql://127.0.0.1:3306/demo_ds_1?serverTimezone=UTC&useSSL=false

  rules:
  - !SHARDING
    tables:
      t_order:
        actualDataNodes: ds_${0..1}.t_order_${0..1}
        tableStrategy:
          standard:
            shardingColumn: order_id
            shardingAlgorithmName: t_order_inline
        keyGenerateStrategy:
          column: order_id
          keyGeneratorName: snowflake
      t_order_item:
        actualDataNodes: ds_${0..1}.t_order_item_${0..1}
        tableStrategy:
          standard:
            shardingColumn: order_id
            shardingAlgorithmName: t_order_item_inline
        keyGenerateStrategy:
          column: order_item_id
          keyGeneratorName: snowflake
    bindingTables:
      - t_order,t_order_item
    defaultDatabaseStrategy:
      standard:
        shardingColumn: user_id
        shardingAlgorithmName: database_inline
    defaultTableStrategy:
      none: