短链服务分库分表-扩容免数据迁移方案-初级入门三持续更新中

是小王同学啊~

已于 2022-01-18 22:43:36 修改

阅读量1.7k

点赞数

分类专栏：短链文章标签：哈希算法扩容免迁移 shardingjdbc分库表

于 2021-12-22 22:09:08 首次发布

本文链接：https://blog.csdn.net/wnn654321/article/details/122073980

版权

短链专栏收录该内容

5 篇文章 6 订阅

订阅专栏

问题四：数据量有多大，是否要分库分表？

问题五：如果分库分表，PartitionKey是哪个？使用怎样的策略

问题六：如果分库分表，访问短链怎么知道具体是哪个库哪个表

问题四：数据量有多大，是否要分库分表？

假设短链服务在评估后，3年近100亿的数据，大致估算如下：

	首年日活用户： 10万
	首年日新增短链数据：10万*50(一天建50个短链) = 500万
	年新增短链数：500万 * 365天 = 18.2亿 
	往高的算就是100亿，支撑3年

那么按照估算是需要进行分库分表的，看看分库分表需求场景，一般都是需要将单表数量控制在1千万左右
可分16个库, 每个库64个表，总量就是 1024个表
分片键：短链码 code字段
比如 :g1.fit/92AEva 的短链码 92AEva

问题五：如果分库分表，PartitionKey是哪个？使用怎样的策略

业务界常用的分库分表算法：短链码进行hash取模
        库ID = 短链码hash值 % 库数量
        表ID = 短链码hash值 / 库数量 % 表数量
使用这种方式的优点：
        保证数据较均匀的分散落在不同的库、表中，可以有效的避免热点数据集中问题，分库分表方式清晰易懂

缺点：扩容不是很方便，需要数据迁移
需要一次性建立16个库, 每个库64个表，总量就是 1024个表，浪费资源

那么，本篇文章介绍自定义增加库表位的方式来实现前期数量少不浪费资源，同时扩容避免迁移数据或者免迁移

比如 g1.fit/92AEva 的短链码 92AEva

为什么这种自定义库表位的方式可以？因为由于短链码的前缀和后缀是是固定的，所以扩容也不影响旧的数据。且前期多少数据量就可以建多少个库不浪费资源。

大白话解释就是：使用短链码作为PartitionKey,短链码前面是库位置，后面是表位置，一开始有几个表，库表规则就写几个就不用建那么多浪费了。同时精确的知道了库位置和表位置，后面数据多了的话，增加库位字母就可以，前面已经有的历史数据不用动，因为前面的那些数据库表位没有动只是增加了。比如一开始A-C表示库。后面a-c表示表位。等这3个库表数据多了，改变库表规则D-F表示库，d-f表示表即可。

问题六：如果分库分表，访问短链怎么知道具体是哪个库哪个表

答：通过短链的库表位解析截取，就知道应该去哪个库的哪个表下面查询短链原始URL。

实现方式：springboot2.x+shardingjdbc4.1实现自定义库表位分库分表操作

举例:三个库dcloud_link_0,dcloud_link_1,dcloud_link_a，一个库中有2张表short_link_0,short_link_a

分库策略：自定义库表位策略，代码中需要在短链码之前拼接上库位，在短链之后拼接上表位，查看短链时，再截取库表位的字符，然后对应到相应的表中查询。

开始配置啦~application.properties：

spring.shardingsphere.datasource.names=ds0,ds1,dsa其中0/1/a是增加的库位

其余配置如下：需要分别配置ds0 ds1 dsa的数据项

spring.shardingsphere.props.sql.show=true


spring.shardingsphere.datasource.ds0.connectionTimeoutMilliseconds=3000
spring.shardingsphere.datasource.ds0.driver-class-name=com.mysql.cj.jdbc.Driver
spring.shardingsphere.datasource.ds0.idleTimeoutMilliseconds=60000
spring.shardingsphere.datasource.ds0.jdbc-url=jdbc:mysql://xxxx.xxx.xxx.240:3306/dcloud_link_0?useUnicode=true&characterEncoding=utf-8&useSSL=false&serverTimezone=Asia/Shanghai&allowPublicKeyRetrieval=true
spring.shardingsphere.datasource.ds0.maintenanceIntervalMilliseconds=30000
spring.shardingsphere.datasource.ds0.maxLifetimeMilliseconds=1800000
spring.shardingsphere.datasource.ds0.maxPoolSize=50
spring.shardingsphere.datasource.ds0.minPoolSize=50
spring.shardingsphere.datasource.ds0.password=xxxxx.net168
spring.shardingsphere.datasource.ds0.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds0.username=root


spring.shardingsphere.datasource.ds1.connectionTimeoutMilliseconds=3000
spring.shardingsphere.datasource.ds1.driver-class-name=com.mysql.cj.jdbc.Driver
spring.shardingsphere.datasource.ds1.idleTimeoutMilliseconds=60000
spring.shardingsphere.datasource.ds1.jdbc-url=jdbc:mysql://xxxx.xxx.xxx:3306/dcloud_link_1?useUnicode=true&characterEncoding=utf-8&useSSL=false&serverTimezone=Asia/Shanghai&allowPublicKeyRetrieval=true
spring.shardingsphere.datasource.ds1.maintenanceIntervalMilliseconds=30000
spring.shardingsphere.datasource.ds1.maxLifetimeMilliseconds=1800000
spring.shardingsphere.datasource.ds1.maxPoolSize=50
spring.shardingsphere.datasource.ds1.minPoolSize=50
spring.shardingsphere.datasource.ds1.password=xxxx.net168
spring.shardingsphere.datasource.ds1.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds1.username=root


spring.shardingsphere.datasource.dsa.connectionTimeoutMilliseconds=3000
spring.shardingsphere.datasource.dsa.driver-class-name=com.mysql.cj.jdbc.Driver
spring.shardingsphere.datasource.dsa.idleTimeoutMilliseconds=60000
spring.shardingsphere.datasource.dsa.jdbc-url=jdbc:mysql://xxxx.xxx.xxx.240:3306/dcloud_link_a?useUnicode=true&characterEncoding=utf-8&useSSL=false&serverTimezone=Asia/Shanghai&allowPublicKeyRetrieval=true
spring.shardingsphere.datasource.dsa.maintenanceIntervalMilliseconds=30000
spring.shardingsphere.datasource.dsa.maxLifetimeMilliseconds=1800000
spring.shardingsphere.datasource.dsa.maxPoolSize=50
spring.shardingsphere.datasource.dsa.minPoolSize=50
spring.shardingsphere.datasource.dsa.password=xxxxx.net168
spring.shardingsphere.datasource.dsa.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.dsa.username=root

#配置plus打印sql日志
mybatis-plus.configuration.log-impl=org.apache.ibatis.logging.stdout.StdOutImpl


#----------短链，策略：分库+分表--------------
# 先进行水平分库，然后再水平分表
spring.shardingsphere.sharding.tables.short_link.database-strategy.standard.sharding-column=code
spring.shardingsphere.sharding.tables.short_link.database-strategy.standard.precise-algorithm-class-name=net.wnn.strategy.CustomDBPreciseShardingAlgorithm

让程序代码路由到对应库的方法：


import net.wnn.enums.BizCodeEnum;
import net.wnn.exception.BizException;
import org.apache.shardingsphere.api.sharding.standard.PreciseShardingAlgorithm;
import org.apache.shardingsphere.api.sharding.standard.PreciseShardingValue;

import java.util.Collection;


public class CustomDBPreciseShardingAlgorithm implements PreciseShardingAlgorithm<String> {

    /**
     * @param availableTargetNames 数据源集合
     *                             在分库时值为所有分片库的集合 databaseNames
     *                             分表时为对应分片库中所有分片表的集合 tablesNames
     * @param shardingValue        分片属性，包括
     *                             logicTableName 为逻辑表，
     *                             columnName 分片健（字段），
     *                             value 为从 SQL 中解析出的分片健的值
     * @return
     */

    @Override
    public String doSharding(Collection<String> availableTargetNames, PreciseShardingValue<String> shardingValue) {

        //获取短链码第一位，即库位
        String codePrefix = shardingValue.getValue().substring(0, 1);

        for (String targetName : availableTargetNames) {
            //获取库名的最后一位，真实配置的ds
            String targetNameSuffix = targetName.substring(targetName.length() - 1);

            //如果一致则返回
            if (codePrefix.equals(targetNameSuffix)) {
                return targetName;
            }
        }

        //抛异常
        throw new BizException(BizCodeEnum.DB_ROUTE_NOT_FOUND);

    }
}

分表策略：

# 水平分表策略，自定义策略。   真实库.逻辑表
spring.shardingsphere.sharding.tables.short_link.actual-data-nodes=ds0.short_link,ds1.short_link,dsa.short_link
spring.shardingsphere.sharding.tables.short_link.table-strategy.standard.sharding-column=code
spring.shardingsphere.sharding.tables.short_link.table-strategy.standard.precise-algorithm-class-name=net.wnn.strategy.CustomTablePreciseShardingAlgorithm
#id生成策略
spring.shardingsphere.sharding.tables.short_link.key-generator.column=id
spring.shardingsphere.sharding.tables.short_link.key-generator.type=SNOWFLAKE
spring.shardingsphere.sharding.tables.short_link.key-generator.props.worker.id=${workerId}

让程序路由到对应表的方法:


import org.apache.shardingsphere.api.sharding.standard.PreciseShardingAlgorithm;
import org.apache.shardingsphere.api.sharding.standard.PreciseShardingValue;

import java.util.Collection;


public class CustomTablePreciseShardingAlgorithm implements PreciseShardingAlgorithm<String> {

    /**
     * @param availableTargetNames 数据源集合
     *                             在分库时值为所有分片库的集合 databaseNames
     *                             分表时为对应分片库中所有分片表的集合 tablesNames
     * @param shardingValue        分片属性，包括
     *                             logicTableName 为逻辑表，
     *                             columnName 分片健（字段），
     *                             value 为从 SQL 中解析出的分片健的值
     * @return
     */
    @Override
    public String doSharding(Collection<String> availableTargetNames, PreciseShardingValue<String> shardingValue) {

        //获取逻辑表
        String targetName = availableTargetNames.iterator().next();

        //短链码  A23Ad1
        String value = shardingValue.getValue();


        //获取短链码最后一位
        String codeSuffix =  value.substring(value.length()-1);

        //拼接Actual table
        return targetName+"_"+codeSuffix;
    }
}

生成短链的时候，给短链码增加库表位：

ShardingDBConfig-db的库位配置


import java.util.ArrayList;
import java.util.List;
import java.util.Random;

public class ShardingDBConfig {

    /**
     * 存储数据库位置编号
     */
    private static final List<String> dbPrefixList = new ArrayList<>();

    private static Random random = new Random();

    //配置启用那些库的前缀
    static {
        dbPrefixList.add("0");
        dbPrefixList.add("1");
        dbPrefixList.add("a");
    }


    /**
     * 获取随机的前缀
     * @return
     */
    public static String getRandomDBPrefix(){
        int index = random.nextInt(dbPrefixList.size());
        return dbPrefixList.get(index);
    }



}

shardingTableConfig表位配置


import java.util.ArrayList;
import java.util.List;
import java.util.Random;


public class ShardingTableConfig {

    /**
     * 存储数据表位置编号
     */
    private static final List<String> tableSuffixList = new ArrayList<>();

    private static Random random = new Random();

    //配置启用那些表的后缀
    static {
        tableSuffixList.add("0");
        tableSuffixList.add("a");
    }


    /**
     * 获取随机的后缀
     * @return
     */
    public static String getRandomTableSuffix(){
        int index = random.nextInt(tableSuffixList.size());
        return tableSuffixList.get(index);
    }



}

生成短链码啦: 先调用MurmurHash生成10进制，再进行转换生成62进制。然后再拼接库位+短链码+表位返回结果。

   /**
     * 生成短链码
     * @param param
     * @return
     */
    public String createShortLinkCode(String param){

        long murmurhash = CommonUtil.murmurHash32(param);
        //进制转换
        String code = encodeToBase62(murmurhash);

        String shortLinkCode = ShardingDBConfig.getRandomDBPrefix() + code + ShardingTableConfig.getRandomTableSuffix();

        return shortLinkCode;
    }

短链码生成的方式以及部分逻辑见：短链服务问题解决-跳转问题-短链生成方案初级入门(二) 连载持续更新中_8年开发工作经验的老王，积极分享工作中遇到的问题~-CSDN博客

单元测试根据url生成一个短链验证分库表：

  @Test
    public void testCreateShortLink() {

        Random random = new Random();
        for (int i = 0; i < 10; i++) {
            int num1 = random.nextInt(10);
            int num2 = random.nextInt(10000000);
            int num3 = random.nextInt(10000000);
            String originalUrl = num1 + "wnn" + num2 + ".net" + num3;
            String shortLinkCode = shortLinkComponent.createShortLinkCode(originalUrl);
            log.info("originalUrl:" + originalUrl + ", shortLinkCode=" + shortLinkCode);
        }
    }