手撕分表方案

d90e1e5b68d1129a269a061c791be2d5.png

目标

解决业务数据库中只有少数几个表数据量特大的问题,大部分普通表在百万级别,有几张特别大的接近1亿条数据了。针对这种情况,虽然可以通过增加分库解决,只是库的数量增长会比较快。如果只针对大表做个分表方案,可以有效降低分库的增长速度,也解决了单表数据量过大性能问题。

分表方案

如何实现分表呢?方案有哪些呢?其实不管用什么方案,有一点是肯定要做的,就是对业务层要屏蔽sql的差异,在一个合适的位置对SQL拦截重写,替换成分表的表名。

按目前大部分项目的特点,有如下几种方案可以考虑:

方案1:在ORM框架层(mybaits)进行SQL拦截修改

方案2:在连接池层(druid)进行SQL拦截修改

方案3:  代理中间件(mycat、shardingsphere)中进行SQL的拦截修改,这和方案1,2的原理是一样的,只不过换了个位置

c56148b7b6b7fe078153dd5d31cd05e3.png

方案的实现

今天老吕谈下在方案2连接池层面分表方案的实现

SQL改写的步骤一般是这样的:

e565a7afc1c1b3de684892a97d92ccc5.png

1、将SQL解析为 AST

2、修改AST

3、生成新的SQL(如果2中没有修改则不用生成新SQL)

4、替换原始SQL继续向下走

这里面涉及几个问题:

1、哪些表需要分表?

数据量相差悬殊的那几张表。如果你这个库里的表之间的数据量相差不大,则干脆直接分库来的简单高效,不需要分表。一般分表名单和数量确定后不会随便修改,写到配置文件或者常量即可。

2、分表的后缀如何获取?

一般会选取 用户ID,企业ID等字段,这些字段的值往往登录完成后就确定下来了,也就是说登录完成后 分库分表的后缀就确定下来了,之后用的时候就非常方便了,不需要从SQL里解析获取,直接从登录上下文中就可以得到 分表后缀 值。

如何实现将SQL转换为AST并对AST进行修改?

很庆幸阿里Druid连接池提供的方便的SQL解析器。

1、解析成AST

List<SQLStatement> sqlStatements = 
SQLUtils.parseStatements(sql, JdbcConstants.MYSQL);

2、编写 visitor类,访问并修改AST

class RewriteTableNameVisitor extends MySqlASTVisitorAdapter {


    //分表后缀
    private String suffix;


    //模拟下需要进行分表操作的表名单
    private static Map<String,Boolean> sharding = new HashMap<>();
    static {
        sharding.put("product",true);
    }


    /**
     * sql是否被重写
     */
    private boolean rewrite = false;
    /**
     * 是否insert语句?
     */
    private boolean insert = false;
    /**
     * 是否delete语句?
     */
    private boolean delete = false;
    /**
     * 是否update语句?
     */
    private boolean update = false;


    public RewriteTableNameVisitor(String suffix){
        this.suffix = suffix;
    }
    @Override
    public void endVisit(MySqlInsertStatement x) {
        this.insert = true;
    }
    @Override
    public void endVisit(MySqlDeleteStatement x) {
        this.delete = true;
    }
    @Override
    public void endVisit(MySqlUpdateStatement x) {
        this.update = true;
    }
    @Override
    public boolean visit(SQLExprTableSource tableSource){
        String tableName = tableSource.getTableName();
        //确认否需要分表
        if (sharding.containsKey(tableName.toLowerCase())) {
            //追加别名
            if (tableSource.getAlias()==null&&insert) {
                tableSource.setAlias(tableName);
            }
            //追加分表后缀
            tableSource.setSimpleName(tableName+this.suffix);
            //已经被修改,需要替换SQL
            this.rewrite = true;
        }
        return true;
    }


    public boolean isRewrite() {
        return rewrite;
    }
}

3、获取新SQL

newSQL = SQLUtils.toSQLString(sqlStatements, JdbcConstants.MYSQL);

4、替换

public class MyDruidSqlRewriteFilter extends FilterEventAdapter {
    private Logger logger = LoggerFactory.getLogger(this.getClass());


    @Override
    public PreparedStatementProxy connection_prepareStatement(FilterChain chain, ConnectionProxy connection, String sql) throws SQLException {
        //解决PrepareStatement执行sql问题
        String shardingTableColumnValue = TraceUtil.getShardingTableColumnValue();
        String newSQL = getNewSQL(sql,shardingTableColumnValue);
        return super.connection_prepareStatement(chain, connection, newSQL);
    }


    @Override
    public boolean statement_execute(FilterChain chain, StatementProxy statement, String sql) throws SQLException{
        //解决statement执行sql问题
        String shardingTableColumnValue = TraceUtil.getShardingTableColumnValue();
        String newSQL = getNewSQL(sql,shardingTableColumnValue);
        return super.statement_execute(chain, statement, newSQL);
    }


    private String getNewSQL(String sql,String suffix) {
        boolean rewrite = false;
        String newSQL = sql;
        List<SQLStatement> sqlStatements = SQLUtils.parseStatements(sql, JdbcConstants.MYSQL);
        for (SQLStatement st : sqlStatements) {
            RewriteTableNameVisitor visitor = new RewriteTableNameVisitor(suffix);
            st.accept(visitor);
            if (visitor.isRewrite()) {
                rewrite = true;
            }
        }
        if (rewrite) {
            newSQL = SQLUtils.toSQLString(sqlStatements, JdbcConstants.MYSQL);
        }
        System.out.println(newSQL);
        return newSQL;
    }
}

5、是不是感觉很简单就实现了呢?其实这得益于 Druid 提供的强大的accept方法,这是典型的 访问者设计模式

List<SQLStatement> sqlStatements = SQLUtils.parseStatements(sql, JdbcConstants.MYSQL);
        for (SQLStatement st : sqlStatements) {
            RewriteTableNameVisitor visitor = new RewriteTableNameVisitor(suffix);
            //这一句话相当于我AST树的根节点接受了我的访问者visitor
            //SQLStatement 就相当于这棵树的根节点,root
            st.accept(visitor);
            if (visitor.isRewrite()) {
                rewrite = true;
            }
        }

再看下 SQLStatement 的源码就可以知道,你提供的visitor方法如果返回true,它就会自动遍历下级节点,这个太重要了,它省去了你自己遍历AST的复杂性。

这是基类里核心的accept方法,里面的accept0方法是抽象方法,子类覆盖。preVisit和postVisit也是抽象的,有默认实现,不太重要。这是典型的模板方法设计模式。

public final void accept(SQLASTVisitor visitor) {
    if (visitor == null) {
        throw new IllegalArgumentException();
    }


    visitor.preVisit(this);


    accept0(visitor);


    visitor.postVisit(this);
}

看下子类的accept0方法(其中一个子类,其它子类大体结构相同)

public void accept0(MySqlASTVisitor visitor) {
    if (visitor.visit(this)) {
        acceptChild(visitor, name);
        acceptChild(visitor, from);
    }
    visitor.endVisit(this);
}

visitor方法返回true,表示要遍历下级节点,所有的节点都实现了 SQLObject 接口,有的节点包括下级节点,有的节点是叶子节点,这明显是 组合设计模式 的运用,深度优先遍历树状结构。

这就是Druid SQL Parse模块为我们提供的便利,它自身的不少过滤功能也是基于此模块原理实现的。

10a31d6aefb008b5b544b350cc2ae102.png

测试结果

原始sql:delete from gl_account a where a.id = 1;
改写后sql:DELETE FROM gl_account_1 a
WHERE a.id = 1;
---------------------------------------------
原始sql:REPLACE INTO gl_collected_log( orgId,id,startTime ) VALUES (1,2,3)
改写后sql:REPLACE INTO gl_collected_log_1 (orgId, id, startTime)
VALUES (1, 2, 3)
---------------------------------------------
原始sql:SELECT id,matchId,matchAccFatherCode,accName,matchName,matchMethod,category,errerMsg,isOrderMode,spare
        FROM gl_acc_match_preset
        WHERE category in(1) AND matchId IN (1,2)
        ORDER BY matchId,isOrderMode ;
改写后sql:SELECT id, matchId, matchAccFatherCode, accName, matchName
  , matchMethod, category, errerMsg, isOrderMode, spare
FROM gl_acc_match_preset_1 gl_acc_match_preset
WHERE category IN (1)
  AND matchId IN (1, 2)
ORDER BY matchId, isOrderMode;
---------------------------------------------
原始sql: INSERT INTO gl_excel_accountBegin (`orgId`,`id`) VALUES (1,2),(1,3);
改写后sql:INSERT INTO gl_excel_accountBegin_1 gl_excel_accountBegin (`orgId`, `id`)
VALUES (1, 2),
  (1, 3);
---------------------------------------------
原始sql: SELECT
        acc.isEndNode,acc.isCalc as accIsCalc,ctf.accountId,
        acc.isCalcMulti,ctf.id,ctf.orgId,ctf.year,
        ctf.origAmountDr,ctf.origAmountCr,ctf.amountDr,ctf.amountCr,
        ctf.quantityDr,ctf.quantityCr,ctf.isDetailData,
        ctf.beginOrigAmount,ctf.beginAmount,ctf.beginQuantity,
        ctf.beginOrigAmountDr,ctf.beginAmountDr,ctf.beginQuantityDr,
        ctf.beginOrigAmountCr,ctf.beginAmountCr,ctf.beginQuantityCr
    FROM gl_certificate ctf
    INNER JOIN gl_account acc ON ctf.accountId = acc.id AND ctf.orgId = acc.orgId
         and ctf.isPeriodBegin = 1 and ctf.orgId = #{orgId} and ctf.year = #{year}
  and (acc.parentId = #{accountId} or acc.id=#{accountId})
  where ctf.orgId = #{orgId};
改写后sql:SELECT acc.isEndNode, acc.isCalc AS accIsCalc, ctf.accountId, acc.isCalcMulti, ctf.id
  , ctf.orgId, ctf.year, ctf.origAmountDr, ctf.origAmountCr, ctf.amountDr
  , ctf.amountCr, ctf.quantityDr, ctf.quantityCr, ctf.isDetailData, ctf.beginOrigAmount
  , ctf.beginAmount, ctf.beginQuantity, ctf.beginOrigAmountDr, ctf.beginAmountDr, ctf.beginQuantityDr
  , ctf.beginOrigAmountCr, ctf.beginAmountCr, ctf.beginQuantityCr
FROM gl_certificate_1 ctf
  INNER JOIN gl_account_1 acc
  ON ctf.accountId = acc.id
    AND ctf.orgId = acc.orgId
    AND ctf.isPeriodBegin = 1
    AND ctf.orgId = #{orgId}
    AND ctf.year = #{year}
    AND (acc.parentId = #{accountId}
      OR acc.id = #{accountId})
WHERE ctf.orgId = #{orgId};
---------------------------------------------
原始sql:
   select 1 from gl_certificate ctf
        inner join edf_org org on ctf.orgId = org.id and ctf.year = org.enabledYear
        where ctf.orgId = #{orgId}
        and ctf.isPeriodBegin = 1
      LIMIT 1;
改写后sql:SELECT 1
FROM gl_certificate_1 ctf
  INNER JOIN edf_org_1 org
  ON ctf.orgId = org.id
    AND ctf.year = org.enabledYear
WHERE ctf.orgId = #{orgId}
  AND ctf.isPeriodBegin = 1
LIMIT 1;
---------------------------------------------
原始sql:select ts from gl_certificate 
        inner join edf_org org on orgId = org.id and year = org.enabledYear
        where orgId = #{orgId}
        and isPeriodBegin = 1
        order by ts desc
      LIMIT 1;
改写后sql:SELECT ts
FROM gl_certificate_1 gl_certificate
  INNER JOIN edf_org_1 org
  ON orgId = org.id
    AND year = org.enabledYear
WHERE orgId = #{orgId}
  AND isPeriodBegin = 1
ORDER BY ts DESC
LIMIT 1;
---------------------------------------------
原始sql:SELECT 
    acc.id,
    acc.code,
    acc.name,
    acc.balanceDirection,
    IF(acc.balanceDirection = 0, '借', '贷') AS balanceDirectionName,
    1 AS selected
FROM
    gl_account acc
WHERE
    acc.orgId = 1 AND acc.id = 1


  union all


    SELECT 
    acc.id,
    acc.code,
    acc.name,
    acc.balanceDirection,
    IF(acc.balanceDirection = 0, '借', '贷') AS balanceDirectionName,
    0 AS selected
FROM
    gl_account acc
        LEFT JOIN
    gl_excel_account_relation relation ON acc.orgId = relation.orgId
        AND acc.id = relation.accountId
WHERE
    acc.orgId = 1
        AND relation.accountId IS NULL
ORDER BY code
改写后sql:SELECT acc.id, acc.code, acc.name, acc.balanceDirection
  , IF(acc.balanceDirection = 0, '借', '贷') AS balanceDirectionName
  , 1 AS selected
FROM gl_account_1 acc
WHERE acc.orgId = 1
  AND acc.id = 1
UNION ALL
SELECT acc.id, acc.code, acc.name, acc.balanceDirection
  , IF(acc.balanceDirection = 0, '借', '贷') AS balanceDirectionName
  , 0 AS selected
FROM gl_account_1 acc
  LEFT JOIN gl_excel_account_relation_1 relation
  ON acc.orgId = relation.orgId
    AND acc.id = relation.accountId
WHERE acc.orgId = 1
  AND relation.accountId IS NULL
ORDER BY code
---------------------------------------------
原始sql:select * from product a   
join product b  
join product c 
join orders d 
join (select * from orders where id=1) e


on a.id = b.id 


where a.id = 1 and a.id in (select id from goods)
改写后sql:SELECT *
FROM product_1 a
  JOIN product_1 b
  JOIN product_1 c
  JOIN orders_1 d
  JOIN (
    SELECT *
    FROM orders_1 orders
    WHERE id = 1
  ) e
  ON a.id = b.id
WHERE a.id = 1
  AND a.id IN (
    SELECT id
    FROM goods_1 goods
  )
---------------------------------------------
原始sql:update order a set id = 1,name = 2 where id = 1;
改写后sql:UPDATE order_1 a
SET id = 1, name = 2
WHERE id = 1;
---------------------------------------------
原始sql:insert into xxx values (1), (2), (3)
改写后sql:INSERT INTO xxx_1 xxx
VALUES (1),
  (2),
  (3)
---------------------------------------------
原始sql:update A, B set A.a = a1, B.b = b1 where A.id = B.aid
改写后sql:UPDATE A_1 A, B_1 B
SET A.a = a1, B.b = b1
WHERE A.id = B.aid
---------------------------------------------
原始sql:delete A, B from A, B where A.id = B.aid
改写后sql:DELETE A_1 A, B_1 B
FROM A_1 A, B_1 B
WHERE A.id = B.aid
---------------------------------------------
原始sql:select * from x limit 2
改写后sql:SELECT *
FROM x_1 x
LIMIT 2
---------------------------------------------
原始sql:select A.name, sum(A.num), A.price from A group by A.name
改写后sql:SELECT A.name, sum(A.num), A.price
FROM A_1 A
GROUP BY A.name
---------------------------------------------
原始sql:select * from A order by rand()
改写后sql:SELECT *
FROM A_1 A
ORDER BY rand()
---------------------------------------------
原始sql:replace into x select * from y
改写后sql:REPLACE INTO x_1
  SELECT *
  FROM y_1 y
---------------------------------------------

总结

1、看似简单的实现背后并不简单,通过阅读Druid源码可以了解到整个实现过程非常繁琐,因为AST的每个节点的下级节点都有多种情况需要考虑,我梳理了一下代码中运用了一下设计模式:

    1)访问者设计模式

    2)模板方法设计模式

    3)组合设计模式(处理树状结构)

    4)责任链设计模式(过滤器链)

2、基本上引入任何框架都会导致架构整体复杂度上升一个层级,要考虑你的团队规模、能力水平、阶段主要矛盾是否相适应,要考虑能否有能力解决架构改变带来的负面问题

3、从核心原理看待一个框架,能否用最小的代价实现最核心的功能,价值最大化,风险最小化

a456ef1852dd3c3610dd853d139b4e00.png

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

吕哥架构

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值