详探 Apache ShardingSphere SQL Parse Format 功能

最新推荐文章于 2023-07-27 17:01:54 发布

ShardingSphere

最新推荐文章于 2023-07-27 17:01:54 发布

阅读量896

点赞数

文章标签：数据库 java mysql 大数据编程语言

本文链接：https://blog.csdn.net/ShardingSphere/article/details/123124075

版权

陈出新

SphereEx 中间件研发工程师，Apache ShardingSphere Committer，目前专注于 Apache ShardingSphere 内核模块的研发工作。

经常使用数据库的朋友们一定见过无比复杂的 SQL，以下面的 SQL 语句为例，你能立刻看出来这条 SQL 的含义吗？

select a.order_id,a.status,sum(b.money) as money from t_order a inner join (select c.order_id as order_id, c.number * d.price as money from t_order_detail c inner join t_order_price d on c.s_id = d.s_id) b on a.order_id = b.order_id where b.money > 100 group by a.order_id

经过格式化之后是不是容易理解多了：

SELECT a . order_id , a . status , SUM(b . money) AS money
FROM t_order a INNER JOIN 
(
        SELECT c . order_id AS order_id, c . number * d . price AS money
        FROM t_order_detail c INNER JOIN t_order_price d ON c . s_id = d . s_id
) b ON a . order_id = b . order_id
WHERE 
        b . money > 100
GROUP BY a . order_id;

相信大家拿到复杂 SQL 分析的第一步就是格式化这个 SQL，然后才能基于格式化之后的内容进一步分析 SQL 语义。SQL 的格式化功能也是众多数据库相关软件的必备功能之一。Apache ShardingSphere 基于这样的需求，依托自带的数据库方言解析引擎，推出了自己的 SQL 格式化工具——SQL Parse Format。

SQL Parse Format 是 Apache ShardingSphere 解析引擎的功能之一，也将是未来规划版本中 SQL 审计功能的基础。本文将带领读者深入浅出地理解 SQL Parse Format 功能，了解它的底层原理、使用方式以及如何参与 SQL Parse Format 开发。

Parser Engine

SQL Parse Format 作为 Apache ShardingSphere 解析引擎的功能之一，是解析引擎中独特并且相对独立的功能。要理解 SQL Parse Format 功能，需要首先需要了解 Apache ShardingSphere 的解析引擎。

Apache ShardingSphere 解析引擎创建的初衷是为了提取 SQL 中的关键信息，例如用于分库分表的字段、加密改写的列等等内容。随着 Apache ShardingSphere 的不断发展，解析引擎也经历了 3 代产品的更新迭代。

第一代解析引擎采用 Druid 作为 SQL 解析器，它在 1.4.x 之前的版本使用，性能优异。第二代解析引擎采用了全自研的方式，由于使用目的不同，第二代产品采用对 SQL 半理解的方式，仅仅提取分片数据关心的上下文信息，无需生成解析树，也不用二次遍历，因此性能和兼容性进一步提升。第三代解析引擎采用 ANTLR 作为解析引擎的生成器，从而生成解析树，然后再对解析树进行二次遍历访问提取上下文信息。利用 ANTLR 作为解析引擎生成器之后，SQL 的兼容性得以大幅提升，Apache ShardingSphere 的众多功能也能够基于这个基础快速展开。5.0.x 的版本也对第三代解析引擎进行了大量的性能优化，包括将遍历方式从 Listener 变为 Visitor，为预编译的 SQL 语句添加解析结果缓存等等。

SQL Parse Format 功能的实现正是得益于第三代解析引擎的创建。接下来，就让我们将目光聚集到 SQL Parse Format 功能之上。

SQL Parse Format

SQL Parse Format 是一款 SQL 语句格式化的工具。SQL Parse Format 功能将来还会用于 SQL 审计功能之中，它可以方便用户查看历史 SQL 、通过报表展示格式化的 SQL 或者对 SQL 作进一步的分析处理。

例如如下 SQL 经过 SQL Parse Format 格式化之后会变成以下格式。它通过换行和关键字大写的方式让 SQL 的各个部分更加突出和清晰。

select age as b, name as n from table1 join table2 where id = 1 and name = 'lu';
-- 格式化
SELECT age AS b, name AS n
FROM table1 JOIN table2
WHERE 
        id = 1
        and name = 'lu';

了解了 SQL Parse Format 的基本功能之后，让我们一起来探究 SQL Parse Format 背后的原理。

SQL Parse Format 原理解读

以如下 SQL 为例，我们来一起探究它在 Apache ShardingSphere 中是如何被格式化的。

select order_id from t_order where status = 'OK'

Apache ShardingSphere 采用了 ANTLR4 作为解析引擎生成器工具，所以我们首先按照 ANTLR4 的方式，在 .g4 文件中定义了 select 的语法（以 MySQL 为例）。

simpleSelect
    : SELECT ALL? targetList? intoClause? fromClause? whereClause? groupClause? havingClause? windowClause?
    | SELECT distinctClause targetList intoClause? fromClause? whereClause? groupClause? havingClause? windowClause?
    | valuesClause
    | TABLE relationExpr
    ;

我们通过 IDEA 的 ANTLR4

最低0.47元/天解锁文章

ShardingSphere

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
详探 Apache ShardingSphere SQL Parse Format 功能

陈出新SphereEx 中间件研发工程师，Apache ShardingSphere Committer，目前专注于 Apache ShardingSphere 内核模块的研发工作。经常使...
复制链接

扫一扫