使用Druid的sql parser做一个表数据血缘分析工具

本文介绍如何利用Druid的SQL解析能力,构建一个血缘分析工具,用于大数据场景下的任务依赖管理。通过解析SQL获取抽象语法树,遍历树以确定输入输出表,从而分析任务的输入输出关系。
摘要由CSDN通过智能技术生成

前言

大数据场景下,每天可能都要在离线集群,运行大量的任务来支持业务、运营的分析查询。任务越来越多的时候,就会有越来越多的依赖关系,每一个任务都需要等需要的input表生产出来后,再去生产自己的output表。最开始的时候,依赖关系自然是可以通过管理员来管理,随着任务量的加大,就需要一个分析工具来解析任务的inputs、outs,并且自行依赖上生产inputs表的那些任务。本文就介绍一个使用druid parser,来解析SQL的input、output的血缘分析工具。

建议对druid比较陌生的同学可以先看下druid的官方文档。

做一次sql的血缘分析的流程

  • 解析sql,拿到抽象语法树
  • 遍历抽象语法树,得到from、to

使用druid解析sql到语法树

druid提供了简单、快速的SQL解析工具,可以很简单拿到一段SQL的AST(抽象语法树)。而druid对语法树提供了多种的SQLStatement,使遍历语法树更加容易。

 SQLStatementParser parser = SQLParserUtils.createSQLStatementParser(sql, JdbcConstants.HIVE);
 SQLStatement stmt= parser.parseStatementList().get(0);

从语法树中取出from和to

拿到语法树之后,想办法把from、to从语法树中取出来就大功告成。

最初的写法

最开始,就是简单的遍历一下语法树的节点,取出from表和to表的表名。

    /**
     * 根据create或者insert的sql取出from、to
     * @param sql
     * @return
     * @throws ParserException
     */
    private static Map<String, Set<String>> getFromTo(String sql) throws ParserException {
    
        SQLStatementParser parser = SQLParserUtils.createSQLStatementParser(sql, JdbcConstants.HIVE);
        SQLStatement stmt= parser.parseStatementList().get(0);

        Set<String> from = new HashSet<>();
        Set<String> to = new HashSet<>();
        if (stmt instanceof SQLInsertStatement) {
    
            SQLInsertStatement istmt = (SQLInsertStatement) stmt;
            to.add(istmt.getTableSource().toString().toUpperCase
以下是一个使用Druid解析SQL分析字段血缘关系的Java代码示例: ``` import com.alibaba.druid.sql.SQLUtils; import com.alibaba.druid.sql.ast.SQLExpr; import com.alibaba.druid.sql.ast.SQLStatement; import com.alibaba.druid.sql.ast.expr.*; import com.alibaba.druid.sql.ast.statement.*; import com.alibaba.druid.sql.dialect.mysql.parser.MySqlStatementParser; import com.alibaba.druid.sql.parser.SQLStatementParser; import java.util.*; public class DruidSqlParser { public static void main(String[] args) { String sql = "SELECT t1.id, t2.name FROM table1 t1 JOIN table2 t2 ON t1.id = t2.id WHERE t1.age > 18"; List<ColumnInfo> columnInfoList = analyzeSql(sql); for (ColumnInfo columnInfo : columnInfoList) { System.out.println(columnInfo); } } public static List<ColumnInfo> analyzeSql(String sql) { List<ColumnInfo> columnInfoList = new ArrayList<>(); SQLStatementParser parser = new MySqlStatementParser(sql); List<SQLStatement> statementList = parser.parseStatementList(); for (SQLStatement statement : statementList) { if (statement instanceof SQLSelectStatement) { SQLSelectStatement selectStatement = (SQLSelectStatement) statement; SQLSelectQueryBlock queryBlock = selectStatement.getSelect().getQueryBlock(); analyzeSelectItems(queryBlock.getSelectList(), columnInfoList); analyzeWhere(queryBlock.getWhere(), columnInfoList); analyzeJoin(queryBlock.getFrom(), columnInfoList); } } return columnInfoList; } private static void analyzeSelectItems(List<SQLSelectItem> selectItems, List<ColumnInfo> columnInfoList) { for (SQLSelectItem selectItem : selectItems) { SQLExpr expr = selectItem.getExpr(); String columnName = null; String tableName = null; if (expr instanceof SQLPropertyExpr) { SQLPropertyExpr propertyExpr = (SQLPropertyExpr) expr; columnName = propertyExpr.getName(); tableName = propertyExpr.getOwner().getName(); } else if (expr instanceof SQLIdentifierExpr) { SQLIdentifierExpr identifierExpr = (SQLIdentifierExpr) expr; columnName = identifierExpr.getName(); } if (columnName != null) { ColumnInfo columnInfo = new ColumnInfo(columnName, tableName, true); if (!columnInfoList.contains(columnInfo)) { columnInfoList.add(columnInfo); } } } } private static void analyzeWhere(SQLExpr whereExpr, List<ColumnInfo> columnInfoList) { if (whereExpr instanceof SQLBinaryOpExpr) { SQLBinaryOpExpr binaryOpExpr = (SQLBinaryOpExpr) whereExpr; analyzeWhere(binaryOpExpr.getLeft(), columnInfoList); analyzeWhere(binaryOpExpr.getRight(), columnInfoList); } else if (whereExpr instanceof SQLBinaryOpExprGroup) { SQLBinaryOpExprGroup binaryOpExprGroup = (SQLBinaryOpExprGroup) whereExpr; for (SQLExpr sqlExpr : binaryOpExprGroup.getItems()) { analyzeWhere(sqlExpr, columnInfoList); } } else if (whereExpr instanceof SQLIdentifierExpr) { SQLIdentifierExpr identifierExpr = (SQLIdentifierExpr) whereExpr; String columnName = identifierExpr.getName(); ColumnInfo columnInfo = new ColumnInfo(columnName, null, false); if (!columnInfoList.contains(columnInfo)) { columnInfoList.add(columnInfo); } } } private static void analyzeJoin(SQLTableSource tableSource, List<ColumnInfo> columnInfoList) { if (tableSource instanceof SQLJoinTableSource) { SQLJoinTableSource joinTableSource = (SQLJoinTableSource) tableSource; analyzeJoin(joinTableSource.getLeft(), columnInfoList); analyzeJoin(joinTableSource.getRight(), columnInfoList); SQLExpr onExpr = joinTableSource.getCondition(); if (onExpr instanceof SQLBinaryOpExpr) { SQLBinaryOpExpr binaryOpExpr = (SQLBinaryOpExpr) onExpr; SQLExpr leftExpr = binaryOpExpr.getLeft(); SQLExpr rightExpr = binaryOpExpr.getRight(); if (leftExpr instanceof SQLPropertyExpr && rightExpr instanceof SQLPropertyExpr) { SQLPropertyExpr leftPropertyExpr = (SQLPropertyExpr) leftExpr; SQLPropertyExpr rightPropertyExpr = (SQLPropertyExpr) rightExpr; String leftColumnName = leftPropertyExpr.getName(); String leftTableName = leftPropertyExpr.getOwner().getName(); String rightColumnName = rightPropertyExpr.getName(); String rightTableName = rightPropertyExpr.getOwner().getName(); ColumnInfo leftColumnInfo = new ColumnInfo(leftColumnName, leftTableName, false); ColumnInfo rightColumnInfo = new ColumnInfo(rightColumnName, rightTableName, false); if (columnInfoList.contains(leftColumnInfo)) { int index = columnInfoList.indexOf(leftColumnInfo); columnInfoList.get(index).setTableName(leftTableName); } if (columnInfoList.contains(rightColumnInfo)) { int index = columnInfoList.indexOf(rightColumnInfo); columnInfoList.get(index).setTableName(rightTableName); } } } } else if (tableSource instanceof SQLSubqueryTableSource) { SQLSubqueryTableSource subqueryTableSource = (SQLSubqueryTableSource) tableSource; SQLSelect subSelect = subqueryTableSource.getSelect(); analyzeSelectItems(subSelect.getQueryBlock().getSelectList(), columnInfoList); analyzeWhere(subSelect.getQueryBlock().getWhere(), columnInfoList); analyzeJoin(subSelect.getQueryBlock().getFrom(), columnInfoList); } else if (tableSource instanceof SQLExprTableSource) { SQLExprTableSource exprTableSource = (SQLExprTableSource) tableSource; String tableName = null; if (exprTableSource.getAlias() != null) { tableName = exprTableSource.getAlias(); } else if (exprTableSource.getExpr() instanceof SQLIdentifierExpr) { tableName = ((SQLIdentifierExpr) exprTableSource.getExpr()).getName(); } for (ColumnInfo columnInfo : columnInfoList) { if (columnInfo.getTableName() == null) { columnInfo.setTableName(tableName); } } } } } class ColumnInfo { private String columnName; private String tableName; private boolean isSelectItem; public ColumnInfo(String columnName, String tableName, boolean isSelectItem) { this.columnName = columnName; this.tableName = tableName; this.isSelectItem = isSelectItem; } public String getColumnName() { return columnName; } public void setColumnName(String columnName) { this.columnName = columnName; } public String getTableName() { return tableName; } public void setTableName(String tableName) { this.tableName = tableName; } public boolean isSelectItem() { return isSelectItem; } public void setSelectItem(boolean selectItem) { isSelectItem = selectItem; } @Override public String toString() { return "ColumnInfo{" + "columnName='" + columnName + '\'' + ", tableName='" + tableName + '\'' + ", isSelectItem=" + isSelectItem + '}'; } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; ColumnInfo that = (ColumnInfo) o; return Objects.equals(columnName, that.columnName) && Objects.equals(tableName, that.tableName); } @Override public int hashCode() { return Objects.hash(columnName, tableName); } } ``` 这个代码实现了对SELECT、WHERE和JOIN三个部分的分析,并且能够正确地识别出每个字段所属的和是否是SELECT中的选择项。你可以根据自己的需求对这个代码进行扩展和修改。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值