![412de572a13ead737e4c8416417d3dd5.png](https://i-blog.csdnimg.cn/blog_migrate/013b7219076be6d379e9dfbf06183560.jpeg)
DLA SQL分析函数:SQL语句审计与分析的利器
1. 简介
Data Lake Analytics(https://www.aliyun.com/product/datalakeanalytics)最新release一组SQL内置函数,用来进行SQL语句的分析、信息提取,方便用户对SQL语句进行语法层面的审计、分析,可以应用于很多安全、数据库日志分析等场景。参考DLA(Data Lake Analytics)之前的文档,只要您的SQL语句文本保存在DLA支持的数据源系统中,就能使用DLA的SQL分析函数,方便、快捷的进行SQL语句的审计和分析,或者基于此构建相关应用程序和系统。
前提条件,您的SQL语句文本已经作为一个字符串、文本字段存储在DLA支持的数据源中,或者在统一格式的日志文件中,日志文件放到OSS中(应用自己上传、开源日志工具上传、日志服务投递(https://help.aliyun.com/document_detail/107980.html)、Datahub投递等),然后采用DLA进行SQL分析。
2. SQL分析函数详解
所有提供的SQL分析函数都是scalar标量函数。SQL分析函数支持多种SQL方言,默认SQL方言是mysql。
| SQL方言 |
| mysql |
| postgresql |
| oracle |
| db2 |
| sqlserver |
| hive |
| odps |
支持的函数列表:
| Name | Description |
| --- | --- |
| sql_export_columns | 提取SQL语句中所有出现的列 |
| sql_export_functions | 提取SQL语句中所有出现的函数 |
| sql_export_predicate_columns | 提取SQL语句中所有出现的谓词条件表达式涉及的列 |
| sql_export_predicates | 提取SQL语句中所有出现的谓词条件表达式 |
| sql_export_select_list_columns | 提取SQL语句中SELECT子句表达式中出现的列(包括子查询) |
| sql_export_tables | 提取SQL语句中所有出现的表 |
| sql_format | 对SQL语句进行格式化 |
| sql_params | 提取SQL语句中所有的literal值 |
| sql_pattern | 提取SQL语句参数化后的SQL pattern,literal用?代替 |
| sql_pattern_hash | 提取sqlText的SQL语句参数化后的SQL pattern,并生成hash值 |
| sql_syntax_check | 对sqlText进行语法检查 |
2.1 sql_export_columns
函数调用形式:
sql_export_columns(sqlText)
sql_export_columns(sqlText, dbType)
sql_export_columns(sqlText, dbType, throwError)
参数说明:
| 参数名 | 类型 | 说明 |
| :---: | :---: | :---: |
| sqlText | VARCHAR | 必选 |
| dbType | VARCHAR | 可选。SQL方言,默认为mysql |
| throwError | BOOLEAN | 可选。true时,遇到非法SQL抛出异常;false时,遇到非法SQL不抛出异常,返回null |
返回值:
| 类型 | 说明 |
| :---: | :---: |
| VARCHAR | sqlText的SQL语句中所有出现的列,用逗号分隔,列所属的表会进行关联推导,如果没有找到明确的表,则为UNKNOWN |
示例:
SELECT sql_export_columns(a.sql_text)
FROM (
SELECT 'SELECT s_acctbal,
s_name,
n_name,
p_partkey,
p_mfgr,
s_address,
s_phone,
s_comment
FROM part,
supplier,
partsupp,
nation,
region
WHERE p_partkey = ps_partkey
AND s_suppkey = ps_suppkey
AND p_size = 35
AND p_type LIKE ''%NICKEL''
AND s_nationkey = n_nationkey
AND n_regionkey = r_regionkey
AND r_name = ''MIDDLE EAST''
AND ps_supplycost IN
(
SELECT min(ps_supplycost)
FROM partsupp,
supplier,
nation,
region
WHERE s_suppkey = ps_suppkey
AND s_nationkey = n_nationkey
AND n_regionkey = r_regionkey
AND r_name = ''MIDDLE EAST'' )
ORDER BY s_acctbal DESC,
n_name,
s_name,
p_partkey
LIMIT 100;' AS sql_text
) a;
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| sql_export_columns(a.sql_text) |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| UNKNOWN.s_acctbal,UNKNOWN.s_name,UNKNOWN.n_name,UNKNOWN.p_partkey,UNKNOWN.p_mfgr,UNKNOWN.s_address,UNKNOWN.s_phone,UNKNOWN.s_comment,UNKNOWN.ps_partkey,UNKNOWN.s_suppkey,UNKNOWN.ps_suppkey,UNKNOWN.p_size,UNKNOWN.p_type,UNKNOWN.s_nationkey,UNKNOWN.n_nationkey,UNKNOWN.n_regionkey,UNKNOWN.r_regionkey,UNKNOWN.r_name,UNKNOWN.ps_supplycost |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2.2 sql_export_functions
函数调用形式:
sql_export_functions(sqlText)
sql_export_functions(sqlText, dbType)
sql_export_functions(sqlText, dbType, throwError)
参数说明:
| 参数名 | 类型 | 说明 |
| :---: | :---: | :---: |
| sqlText | VARCHAR | 必选 |
| dbType | VARCHAR | 可选。SQL方言,默认为mysql |
| throwError | BOOLEAN | 可选。true时,遇到非法SQL抛出异常;false时,遇到非法SQL不抛出异常,返回null |
返回值:
| 类型 | 说明 |
| :---: | :---: |
| VARCHAR | sqlText的SQL语句中所有出现的函数名,用逗号分隔,列所属的表会进行关联推导,如果没有找到明确的表,则为UNKNOWN |
示例:
SELECT sql_export_functions(a.sql_text)
FROM (
SELECT 'SELECT s_acctbal,
s_name,
n_name,
p_partkey,
p_mfgr,
s_address,
s_phone,
s_comment
FROM part,
supplier,
partsupp,
nation,
region
WHERE p_partkey = ps_partkey
AND s_suppkey = ps_suppkey
AND p_size = 35
AND p_type LIKE ''%NICKEL''
AND s_nationkey = n_nationkey
AND n_regionkey = r_regionkey
AND r_name = ''MIDDLE EAST''
AND ps_supplycost IN
(
SELECT min(ps_supplycost)
FROM partsupp,
supplier,
nation,
region
WHERE s_suppkey = ps_suppkey
AND s_nationkey = n_nationkey
AND n_regionkey = r_regionkey
AND r_name = ''MIDDLE EAST'' )
ORDER BY s_acctbal DESC,