源码链接
概述
在前两篇博客里,我分析了 transformStmt() 和 transformTopLevelStmt() 函数,这两个函数有着调用和被调用的关系,其中在 transformTopLevelStmt() 中调用了 transformStmt() 函数。而在本篇博客,我将继续解析 parse_analyze() 以及其它的函数,parse_analyze() 是一个真正意义上的默认的语义分析的入口。
解析
parse_analyze()
//代码清单1
//src/common/backend/parser/analyze.cpp
Query* parse_analyze(Node* parseTree, const char* sourceText, Oid* paramTypes, int numParams, bool isFirstNode, bool isCreateView)
{
ParseState* pstate = make_parsestate(NULL);
Query* query = NULL;
/* required as of 8.4 */
AssertEreport(sourceText != NULL, MOD_OPT, "para cannot be NULL");
pstate->p_sourcetext = sourceText;
if (numParams > 0) {
parse_fixed_parameters(pstate, paramTypes, numParams);
}
PUSH_SKIP_UNIQUE_SQL_HOOK();
query = transformTopLevelStmt(pstate, parseTree, isFirstNode, isCreateView);
POP_SKIP_UNIQUE_SQL_HOOK();
/* it's unsafe to deal with plugins hooks as dynamic lib may be released */
if (post_parse_analyze_hook && !(g_instance.status > NoShutdown)) {
(*post_parse_analyze_hook)(pstate, query);
}
pfree_ext(pstate->p_ref_hook_state);
free_parsestate(pstate);
/* For plpy CTAS query. CTAS is a recursive call. CREATE query is the first rewrited.
* thd 2nd rewrited query is INSERT SELECT. Without this attribute, DB will have
* an error that has no idea about $x when INSERT SELECT query is analyzed.
*/
query->fixed_paramTypes = paramTypes;
query->fixed_numParams = numParams;
return query;
}
在代码清单1第5行,我们定义了一个 ParseState 结构体类型的 pstate 变量用于记录语义分析的状态,其中,make_parsestate() 函数如下:
//代码清单2
//src/common/backend/parser/parse_node.cpp
/*
* make_parsestate
* Allocate and initialize a new ParseState.
*
* Caller should eventually release the ParseState via free_parsestate().
*/
ParseState* make_parsestate(ParseState* parentParseState)
{
ParseState* pstate = NULL;
pstate = (ParseState*)palloc0(sizeof(ParseState));
pstate->parentParseState = parentParseState;
pstate->isAliasReplace = true;
/* Fill in fields that don't start at null/false/zero */
pstate->p_next_resno = 1;
pstate->p_star_start = NIL;
pstate->p_star_end = NIL;
pstate->p_star_only = NIL;
pstate->p_resolve_unknowns = true;
pstate->ignoreplus = false;
pstate->p_plusjoin_rte_info = NULL;
pstate->p_rawdefaultlist = NIL;
if (parentParseState != NULL) {
pstate->p_sourcetext = parentParseState->p_sourcetext;
/* all hooks are copied from parent */
pstate->p_pre_columnref_hook = parentParseState->p_pre_columnref_hook;
pstate->p_post_columnref_hook = parentParseState->p_post_columnref_hook;
pstate->p_paramref_hook = parentParseState->p_paramref_hook;
pstate->p_coerce_param_hook = parentParseState->p_coerce_param_hook;
pstate->p_ref_hook_state = parentParseState->p_ref_hook_state;
pstate->p_create_proc_operator_hook = parentParseState->p_create_proc_operator_hook;
pstate->p_create_proc_insert_hook = parentParseState->p_create_proc_insert_hook;
pstate->p_cl_hook_state = parentParseState->p_cl_hook_state;
pstate->p_bind_variable_columnref_hook = parentParseState->p_bind_variable_columnref_hook;
pstate->p_bind_hook_state = parentParseState->p_bind_hook_state;
pstate->p_bind_describe_hook = parentParseState->p_bind_describe_hook;
pstate->p_describeco_hook_state = parentParseState->p_describeco_hook_state;
}
return pstate;
}
通篇看下来,其实就是声明并初始化了一个用于记录解析状态的变量 pstate 。既然我们为这个变量分配了内存空间,那很自然地,在解析结束之后我们要释放这部分空间,这就用到了代码清单1中第27行的 free_parsestate() 函数:
//代码清单3
//src/common/backend/parser/parse_node.cpp
/*
* free_parsestate
* Release a ParseState and any subsidiary resources.
*/
void free_parsestate(ParseState* pstate)
{
Assert(pstate != NULL);
/*
* Check that we did not produce too many resnos; at the very least we
* cannot allow more than 2^16, since that would exceed the range of a
* AttrNumber. It seems safest to use MaxTupleAttributeNumber.
*/
if (pstate->p_next_resno - 1 > MaxTupleAttributeNumber) {
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("target lists can have at most %d entries", MaxTupleAttributeNumber)));
}
if (pstate->p_target_relation != NULL) {
heap_close(pstate->p_target_relation, NoLock);
}
pfree_ext(pstate);
}
代码清单1第8行注释指的是 Postgresql 的版本而非 openGauss的,第9行判断源文本是否为空,第11行将源文本记录到解析状态中。第13行开始的 if 判断结构在当参数个数大于零时执行,第18行代码用于得到经语法树转变的查询树,第22行的 if 判断结构用于安全使用插件钩子即post_parse_analyze_hook 指向的函数,第27行为释放为用于记录解析状态的 ParseState 类型的变量 pstate 分配的内存空间。
parse_analyze_varparams()
//代码清单4
//src/common/backend/parser/analyze.cpp
Query* parse_analyze_varparams(Node* parseTree, const char* sourceText, Oid** paramTypes, int* numParams)
{
ParseState* pstate = make_parsestate(NULL);
Query* query = NULL;
/* required as of 8.4 */
AssertEreport(sourceText != NULL, MOD_OPT, "para cannot be NULL");
pstate->p_sourcetext = sourceText;
parse_variable_parameters(pstate, paramTypes, numParams);
query = transformTopLevelStmt(pstate, parseTree);
/* make sure all is well with parameter types */
check_variable_parameters(pstate, query);
/* it's unsafe to deal with plugins hooks as dynamic lib may be released */
if (post_parse_analyze_hook && !(g_instance.status > NoShutdown)) {
(*post_parse_analyze_hook)(pstate, query);
}
pfree_ext(pstate->p_ref_hook_state);
free_parsestate(pstate);
return query;
}
这个函数是 parse_analyze() 的变体,当可以从上下文推断有关$n符号数据类型的信息时,使用此变体。
parse_sub_analyze()
//代码清单5
//src/common/backend/parser/analyze.cpp
Query* parse_sub_analyze(Node* parseTree, ParseState* parentParseState, CommonTableExpr* parentCTE, bool locked_from_parent, bool resolve_unknowns)
{
ParseState* pstate = make_parsestate(parentParseState);
Query* query = NULL;
pstate->p_parent_cte = parentCTE;
pstate->p_locked_from_parent = locked_from_parent;
pstate->p_resolve_unknowns = resolve_unknowns;
if (u_sess->attr.attr_sql.td_compatible_truncation && u_sess->attr.attr_sql.sql_compatibility == C_FORMAT)
set_subquery_is_under_insert(pstate); /* Set p_is_in_insert for parse state.*/
query = transformStmt(pstate, parseTree);
free_parsestate(pstate);
return query;
}
这个函数是递归分析子语句的入口点。
总结
这三个函数都和语义分析有很大的关系,parse_analyze() 是一般情况下的语义分析的入口,而 parse_analyze_varparams() 是当条件更加具体时的语义分析的入口,parse_sub_analyze() 则是对子语句语义分析的入口。