openGauss源码学习(三)SeqScan扫描算子
文章目录
前言
关于优化器、执行器的主要逻辑有很多文章和书籍进行说明,比如查询解析、查询重写、路径生成、计划生成和执行器及算子。所以系列博客还是准备对源码做较多的分析,计划后面的文章从一个算子的优化器到执行器进行详细的代码走读与分析。
那么这篇文章先对SeqScan的优化器部分进行分析,从行数估算、代价估算以及路径/计划生成做一个整体的介绍。下篇文章对执行器和存储引擎进行介绍,全面地了解一个表内的数据是如何读取并返回给客户端。
一、grouping_planner
生成物理路径的主要函数是grouping_planner,函数中会估算表的行数并根据代价选择更优的路径,以及添加排序、聚集算子等。在此之前还有很多逻辑重写的部分在subquery_planner这个函数内,比如子查询和子链接的提升,表达式预处理以及规则重写等等。逻辑重写的部分更多是基于理论进行重写,难点在于理解背后的原理,因此考虑后面单独从规则的维度对逻辑重写做分析和代码走读。
还是以一个简单的例子来说明:
CREATE TABLE t1(a INT, b INT);
INSERT INTO t1 SELECT generate_series(1,10) x;
ANALYZE t1;
SELECT * FROM t1 WHERE a=1;
grouping_planner中首先是处理了limit(可能会影响路径的选择,limit有时候会偏向于启动代价更小的路径),然后判断是否存在setop(UNION/EXCEPT/INTERSECT等)。当前SQL比较简单,先走读短路径代码,主要的两个函数分别是query_planner和choose_best_path,这两个函数分别负责路径生成和路径选择。
/*
* Generate pathlist by query_planner for final_rel and canonicalize
* all the pathkeys.
*/
final_rel = query_planner(root, sub_tlist, standard_qp_callback, &qp_extra);
...
/*
* Select the best path. If we are doing hashed grouping, we will
* always read all the input tuples, in addition cn gather permit on
* then use the cheapest-total path.
* Otherwise, trust query_planner's decision about which to use.
*/
best_path = choose_best_path((use_hashed_grouping || use_hashed_distinct ||
sorted_path == NULL || permit_gather(root)),
root, cheapest_path, sorted_path);
二、路径生成
query_planner负责路径生成,但里面也有很多优化逻辑,比如等价类的生成与推导,谓词下推和join的一些预处理等等。
make_one_rel函数才是生成基表物理路径的主要入口函数。
/*
* make_one_rel
* Finds all possible access paths for executing a query, returning a
* single rel that represents the join of all base rels in the query.
*/
RelOptInfo* make_one_rel(PlannerInfo* root, List* joinlist)
{
// 对基表进行行数估算,计算表大小和宽度
set_base_rel_sizes(root);
// 生成基表所有的扫描路径(顺序扫描、索引扫描、位图扫描等等)
set_base_rel_pathlists(root);
/*
* Generate access paths for the entire join tree.
*/
rel = make_rel_from_joinlist(root, joinlist);
return rel;
}
2.1 set_base_rel_sizes
set_rel_size函数实现了具体的估算逻辑,内部对多种类型的RTE进行处理,包括递归处理子查询类型RTE(RTE_SUBQUERY),以及其他类型的RTE(RTE_FUNCTION/RTE_RESULT等等)。
set_baserel_size_estimates函数负责设置基表的估算。
/*
* set_baserel_size_estimates
* Set the size estimates for the given base relation.
*
* The rel's targetlist and restrictinfo list must have been constructed
* already, and rel->tuples must be set.
*
* We set the following fields of the rel node:
* rows: the estimated number of output tuples (after applying
* restriction clauses).
* width: the estimated average output tuple width in bytes.
* baserestrictcost: estimated cost of evaluating baserestrictinfo clauses.
*/
void set_baserel_size_estimates(PlannerInfo* root, RelOptInfo* rel)
{
double nrows;
/* Should only be applied to base relations */
AssertEreport(
rel->relid > 0, MOD_OPT, "The relid is invalid when set the size estimates for the given base relation.");
if (root->glob->boundParams != NULL && root->glob->boundParams->uParamInfo != DEFUALT_INFO) {
root->glob->boundParams->params_lazy_bind = false;
}
// 根据条件的选择率估算基表的实际行数,选择率估算部分请见上一篇博客。
// rel->tuples对应pg_class系统表中reltuples字段,analyze时获取的统计信息,说明了表的行数。
// build_simple_rel中设置了表的基本信息。
// 对应例子中的场景,rel->tupels为10,条件选择率使用mcv值计算得到结果为0.1,实际行数为1行。
nrows = rel->tuples * clauselist_selectivity(root, rel->baserestrictinfo, 0, JOIN_INNER, NULL);
if (root->glob->boundParams != NULL && root->glob->boundParams->uParamInfo != DEFUALT_INFO) {
root->glob->boundParams->params_lazy_bind = true;
}
rel->rows = clamp_row_est(nrows);
// 计算表达式代价
cost_qual_eval(&rel->baserestrictcost, rel->baserestrictinfo, root);
set_rel_width(root, rel);
}
2.2 create_seqscan_path
调用关系:
make_one_rel
set_base_rel_pathlists
set_rel_pathlist
SetRelationPath
create_seqscan_path
/*
* create_seqscan_path
* Creates a path corresponding to a sequential scan, returning the
* pathnode.
*/
Path* create_seqscan_path(PlannerInfo* root, RelOptInfo* rel, Relids required_outer, int dop)
{
Path* pathnode = makeNode(Path);
pathnode->pathtype = T_SeqScan;
pathnode->parent = rel;
pathnode->pathtarget = rel->reltarget;
// 参数信息,NestedLoop的外表数据就是参数的一种
pathnode->param_info = get_baserel_parampathinfo(root, rel, required_outer);
pathnode->pathkeys = NIL; /* seqscan has unordered result */
// SMP并行度,非SMP场景为1
pathnode->dop = dop;
// 分布式遗留代码,可以忽略,SMP场景也不会使用
pathnode->exec_type = SetBasePathExectype(root, rel);
// SMP并行扫描SeqScan,后面smp场景再展开说明
#ifdef STREAMPLAN
...
#endif
RangeTblEntry* rte = planner_rt_fetch(rel->relid, root);
if (NULL == rte->tablesample) {
// cost部分不展开说明代码了,第三部分对SeqScan代价模型做一个简单的描述。
// 代价还包括其他部分,比如targetlist投影的代价(add_tlist_costs_to_plan)等。
// 但这部分是公共的,和实际的物理路径无关,这里就先不做额外的说明。
cost_seqscan(pathnode, root, rel, pathnode->param_info);
} else {
AssertEreport(rte->rtekind == RTE_RELATION, MOD_OPT_JOIN, "Rel should be base relation");
cost_samplescan(pathnode, root, rel, pathnode->param_info);
}
return pathnode;
}
三、路径选择
代价分为启动代价和总代价两部分,启动代价说明获取第一条数据的代价,而总代价则是整个算子全部执行完毕的代价。
启动代价在LIMIT场景更有意义,而大部分场景的 选择还是基于total_cost。
- startup_cost = qpqual_cost.startup;
- total_cost = page_cost * page_number + (cpu_tuple_cost + qpqual_cost.per_tuple) * tuple_number
page_cost * page_number
表达式的含义是I/O读取硬盘中page的总代价,page_cost是通过guc参数seq_page_cost配置的。(cpu_tuple_cost + qpqual_cost.per_tuple) * tuple_number
代表着处理对CPU所有tuple代价的估算。
优化器选择路径时,会根据代价对两条路径做决策,选取代价更低的路径,这也就是为什么数据量大的情况下一般选择SeqScan,而数据集较小的情况下选择IndexScan的原因。
优化器是通过add_path决定路径是否有优势需要添加,choose_best_path选择是排序还是非排序路径。
void add_path(PlannerInfo* root, RelOptInfo* parent_rel, Path* new_path)
{
...
for (p1 = list_head(parent_rel->pathlist); p1 != NULL; p1 = p1_next) {
Path* old_path = (Path*)lfirst(p1);
...
/*
* Do a fuzzy cost comparison with 1% fuzziness limit. (XXX does this
* percentage need to be user-configurable?)
*/
// FUZZY_FACTOR的值为1.01,只有当新路径代价 > 旧路径代价*FUZZY_FACTOR时,才认为新路径更优
// 代价比较规则如下:
// 1. 比较total_cost,total_cost更优的情况下,路径启动代价更小则认为该路径更优。
// 否则认为两条路径各有优势,返回COSTS_DIFFERENT
// 2. total_cost在fuzzily情况下代价相同,启动代价更小的路径认为更优
// 3. total_cost和startup_cost都几乎相等,那么认为两条路径代价相同。
costcmp = compare_path_costs_fuzzily(new_path, old_path, FUZZY_FACTOR);
/*
* If the two paths compare differently for startup and total cost,
* then we want to keep both, and we can skip comparing pathkeys and
* required_outer rels. If they compare the same, proceed with the
* other comparisons. Row count is checked last. (We make the tests
* in this order because the cost comparison is most likely to turn
* out "different", and the pathkeys comparison next most likely. As
* explained above, row count very seldom makes a difference, so even
* though it's cheap to compare there's not much point in checking it
* earlier.)
*/
if (costcmp != COSTS_DIFFERENT) {
// 比较排序的key,如果一个key是另一个key的子集,那么认为母集的key更优。
keyscmp = compare_pathkeys(new_path_pathkeys, old_path_pathkeys);
if (keyscmp != PATHKEYS_DIFFERENT) {
switch (costcmp) {
case COSTS_EQUAL:
// 比较两条路径的参数信息
outercmp = bms_subset_compare(PATH_REQ_OUTER(new_path), PATH_REQ_OUTER(old_path));
if (keyscmp == PATHKEYS_BETTER1) {
if ((outercmp == BMS_EQUAL || outercmp == BMS_SUBSET1) && new_path->rows <= old_path->rows)
remove_old = true; /* new dominates old */
} else if (keyscmp == PATHKEYS_BETTER2) {
if ((outercmp == BMS_EQUAL || outercmp == BMS_SUBSET2) && new_path->rows >= old_path->rows)
accept_new = false; /* old dominates new */
} else {
if (outercmp == BMS_EQUAL) {
/*
* Same pathkeys and outer rels, and fuzzily
* the same cost, so keep just one; to decide
* which, first check rows and then do a fuzzy
* cost comparison with very small fuzz limit.
* (We used to do an exact cost comparison,
* but that results in annoying
* platform-specific plan variations due to
* roundoff in the cost estimates.) If things
* are still tied, arbitrarily keep only the
* old path. Notice that we will keep only
* the old path even if the less-fuzzy
* comparison decides the startup and total
* costs compare differently.
*/
if (new_path->rows < old_path->rows)
// 新路径行数更少,代价相同的情况下选取新路径。
remove_old = true; /* new dominates old */
else if (new_path->rows > old_path->rows)
accept_new = false; /* old dominates new */
else {
// #define SMALL_FUZZY_FACTOR 1.0000000001
// 两条路径的代价和pathkey都相近,无法选出更优的路径,尝试更精准地比较代价。
small_fuzzy_factor_is_used = true;
if (compare_path_costs_fuzzily(new_path, old_path, SMALL_FUZZY_FACTOR) ==
COSTS_BETTER1)
remove_old = true; /* new dominates old */
else
accept_new = false; /* old equals or
* dominates new */
}
} else if (outercmp == BMS_SUBSET1 && new_path->rows <= old_path->rows)
remove_old = true; /* new dominates old */
else if (outercmp == BMS_SUBSET2 && new_path->rows >= old_path->rows)
accept_new = false; /* old dominates new */
/* else different parameterizations, keep both */
}
break;
// COSTS_BETTER说明存在更优的路径,那么比较参数并在参数和行数更优的情况下,选取该路径。
case COSTS_BETTER1:
if (keyscmp != PATHKEYS_BETTER2) {
outercmp = bms_subset_compare(PATH_REQ_OUTER(new_path), PATH_REQ_OUTER(old_path));
if ((outercmp == BMS_EQUAL || outercmp == BMS_SUBSET1) && new_path->rows <= old_path->rows)
remove_old = true; /* new dominates old */
}
break;
case COSTS_BETTER2:
if (keyscmp != PATHKEYS_BETTER1) {
outercmp = bms_subset_compare(PATH_REQ_OUTER(new_path), PATH_REQ_OUTER(old_path));
if ((outercmp == BMS_EQUAL || outercmp == BMS_SUBSET2) && new_path->rows >= old_path->rows)
accept_new = false; /* old dominates new */
}
break;
default:
break;
}
}
}
...
}
...
}
总结
以上就是一条简单SQLSELECT * FROM t1;
优化器部分的路径生成代码逻辑说明,由于简单查询情况下查询重写和计划生成都不涉及太多额外的处理,所以该SQL主要的优化器逻辑就集中在路径生成部分。
总的来说,SeqScan没有太多的额外处理,仅涉及行数估算和代价估算。博文中也只对主要流程做了简单说明,很多具体的细节或者问题分析还是需要更深入地阅读代码和分析。
下一篇博文准备从执行器的角度介绍一下SeqScan是如何执行的,感谢大家的阅读。