1.1.1.1 嵌套循环的代价:final_cost_nestloop
函数功能:
计算两个关系使用嵌套循环连接算法的花费。内外表的元组数决定着CPU的花费。
对于nestloop花费的计算,分为两个函数:
1. 一是initial_cost_ nestloop函数,初步估计nestloop算法的花费,形成的结果会作为final_cost_nestloop的入口参数(JoinCostWorkspace *workspace)传入final_cost_nestloop函数;
2. 二是final_cost_nestloop函数,对nestloop算法的花费进行计算,区分了半连接、反半连接和其他连接类型的差别。
代码分析:
void
final_cost_nestloop(PlannerInfo *root, NestPath *path,
JoinCostWorkspace *workspace,
SpecialJoinInfo *sjinfo,
SemiAntiJoinFactors *semifactors)
{......
//半连接或反半连接,其连接后的元组数与其他连接方式不同,表现在选择率的计算上
if (path->jointype == JOIN_SEMI || path->jointype == JOIN_ANTI)
{......
//半连接或反半连接,在遇到第一个满足条件的,即停止扫描,所以,需要考虑选择率(在initial_cost_nestloop函数中考虑选择率)
ntuples = outer_matched_rows * inner_path_rows * inner_scan_frac;
/*
* For unmatched outer-rel rows, there are two cases. If the inner
* path is an indexscan using all the joinquals as indexquals, then an
* unmatched row results in an indexscan returning no rows, which is
* probably quite cheap. We estimate this case as the same cost to
* return the first tuple of a nonempty scan. Otherwise, the executor
* will have to scan the whole inner rel; not so cheap.
*/
if (has_indexed_join_quals(path))//对于半连接和反半连接考虑索引造成的影响
{
run_cost += (outer_path_rows - outer_matched_rows) *
inner_rescan_run_cost / inner_path_rows;
}
else
{
run_cost += (outer_path_rows - outer_matched_rows) *
inner_rescan_run_cost;
ntuples += (outer_path_rows - outer_matched_rows) * inner_path_rows;
}
}
else //內连接,外表和内表的元组都要进行连接
{ //对于半连接和反半连接考虑了索引造成的影响;对于內连接也应该考虑,但PostgreSQL没有考虑索引对非半连接和非反半连接的其他连接造成的影响
ntuples = outer_path_rows * inner_path_rows;
}
//计算CPU花费
cost_qual_ev
startup_cost += restrict_qual_cost.startup;
cpu_per_tuple = cpu_tuple_cost + restrict_qual_cost.per_tuple;
run_cost += cpu_per_tuple * ntuples; //内外表的元组数决定着CPU的花费
path->path.startup_cost = startup_cost;
path->path.total_cost = startup_cost + run_cost;
}
调用关系图解:
1. final_cost_nestloop函数被create_nestloop_path函数调用,直至被add_paths_to_joinrel函数调用,完成两表连接的nestloop连接算法的花费估算。