【查询优化——生成路径】
查询优化的整个过程可以分为预处理,生成路径和生成计划三个阶段。
关系数据库 特色是支持表的连接。但是表的连接结果会因连接顺序的不同而不同,所以同一组表的连接结果可能有多个。表的连接在逻辑上,我们可以看做是一棵连接树,每一棵连接树在PostgreSQL中都被称作是一条路径。因为同一组表的连接结果会有多条路径,所以查询优化的工作是:从一系列等效路径中选取效率最高的路径并形成执行计划 。
query_planner函数
生成路径的工作是由query_planner函数来完成的。为一个基本的查询(可能涉及连接)生成访问路径。
\src\backend\optimizer\plan\Planmain.c
/*
* query_planner
* Generate a path (that is, a simplified plan) for a basic query,
* which may involve joins but not any fancier features.
*
* Since query_planner does not handle the toplevel processing (grouping,
* sorting, etc) it cannot select the best path by itself. Instead, it
* returns the RelOptInfo for the top level of joining, and the caller
* (grouping_planner) can choose among the surviving paths for the rel.
*
* root describes the query to plan
* tlist is the target list the query should produce
* (this is NOT necessarily root->parse->targetList!)
* qp_callback is a function to compute query_pathkeys once it's safe to do so
* qp_extra is optional extra data to pass to qp_callback
*
* Note: the PlannerInfo node also includes a query_pathkeys field, which
* tells query_planner the sort order that is desired in the final output
* plan. This value is *not* available at call time, but is computed by
* qp_callback once we have completed merging the query's equivalence classes.
* (We cannot construct canonical pathkeys until that's done.)
*/
RelOptInfo *
query_planner(PlannerInfo *root, List *tlist,
query_pathkeys_callback qp_callback, void *qp_extra)
RelOptInfo *
query_planner(PlannerInfo *root, List *tlist,
query_pathkeys_callback qp_callback, void *qp_extra)
{// #lizard forgives
Query *parse = root->parse;
List *joinlist;
RelOptInfo *final_rel;
Index rti;
double total_pages;
/*
* If the query has an empty join tree, then it's something easy like
* "SELECT 2+2;" or "INSERT ... VALUES()" or "INSERT .. ON CONFLICT DO UPDATE ..". Fall through quickly.
*/
if (parse->jointree->fromlist == NIL)
{
/* We need a dummy joinrel to describe the empty set of baserels */
final_rel = build_empty_join_rel(root);
/*
* If query allows parallelism in general, check whether the quals are
* parallel-restricted. (We need not check final_rel->reltarget
* because it's empty at this point. Anything parallel-restricted in
* the query tlist will be dealt with later.)
*/
if (root->glob->parallelModeOK)
final_rel->consider_parallel =
is_parallel_safe(root, parse->jointree->quals);
/* The only path for it is a trivial Result path */
add_path(final_rel, (Path *)
create_result_path(root, final_rel,
final_rel->reltarget,
(List *) parse->jointree->quals));
/* Select cheapest path (pretty easy in this case...) */
set_cheapest(final_rel);
/*
* We still are required to call qp_callback, in case it's something
* like "SELECT 2+2 ORDER BY 1".
*/
root->canon_pathkeys = NIL;
(*qp_callback) (root, qp_extra);
return final_rel;
}
/*
* Init planner lists to empty.
*
* NOTE: append_rel_list was set up by subquery_planner, so do not touch
* here.
*/
root->join_rel_list = NIL;
root->join_rel_hash = NULL;
root->join_rel_level = NULL;
root->join_cur_level = 0;
root->canon_pathkeys = NIL;
root->left_join_clauses = NIL;
root->right_join_clauses = NIL;
root->full_join_clauses = NIL;
root->join_info_list = NIL;
root->placeholder_list = NIL;
root->fkey_list = NIL;
root->initial_rels = NIL;
/*
* Make a flattened version of the rangetable for faster access (this is
* OK because the rangetable won't change any more), and set up an empty
* array for indexing base relations.
*/
setup_simple_rel_arrays(root);
/*
* Construct RelOptInfo nodes for all base relations in query, and
* indirectly for all appendrel member relations ("other rels"). This
* will give us a RelOptInfo for every "simple" (non-join) rel involved in
* the query.
*
* Note: the reason we find the rels by searching the jointree and
* appendrel list, rather than just scanning the rangetable, is that the
* rangetable may contain RTEs for rels not actively part of the query,
* for example views. We don't want to make RelOptInfos for them.
*/
add_base_rels_to_query(root, (Node *) parse->jointree);
/*
* Examine the targetlist and join tree, adding entries to baserel
* targetlists for all referenced Vars, and generating PlaceHolderInfo
* entries for all referenced PlaceHolderVars. Restrict and join clauses
* are added to appropriate lists belonging to the mentioned relations. We
* also build EquivalenceClasses for provably equivalent expressions. The
* SpecialJoinInfo list is also built to hold information about join order
* restrictions. Finally, we form a target joinlist for make_one_rel() to
* work from.
*/
build_base_rel_tlists(root, tlist);
find_placeholders_in_jointree(root);
find_lateral_references(root);
joinlist = deconstruct_jointree(root);
/*
* Reconsider any postponed outer-join quals now that we have built up
* equivalence classes. (This could result in further additions or
* mergings of classes.)
*/
reconsider_outer_join_clauses(root);
/*
* If we formed any equivalence classes, generate additional restriction
* clauses as appropriate. (Implied join clauses are formed on-the-fly
* later.)
*/
generate_base_implied_equalities(root);
/*
* We have completed merging equivalence sets, so it's now possible to
* generate pathkeys in canonical form; so compute query_pathkeys and
* other pathkeys fields in PlannerInfo.
*/
(*qp_callback) (root, qp_extra);
/*
* Examine any "placeholder" expressions generated during subquery pullup.
* Make sure that the Vars they need are marked as needed at the relevant
* join level. This must be done before join removal because it might
* cause Vars or placeholders to be needed above a join when they weren't
* so marked before.
*/
fix_placeholder_input_needed_levels(root);
/*
* Remove any useless outer joins. Ideally this would be done during
* jointree preprocessing, but the necessary information isn't available
* until we've built baserel data structures and classified qual clauses.
*/
joinlist = remove_useless_joins(root, joinlist);
/*
* Also, reduce any semijoins with unique inner rels to plain inner joins.
* Likewise, this can't be done until now for lack of needed info.
*/
reduce_unique_semijoins(root);
/*
* Now distribute "placeholders" to base rels as needed. This has to be
* done after join removal because removal could change whether a
* placeholder is evaluable at a base rel.
*/
add_placeholders_to_base_rels(root);
/*
* Construct the lateral reference sets now that we have finalized
* PlaceHolderVar eval levels.
*/
create_lateral_join_info(root);
/*
* Match foreign keys to equivalence classes and join quals. This must be
* done after finalizing equivalence classes, and it's useful to wait till
* after join removal so that we can skip processing foreign keys
* involving removed relations.
*/
match_foreign_keys_to_quals(root);
/*
* Look for join OR clauses that we can extract single-relation
* restriction OR clauses from.
*/
extract_restriction_or_clauses(root);
/*
* We should now have size estimates for every actual table involved in
* the query, and we also know which if any have been deleted from the
* query by join removal; so we can compute total_table_pages.
*
* Note that appendrels are not double-counted here, even though we don't
* bother to distinguish RelOptInfos for appendrel parents, because the
* parents will still have size zero.
*
* XXX if a table is self-joined, we will count it once per appearance,
* which perhaps is the wrong thing ... but that's not completely clear,
* and detecting self-joins here is difficult, so ignore it for now.
*/
total_pages = 0;
for (rti = 1; rti < root->simple_rel_array_size; rti++)
{
RelOptInfo *brel = root->simple_rel_array[rti];
if (brel == NULL)
continue;
Assert(brel->relid == rti); /* sanity check on array */
if (IS_SIMPLE_REL(brel))
total_pages += (double) brel->pages;
}
root->total_table_pages = total_pages;
/*
* Ready to do the primary planning.
*/
final_rel = make_one_rel(root, joinlist);
/* Check that we got at least one usable path */
if (!final_rel || !final_rel->cheapest_total_path ||
final_rel->cheapest_total_path->param_info != NULL)
elog(ERROR, "failed to construct the join relation");
return final_rel;
}
/*****************************************************************************
*
* JOIN TREE PROCESSING
*
*****************************************************************************/
/*
* deconstruct_jointree
* Recursively scan the query's join tree for WHERE and JOIN/ON qual
* clauses, and add these to the appropriate restrictinfo and joininfo
* lists belonging to base RelOptInfos. Also, add SpecialJoinInfo nodes
* to root->join_info_list for any outer joins appearing in the query tree.
* Return a "joinlist" data structure showing the join order decisions
* that need to be made by make_one_rel().
*
* The "joinlist" result is a list of items that are either RangeTblRef
* jointree nodes or sub-joinlists. All the items at the same level of
* joinlist must be joined in an order to be determined by make_one_rel()
* (note that legal orders may be constrained by SpecialJoinInfo nodes).
* A sub-joinlist represents a subproblem to be planned separately. Currently
* sub-joinlists arise only from FULL OUTER JOIN or when collapsing of
* subproblems is stopped by join_collapse_limit or from_collapse_limit.
*
* NOTE: when dealing with inner joins, it is appropriate to let a qual clause
* be evaluated at the lowest level where all the variables it mentions are
* available. However, we cannot push a qual down into the nullable side(s)
* of an outer join since the qual might eliminate matching rows and cause a
* NULL row to be incorrectly emitted by the join. Therefore, we artificially
* OR the minimum-relids of such an outer join into the required_relids of
* clauses appearing above it. This forces those clauses to be delayed until
* application of the outer join (or maybe even higher in the join tree).
*/
List *
deconstruct_jointree(PlannerInfo *root)
{
List *result;
Relids qualscope;
Relids inner_join_rels;
List *postponed_qual_list = NIL;
/* Start recursion at top of jointree */
Assert(root->parse->jointree != NULL &&
IsA(root->parse->jointree, FromExpr));
/* this is filled as we scan the jointree */
root->nullable_baserels = NULL;
result = deconstruct_recurse(root, (Node *) root->parse->jointree, false,
&qualscope, &inner_join_rels,
&postponed_qual_list);
/* Shouldn't be any leftover quals */
Assert(postponed_qual_list == NIL);
return result;
}
deconstruct_jointree处理完语句,语句中的表从树状结构转换成了数组结构,除了内 连接,其他类型的连接关系被记录到 Specia!Joinlnfo 结构体里 ,同时被记录到 Spec a!Joinlnfo
/*
* reconsider_outer_join_clauses
* Re-examine any outer-join clauses that were set aside by
* distribute_qual_to_rels(), and see if we can derive any
* EquivalenceClasses from them. Then, if they were not made
* redundant, push them out into the regular join-clause lists.
*
* When we have mergejoinable clauses A = B that are outer-join clauses,
* we can't blindly combine them with other clauses A = C to deduce B = C,
* since in fact the "equality" A = B won't necessarily hold above the
* outer join (one of the variables might be NULL instead). Nonetheless
* there are cases where we can add qual clauses using transitivity.
*
* One case that we look for here is an outer-join clause OUTERVAR = INNERVAR
* for which there is also an equivalence clause OUTERVAR = CONSTANT.
* It is safe and useful to push a clause INNERVAR = CONSTANT into the
* evaluation of the inner (nullable) relation, because any inner rows not
* meeting this condition will not contribute to the outer-join result anyway.
* (Any outer rows they could join to will be eliminated by the pushed-down
* equivalence clause.)
*
* Note that the above rule does not work for full outer joins; nor is it
* very interesting to consider cases where the generated equivalence clause
* would involve relations outside the outer join, since such clauses couldn't
* be pushed into the inner side's scan anyway. So the restriction to
* outervar = pseudoconstant is not really giving up anything.
*
* For full-join cases, we can only do something useful if it's a FULL JOIN
* USING and a merged column has an equivalence MERGEDVAR = CONSTANT.
* By the time it gets here, the merged column will look like
* COALESCE(LEFTVAR, RIGHTVAR)
* and we will have a full-join clause LEFTVAR = RIGHTVAR that we can match
* the COALESCE expression to. In this situation we can push LEFTVAR = CONSTANT
* and RIGHTVAR = CONSTANT into the input relations, since any rows not
* meeting these conditions cannot contribute to the join result.
*
* Again, there isn't any traction to be gained by trying to deal with
* clauses comparing a mergedvar to a non-pseudoconstant. So we can make
* use of the EquivalenceClasses to search for matching variables that were
* equivalenced to constants. The interesting outer-join clauses were
* accumulated for us by distribute_qual_to_rels.
*
* When we find one of these cases, we implement the changes we want by
* generating a new equivalence clause INNERVAR = CONSTANT (or LEFTVAR, etc)
* and pushing it into the EquivalenceClass structures. This is because we
* may already know that INNERVAR is equivalenced to some other var(s), and
* we'd like the constant to propagate to them too. Note that it would be
* unsafe to merge any existing EC for INNERVAR with the OUTERVAR's EC ---
* that could result in propagating constant restrictions from
* INNERVAR to OUTERVAR, which would be very wrong.
*
* It's possible that the INNERVAR is also an OUTERVAR for some other
* outer-join clause, in which case the process can be repeated. So we repeat
* looping over the lists of clauses until no further deductions can be made.
* Whenever we do make a deduction, we remove the generating clause from the
* lists, since we don't want to make the same deduction twice.
*
* If we don't find any match for a set-aside outer join clause, we must
* throw it back into the regular joinclause processing by passing it to
* distribute_restrictinfo_to_rels(). If we do generate a derived clause,
* however, the outer-join clause is redundant. We still throw it back,
* because otherwise the join will be seen as a clauseless join and avoided
* during join order searching; but we mark it as redundant to keep from
* messing up the joinrel's size estimate. (This behavior means that the
* API for this routine is uselessly complex: we could have just put all
* the clauses into the regular processing initially. We keep it because
* someday we might want to do something else, such as inserting "dummy"
* joinclauses instead of real ones.)
*
* Outer join clauses that are marked outerjoin_delayed are special: this
* condition means that one or both VARs might go to null due to a lower
* outer join. We can still push a constant through the clause, but only
* if its operator is strict; and we *have to* throw the clause back into
* regular joinclause processing. By keeping the strict join clause,
* we ensure that any null-extended rows that are mistakenly generated due
* to suppressing rows not matching the constant will be rejected at the
* upper outer join. (This doesn't work for full-join clauses.)
*/
void
reconsider_outer_join_clauses(PlannerInfo *root)
make_one_rel函数调用的—— make_rel_from_joinlist函数
make_rel_from_joinlist函数会将传入参数joinlist中的基本关系连接起来生成最终的连接关系并且为最终的连接关系建立RelOptInfo结构,其pathlist字段就是最终路径组成的链表。该函数的实质就是选择不同的连接方式作为中间节点,选择不同的连接顺序将代表基本关系的叶子结点连接成一棵树,使其代价最小。
\src\backend\optimizer\path\Allpaths.c
/*
* make_rel_from_joinlist
* Build access paths using a "joinlist" to guide the join path search.
*
* See comments for deconstruct_jointree() for definition of the joinlist
* data structure.
*/
static RelOptInfo *
make_rel_from_joinlist(PlannerInfo *root, List *joinlist)
{// #lizard forgives
int levels_needed;
List *initial_rels;
ListCell *jl;
/*
* Count the number of child joinlist nodes. This is the depth of the
* dynamic-programming algorithm we must employ to consider all ways of
* joining the child nodes.
*/
levels_needed = list_length(joinlist);
if (levels_needed <= 0)
return NULL; /* nothing to do? */
/*
* Construct a list of rels corresponding to the child joinlist nodes.
* This may contain both base rels and rels constructed according to
* sub-joinlists.
*/
initial_rels = NIL;
foreach(jl, joinlist)
{
Node *jlnode = (Node *) lfirst(jl);
RelOptInfo *thisrel;
if (IsA(jlnode, RangeTblRef))
{
int varno = ((RangeTblRef *) jlnode)->rtindex;
thisrel = find_base_rel(root, varno);
}
else if (IsA(jlnode, List))
{
/* Recurse to handle subproblem */
thisrel = make_rel_from_joinlist(root, (List *) jlnode);
}
else
{
elog(ERROR, "unrecognized joinlist node type: %d",
(int) nodeTag(jlnode));
thisrel = NULL; /* keep compiler quiet */
}
initial_rels = lappend(initial_rels, thisrel);
}
if (levels_needed == 1)
{
/*
* Single joinlist node, so we're done.
*/
return (RelOptInfo *) linitial(initial_rels);
}
else
{
/*
* Consider the different orders in which we could join the rels,
* using a plugin, GEQO, or the regular join search code.
*
* We put the initial_rels list into a PlannerInfo field because
* has_legal_joinclause() needs to look at it (ugly :-().
*/
root->initial_rels = initial_rels;
if (join_search_hook)
return (*join_search_hook) (root, levels_needed, initial_rels);
else if (enable_geqo && levels_needed >= geqo_threshold)
return geqo(root, levels_needed, initial_rels);
else
return standard_join_search(root, levels_needed, initial_rels);
}
}
【路径生成算法】
动态规划算法
沿用 PostgreSQL ,通常使用动态规划来获得最优的路径的
动态规划算法
(1)初始
初始状态下,为每一个待连接的 baserel生成基本关系访问路径,选出最优路径。这些关系称为第1层中间关系。把所有的由n个关系连接生成的中间关系称为第n 层关系。
(2)归纳
已知第1 ~n- 1层的路径,用下列方法生成第n层的关系(n >1):将第n-1层的关系分别与每个第1层的关系连接,并在各自最优路径的基础上,生成使用不同连接方法的路径。若n>3,将第n -2层的关系分别与每个第2层的关系连接,n-3层的关系分别与每个第3层的关系连接,并在各自最优路径的基础上,生成使用不同连接方法的路径,并依此类推。生成第n层关系后,选出每个第n层关系的最优路径作为结果。
(3)生成第n层关系后选出每个第n层关系的最优路径作为结果
/*
* standard_join_search
* Find possible joinpaths for a query by successively finding ways
* to join component relations into join relations.
*
* 'levels_needed' is the number of iterations needed, ie, the number of
* independent jointree items in the query. This is > 1.
*
* 'initial_rels' is a list of RelOptInfo nodes for each independent
* jointree item. These are the components to be joined together.
* Note that levels_needed == list_length(initial_rels).
*
* Returns the final level of join relations, i.e., the relation that is
* the result of joining all the original relations together.
* At least one implementation path must be provided for this relation and
* all required sub-relations.
*
* To support loadable plugins that modify planner behavior by changing the
* join searching algorithm, we provide a hook variable that lets a plugin
* replace or supplement this function. Any such hook must return the same
* final join relation as the standard code would, but it might have a
* different set of implementation paths attached, and only the sub-joinrels
* needed for these paths need have been instantiated.
*
* Note to plugin authors: the functions invoked during standard_join_search()
* modify root->join_rel_list and root->join_rel_hash. If you want to do more
* than one join-order search, you'll probably need to save and restore the
* original states of those data structures. See geqo_eval() for an example.
*/
RelOptInfo *
standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels){....../*
* We employ a simple "dynamic programming" algorithm: we first find all
* ways to build joins of two jointree items, then all ways to build joins
* of three items (from two-item joins and single items), then four-item
* joins, and so on until we have considered all ways to join all the
* items into one rel.
*
* root->join_rel_level[j] is a list of all the j-item rels. Initially we
* set root->join_rel_level[1] to represent all the single-jointree-item
* relations.
*/......}
遗传算法
动态规划算法是生成路径的默认算法,但在有些情况下,检查一个查询所有可能的路径会花去很多的时间和内存空间,特别是所要执行的查询涉及大量的关系的时候。在这种情况下,为了在合理的时间里判断一个合理的执行计划,PostgreSQL将使用遗传算法生成路径。 启用遗传算法由两个参数决定: 系统参数是否启用遗传算法,以及需要连接的基本关系数量 超过使用遗传算法的参数阈值。遗传算法将路径作为个体,将个体以某种方式编码(个体体现了连接顺序),然后通过重组得到后代,考虑连接代价来计算后代的适应值,再选择合适的后代进行下一次迭代。当到达一定的迭代次数之后,遗传算法终止。选择遗传算法可以减少生成路径的时间,但是遗传算法并不一定能找到“最好”的规划,它只能找到相对较优的路径。
Allpaths 模块中有开关参数,控制是 否启用遗传算法 ,如果启用 enable_geqo ,且连接的表个数 等于或超过 参数 geqo_threshold ,就调用 geqo 录中 的遗传算法进行优化。
\src\backend\optimizer\path\Allpaths.c
static RelOptInfo *
make_rel_from_joinlist(PlannerInfo *root, List *joinlist)
......
if (enable_geqo && levels_needed >= geqo_threshold)
return geqo(root, levels_needed, initial_rels);......
遗传算法的 steps:种群初始化: 对基因进行编码,井通过对基因进行随机的排列组合,生成多个染色体,这些染色体构成一个新的种群,另外,在生成染色体的过程中同时计算染色体的适应度。选择染色体:通过随机选择 (实际上通过基于概率的随机数生成算法 ,这样能倾向于选择出优秀的染色体〉,选择出用于交叉和变异的染色体。交叉操作:染色体进行交叉,产生新的染色体并加入种群。变异操作: 对染色体进行变异操作,产生新的染色体并加入种群。适应度计算: 对不良的染色体进行筛选。
/*
* geqo
* solution of the query optimization problem
* similar to a constrained Traveling Salesman Problem (TSP)
*/
RelOptInfo *
geqo(PlannerInfo *root, int number_of_rels, List *initial_rels)
{// #lizard forgives
GeqoPrivateData private;
int generation;
Chromosome *momma;
Chromosome *daddy;
Chromosome *kid;
Pool *pool;
int pool_size,
number_generations;
#ifdef GEQO_DEBUG
int status_interval;
#endif
Gene *best_tour;
RelOptInfo *best_rel;
#if defined(ERX)
Edge *edge_table; /* list of edges */
int edge_failures = 0;
#endif
#if defined(CX) || defined(PX) || defined(OX1) || defined(OX2)
City *city_table; /* list of cities */
#endif
#if defined(CX)
int cycle_diffs = 0;
int mutations = 0;
#endif
/* set up private information */
root->join_search_private = (void *) &private;
private.initial_rels = initial_rels;
/* initialize private number generator */
geqo_set_seed(root, Geqo_seed);
/* set GA parameters */
pool_size = gimme_pool_size(number_of_rels);
number_generations = gimme_number_generations(pool_size);
#ifdef GEQO_DEBUG
status_interval = 10;
#endif
/* allocate genetic pool memory */
pool = alloc_pool(root, pool_size, number_of_rels);
/* random initialization of the pool */
random_init_pool(root, pool);
/* sort the pool according to cheapest path as fitness */
sort_pool(root, pool); /* we have to do it only one time, since all
* kids replace the worst individuals in
* future (-> geqo_pool.c:spread_chromo ) */
#ifdef GEQO_DEBUG
elog(DEBUG1, "GEQO selected %d pool entries, best %.2f, worst %.2f",
pool_size,
pool->data[0].worth,
pool->data[pool_size - 1].worth);
#endif
/* allocate chromosome momma and daddy memory */
momma = alloc_chromo(root, pool->string_length);
daddy = alloc_chromo(root, pool->string_length);
#if defined (ERX)
#ifdef GEQO_DEBUG
elog(DEBUG2, "using edge recombination crossover [ERX]");
#endif
/* allocate edge table memory */
edge_table = alloc_edge_table(root, pool->string_length);
#elif defined(PMX)
#ifdef GEQO_DEBUG
elog(DEBUG2, "using partially matched crossover [PMX]");
#endif
/* allocate chromosome kid memory */
kid = alloc_chromo(root, pool->string_length);
#elif defined(CX)
#ifdef GEQO_DEBUG
elog(DEBUG2, "using cycle crossover [CX]");
#endif
/* allocate city table memory */
kid = alloc_chromo(root, pool->string_length);
city_table = alloc_city_table(root, pool->string_length);
#elif defined(PX)
#ifdef GEQO_DEBUG
elog(DEBUG2, "using position crossover [PX]");
#endif
/* allocate city table memory */
kid = alloc_chromo(root, pool->string_length);
city_table = alloc_city_table(root, pool->string_length);
#elif defined(OX1)
#ifdef GEQO_DEBUG
elog(DEBUG2, "using order crossover [OX1]");
#endif
/* allocate city table memory */
kid = alloc_chromo(root, pool->string_length);
city_table = alloc_city_table(root, pool->string_length);
#elif defined(OX2)
#ifdef GEQO_DEBUG
elog(DEBUG2, "using order crossover [OX2]");
#endif
/* allocate city table memory */
kid = alloc_chromo(root, pool->string_length);
city_table = alloc_city_table(root, pool->string_length);
#endif
/* my pain main part: */
/* iterative optimization */
for (generation = 0; generation < number_generations; generation++)
{
/* SELECTION: using linear bias function */
geqo_selection(root, momma, daddy, pool, Geqo_selection_bias);
#if defined (ERX)
/* EDGE RECOMBINATION CROSSOVER */
gimme_edge_table(root, momma->string, daddy->string, pool->string_length, edge_table);
kid = momma;
/* are there any edge failures ? */
edge_failures += gimme_tour(root, edge_table, kid->string, pool->string_length);
#elif defined(PMX)
/* PARTIALLY MATCHED CROSSOVER */
pmx(root, momma->string, daddy->string, kid->string, pool->string_length);
#elif defined(CX)
/* CYCLE CROSSOVER */
cycle_diffs = cx(root, momma->string, daddy->string, kid->string, pool->string_length, city_table);
/* mutate the child */
if (cycle_diffs == 0)
{
mutations++;
geqo_mutation(root, kid->string, pool->string_length);
}
#elif defined(PX)
/* POSITION CROSSOVER */
px(root, momma->string, daddy->string, kid->string, pool->string_length, city_table);
#elif defined(OX1)
/* ORDER CROSSOVER */
ox1(root, momma->string, daddy->string, kid->string, pool->string_length, city_table);
#elif defined(OX2)
/* ORDER CROSSOVER */
ox2(root, momma->string, daddy->string, kid->string, pool->string_length, city_table);
#endif
/* EVALUATE FITNESS */
kid->worth = geqo_eval(root, kid->string, pool->string_length);
/* push the kid into the wilderness of life according to its worth */
spread_chromo(root, kid, pool);
#ifdef GEQO_DEBUG
if (status_interval && !(generation % status_interval))
print_gen(stdout, pool, generation);
#endif
}
#if defined(ERX) && defined(GEQO_DEBUG)
if (edge_failures != 0)
elog(LOG, "[GEQO] failures: %d, average: %d",
edge_failures, (int) number_generations / edge_failures);
else
elog(LOG, "[GEQO] no edge failures detected");
#endif
#if defined(CX) && defined(GEQO_DEBUG)
if (mutations != 0)
elog(LOG, "[GEQO] mutations: %d, generations: %d",
mutations, number_generations);
else
elog(LOG, "[GEQO] no mutations processed");
#endif
#ifdef GEQO_DEBUG
print_pool(stdout, pool, 0, pool_size - 1);
#endif
#ifdef GEQO_DEBUG
elog(DEBUG1, "GEQO best is %.2f after %d generations",
pool->data[0].worth, number_generations);
#endif
/*
* got the cheapest query tree processed by geqo; first element of the
* population indicates the best query tree
*/
best_tour = (Gene *) pool->data[0].string;
best_rel = gimme_tree(root, best_tour, pool->string_length);
if (best_rel == NULL)
elog(ERROR, "geqo failed to make a valid plan");
/* DBG: show the query plan */
#ifdef NOT_USED
print_plan(best_plan, root);
#endif
/* ... free memory stuff */
free_chromo(root, momma);
free_chromo(root, daddy);
#if defined (ERX)
free_edge_table(root, edge_table);
#elif defined(PMX)
free_chromo(root, kid);
#elif defined(CX)
free_chromo(root, kid);
free_city_table(root, city_table);
#elif defined(PX)
free_chromo(root, kid);
free_city_table(root, city_table);
#elif defined(OX1)
free_chromo(root, kid);
free_city_table(root, city_table);
#elif defined(OX2)
free_chromo(root, kid);
free_city_table(root, city_table);
#endif
free_pool(root, pool);
/* ... clear root pointer to our private storage */
root->join_search_private = NULL;
return best_rel;
}