PostgreSQL 源码解读(27)- 查询语句#12(查询优化-上拉子链接#2)

本文深入探讨了PostgreSQL查询优化中的子链接上拉策略,特别是针对ANY_SUBLINK和EXISTS_SUBLINK的优化。通过实例展示了子链接上拉如何提升性能,将查询转换为半连接或反半连接操作,并提供了源码解读、数据结构分析及gdb跟踪测试的详细步骤。
摘要由CSDN通过智能技术生成

本文简单介绍了PG查询逻辑优化中的子查询链接(subLink),以ANY子链接为例介绍了子查询链接上拉主函数处理逻辑以及使用gdb跟踪分析。

一、子链接上拉简介

PG尝试对ANY_SUBLINK和EXISTS_SUBLINK两种类型的子链接尝试提升,ANY_SUBLINK表示IN/ANY子句,EXISTS_SUBLINK表示EXISTS子句.子链接上拉(pull up)的目的是为了提升性能,把查询一条记录对比一条记录的逻辑实现变换为半连接或反半连接实现.
上拉样例:

-- 原始SQL
select * 
from t_dwxx a
where dwbh > any (select b.dwbh from t_grxx b);
-- 上拉后的SQL(示意,实际不能执行)
select a.* 
from t_dwxx a semi join select b.dwbh from t_grxx b on a.dwbh > pullup.dwbh;

执行了子链接上拉后的执行计划:

testdb=# explain select * 
testdb-# from t_dwxx a
testdb-# where dwbh > any (select b.dwbh from t_grxx b);
                               QUERY PLAN                               
------------------------------------------------------------------------
 Nested Loop Semi Join  (cost=0.00..815.76 rows=53 width=474)
   Join Filter: ((a.dwbh)::text > (b.dwbh)::text)
   ->  Seq Scan on t_dwxx a  (cost=0.00..11.60 rows=160 width=474)
   ->  Materialize  (cost=0.00..17.35 rows=490 width=38)
         ->  Seq Scan on t_grxx b  (cost=0.00..14.90 rows=490 width=38)
(5 rows)

没有上拉的执行计划:

testdb=# explain select * 
testdb-# from t_dwxx a
testdb-# where dwbh > any (select b.dwbh from t_grxx b where a.dwbh = b.dwbh);
                            QUERY PLAN                            
------------------------------------------------------------------
 Seq Scan on t_dwxx a  (cost=0.00..1302.40 rows=80 width=474)
   Filter: (SubPlan 1)
   SubPlan 1
     ->  Seq Scan on t_grxx b  (cost=0.00..16.12 rows=2 width=38)
           Filter: ((a.dwbh)::text = (dwbh)::text)
(5 rows)

二、源码解读

pull_up_sublinks

 /*
  * pull_up_sublinks
  *      Attempt to pull up ANY and EXISTS SubLinks to be treated as
  *      semijoins or anti-semijoins.
  * 尝试上拉(pull up) ANY/IN 和 EXISTS 子链接,变换为半连接或反半连接
  * 
  * A clause "foo op ANY (sub-SELECT)" can be processed by pulling the
  * sub-SELECT up to become a rangetable entry and treating the implied
  * comparisons as quals of a semijoin.  However, this optimization *only*
  * works at the top level of WHERE or a JOIN/ON clause, because we cannot
  * distinguish whether the ANY ought to return FALSE or NULL in cases
  * involving NULL inputs.  Also, in an outer join's ON clause we can only
  * do this if the sublink is degenerate (ie, references only the nullable
  * side of the join).  In that case it is legal to push the semijoin
  * down into the nullable side of the join.  If the sublink references any
  * nonnullable-side variables then it would have to be evaluated as part
  * of the outer join, which makes things way too complicated.
  *
  * Under similar conditions, EXISTS and NOT EXISTS clauses can be handled
  * by pulling up the sub-SELECT and creating a semijoin or anti-semijoin.
  *
  * express op ANY (sub-SELECT) 语句可以通过上拉子查询为RTE并且把对比操作变换
  * 为半连接操作,这种优化只能在最上层的WHERE语句中实现
  * EXISTS/NOT EXISTS语句类似
  *
  *
  * This routine searches for such clauses and does the necessary parsetree
  * transformations if any are found.
  *
  * This routine has to run before preprocess_expression(), so the quals
  * clauses are not yet reduced to implicit-AND format, and are not guaranteed
  * to be AND/OR-flat either.  That means we need to recursively search through
  * explicit AND clauses.  We stop as soon as we hit a non-AND item.
  */
 void
 pull_up_sublinks(PlannerInfo *root)
 {
     Node       *jtnode;
     Relids      relids;
 
     /* Begin recursion through the jointree */
     jtnode = pull_up_sublinks_jointree_recurse(root,
                                                (Node *) root->parse->jointree,
                                                &relids);//执行上拉操作
 
     /*
      * root->parse->jointree must always be a FromExpr, so insert a dummy one
      * if we got a bare RangeTblRef or JoinExpr out of the recursion.
      */
     if (IsA(jtnode, FromExpr))//jointree要求FromExpr类型
         root->parse->jointree = (FromExpr *) jtnode;
     else
         root->parse->jointree = makeFromExpr(list_make1(jtnode), NULL);
 }
 

pull_up_sublinks_jointree_recurse
处理流程如下图所示:

bb
pull_up_sublinks_jointree_recurse处理流程

查询树结构如下:


bb
查询树
 /*
  * Recurse through jointree nodes for pull_up_sublinks()
  *
  * In addition to returning the possibly-modified jointree node, we return
  * a relids set of the contained rels into *relids.
  */
 static Node *
 pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,
                                   Relids *relids)
 {
     if (jtnode == NULL)
     {
         *relids = NULL;
     }
     else if (IsA(jtnode, RangeTblRef))//如为RangeTblRef类型
     {
         int         varno = ((RangeTblRef *) jtnode)->rtindex;
 
         *relids = bms_make_singleton(varno);
         /* jtnode is returned unmodified */
     }
     else if (IsA(jtnode, FromExpr))//如为FromExpr类型
     {
         FromExpr   *f = (FromExpr *) jtnode;
         List       *newfromlist = NIL;
         Relids      frelids = NULL;
         FromExpr   *newf;
         Node       *jtlink;
         ListCell   *l;
 
         /* First, recurse to process children and collect their relids */
         foreach(l, f->fromlist)//
         {
             Node       *newchild;
             Relids      childrelids;
             //对fromlist中的元素执行上拉操作
             //如能够上拉,则把子查询从WHERE子句中提升到FROM子句,newchild作为连接的一部分
             newchild = pull_up_sublinks_jointree_recurse(root,
                                                          lfirst(l),
                                                          &childrelids);
             newfromlist = lappend(newfromlist, newchild);
             frelids = bms_join(frelids, childrelids);
         }
         /* Build the replacement FromExpr; no quals yet */
         newf = makeFromExpr(newfromlist, NULL);//创建新的FromExpr
         /* Set up a link representing the rebuilt jointree */
         jtlink = (Node *) newf;
         /* Now process qual --- all children are available for use */
         //处理子链接中的表达式
         //newf(指针,相当于jtlink)
         newf->quals = pull_up_sublinks_qual_recurse(root, f->quals,
                                                     &jtlink, frelids,
                                                     NULL, NULL);//
 
         /*
          * Note that the result will be either newf, or a stack of JoinExprs
          * with newf at the base.  We rely on subsequent optimization steps to
          * flatten this and rearrange the joins as needed.
          *
          * Although we could include the pulled-up subqueries in the returned
          * relids, there's no need since upper quals couldn't refer to their
          * outputs anyway.
          */
         *relids = frelids;//设置相关的relids
         jtnode = jtlink;//返回值
     }
     else if (IsA(jtnode, JoinExpr))
     {
         JoinExpr   *j;
         Relids      leftrelids;
         Relids      rightrelids;
         Node       *jtlink;
 
         /*
     
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值