Presto join enumeration笔记

原文:presto-join-enumeration

Query improvements

Incorporating join enumeration into Presto means that your queries can automatically run faster without manual adjustments. Such manual adjustments are often not possible at all because they are not expressible in SQL language (e.g: join distribution type selection).

Prerequisite :data source needs to have statistics collected (currently the Hive connector is the only connector that provides statistics). presto cbo statistics

Case 1: Broadcast vs distributed

Query: SELECT * FROM large_table l, small_table s WHERE l.key = s.key

The example below :

  • shows how our algorithm selects join operator type (distributed vs broadcast) based on table sizes and cost model.

Case 2: Join sides reordering

Query: SELECT * FROM small_table s, large_table l WHERE s.key = l.key

This example displays :

  • how the algorithm flips join sides so that the table with smaller memory footprint is kept in distributed join operator.

Case 3: Join tree reordering

Query: SELECT * FROM table_a a, table_b b, table_c c WHERE a.key=b.key AND a.key=c.key

  •  In this case join enumeration algorithm will reorder the entire join tree to avoid large and costly intermediate result.
  •  It also shows that our algorithm can intelligently derive new predicates (“b.key=c.key“) so that new joins can be explored.

Join enumeration

Join enumeration is the process of enumerating and evaluating different join orders with the goal of finding an optimal execution plan. To do this efficiently, Presto join enumerator utilizes dynamic programming and divide-and-conquer technique. Such algorithms split larger problems into smaller ones. For instance, they assume that the best way to join tables (a,b,c) with table d involves finding the best join order for tables (a,b,c) first. This assumption allows to reuse join enumeration result for (a,b,c) in all join enumerations that have (a,b,c) as a subcomponent, e.g:(a,b,c,d) or (a,b,c,e).

Top-down approach

Dynamic programming is usually implemented using a top-down (recursive) or bottom-up (iterative) approach. Presto join enumerator uses the former technique:

  1. In the top-down approach relation sets (i.e. tables) are partitioned into two partitions.
  2. For each partition the best join ordering is found recursively.
  3. Then the partitioning with the lowest total cost is selected as a global result.
  4. Memoization is used in order to reduce the search time.

Search space

Most optimizers (Presto included) skip cross-joins during join enumeration. It’s a good heuristic that drastically reduces the search space. Cross-joins are usually not part of an optimal join ordering and enumerating them greatly slows down query optimization process.

Some optimizers only consider left-deep join trees. This reduces search space, but could lead to suboptimal plans. We have found that the cost of exploring bushy-tree joins for a relatively small number of relations does not bring a significant overhead. Therefore, in order to to find the best plan Presto join enumerator explores both left-deep and bushy tree joins.

补充(TODO 找时间学习)

Commonly used classes of join trees:

  • left-deep tree
  • right-deep tree
  • zigzag tree
  • bushy tree 

Presto algorithm design

Presto join enumeration works in the following stages:

1)

  • First, join nodes that can be reordered are collected into a special multi-join node. 
  • Multi-join node contains aggregated information about reorderable joins. This includes source relations and predicates from all joins. It’s worth noting that for result correctness, not all joins are freely reorderable so are not collected into the multi-join node. We won’t go into those details today.

2a) Then, the best join ordering is found recursively. The procedure is as follows:

  •  For a given relation set RS find all partitionings into two partitions. We only consider partitionings where there is a join predicate between the relations (i.e. no cross joins)..
  • For each partition the best join ordering is then found recursively.

2b) For each partitioning construct a join J between the two partitions. Consider different physical properties of a join (e.g. distributed vs. broadcast). Choose the properties with the lowest cost.

2c) Choose a partitioning with the lowest total cost. Corresponding join J is then memorized along with a total cost and a set of relations that are joined together, e.g: (a,b,c). Every time join enumeration will try to resolve best join ordering for (a,b,c) it can reuse an already computed result.

An interesting feature of our algorithm is that we derive equality predicates for joins using a so called equality inference algorithm. The input of the algorithm are predicates of all reorderable joins. The algorithm can deduce new predicates for relations that didn’t have predicates between them initially. More on that later.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值