OAP 之 OapStrategy

OapSortLimitStrategy

Plans special cases of orderby+limit operators.
If OAP database already has index on a specific column, we can push this sort and limit condition down to file scan RDD,

i.e. before this strategy applies, the child (could be a deep
child) the FileScanRDD gives full ROW scan and do lots sort and limit in upper tree level. But after it applies, FileScanRDD gives sorted (because of OAP index) and limited ROWs to upper level, and then do only few sort and limit operation.

Limitations:
Only 2 use scenarios so far.

   1.filter + order by with limit on same/single column
     SELECT x FROM xx WHERE filter(A) ORDER BY Column-A LIMIT N
   2. order by a single column with limit Only
     SELECT x FROM xx ORDER BY Column-A LIMIT N
key code
  def apply(plan: LogicalPlan): Seq[SparkPlan] = plan match {
    case logical.ReturnAnswer(rootPlan) => rootPlan match {
      case logical.Limit(IntegerLiteral(limit), logical.Sort(order, true, child)) =>
        val childPlan = calcChildPlan(child, limit, order)
        TakeOrderedAndProjectExec(limit, order, child.output, childPlan) :: Nil
      case logical.Limit(
      IntegerLiteral(limit),
      logical.Project(projectList, logical.Sort(order, true, child))) =>
        val childPlan = calcChildPlan(child, limit, order)
        TakeOrderedAndProjectExec(limit, order, projectList, childPlan) :: Nil
      case _ =>
        Nil
    }
    case logical.Limit(IntegerLiteral(limit), logical.Sort(order, true, child)) =>
      val childPlan = calcChildPlan(child, limit, order)
      TakeOrderedAndProjectExec(limit, order, child.output, childPlan) :: Nil
    case logical.Limit(
    IntegerLiteral(limit),
    logical.Project(projectList, logical.Sort(order, true, child))) =>
      val childPlan = calcChildPlan(child, limit, order)
      TakeOrderedAndProjectExec(limit, order, projectList, childPlan) :: Nil
    case _ => Nil
  }
OapSemiJoinStrategy

OAPSemiJoinStrategy optimizes SemiJoin.
SemiJoin can take assumption that each value in right
table is distinct, so we can take advantage of OAP index
that we tell index scan to return only 1 item from each
index entry.

 Limitation:
 1. Query & Filter column must have index.
key code
def apply(plan: LogicalPlan): Seq[SparkPlan] = plan match {
    case ExtractEquiJoinKeys(joinType, leftKeys, rightKeys, condition, left, right)
      if joinType == LeftSemi && canBroadcast(right) =>
      Seq(joins.BroadcastHashJoinExec(
        leftKeys, rightKeys, joinType, BuildRight, condition, planLater(left),
        calcChildPlan(right, rightKeys.map(SortOrder(_, Ascending)))))
    case _ => Nil
  }
OapGroupAggregateStrategy

Index Boosts Group By Aggregation

Optimized Aggregations w/ GROUP BY as a OAP Strategies (partial aggregation optimization).
Now the workable case is:

   SELECT [agg](columns)
   FROM table
   [WHERE filter on columns]
   GROUP BY one specific column
key code
val aggregateOperator =
        if (aggExpressions.map(_.aggregateFunction).exists(
          !AggregateFunctionAdapter.supportsPartial(_))) {
          if (functionsWithDistinct.nonEmpty) {
            sys.error("Distinct columns cannot exist in Aggregate operator containing " +
              "aggregate functions which don't support partial aggregation.")
          } else {
            // So far do not support non-partial aggregations.
            Nil
          }
        } else if (functionsWithDistinct.isEmpty) {
          // Support single group by only so far
          if (groupingExpressions.size == 1) {
            OapAggUtils.planAggregateWithoutDistinct(
              groupingExpressions,
              aggExpressions,
              resultExpressions,
              calcChildPlan(
                groupingExpressions, aggExpressions, resultExpressions, child))
          } else Nil
        } else {
          // TODO: support distinct in future.
          Nil
        }
      aggregateOperator
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值