老大昨天发给我一个hql:
create table zx_car_weibo_41_tmp
as
select *
from ods_tblog_content
where dt = '20130101'
and ((content like '%4C%') or ( extend like '%4C%'))
and ((content like '%雪铁龙%') or ( extend like '%雪铁龙%'))
让我研究一下为什么这么简单的hql会启动两个mapreduce作业,按理说最多是一个mapreduce加上数据文件move操作就够了,我拿到手之后就着手开始研究,首先拿到执行计划:
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-4 depends on stages: Stage-1 , consists of Stage-3, Stage-2
Stage-3
Stage-0 depends on stages: Stage-3, Stage-2
Stage-5 depends on stages: Stage-0
Stage-2
STAGE PLANS:
Stage: Stage-1
Map Reduce
Alias -> Map Operator Tree:
ods_tblog_content
TableScan
alias: ods_tblog_content
Filter Operator
predicate:
expr: (((content like '%4C%') or (extend like '%4C%')) and ((content like '%雪铁龙%') or (extend like '%雪铁龙%')))
type: boolean
Filter Operator
predicate:
expr: (((dt = '20130101') and ((content like '%4C%') or (extend like '%4C%'))) and ((content like '%雪铁龙%') or (extend like '%雪铁龙%')))
type: boolean
Select Operator
expressions:
expr: action
type: string