oozie-工作流例子

Fork and Join Example

The following workflow definition example executes 4 Map-Reduce jobs in 3 steps, 1 job, 2 jobs in parallel and 1 job.

The output of the jobs in the previous step are use as input for the next jobs.

Required workflow job parameters:

jobtracker : JobTracker HOST:PORT
namenode : NameNode HOST:PORT
input : input directory
output : output directory
<workflow-app name='example-forkjoinwf' xmlns="uri:oozie:workflow:0.1">
<start to='firstjob' />
<action name="firstjob">
<map-reduce>
<job-tracker>${jobtracker}</job-tracker>
<name-node>${namenode}</name-node>
<configuration>
<property>
<name>mapred.mapper.class</name>
<value>org.apache.hadoop.example.IdMapper</value>
</property>
<property>
<name>mapred.reducer.class</name>
<value>org.apache.hadoop.example.IdReducer</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>1</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>${input}</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>/usr/foo/${wf:id()}/temp1</value>
</property>
</configuration>
</map-reduce>
<ok to="fork" />
<error to="kill" />
</action>
<fork name='fork'>
<path start='secondjob' />
<path start='thirdjob' />
</fork>
<action name="secondjob">
<map-reduce>
<job-tracker>${jobtracker}</job-tracker>
<name-node>${namenode}</name-node>
<configuration>
<property>
<name>mapred.mapper.class</name>
<value>org.apache.hadoop.example.IdMapper</value>
</property>
<property>
<name>mapred.reducer.class</name>
<value>org.apache.hadoop.example.IdReducer</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>1</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>/usr/foo/${wf:id()}/temp1</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>/usr/foo/${wf:id()}/temp2</value>
</property>
</configuration>
</map-reduce>
<ok to="join" />
<error to="kill" />
</action>
<action name="thirdjob">
<map-reduce>
<job-tracker>${jobtracker}</job-tracker>
<name-node>${namenode}</name-node>
<configuration>
<property>
<name>mapred.mapper.class</name>
<value>org.apache.hadoop.example.IdMapper</value>
</property>
<property>
<name>mapred.reducer.class</name>
<value>org.apache.hadoop.example.IdReducer</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>1</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>/usr/foo/${wf:id()}/temp1</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>/usr/foo/${wf:id()}/temp3</value>
</property>
</configuration>
</map-reduce>
<ok to="join" />
<error to="kill" />
</action>
<join name='join' to='finalejob'/>
<action name="finaljob">
<map-reduce>
<job-tracker>${jobtracker}</job-tracker>
<name-node>${namenode}</name-node>
<configuration>
<property>
<name>mapred.mapper.class</name>
<value>org.apache.hadoop.example.IdMapper</value>
</property>
<property>
<name>mapred.reducer.class</name>
<value>org.apache.hadoop.example.IdReducer</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>1</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>/usr/foo/${wf:id()}/temp2,/usr/foo/${wf:id()}/temp3
</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>${output}</value>
</property>
</configuration>
</map-reduce>
<ok to="end" />
<ok to="kill" />
</action>
<kill name="kill">
<message>Map/Reduce failed, error message[${wf:errorMessage()}]</message>
</kill>
<end name='end'/>
</workflow-app>
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值