上一篇文章我们简单介绍了一下Oozie以及怎样安装部署Oozie,本文我们通过几个案例来看一下怎样使用Oozie。关注专栏《破茧成蝶——大数据篇》,查看更多相关的内容~
目录
一、Oozie调度shell脚本
1、创建目录
[root@master oozie-4.0.0-cdh5.3.6]# mkdir -p oozie-apps/shell
2、在创建的shell目录下创建两个文件
[root@master shell]# touch workflow.xml job.properties
3、编辑job.properties
#HDFS地址
nameNode=hdfs://master:8020
#ResourceManager地址
jobTracker=slave01:8032
#队列名称
queueName=default
examplesRoot=oozie-apps
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/shell
4、编辑workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">
<!--开始节点-->
<start to="shell-node"/>
<!--动作节点-->
<action name="shell-node">
<!--shell动作-->
<shell xmlns="uri:oozie:shell-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<!--要执行的脚本:创建/root/files目录下创建xzw目录-->
<exec>mkdir</exec>
<argument>/root/files/xzw</argument>
<capture-output/>
</shell>
<ok to="end"/>
<error to="fail"/>
</action>
<!--kill节点-->
<kill name="fail">
<message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<!--结束节点-->
<end name="end"/>
</workflow-app>
5、上传配置文件到HDFS上
[root@master hadoop-2.5.0-cdh5.3.6]# bin/hdfs dfs -put /opt/modules/oozie-4.0.0-cdh5.3.6/oozie-apps/ /user/root
6、执行任务
[root@master oozie-4.0.0-cdh5.3.6]# bin/oozie job -oozie http://master:11000/oozie -config oozie-apps/shell/job.properties -run
被执行的脚本是创建目录xzw,我们可以去对应目录看一下是否创建成功:
当然也可以通过监控界面进行查看:
7、杀死某个任务
bin/oozie job -oozie http://master:11000/oozie -kill 0000001-210412154035993-oozie-root-W
二、Oozie调度多个job
1、编辑job.properties
nameNode=hdfs://master:8020
jobTracker=slave01:8032
queueName=default
examplesRoot=oozie-apps
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/shells
2、编辑workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">
<start to="p1-shell-node"/>
<action name="p1-shell-node">
<shell xmlns="uri:oozie:shell-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>mkdir</exec>
<argument>/root/files/aaa1</argument>
<capture-output/>
</shell>
<ok to="forking"/>
<error to="fail"/>
</action>
<action name="p2-shell-node">
<shell xmlns="uri:oozie:shell-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>mkdir</exec>
<argument>/root/files/aaa2</argument>
<capture-output/>
</shell>
<ok to="joining"/>
<error to="fail"/>
</action>
<action name="p3-shell-node">
<shell xmlns="uri:oozie:shell-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>mkdir</exec>
<argument>/root/files/aaa3</argument>
<capture-output/>
</shell>
<ok to="joining"/>
<error to="fail"/>
</action>
<action name="p4-shell-node">
<shell xmlns="uri:oozie:shell-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>mkdir</exec>
<argument>/root/files/aaa4</argument>
<capture-output/>
</shell>
<ok to="end"/>
<error to="fail"/>
</action>
<fork name="forking">
<path start="p2-shell-node"/>
<path start="p3-shell-node"/>
</fork>
<join name="joining" to="p4-shell-node"/>
<kill name="fail">
<message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
3、上传配置文件到HDFS上
[root@master hadoop-2.5.0-cdh5.3.6]# bin/hdfs dfs -put /opt/modules/oozie-4.0.0-cdh5.3.6/oozie-apps/shells/ /user/root/oozie-apps
4、执行任务
bin/oozie job -oozie http://master:11000/oozie -config oozie-apps/shells/job.properties -run
5、查看执行结果
三、Oozie调度MapReduce任务
1、我们使用Oozie里面自带的examples,首先我们解压。
tar -zxvf oozie-examples.tar.gz
2、将examples目录下的map-reduce案例拷贝到我们oozie-apps目录下
[root@master apps]# pwd
/opt/modules/oozie-4.0.0-cdh5.3.6/examples/apps
[root@master apps]# cp -r ./map-reduce/ ../../oozie-apps/
3、删除不相关的目录
4、拷贝MapReduce的执行jar包到map-reduce的lib目录下
[root@master map-reduce]# cp /opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.3.6.jar lib/
5、修改workflow.xml文件
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<workflow-app xmlns="uri:oozie:workflow:0.2" name="map-reduce-wf">
<start to="mr-node"/>
<action name="mr-node">
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/xzw/output"/>
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
<property>
<name>mapred.mapper.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.reducer.new-api</name>
<value>true</value>
</property>
<property>
<name>mapreduce.job.map.class</name>
<value>org.apache.hadoop.examples.WordCount$TokenizerMapper</value>
</property>
<property>
<name>mapreduce.job.reducer.class</name>
<value>org.apache.hadoop.examples.WordCount$IntSumReducer</value>
</property>
<property>
<name>mapreduce.job.output.key.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
<property>
<name>mapreduce.job.output.value.class</name>
<value>org.apache.hadoop.io.IntWritable</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>1</value>
</property>
<property>
<name>mapreduce.input.fileinputformat.inputdir</name>
<value>/xzw/input</value>
</property>
<property>
<name>mapreduce.output.fileoutputformat.outputdir</name>
<value>/xzw/output</value>
</property>
</configuration>
</map-reduce>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
6、修改job.properties文件
nameNode=hdfs://master:8020
jobTracker=slave01:8032
queueName=default
oozie.wf.application.path=${nameNode}/user/${user.name}/oozie-apps/map-reduce/workflow.xml
7、上传到HDFS中
bin/hdfs dfs -put /opt/modules/oozie-4.0.0-cdh5.3.6/oozie-apps/map-reduce/ /user/root/oozie-apps
8、执行任务
bin/oozie job -oozie http://master:11000/oozie -config oozie-apps/map-reduce/job.properties -run
9、查看执行结果
四、Oozie定时任务
1、配置oozie-site.xml文件
<property>
<name>oozie.processing.timezone</name>
<value>GMT+0800</value>
<description>
Oozie server timezone. Valid values are UTC and GMT(+/-)####, for example 'GMT+0530' would be India
timezone. All dates parsed and genered dates by Oozie Coordinator/Bundle will be done in the specified
timezone. The default value of 'UTC' should not be changed under normal circumtances. If for any reason
is changed, note that GMT(+/-)#### timezones do not observe DST changes.
</description>
</property>
2、重启Oozie,并将时区改为东八区
3、将examples目录下的cron案例拷贝到我们oozie-apps目录下
4、修改coordinator.xml配置文件
<coordinator-app name="cron-coord" frequency="${coord:minutes(5)}" start="${start}" end="${end}" timezone="GMT+0800"
xmlns="uri:oozie:coordinator:0.2">
<action>
<workflow>
<app-path>${workflowAppUri}</app-path>
<configuration>
<property>
<name>jobTracker</name>
<value>${jobTracker}</value>
</property>
<property>
<name>nameNode</name>
<value>${nameNode}</value>
</property>
<property>
<name>queueName</name>
<value>${queueName}</value>
</property>
</configuration>
</workflow>
</action>
</coordinator-app>
5、修改workflow.xml配置文件
<workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">
<!--开始节点-->
<start to="shell-node"/>
<!--动作节点-->
<action name="shell-node">
<!--shell动作-->
<shell xmlns="uri:oozie:shell-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<!--要执行的脚本-->
<exec>append_time.sh</exec>
<file>/user/root/oozie-apps/cron/append_time.sh</file>
<capture-output/>
</shell>
<ok to="end"/>
<error to="fail"/>
</action>
<!--kill节点-->
<kill name="fail">
<message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<!--结束节点-->
<end name="end"/>
</workflow-app>
6、修改job.properties配置文件
nameNode=hdfs://master:8020
jobTracker=slave01:8032
queueName=default
oozie.coord.application.path=${nameNode}/user/${user.name}/oozie-apps/cron
start=2021-04-16T15:50+0800
end=2021-04-16T16:25+0800
workflowAppUri=${nameNode}/user/${user.name}/oozie-apps/cron
7、上传到HDFS上
[root@master hadoop-2.5.0-cdh5.3.6]# bin/hdfs dfs -put /opt/modules/oozie-4.0.0-cdh5.3.6/oozie-apps/cron/ /user/root/oozie-apps
8、执行任务
bin/oozie job -oozie http://master:11000/oozie -config oozie-apps/cron/job.properties -run
9、查看结果
以上就是本文的所有内容,比较简单。你们在此过程中遇到了什么问题,欢迎留言,让我看看你们都遇到了哪些问题~