Oozie基础入门

工作流调度框架Oozie
  * 工作流
      import -> hive -> export
      将不同的业务进行编排    
  * 调度
      作业/任务 定时执行
      事件触发执行
        时间
        数据集
调度框架
  Linux crontab
    规则
      * * * * * cmd
      前五个字段是设置的时间
      *         *       *        *         *        cmd
      分
                时
                        天
                                 月
                                          星期
     * mr
        yarn jar xxx.jar input output
     * hive
        hive -f xx.sql
     * sqoop
        sqoop --options-file xx.txt
     * shell script
        sh xxx.sh
        
  Azkaban
     开源的工作流管理,可视化的
     https://azkaban.github.io
     
  Oozie
     http://oozie.apache.org
     
  Zeus(宙斯)
      阿里开源框架,一个完整的Hadoop作业平台。
  
一个Oozie Job,是一个MapReduce程序,仅仅只有Map Task

针对不同类型的任务,编写模板。
--------------------------------------------------------Oozie安装----------------------------------------------------------------------------
1.The following two properties are required in Hadoop core-site.xml:

  <!-- OOZIE -->
  <property>
    <name>hadoop.proxyuser.[OOZIE_SERVER_USER].hosts</name>    #root
    <value>[OOZIE_SERVER_HOSTNAME]</value>                     #*
  </property>
  <property>
    <name>hadoop.proxyuser.[OOZIE_SERVER_USER].groups</name>   #root
    <value>[USER_GROUPS_THAT_ALLOW_IMPERSONATION]</value>      #*
  </property>

  vi oozie-site.xml
    <property>
        <name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
        <value>*=/opt/hadoop-2.5.0-cdh5.3.6/etc/hadoop</value>
        <description>
            Comma separated AUTHORITY=HADOOP_CONF_DIR, where AUTHORITY is the HOST:PORT of
            the Hadoop service (JobTracker, HDFS). The wildcard '*' configuration is
            used when there is no exact match for an authority. The HADOOP_CONF_DIR contains
            the relevant Hadoop *-site.xml files. If the path is relative is looked within
            the Oozie configuration directory; though the path can be absolute (i.e. to point
            to Hadoop client conf/ directories in the local filesystem.
        </description>
    </property>
    
    可配置数据库存放数据
2.解压jar包
  tar -zxf oozie-hadooplibs-4.0.0-cdh5.3.6.tar.gz
  
3.创建目录
  mkdir oozie-4.0.0-cdh5.3.6/libext
  
4.拷贝jar包
  mv oozie-4.0.0-cdh5.3.6/hadooplibs/hadooplib-2.5.0-cdh5.3.6.oozie-4.0.0-cdh5.3.6/* ./libext/

5.If using the ExtJS library copy the ZIP file to the libext/ directory.
  http://extjs.com/deploy/ext-2.0.2.zip  http://archive.cloudera.com/gplextras/misc/ext-2.2.zip
  ext-2.2.zip 

6.Run the oozie-setup.sh script to configure Oozie with all the components added to the libext/ directory.
  bin/oozie-setup.sh prepare-war
  bin/oozie-setup.sh sharelib create -fs hdfs://hadoop-senior01.zhangbk.com:8020 -locallib oozie-sharelib-4.0.0-cdh5.3.6-yarn.tar.gz
  bin/ooziedb.sh create -sqlfile oozie.sql -run DB Connection

7.Start Oozie as a daemon process run:
  bin/oozied.sh start

--------------------------------------------------------------------------------------------------------------------------------------------------------------
Command Line Examples
  The examples/ directory must be copied to the user HOME directory in HDFS:
    hdfs dfs -put examples examples
    
  vi /opt/oozie-4.0.0-cdh5.3.6/examples/apps/map-reduce/job.properties
    nameNode=hdfs://ns1
    jobTracker=hadoop-senior03.zhangbk.com:8032
  
  How to run an example application:
    bin/oozie job -oozie http://hadoop-senior01.zhangbk.com:11000/oozie -config examples/apps/map-reduce/job.properties -run
      job: 14-20090525161321-oozie-tucu
  Check the workflow job status:
    oozie job -oozie http://localhost:11000/oozie -info 14-20090525161321-oozie-tucu

------------------------------1. hdfs Action---------------------------------------------------------------------

vi /opt/oozie-4.0.0-cdh5.3.6/examples/apps/map-reduce/workflow.xml


<!--
  Licensed to the Apache Software Foundation (ASF) under one
  or more contributor license agreements.  See the NOTICE file
  distributed with this work for additional information
  regarding copyright ownership.  The ASF licenses this file
  to you under the Apache License, Version 2.0 (the
  "License"); you may not use this file except in compliance
  with the License.  You may obtain a copy of the License at
  
       http://www.apache.org/licenses/LICENSE-2.0
  
  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->
<workflow-app xmlns="uri:oozie:workflow:0.5" name="mr-wordcount-wf">
    <start to="mr-node-wordcount"/>
    <action name="mr-node-wordcount">
        <map-reduce>
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/${oozieDataRoot}/${outputDir}"/>
            </prepare>
            <configuration>
                <property>
                    <name>mapred.mapper.new-api</name>
                    <value>true</value>
                </property>
                <property>
                    <name>mapred.reducer.new-api</name>
                    <value>true</value>
                </property>
                <property>
                    <name>mapreduce.job.queuename</name>
                    <value>${queueName}</value>
                </property>
                <property>
                    <name>mapreduce.job.map.class</name>
                    <value>com.zhangbk.mapreduce.WordCount$WordCountMapper</value>
                </property>
                <property>
                    <name>mapreduce.job.reduce.class</name>
                    <value>com.zhangbk.mapreduce.WordCount$WordCountReducer</value>
                </property>
                <property>
                    <name>mapreduce.map.output.key.class</name>
                    <value>org.apache.hadoop.io.Text</value>
                </property>
                <property>
                    <name>mapreduce.map.output.value.class</name>
                    <value>org.apache.hadoop.io.IntWritable</value>
                </property>
                <property>
                    <name>mapreduce.job.output.key.class</name>
                    <value>org.apache.hadoop.io.Text</value>
                </property>
                <property>
                    <name>mapreduce.job.output.value.class</name>
                    <value>org.apache.hadoop.io.IntWritable</value>
                </property>
                <property>
                    <name>mapreduce.input.fileinputformat.inputdir</name>
                    <value>${nameNode}/${oozieDataRoot}/${inputDir}</value>
                </property>
                <property>
                    <name>mapreduce.output.fileoutputformat.outputdir</name>
                    <value>${nameNode}/${oozieDataRoot}/${outputDir}</value>
                </property>
            </configuration>
        </map-reduce>
        <ok to="end"/>
        <error to="fail"/>
    </action>
    <kill name="fail">
        <message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>

vi job.properties

nameNode=hdfs://ns1
jobTracker=hadoop-senior03.zhangbk.com:8032
queueName=default

oozieAppsRoot=user/oozie-apps
oozieDataRoot=user/oozie/datas

oozie.wf.application.path=${nameNode}/${oozieAppsRoot}/mr-wordcount-wf/workflow.xml
inputDir=mr-wordcount-wf/input
outputDir=mr-wordcount-wf/output


------------------------------------------------------------------------------------------------------------------
如何定义一个workflow
  job.properties
    关键点:指向workflow.xml文件所在的HDFS位置
  workflow.xml
    定义文件
    xml文件
  lib目录
    依赖的jar包
  workflow.xml编写
    流程控制节点
    Action
  Map reduce Action
    如何使用Oozie调度mapreduce程序
    关键点
      将以前Java MapReduce程序中的Driver部分配置成xml

----------------------2. Hive Action--------------------------------------------------------------------------------------------------------------------------------

vi job.properties

nameNode=hdfs://ns1
jobTracker=hadoop-senior03.zhangbk.com:8032
queueName=default
oozieAppsRoot=user/oozie-apps
oozieDataRoot=user/oozie/datas

oozie.use.system.libpath=true

oozie.wf.application.path=${nameNode}/${oozieAppsRoot}/hive-select
outputDir=hive-select/output

vi /opt/oozie-4.0.0-cdh5.3.6/examples/apps/map-reduce/workflow.xml

<?xml version="1.0" encoding="UTF-8"?>
<!--
  Licensed to the Apache Software Foundation (ASF) under one
  or more contributor license agreements.  See the NOTICE file
  distributed with this work for additional information
  regarding copyright ownership.  The ASF licenses this file
  to you under the Apache License, Version 2.0 (the
  "License"); you may not use this file except in compliance
  with the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->
<workflow-app xmlns="uri:oozie:workflow:0.5" name="hive-select">
    <start to="hive-node"/>

    <action name="hive-node">
        <hive xmlns="uri:oozie:hive-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/${oozieDataRoot}/${outputDir}"/>
            </prepare>
            <job-xml>${nameNode}/${oozieAppsRoot}/hive-select/hive-site.xml</job-xml>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <script>select-tab.sql</script>
            <param>OUTPUT=${nameNode}/${oozieDataRoot}/${outputDir}</param>
        </hive>
        <ok to="end"/>
        <error to="fail"/>
    </action>

    <kill name="fail">
        <message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>
#执行Oozie程序
export OOZIE_URL=http://hadoop-senior01.zhangbk.com:11000/oozie
bin/oozie job -config oozie-apps/hive-select/job.properties -run

注意:拷贝MySQL的jar包,放在lib下,配置<job-xml>${nameNode}/${oozieAppsRoot}/hive-select/hive-site.xml</job-xml>,编写sql语句。

--------------3.Sqoop Action--------------------------------------------------------------------------------------------------------------------------------------

vi job.properties

nameNode=hdfs://ns1
jobTracker=hadoop-senior03.zhangbk.com:8032
queueName=default
oozieAppsRoot=user/oozie-apps
oozieDataRoot=user/oozie/datas

oozie.use.system.libpath=true

oozie.wf.application.path=${nameNode}/${oozieAppsRoot}/sqoop-imp
outputDir=sqoop-imp/output


-----------------------------
vi workflow.xml

<?xml version="1.0" encoding="UTF-8"?>
<!--
  Licensed to the Apache Software Foundation (ASF) under one
  or more contributor license agreements.  See the NOTICE file
  distributed with this work for additional information
  regarding copyright ownership.  The ASF licenses this file
  to you under the Apache License, Version 2.0 (the
  "License"); you may not use this file except in compliance
  with the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->
<workflow-app xmlns="uri:oozie:workflow:0.5" name="sqoop-imp-wf">
    <start to="sqoop-node"/>

    <action name="sqoop-node">
        <sqoop xmlns="uri:oozie:sqoop-action:0.3">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/${oozieDataRoot}/${outputDir}"/>
            </prepare>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <command>import --connect jdbc:mysql://hadoop-senior01.zhangbk.com:3306/test --username root --password password01 --table dm_gy_xzqh --target-dir ${nameNode}/${oozieDataRoot}/${outputDir} --num-mappers 1</command>
        </sqoop>   #ʹԃimport --options-file sqoop_import_hdfs.txt
        <ok to="end"/>
        <error to="fail"/>
    </action>

    <kill name="fail">
        <message>Sqoop failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>



--------------------
The same Sqoop action using arg elements:

<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1">
    ...
    <action name="myfirsthivejob">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-traker>foo:8021</job-tracker>
            <name-node>bar:8020</name-node>
            <prepare>
                <delete path="${jobOutput}"/>
            </prepare>
            <configuration>
                <property>
                    <name>mapred.compress.map.output</name>
                    <value>true</value>
                </property>
            </configuration>
            <arg>import</arg>
            <arg>--connect</arg>
            <arg>jdbc:hsqldb:file:db.hsqldb</arg>
            <arg>--table</arg>
            <arg>TT</arg>
            <arg>--target-dir</arg>
            <arg>hdfs://localhost:8020/user/tucu/foo</arg>
            <arg>-m</arg>
            <arg>1</arg>
        </sqoop>
        <ok to="myotherjob"/>
        <error to="errorcleanup"/>
    </action>
    ...
</workflow-app>

-----------------4.Shell Action-----------------------------------------------------------------

vi job.properties

nameNode=hdfs://ns1
jobTracker=hadoop-senior03.zhangbk.com:8032
queueName=default
oozieAppsRoot=user/oozie-apps
oozieDataRoot=user/oozie/datas

oozie.wf.application.path=${nameNode}/${oozieAppsRoot}/shell-hive-select
output=shell-hive-select
exec=select-tab.sh
script=select-user.sql

vi workflow.xml

<workflow-app xmlns="uri:oozie:workflow:0.5" name="shell-wf">
    <start to="shell-node"/>
    <action name="shell-node">
        <shell xmlns="uri:oozie:shell-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <exec>${exec}</exec>
            <file>${nameNode}/${oozieAppsRoot}/shell-hive-select/${exec}#${exec}</file>
            <file>${nameNode}/${oozieAppsRoot}/shell-hive-select/${script}#${script}</file>
            <capture-output/>
        </shell>
        <ok to="end"/>
        <error to="fail"/>
    </action>
    <kill name="fail">
        <message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>

---------------------------------------------------------------------------------
workflow 多个action,形成一个完整的工作流
案例:
  start node
  hive action
    table
    result -> hdfs
  sqoop action
    hdfs -> mysql
  end
  kill
--------------------------------------------------------------------------------
coordinate 调度

修改服务器时区
  删除软连接
    rm -rf /etc/localtime
  创建软连接
    ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
  查看时区及时间
    date -R
     Sat, 06 Jul 2019 16:29:26 +0800
   配置时间
     date -s 2019-7-6
     date -s 16:31:30
配置Oozie时区
  Oozie默认使用UTC时区,而服务器可能是CST,建议统一使用GMT+0800
  
  修改 Oozie-site.xml
      <property>
        <name>oozie.processing.timezone</name>
        <value>GMT+0800</value>
    </property>
  清理缓存
    rm -rf /opt/oozie-4.0.0-cdh5.3.6/oozie-server/work/Catalina
    rm -rf /opt/oozie-4.0.0-cdh5.3.6/oozie-server/conf/Catalina

--------------------------------------------------------------------------------------------------------------------------------------------------------------

vi /opt/oozie-4.0.0-cdh5.3.6/oozie-server/webapps/oozie/oozie-console.js
function getTimeZone() {
    Ext.state.Manager.setProvider(new Ext.state.CookieProvider());
    return Ext.state.Manager.get("TimezoneId","GMT+0800");
}

Coordinator配置定时
    <property>
        <name>oozie.service.coord.check.maximum.frequency</name>
        <value>false</value>
        <description>
            When true, Oozie will reject any coordinators with a frequency faster than 5 minutes.  It is not recommended to disable
            this check or submit coordinators with frequencies faster than 5 minutes: doing so can cause unintended behavior and
            additional system stress.
        </description>
    </property>

--------------------------------------------------------------------------------------------------------------------------------------------------------
Coordinator案例
vi job.properties

nameNode=hdfs://ns1
jobTracker=hadoop-senior03.zhangbk.com:8032
queueName=default
oozieAppsRoot=user/oozie-apps
oozieDataRoot=user/oozie/datas

oozie.coord.application.path=${nameNode}/${oozieAppsRoot}/cron-schedule
start=2019-07-06T23:15+0800
end=2019-07-06T23:25+0800
workflowAppUri=${nameNode}/${oozieAppsRoot}/cron-schedule


vi workflow.xml

<workflow-app xmlns="uri:oozie:workflow:0.5" name="no-op-wf">
    <start to="end"/>
    <end name="end"/>
</workflow-app>


vi coordinator.xml

<coordinator-app name="cron-coord" frequency="${coord:minutes(2)}" 
                   start="${start}" end="${end}" timezone="GMT+0800"
                 xmlns="uri:oozie:coordinator:0.4">
        <action>
        <workflow>
            <app-path>${workflowAppUri}</app-path>
            <configuration>
                <property>
                    <name>jobTracker</name>
                    <value>${jobTracker}</value>
                </property>
                <property>
                    <name>nameNode</name>
                    <value>${nameNode}</value>
                </property>
                <property>
                    <name>queueName</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
        </workflow>
    </action>
</coordinator-app>

---------------------------------------------------------------------------------------------------------
Coordinator配置调度mpreduce

nameNode=hdfs://ns1
jobTracker=hadoop-senior03.zhangbk.com:8032
queueName=default
oozieAppsRoot=user/oozie-apps
oozieDataRoot=user/oozie/datas

oozie.coord.application.path=${nameNode}/${oozieAppsRoot}/cron
start=2019-07-06T23:50+0800
end=2019-07-06T23:59+0800
workflowAppUri=${nameNode}/${oozieAppsRoot}/cron
inputDir=mr-wordcount-wf/input
outputDir=mr-wordcount-wf/output
<workflow-app xmlns="uri:oozie:workflow:0.5" name="mr-wordcount-wf">
    <start to="mr-node-wordcount"/>
    <action name="mr-node-wordcount">
        <map-reduce>
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/${oozieDataRoot}/${outputDir}"/>
            </prepare>
            <configuration>
                <property>
                    <name>mapred.mapper.new-api</name>
                    <value>true</value>
                </property>
                <property>
                    <name>mapred.reducer.new-api</name>
                    <value>true</value>
                </property>
                <property>
                    <name>mapreduce.job.queuename</name>
                    <value>${queueName}</value>
                </property>
                <property>
                    <name>mapreduce.job.map.class</name>
                    <value>com.zhangbk.mapreduce.WordCount$WordCountMapper</value>
                </property>
                <property>
                    <name>mapreduce.job.reduce.class</name>
                    <value>com.zhangbk.mapreduce.WordCount$WordCountReducer</value>
                </property>
                <property>
                    <name>mapreduce.map.output.key.class</name>
                    <value>org.apache.hadoop.io.Text</value>
                </property>
                <property>
                    <name>mapreduce.map.output.value.class</name>
                    <value>org.apache.hadoop.io.IntWritable</value>
                </property>
                <property>
                    <name>mapreduce.job.output.key.class</name>
                    <value>org.apache.hadoop.io.Text</value>
                </property>
                <property>
                    <name>mapreduce.job.output.value.class</name>
                    <value>org.apache.hadoop.io.IntWritable</value>
                </property>
                <property>
                    <name>mapreduce.input.fileinputformat.inputdir</name>
                    <value>${nameNode}/${oozieDataRoot}/${inputDir}</value>
                </property>
                <property>
                    <name>mapreduce.output.fileoutputformat.outputdir</name>
                    <value>${nameNode}/${oozieDataRoot}/${outputDir}</value>
                </property>
            </configuration>
        </map-reduce>
        <ok to="end"/>
        <error to="fail"/>
    </action>
    <kill name="fail">
        <message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>
<coordinator-app name="cron-coord" frequency="0/3 * * * *" 
                   start="${start}" end="${end}" timezone="GMT+0800"
                 xmlns="uri:oozie:coordinator:0.4">
        <action>
        <workflow>
            <app-path>${workflowAppUri}</app-path>
            <configuration>
                <property>
                    <name>jobTracker</name>
                    <value>${jobTracker}</value>
                </property>
                <property>
                    <name>nameNode</name>
                    <value>${nameNode}</value>
                </property>
                <property>
                    <name>queueName</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
        </workflow>
    </action>
</coordinator-app>


在Hive table中,提供一些列的属性
  INPUT__FILE__NAME 查看数据存放的位置


-------------------------------------------------------------------------------------------
Hive Action--Sqoop Action定时调度

job.properties

nameNode=hdfs://ns1
jobTracker=hadoop-senior03.zhangbk.com:8032
queueName=default
oozieAppsRoot=user/oozie-apps
oozieDataRoot=user/oozie/datas

oozie.use.system.libpath=true

oozie.coord.application.path=${nameNode}/${oozieAppsRoot}/wf-hive-select
start=2019-07-07T21:50+0800
end=2019-07-07T21:59+0800
workflowAppUri=${nameNode}/${oozieAppsRoot}/wf-hive-select
outputDir=wf-hive-select/output


workflow.xml

<workflow-app xmlns="uri:oozie:workflow:0.5" name="wf-hive-select">
    <start to="hive-node"/>

    <action name="hive-node">
        <hive xmlns="uri:oozie:hive-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/${oozieDataRoot}/${outputDir}"/>
            </prepare>
            <job-xml>${nameNode}/${oozieAppsRoot}/hive-select/hive-site.xml</job-xml>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <script>select-tab.sql</script>
            <param>OUTPUT=${nameNode}/${oozieDataRoot}/${outputDir}</param>
        </hive>
        <ok to="sqoop-node"/>
        <error to="fail"/>
    </action>

    <action name="sqoop-node">
        <sqoop xmlns="uri:oozie:sqoop-action:0.3">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <command>export --connect jdbc:mysql://hadoop-senior01.zhangbk.com:3306/test --username root --password 
password01 --table xzqh_hive --input-fields-terminated-by "\t" --export-dir hdfs://ns1/user/oozie/datas/wf-hive-sele
ct/output --num-mappers 1</command>
        </sqoop>
        <ok to="end"/>
        <error to="fail"/>
    </action>


    <kill name="fail">
        <message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>

select-tab.sql

drop table if exists xzqh_tmp ;
create table if not exists default.xzqh_tmp like default.dm_gy_xzqh_hive location '${OUTPUT}';
insert overwrite table default.xzqh_tmp
select * 
from default.dm_gy_xzqh_hive 
order by xzqhsz_dm
limit 100 ;

coordinator.xml

<coordinator-app name="cron-coord" frequency="0/10 * * * *" 
                   start="${start}" end="${end}" timezone="GMT+0800"
                 xmlns="uri:oozie:coordinator:0.4">
        <action>
        <workflow>
            <app-path>${workflowAppUri}</app-path>
            <configuration>
                <property>
                    <name>jobTracker</name>
                    <value>${jobTracker}</value>
                </property>
                <property>
                    <name>nameNode</name>
                    <value>${nameNode}</value>
                </property>
                <property>
                    <name>queueName</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
        </workflow>
    </action>
</coordinator-app>

可能出现的问题:

    java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.HiveMain not found

解决方法:

缺少jar包,在job.properties中添加
    oozie.use.system.libpath=true
-------------------------------------------------------------------------------------------------------
Oozie
  * workflow
  * coordinator
      time定时触发
        ${coord:days(1)}
        cron
      data触发
  * bundle
      

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值