78.1 演示环境介绍
- CM和CDH版本:5.13.1
- 未启用Kerberos
78.2 操作演示
ooziejob.sh脚本
#!/bin/bash
name=$1
echo "hello $name" >> /tmp/oozieshell.log
jar包上传到HDFS目录
sudo -u faysontest hadoop fs -mkdir -p /faysontest/jars
sudo -u faysontest hadoop fs -put /opt/ooziejob.sh /faysontest/jars
sudo -u faysontest hadoop fs -ls /faysontest/jars
- 定义Shell Action的workflow.xml文件:
- workflow.xml文件中使用的参数配置为动态参数
<workflow-app name="ShellWorkflow" xmlns="uri:oozie:workflow:0.5">
<start to="shell-d9b6"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="shell-d9b6">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>${exec}</exec>
<argument>${argument}</argument>
<capture-output/>
</shell>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
workflow.xml文件上传至HDFS的/user/faysontest/oozie/shellaction目录下
[root@ip-186-31-6-148 opt]# sudo -u faysontest hadoop fs -mkdir -p /user/faysontest/oozie/shellaction
[root@ip-186-31-6-148 opt]# sudo -u faysontest hadoop fs -put /opt/workflow.xml /user/faysontest/oozie/shellaction
[root@ip-186-31-6-148 opt]# sudo -u faysontest hadoop fs -ls /user/faysontest/oozie/shellaction
- Maven创建Java工程
- pom.xml文件内容如下:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<artifactId>cdh-project</artifactId>
<groupId>com.cloudera</groupId>
<version>1.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<artifactId>oozie-demo</artifactId>
<packaging>jar</packaging>
<name>oozie-demo</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5.4</version>
</dependency>
<dependency>
<groupId>net.sourceforge.spnego</groupId>
<artifactId>spnego</artifactId>
<version>7.0</version>
</dependency>
<dependency>
<groupId>org.apache.oozie</groupId>
<artifactId>oozie-client</artifactId>
<version>4.1.0</version>
</dependency>
</dependencies>
</project>
ShellWorkflowDemo.java代码如下:
package com.cloudera.nokerberos;
import org.apache.oozie.client.OozieClient;
import org.apache.oozie.client.WorkflowAction;
import org.apache.oozie.client.WorkflowJob;
import java.util.List;
import java.util.Properties;
/**
* package: com.cloudera.nokerberos
* describe: 使用Oozie-client的API接口向非Kerberos集群提交Shell Action作业
* creat_user: Fayson
* email: htechinfo@163.com
* creat_date: 2018/2/13
* creat_time: 下午11:10
* 公众号:碧茂科技
*/
public class ShellWorkflowDemo {
private static String oozieURL = "http://ip-186-31-6-148.fayson.com:11000/oozie";
public static void main(String[] args) {
System.setProperty("user.name", "faysontest");
OozieClient oozieClient = new OozieClient(oozieURL);
try {
System.out.println(oozieClient.getServerBuildVersion());
Properties properties = oozieClient.createConfiguration();
properties.put("oozie.wf.application.path", "${nameNode}/user/faysontest/oozie/shellaction");
properties.put("oozie.use.system.libpath", "True");
properties.put("nameNode", "hdfs://ip-186-31-10-118.fayson.com:8020");
properties.put("jobTracker", "ip-186-31-6-148.fayson.com:8032");
properties.put("exec", "${nameNode}//faysontest/jars/ooziejob.sh");
properties.put("argument", "fayson");
//运行workflow
String jobid = oozieClient.run(properties);
System.out.println(jobid);
//等待10s
new Thread(){
public void run() {
try {
Thread.sleep(10000l);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}.start();
//根据workflow id获取作业运行情况
WorkflowJob workflowJob = oozieClient.getJobInfo(jobid);
//获取作业日志
System.out.println(oozieClient.getJobLog(jobid));
//获取workflow中所有ACTION
List<WorkflowAction> list = workflowJob.getActions();
for (WorkflowAction action : list) {
//输出每个Action的 Appid 即Yarn的Application ID
System.out.println(action.getExternalId());
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
- 总结
- 通过Oozie API提交作业,需要先定义好workflow.xml文件
- 参数传递通过在代码里面调用oozieClient.createConfiguration()创建一个Properties对象将K,V值存储并传入oozieClient.run(properties)中
- 在指定HDFS上运行的jar或workflow的路径时需要带上HDFS的路径,否则默认会找到本地的目录
大数据视频推荐:
CSDN
大数据语音推荐:
企业级大数据技术应用
大数据机器学习案例之推荐系统
自然语言处理
大数据基础
人工智能:深度学习入门到精通