oozie任务调度spark2实例

1、运行环境

CDH:CDH 5.16.1

Java:1.8

scala:2.11.8

spark:2.2.0

oozie:oozie-4.1.0

 

2、创建spark2目录

hadoop fs -mkdir /user/oozie/share/lib/lib_20190121152411/spark2

上传jar包
hadoop fs -put /opt/cloudera/parcels/SPARK2/lib/spark2/jars/*  /user/oozie/share/lib/lib_20190121152411/spark2


hadoop fs -cp /user/oozie/share/lib/lib_20190121152411/spark/oozie-sharelib-spark-4.1.0-cdh5.16.1.jar /user/oozie/share/lib/lib_20190121152411/spark2


hadoop fs -cp /user/oozie/share/lib/lib_20190121152411/spark/oozie-sharelib-spark.jar /user/oozie/share/lib/lib_20180403101432/spark2 

 

 

3、新建job.properties、workflow.xml、目录lib存放需要运行的jar包

其中

job.properties:

nameNode=hdfs://cdh01:8020
jobTracker=cdh01:8032
master=yarn-cluster
queueName=default
examplesRoot=examples
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/spark2
workflowpath=${nameNode}/user/${user.name}/${examplesRoot}/apps/spark2
userName=root
groupsName=supergroup
#jars in hdfs
oozie.libpath=/user/oozie/share/lib/lib_20190121152411/spark2
oozie.subworkflow.classpath.inheritance=true
#oozie url
oozieUrl=http://cdh01:11000/oozie/

workflow.xml

<workflow-app xmlns='uri:oozie:workflow:0.5' name='SparkPi'>
    <start to="SparkPi" />
    <action name="SparkPi">
        <spark xmlns="uri:oozie:spark-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>oozie.action.sharelib.for.spark</name>
                    <value>spark2 </value>
                </property>
            </configuration>
            <master>${master}</master>
            <name>SparkPi</name>
            <class>org.apache.spark.examples.SparkPi</class>
            <jar>${nameNode}/user/${wf:user()}/${examplesRoot}/apps/spark2/lib/spark-examples_2.11-2.3.0.cloudera3.jar</jar>
            <spark-opts> --deploy-mode cluster --driver-memory 2G --executor-memory 4G --num-executors 5 --executor-cores 2</spark-opts>
        </spark>
        <ok to="end" />

        <error to="fail_kill" />
    </action>
    <kill name="fail_kill">
        <message>Job failed, error
            message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end" />
</workflow-app>

其中运行时一直出现Error: E0701 : E0701: XML schema error, cvc-complex-type.2.4.a: Invalid content was found starting with element 'name'. One of '{"uri:oozie:spark-action:0.1":master}' is expected.

这个问题是因为我的master和name前后调换了一下,导致出现这个问题,master在前则正常运行。

 

4、将该文件下的所有文件全部上传到${nameNode}/user/${wf:user()}/${examplesRoot}/apps/spark2所在的目录

5、启动oozie任务

运行成功

 

遇到的问题:

日志可能在oozie web ui中如果看不到具体日志,可以进8088端口看Applications的日志,到处找找。

问题:

Caused by: org.apache.spark.SparkException: Exception when registering SparkListener
	at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2371)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:554)
	at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2493)
	at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:933)
	at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:924)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:924)
	at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
	at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:688)
Caused by: java.lang.ClassNotFoundException: com.cloudera.spark.lineage.ClouderaNavigatorListener
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at org.apache.spark.util.Utils$.classForName(Utils.scala:239)
	at org.apache.spark.util.Utils$$anonfun$loadExtensions$1.apply(Utils.scala:2738)
	at org.apache.spark.util.Utils$$anonfun$loadExtensions$1.apply(Utils.scala:2736)
	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
	at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74)
	at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
	at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
	at org.apache.spark.util.Utils$.loadExtensions(Utils.scala:2736)
	at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2360)
	at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2359)
	at scala.Option.foreach(Option.scala:257)
	at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2359)

解决办法:

 

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值