日常跑spark任务,任务跑起来的时候报错:
错误信息:
java.lang.ClassNotFoundException: org.openx.data.jsonserde.JsonSerDejava.lang.RuntimeException: java.lang.ClassNotFoundException: org.openx.data.jsonserde.JsonSerDe
spark提交指令:
spark-submit -v --master yarn --name ${JOB_NAME} --num-executors 20 --driver-memory 5g --executor-memory 10g --executor-cores 4 --queue low \
--class com.****.****Spark.Runner ****-0.0.1-jar-with-dependencies.jar \
--cluster ${CLUSTER} \
--db ${DB} \
--confPath ${CONFPATH} 1>spark_error.log
错误原因
缺少对应读取json格式的hive表的jar包,这里只需要追加对应的jar就可以解决这个问题,我这边hive的lib里存在合格jar,所以直接add就可以解决。将submit指令修改如下:
spark-submit -v --master yarn --name ${JOB_NAME} --num-executors 20 --driver-memory 5g --executor-memory 10g --executor-cores 4 --queue low \
--jars /home/work/hive/lib/json-serde-1.3.7-jar-with-dependencies.jar \
--class com.****.****Spark.Runner ****-0.0.1-jar-with-dependencies.jar \
--cluster ${CLUSTER} \
--db ${DB} \
--confPath ${CONFPATH} 1>spark_error.log
问题完美解决!