scala编译成功,但通过spark-shell调用时,执行报错如下:
ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:89)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:63)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:164)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:500)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2493)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:934)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:925)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:925)
at com.uniclick.job.CreateOutdataReportJob$.main(CreateOutdataReportJob.scala:269)
at com.uniclick.job.CreateOutdataReportJob.main(CreateOutdataReportJob.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
报错原因:
spark-shell传参对象中(一般是json)有空格
解决方案:
1、scala中用Seq拼接命令执行shell,注意Seq中,json对象必须用单引号引起来
import scala.sys.process._
val json = {"start": "2020-06-18 00:00:00", "end": "2020-06-18 23:59:59"}
val command = Seq("sh", "/data/test.sh", s"'$json'")
val pb:ProcessBuilder = Process(command)
val runningCommand = pb.run()
val status: Int = runningCommand.exitValue()
2、shell中,给json参数带上双引号,如下:
# /data/test.sh
spark-submit --master yarn --deploy-mode client --driver-memory 6G \
--conf spark.executor.memoryOverhead=2048 \
--conf spark.port.maxRetries=128 \
--executor-memory 4G \
--jars ${jars} \
--class com.test.job.CreateDataJob test-data-1.0.jar "$1"
其中$1就上上面的json变量,必须用双引号引起来