1.如果用cdh安装sparn on yarn
直接用集群模式运行
spark-submit --class org.apache.spark.examples.SparkPi \
--master yarn-cluster \
--num-executors 3 \
--driver-memory 4g \
--executor-memory 2g \
--executor-cores 1 \
--queue thequeue \
./spark-examples*.jar
出现以下错误
Exit code: 15
Stack trace: ExitCodeException exitCode=15:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:197)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:299)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
接下来用命令追踪错误
yarn logs -applicationId application_1429759514549_0001
发现错误根源
Exception in thread "Driver" java.io.IOException: Error in creating log directory: file:/user/spark/applicationHistory/application_1429759514549_0001
at org.apache.spark.util.FileLogger.createLogDir(FileLogger.scala:133)
at org.apache.spark.util.FileLogger.start(FileLogger.scala:115)
at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:74)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:353)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:28)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:427)
发现以下错误
Error in creating log directory:
解决办法
进入到spark的配置文件看到spark-defaults.conf
spark.eventLog.dir=/user/spark/applicationHistory
spark.eventLog.enabled=true
spark.yarn.historyServer.address=http://slave3.hadoop.gitv.we:18088
spark.driver.extraLibraryPath=/opt/soft/BI/cloudera/cm/cm5.3.1/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/hadoop/lib/native
spark.executor.extraLibraryPath=/opt/soft/BI/cloudera/cm/cm5.3.1/cloudera/parcels/CDH-5.3.1-1.cdh5.3.1.p0.5/lib/hadoop/lib/native
修该配置 spark.eventLog.dir 变为hdfs目录 (hdfs://nameservice1为我的hdfs 命名空间,因为配置了HA)
spark.eventLog.dir=hdfs://nameservice1/user/spark/applicationHistory