conf/spark-env.sh
export JAVA_HOME=/opt/module/jdk1.8.0_144
export YARN_CONF_DIR=/opt/module/hadoop/etc/hadoop
spark-defaults.conf
配置日志存储路径
spark.eventLog.enabled true
spark.eventLog.dir hdfs://linux1:8020/directory
需要启动hadoop集群,HDFS上的目录需要提前存在。
[root@linux1 hadoop]# sbin/start-dfs.sh
[root@linux1 hadoop]# hadoop fs -mkdir /directory
修改spark-env.sh文件, 添加日志配置
export SPARK_HISTORY_OPTS="
-Dspark.history.ui.port=18080
-Dspark.history.fs.logDirectory=hdfs://linux1:8020/directory
-Dspark.history.retainedApplications=30"
4)修改spark-defaults.conf
spark.yarn.historyServer.address=linux1:18080
spark.history.ui.port=18080
------------------可能碰到的问题---------------------------------------------------------------------------------------------------------------------
---------------如果在 yarn 日志端无法查看到具体的日志, 则在yarn-site.xml中添加如下配置并启动Yarn历史服务器--------
yarn-site.xml
<property>
<name>yarn.log.server.url</name>
<value>http://hadoop204:19888/jobhistory/logs</value>
</property>