spark standalone配置
(1)
cd /opt/module/spark/conf
mv slaves.template slaves
mv spark-env.sh.template spark-env.sh
(2)
修改配置信息:
vim /opt/module/spark/conf/slaves
192.168.1.101
192.168.1.102
192.168.1.103
vim /opt/module/spark/conf/spark-env.sh
SPARK_MASTER_HOST=192.168.1.101
SPARK_MASTER_PORT=7077
(3)分发配置包到三台机器
xsync /opt/module/spark/
(4)启动
/opt/module/spark/sbin/start-all.sh 或者 /opt/module/spark/sbin/start-master.sh加/opt/module/spark/sbin/strart-slaves.sh
如果遇到 “JAVA_HOME not set” 异常,可以在/opt/module/spark/sbin目录下的spark-config.sh 文件中加入如下配置:
export JAVA_HOME=/opt/module/jdk1.8.0_144
spark yarn配置
1)修改spark-default.conf.template名称
mv spark-defaults.conf.template spark-defaults.conf
2)修改spark-default.conf文件,开启Log:
vim spark-defaults.conf
spark.eventLog.enabled true
spark.eventLog.dir hdfs://192.168.1.102:8020/directory/ 或者 spark.eventLog.dir hdfs://192.168.1.102:9000/directory/
spark.yarn.historyServer.address=192.168.1.102:18080
spark.history.ui.port=18080
注意:HDFS上的目录需要提前存在。
3)修改spark-env.sh文件,添加如下配置:
vim spark-env.sh
export SPARK_HISTORY_OPTS=“-Dspark.history.ui.port=18080 -Dspark.history.retainedApplications=30 -Dspark.history.fs.logDirectory=hdfs://192.168.1.102:8020/directory”
YARN_CONF_DIR=/etc/hadoop/conf
export HADOOP_USER_NAME=hdfs
4) 重启历史服务:sbin/start-history-server.sh