目录
配置文件
进入Spark的conf目录,spark-defaults.conf.template拷贝一份
[fengling@hadoop129 conf]$ pwd
/opt/module/spark-2.4.4-bin-hadoop2.7/conf
[fengling@hadoop129 conf]$ cp spark-defaults.conf.template spark-defaults.conf
如图三个spark配置去掉注释,并根据自己机子的情况修改配置
spark.master spark://hadoop129:7077
spark.eventLog.enabled true
spark.eventLog.dir hdfs://hadoop129:9000/spark/logs
修改spark-env.sh文件
export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=4000
-Dspark.history.retainedApplications=3
-Dspark.history.fs.logDirectory=hdfs://hadoop129:9000/spark/logs"
修改完毕之后,同步到其他机子
[fengling@hadoop129 spark-2.4.4-bin-hadoop2.7]$ xsync conf/
一键部署的脚本可以参考这篇博文:我的大数据之旅-xsync集群分发脚本
提交作业,检查是否可用
[fengling@hadoop129 spark-2.4.4-bin-hadoop2.7]$ bin/spark-submit \
--master spark://hadoop129:7077 \
--class com.fengling.spark.WordCount mySparks/wordcount-jar-with-dependencies.jar \
hdfs://hadoop129:9000/user/user/fengling/spark/RELEASE \
hdfs://hadoop129:9000/user/user/fengling/spark/WordCount_output_20190927_110700