配置spark历史服务(spark二)

1. 编辑spark-defaults.conf位置文件

添加spark.eventLog.enabled和spark.eventLog.dir的配置
修改spark.eventLog.dir为我们之前在hdfs配置的端口
hdfs配置参考hadoop(七)集群配置同步(hadoop完全分布式四)|9

[shaozhiqi@hadoop102 conf]$ pwd
/opt/module/spark-2.4.3-bin-hadoop2.7/conf [shaozhiqi@hadoop102 conf]$ vim spark-defaults.conf # spark.master spark://master:7077 # spark.eventLog.enabled true # spark.eventLog.dir hdfs://namenode:8021/directory # spark.serializer org.apache.spark.serializer.KryoSerializer # spark.driver.memory 5g # spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three" spark.eventLog.enabled true spark.eventLog.dir hdfs://hadoop102:9000/directory 

2. 分发我们conf修改的配置文件

分发配置参考hadoop(六)rsync远程同步|xsync集群分发(完全分布式准备三)|8

[shaozhiqi@hadoop102 spark-2.4.3-bin-hadoop2.7]$ testxsync conf/

找个机器看下是否同步成功

[shaozhiqi@hadoop103 spark-2.4.3-bin-hadoop2.7]$ cd conf
[shaozhiqi@hadoop103 conf]$ cat spark-defaults.conf
# spark.master spark://master:7077
# spark.eventLog.enabled true # spark.eventLog.dir hdfs://namenode:8021/directory # spark.serializer org.apache.spark.serializer.KryoSerializer # spark.driver.memory 5g # spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three" spark.eventLog.enabled true spark.eventLog.dir hdfs://hadoop102:9000/directory [shaozhiqi@hadoop103 conf]$ 

3. 启动我们的hdfs

防止启动报错,先删除data logs 然后格式化namenode
bin/hdfs namenode –format

[shaozhiqi@hadoop102 hadoop-3.1.2]$ start-dfs.sh

启动成功,查看进程

[shaozhiqi@hadoop102 hadoop-3.1.2]$ start-dfs.sh
Starting namenodes on [hadoop102]
Starting datanodes
hadoop103: WARNING: /opt/module/hadoop-3.1.2/logs does not exist. Creating. hadoop104: WARNING: /opt/module/hadoop-3.1.2/logs does not exist. Creating. Starting secondary namenodes [hadoop104] [shaozhiqi@hadoop102 hadoop-3.1.2]$ jps 3088 Master 3168 Worker 4452 Jps 3366 CoarseGrainedExecutorBackend 4200 DataNode 4076 NameNode 3773 GetConf [shaozhiqi@hadoop102 hadoop-3.1.2]$ 

Yarn等我们提交任务到yarn时再启动

4. 查看我们的hdfs namenode ui

image.png
image.png

5. 创建hdfs文件夹,和我们上面配置的spark-defaults.conf中的一样

[shaozhiqi@hadoop102 spark-2.4.3-bin-hadoop2.7]$ hadoop fs -mkdir /directory

再次查看:

image.png
image.png

 

6. 再次修改spark-env.sh添加历史服务参数

[shaozhiqi@hadoop102 conf]$ vi spark-env.sh
export JAVA_HOME=/opt/module/jdk1.8.0_211
export SPARK_MASTER_HOS=hadoop102 export SPARK_MASTER_PORT=7077 export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=4000 -Dspark.history.retainedApplications=3 -Dspark.history.fs.logDirectory=hdfs://hadoop102:9000/directory" 

7. 同步我们的spark-env.sh

shaozhiqi@hadoop102 spark-2.4.3-bin-hadoop2.7]$ testxsync conf/spark-env.sh

8. 执行一个spark进程

[shaozhiqi@hadoop102 spark-2.4.3-bin-hadoop2.7]$ bin/spark-submit \
> --class org.apache.spark.examples.SparkPi \ > --master spark://hadoop102:7077 \ > --executor-memory 1G \ > --total-executor-cores 2 \ > ./examples/jars/spark-examples_2.11-2.4.3.jar \ > 100 

9. 查看spark ui多了我们的进程

 

image.png
image.png


点击spark pi进程,由于我们的任务还在执行,可以直接跳转

image.png
image.png

 

10. 发现好久都没有执行完看下日志

19/07/01 07:15:53 WARN TaskSchedulerImpl:Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 

难道是没有资源了?
点击kill掉spark shell和我们的spark Pi,然后单独提交spark Pi任务试下

[shaozhiqi@hadoop102 spark-2.4.3-bin-hadoop2.7]$ bin/spark-submit \
> --class org.apache.spark.examples.SparkPi \ > --master spark://hadoop102:7077 \ > --executor-memory 1G \ > --total-executor-cores 2 \ > ./examples/jars/spark-examples_2.11-2.4.3.jar \ > 100 
image.png
image.png

可以看到50多秒句结束了
当任务执行结束现在去访问spark 的4000,发现发问不了

11. 开启历史服务就可以访问已结束的任务了

[shaozhiqi@hadoop102 spark-2.4.3-bin-hadoop2.7]$ sbin/start-history-server.sh
starting org.apache.spark.deploy.history.HistoryServer, logging to /opt/module/spark-2.4.3-bin-hadoop2.7/logs/spark-shaozhiqi-org.apache.spark.deploy.history.HistoryServer-1-hadoop102.out [shaozhiqi@hadoop102 spark-2.4.3-bin-hadoop2.7]$ jps 

可以看到多了HistoryServer

[shaozhiqi@hadoop102 spark-2.4.3-bin-hadoop2.7]$ jps
3505 Worker
4708 HistoryServer 4775 Jps 4027 DataNode 3437 Master 3901 NameNode [shaozhiqi@hadoop102 spark-2.4.3-bin-hadoop2.7]$ 

12. 访问history ui,成功

image.png
image.png

13. 查看hdfsz有无生成执行结果文件

文件已生成历史服务配置成功

image.png
image.png

转载于:https://www.cnblogs.com/shaozhiqi/p/11534895.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值