Centos 7 环境 Spark 2.4.3 完全分布式集群的搭建过程

系列文章地址

Centos 7 环境 hadoop 3.2.0 完全分布式集群搭建

Centos 7 环境 hive3.1.1 搭建

Centos 7 环境 Spark 2.4.3 完全分布式集群的搭建过程

Centos 7 环境  HBase 2.1.5 完全分布式集群的搭建过程

Centos 7 环境 Storm 2.0.0 完全分布式集群的搭建过程

 

注意:Spark 集群的搭建是在hadoop 集群搭建基础之上完成的,首先要搭建完成hadoop 集群,具体过程可以参考

Centos 7 下hadoop 3.2.0 完全分布式集群搭建

一  集群规划

 masterworker
centos48(10.0.0.48)YY
centos49(10.0.0.49) Y
centos50(10.0.0.50) Y

二 下载部署Sark 并做相关的配置

 

1. 配置 spark-env.sh

cd /usr/local
wget http://mirror.bit.edu.cn/apache/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgz
tar -zxvf  spark-2.4.3-bin-hadoop2.7.tgz
cd /usr/local/spark-2.4.3-bin-hadoop2.7/conf
cp spark-env.sh.template  spark-env.sh 
vi  spark-env.sh 

#文件末尾加入以下内容
export JAVA_HOME=/usr/java/jdk1.8.0_131
export HADOOP_CONF_DIR=/usr/local/hadoop-3.2.0/etc/hadoop
export SPARK_MASTER_HOST=centos48
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_CORES=1
export SPARK_WORKER_MEMORY=1g
export SPARK_MASTER_WEBUI_PORT=8088 

2. 配置 slaves

cp slaves.template  slaves 
vi slaves 

#文件末尾加入以下内容
centos48
centos49
centos50

3. 重命名 start-all.sh stop-all.sh  

 start-all.sh, stop-all.sh hadoop 下原来也有,为了避免冲突,需要重名令 spark 下的文件

cd  /usr/local/spark-2.4.3-bin-hadoop2.7/sbin
mv start-all.sh start-spark-all.sh
mv stop-all.sh stop-spark-all.sh

4. 复制spark 到另外两台机器 

cd /usr/local
scp -r ./spark-2.4.3-bin-hadoop2.7 root@centos49:/usr/local
scp -r ./spark-2.4.3-bin-hadoop2.7 root@centos50:/usr/local

三 启动spark ,验证

1. 修改环境变量 (3台机器都要做修改)

vi /etc/profile

#文件末尾加入以下内容
export SPARK_HOME=/usr/local/spark-2.4.3-bin-hadoop2.7
export PATH=$SPARK_HOME/sbin:$PATH

2.启动 spark

start-spark-all.sh

3  查看进程

centos48 机器上

centos 49 机器上

centos50 机器上

4. 通过页面查看

http://10.0.0.48:8088

5. 执行example 程序,并在hadoop YARN 上查看

1. 执行 exmaple 程序

cd  /usr/local/spark-2.4.3-bin-hadoop2.7/bin
./spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client --executor-memory 1G --num-executors 10 /usr/local/spark-2.4.3-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.4.3.jar 100
2019-08-21 16:24:41,455 INFO scheduler.TaskSetManager: Finished task 96.0 in stage 0.0 (TID 96) in 113 ms on centos48 (executor 1) (99/100)
2019-08-21 16:24:41,594 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on centos49:17162 (size: 1256.0 B, free: 366.3 MB)
2019-08-21 16:24:41,842 INFO scheduler.TaskSetManager: Finished task 13.0 in stage 0.0 (TID 13) in 2204 ms on centos49 (executor 10) (100/100)
2019-08-21 16:24:41,846 INFO scheduler.DAGScheduler: ResultStage 0 (reduce at SparkPi.scala:38) finished in 4.676 s
2019-08-21 16:24:41,844 INFO cluster.YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 
2019-08-21 16:24:41,860 INFO scheduler.DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 5.008590 s
Pi is roughly 3.1414335141433516
2019-08-21 16:24:41,895 INFO server.AbstractConnector: Stopped Spark@22ee2d0{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2019-08-21 16:24:41,917 INFO ui.SparkUI: Stopped Spark web UI at http://centos48:4040
2019-08-21 16:24:42,113 INFO cluster.YarnClientSchedulerBackend: Interrupting monitor thread
2019-08-21 16:24:42,148 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
2019-08-21 16:24:42,149 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
2019-08-21 16:24:42,178 INFO cluster.SchedulerExtensionServices: Stopping SchedulerExtensionServices
(serviceOption=None,
 services=List(),
 started=false)
2019-08-21 16:24:42,178 INFO cluster.YarnClientSchedulerBackend: Stopped
2019-08-21 16:24:42,189 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
2019-08-21 16:24:42,299 INFO memory.MemoryStore: MemoryStore cleared
2019-08-21 16:24:42,300 INFO storage.BlockManager: BlockManager stopped
2019-08-21 16:24:42,321 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
2019-08-21 16:24:42,324 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
2019-08-21 16:24:42,360 INFO spark.SparkContext: Successfully stopped SparkContext
2019-08-21 16:24:42,511 INFO util.ShutdownHookManager: Shutdown hook called
2019-08-21 16:24:42,511 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-0a93cd55-de2a-466b-a398-ec8da81f6ebc
2019-08-21 16:24:42,517 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-cfe65620-2d79-4bdd-9e6b-fb34e069ad45

2. 在hadoop YARN 上可以看到本次任务

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值