Spark 1.2 集群环境安装

我是在单机环境下修改下配置完成的集群模式

单机安装查看:http://blog.csdn.net/wind520/article/details/43458925

参考官网配置:http://spark.apache.org/docs/latest/spark-standalone.html

1:修改slaves配置

[jifeng@jifeng01 conf]$ cp slaves.template slaves
[jifeng@jifeng01 conf]$ vi slaves


# A Spark Worker will be started on each of the machines listed below.
jifeng02.sohudo.com
jifeng03.sohudo.com
jifeng04.sohudo.com
                                                                                                                  
"slaves" 4L, 131C 已写入                                                                           
[jifeng@jifeng01 conf]$ cat slaves
# A Spark Worker will be started on each of the machines listed below.
jifeng02.sohudo.com
jifeng03.sohudo.com
jifeng04.sohudo.com

2:复制文件到其它节点

[jifeng@jifeng01 hadoop]$ scp -r ./spark-1.2.0-bin-hadoop1 jifeng@jifeng02.sohudo.com:/home/jifeng/hadoop/
[jifeng@jifeng01 hadoop]$ scp -r ./spark-1.2.0-bin-hadoop1 jifeng@jifeng03.sohudo.com:/home/jifeng/hadoop/
[jifeng@jifeng01 hadoop]$ scp -r ./spark-1.2.0-bin-hadoop1 jifeng@jifeng04.sohudo.com:/home/jifeng/hadoop/
[jifeng@jifeng01 hadoop]$ scp -r ./scala-2.11.4 jifeng@jifeng02.sohudo.com:/home/jifeng/hadoop/
[jifeng@jifeng01 hadoop]$ scp -r ./scala-2.11.4 jifeng@jifeng03.sohudo.com:/home/jifeng/hadoop/
[jifeng@jifeng01 hadoop]$ scp -r ./scala-2.11.4 jifeng@jifeng04.sohudo.com:/home/jifeng/hadoop/

3:启动

  • sbin/start-all.sh - Starts both a master and a number of slaves as described above.

[jifeng@jifeng01 spark-1.2.0-bin-hadoop1]$ sbin/start-all.sh
starting org.apache.spark.deploy.master.Master, logging to /home/jifeng/hadoop/spark-1.2.0-bin-hadoop1/sbin/../logs/spark-jifeng-org.apache.spark.deploy.master.Master-1-jifeng01.sohudo.com.out
jifeng04.sohudo.com: starting org.apache.spark.deploy.worker.Worker, logging to /home/jifeng/hadoop/spark-1.2.0-bin-hadoop1/sbin/../logs/spark-jifeng-org.apache.spark.deploy.worker.Worker-1-jifeng04.sohudo.com.out
jifeng02.sohudo.com: starting org.apache.spark.deploy.worker.Worker, logging to /home/jifeng/hadoop/spark-1.2.0-bin-hadoop1/sbin/../logs/spark-jifeng-org.apache.spark.deploy.worker.Worker-1-jifeng02.sohudo.com.out
jifeng03.sohudo.com: starting org.apache.spark.deploy.worker.Worker, logging to /home/jifeng/hadoop/spark-1.2.0-bin-hadoop1/sbin/../logs/spark-jifeng-org.apache.spark.deploy.worker.Worker-1-jifeng03.sohudo.com.out
[jifeng@jifeng01 spark-1.2.0-bin-hadoop1]$ 


4:web管理页面

spark集群的web管理页面访问:http://jifeng01.sohudo.com:8080/

5:启动spark-shell控制台

./spark-shell

[jifeng@jifeng01 spark-1.2.0-bin-hadoop1]$ cd bin
[jifeng@jifeng01 bin]$ ./spark-shell
Spark assembly has been built with Hive, including Datanucleus jars on classpath
15/02/04 22:11:10 INFO spark.SecurityManager: Changing view acls to: jifeng
15/02/04 22:11:10 INFO spark.SecurityManager: Changing modify acls to: jifeng
15/02/04 22:11:10 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jifeng); users with modify permissions: Set(jifeng)
15/02/04 22:11:10 INFO spark.HttpServer: Starting HTTP Server
15/02/04 22:11:10 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/04 22:11:10 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:57780
15/02/04 22:11:10 INFO util.Utils: Successfully started service 'HTTP class server' on port 57780.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.2.0
      /_/

Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_45)
Type in expressions to have them evaluated.
Type :help for more information.
15/02/04 22:11:15 INFO spark.SecurityManager: Changing view acls to: jifeng
15/02/04 22:11:15 INFO spark.SecurityManager: Changing modify acls to: jifeng
15/02/04 22:11:15 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(jifeng); users with modify permissions: Set(jifeng)
15/02/04 22:11:15 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/02/04 22:11:15 INFO Remoting: Starting remoting
15/02/04 22:11:15 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@jifeng01.sohudo.com:48769]
15/02/04 22:11:15 INFO util.Utils: Successfully started service 'sparkDriver' on port 48769.
15/02/04 22:11:15 INFO spark.SparkEnv: Registering MapOutputTracker
15/02/04 22:11:15 INFO spark.SparkEnv: Registering BlockManagerMaster
15/02/04 22:11:15 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-local-20150204221115-20b0
15/02/04 22:11:15 INFO storage.MemoryStore: MemoryStore started with capacity 265.4 MB
15/02/04 22:11:15 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-6d24f541-4ce1-4e12-9be2-4296791bd73f
15/02/04 22:11:15 INFO spark.HttpServer: Starting HTTP Server
15/02/04 22:11:15 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/04 22:11:15 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:42647
15/02/04 22:11:15 INFO util.Utils: Successfully started service 'HTTP file server' on port 42647.
15/02/04 22:11:16 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/02/04 22:11:16 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
15/02/04 22:11:16 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
15/02/04 22:11:16 INFO ui.SparkUI: Started SparkUI at http://jifeng01.sohudo.com:4040
15/02/04 22:11:16 INFO executor.Executor: Using REPL class URI: http://10.5.4.54:57780
15/02/04 22:11:16 INFO util.AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@jifeng01.sohudo.com:48769/user/HeartbeatReceiver
15/02/04 22:11:16 INFO netty.NettyBlockTransferService: Server created on 58669
15/02/04 22:11:16 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/02/04 22:11:16 INFO storage.BlockManagerMasterActor: Registering block manager localhost:58669 with 265.4 MB RAM, BlockManagerId(<driver>, localhost, 58669)
15/02/04 22:11:16 INFO storage.BlockManagerMaster: Registered BlockManager
15/02/04 22:11:16 INFO repl.SparkILoop: Created spark context..
Spark context available as sc.

scala>


6:Spark shell管理页面

进入后还可打开访问web控制页面,访问http://jifeng01.sohudo.com:4040/


7:测试

输入命令:

val file=sc.textFile("hdfs://jifeng01.sohudo.com:9000/user/jifeng/in/test1.txt")

读取文件
val count=file.flatMap(line=>line.split(" ")).map(word=>(word,1)).reduceByKey(_+_)

统计单词数量
count.collect()

collect命令提交并执行job

8:查看执行结果


9:退出shell

:quit

scala> :quit
Stopping spark context.
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/kill,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/static,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/threadDump/json,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/threadDump,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/json,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment/json,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd/json,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/json,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool/json,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/json,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/json,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/job/json,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/job,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/json,null}
15/02/04 22:28:51 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs,null}
15/02/04 22:28:51 INFO ui.SparkUI: Stopped Spark web UI at http://jifeng01.sohudo.com:4040
15/02/04 22:28:51 INFO scheduler.DAGScheduler: Stopping DAGScheduler
15/02/04 22:28:52 INFO spark.MapOutputTrackerMasterActor: MapOutputTrackerActor stopped!
15/02/04 22:28:52 INFO storage.MemoryStore: MemoryStore cleared
15/02/04 22:28:52 INFO storage.BlockManager: BlockManager stopped
15/02/04 22:28:52 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
15/02/04 22:28:52 INFO spark.SparkContext: Successfully stopped SparkContext
15/02/04 22:28:52 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/02/04 22:28:52 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/02/04 22:28:52 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
[jifeng@jifeng01 bin]$ 

10:停止集群

sbin/stop-all.sh 


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
好的,以下是在 Docker 中安装 Spark 独立集群的步骤: 1. 安装 Docker 首先,你需要在你的机器上安装 Docker。你可以在 Docker 官网上找到安装教程。 2. 下载 Spark 接下来,你需要从 Apache Spark 官网下载最新版本的 Spark。 3. 创建一个 Docker 镜像 在你的 Spark 目录中,创建一个 Dockerfile 文件,内容如下: ``` FROM openjdk:8-jre-slim ENV SPARK_VERSION=3.1.2 ENV HADOOP_VERSION=3.2 RUN apt-get update && \ apt-get install -y curl && \ curl https://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz | tar -xz -C /opt && \ cd /opt && \ ln -s spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION} spark && \ rm -rf /var/lib/apt/lists/* ENV SPARK_HOME /opt/spark ENV PATH $PATH:${SPARK_HOME}/bin EXPOSE 4040 ``` 这个 Dockerfile 文件将会安装 Java 运行时环境Spark,并将 Spark 安装到 /opt/spark 目录下。 接下来,使用以下命令构建 Docker 镜像: ``` docker build -t spark-cluster . ``` 这个命令将会构建一个名为 spark-cluster 的 Docker 镜像。 4. 启动 Spark 独立集群 使用以下命令启动 Spark 独立集群: ``` docker run -it --rm -p 4040:4040 -p 8080:8080 -p 7077:7077 --name spark-master spark-cluster ./bin/spark-class org.apache.spark.deploy.master.Master -h spark-master ``` 这个命令将会启动一个名为 spark-master 的容器,并且会在容器中启动 Spark 独立集群的主节点。 接下来,你需要启动一个或多个 Spark 独立集群的工作节点。使用以下命令启动一个工作节点: ``` docker run -it --rm --link spark-master:spark-master spark-cluster ./bin/spark-class org.apache.spark.deploy.worker.Worker spark://spark-master:7077 ``` 这个命令将会启动一个名为 spark-worker 的容器,并且会在容器中启动 Spark 独立集群的工作节点。注意,这个命令中的 --link 参数将会链接到之前启动的 spark-master 容器。 你可以根据需要启动多个工作节点。 5. 测试 Spark 独立集群 现在,你可以使用以下命令启动 Spark Shell,并连接到 Spark 独立集群: ``` docker run -it --rm --link spark-master:spark-master spark-cluster ./bin/spark-shell --master spark://spark-master:7077 ``` 这个命令将会启动一个 Spark Shell,并连接到 Spark 独立集群。你可以在 Shell 中执行 Spark 任务,并查看 Spark UI 界面来监控任务的执行情况。 希望这个步骤对你有所帮助!

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值