Spark完全分布式安装

安装Spark前需要安装Hadoop,已在VM上安装了三台虚拟机

一、前置条件

1、Hadoop集群,作者已经在VM上安装了三台虚拟机,里面已安装好Hadoop集群

二、所需软件

1、 Scala:2.10

2、Spark:2.20

三、安装Scala(集群上三台机器都需要安装,下面已一台为例)

1、下载scala-2.10.0,并解压到/usr/local/目录,作者下载的scala目录在/home/hadoop/tools

hadoop@Worker1:~$ sudo mv /home/hadoop/tools/scala-2.11.0.tgz /usr/local/
hadoop@Worker1:/usr/local$ sudo tar -zxvf scala-2.11.0.tgz

2、修改~/.bashrc文件,增加SCALA_HOME

hadoop@Worker1:/usr/local$ vim ~/.bashrc
export SCALA_HOME=/usr/local/scala-2.11.0
export JAVA_HOME=/usr/local/bin/jdk1.8.0_131
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
export PATH=${SCALA_HOME}/bin:$PATH
export HADOOP_HOME=/usr/local/hadoop-2.7.3
export PATH=$PATH:${HADOOP_HOME}/bin
export SCALA_HOME=/usr/local/scala-2.11.0

3、使修改生效

hadoop@Worker1:/usr/local$ source ~/.bashrc 

4、验证Scala是否安装成功

hadoop@Worker1:/usr/local$ scala -version

如果出现:Scala code runner version 2.11.0 -- Copyright 2002-2013, LAMP/EPFL,则表示安装成功

5、在其它机器上重复上面的步骤安装成功scala

四、安装Spark集群:Spark集群支持三种模式,分别是Mesos、YARN、Standalone模式,现在作者配置Standalone模式

1、下载spark-2.2.0-bin-hadoop2.7.tgz,并解压

hadoop@Master:/usr/local$ sudo mv /home/hadoop/tools/spark-2.2.0-bin-hadoop2.7.tgz /usr/local/
hadoop@Master:/usr/local$ sudo tar -zxvf spark-2.2.0-bin-hadoop2.7.tgz

2、授权

hadoop@Master:/usr/local$ sudo chown -R hadoop:root ./spark-2.2.0-bin-hadoop2.7

3、bashrc配置spark环境

hadoop@Master:/usr/local$ vim ~/.bashrc 
export SCALA_HOME=/usr/local/scala-2.11.0
export JAVA_HOME=/usr/local/bin/jdk1.8.0_131
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
export PATH=${SCALA_HOME}/bin:$PATH
export HADOOP_HOME=/usr/local/hadoop-2.7.3
export PATH=$PATH:${HADOOP_HOME}/bin
export SCALA_HOME=/usr/local/scala-2.11.0
export SPARK_HOME=/usr/local/spark-2.2.0-bin-hadoop2.7
export PATH=${SPARK_HOME}/bin:${SPARK_HOME}/sbin/sbin:$PATH

4、使配置生效

hadoop@Master:/usr/local$ source ~/.bashrc 

5、进入Spark的conf目录,配置spark-env.sh

hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/conf$ cp spark-env.sh.template spark-env.sh
hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/conf$ vim spark-env.sh
export JAVA_HOME=/usr/local/bin/jdk1.8.0_131
export SCALA_HOME=/usr/local/scala-2.11.0
export HADOOP_HOME=/usr/local/hadoop-2.7.3
export HADOOP_CONF_DIR=/usr/local/hadoop-2.7.3/etc/hadoop
export SPARK_MASTER_IP=Master
export SPARK_WORKER_CORES=2
export SPARK_DRIVER_MEMORY=1G
export SPARK_WORKER_MEMORY=1g
export SPARK_EXECUTOR_MEMORY=1g

6、配置slaves

hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/conf$ cp slaves.template slaves
hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/conf$ vim slaves
Worker1
Worker2

7、配置spark-defaults.conf

spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
spark.eventLog.enabled           true
spark.eventLog.dir               hdfs://Master:9000/historyserverforSpark
spark.yarn.historyServer.address  Master:18080
spark.history.fs.logDirectory  hdfs:Master:9000/historyserverforSpark

其中historyserverforSpark目录需要手动创建,否则在启动Spark-shell的时候会报错。

8、启动

hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/sbin$ ./start-all.sh 

 

转载于:https://my.oschina.net/u/729917/blog/1556871

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值