1、上传安装包
把安装包上传到服务器
put -r "D:\spark-2.3.0-bin-hadoop2.7.tgz"
2、解压缩安装包
tar -zxvf spark-2.3.0-bin-hadoop2.7.tgz -C ~/apps/
3、修改配置文件
(1)hadoop相关配置文件可参照《hadoop高可用集群搭建(三)》一文
(2)spark-env.sh
进入spark配置文件目录
cd /root/apps/spark-2.3.0-bin-hadoop2.7/conf
修改文件名
mv spark-env.sh.template spark-env.sh
修改参数如下:
export SPARK_WORKER_CORES=1
export SPARK_WORKER_MEMORY=1000m
export JAVA_HOME=/usr/local/java/jdk1.8.0_73
export SPARK_MASTER_PORT=7077
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=hadoop01:2181,hadoop02:2181,hadoop03:2181,hadoop04:2181 -Dspark.deploy.zookeeper.dir=/spark"
(3)salves
配置salves从节点列表
vim salves
hadoop01
hadoop02
hadoop03
hadoop04
4、将hadoop01已安装好的spark分发至另外3台机器(hadoop02、hadoop03、hadoop04)
scp -r spark-2.3.0-bin-hadoop2.7/ root@hadoop02:$PWD
scp -r spark-2.3.0-bin-hadoop2.7/ root@hadoop03:$PWD
scp -r spark-2.3.0-bin-hadoop2.7/ root@hadoop04:$PWD
5、配置spark环境变量
vim /etc/profile
export SPARK_HOME=/root/apps/spark-2.3.0-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin
刷新策略
source /etc/profile
6、启动集群
(1)启动zookeeper
启动:zkServer.sh start
停止:zkServer.sh stop
查看状态:zkServer.sh status
(2)启动HDFS
启动:start-dfs.sh
停止:stop-dfs.sh
查看状态:
启动集群(hadoop01,active):/root/apps/spark-2.3.0-bin-hadoop2.7/sbin/start-all.sh
启动master主节点(hadoop02,standby):/root/apps/spark-2.3.0-bin-hadoop2.7/sbin/start-master.sh
停止master主节点:/root/apps/spark-2.3.0-bin-hadoop2.7/sbin/stop-master.sh
停止worker从节点:/root/apps/spark-2.3.0-bin-hadoop2.7/sbin/stop-slaves.sh
7、检测是否启动成功
(1)JPS命令查看对应的守护进程是否都启动成功
(2)登录spark的web管理界面
把安装包上传到服务器
put -r "D:\spark-2.3.0-bin-hadoop2.7.tgz"
2、解压缩安装包
tar -zxvf spark-2.3.0-bin-hadoop2.7.tgz -C ~/apps/
3、修改配置文件
(1)hadoop相关配置文件可参照《hadoop高可用集群搭建(三)》一文
(2)spark-env.sh
进入spark配置文件目录
cd /root/apps/spark-2.3.0-bin-hadoop2.7/conf
修改文件名
mv spark-env.sh.template spark-env.sh
修改参数如下:
export SPARK_WORKER_CORES=1
export SPARK_WORKER_MEMORY=1000m
export JAVA_HOME=/usr/local/java/jdk1.8.0_73
export SPARK_MASTER_PORT=7077
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=hadoop01:2181,hadoop02:2181,hadoop03:2181,hadoop04:2181 -Dspark.deploy.zookeeper.dir=/spark"
(3)salves
配置salves从节点列表
vim salves
hadoop01
hadoop02
hadoop03
hadoop04
4、将hadoop01已安装好的spark分发至另外3台机器(hadoop02、hadoop03、hadoop04)
scp -r spark-2.3.0-bin-hadoop2.7/ root@hadoop02:$PWD
scp -r spark-2.3.0-bin-hadoop2.7/ root@hadoop03:$PWD
scp -r spark-2.3.0-bin-hadoop2.7/ root@hadoop04:$PWD
5、配置spark环境变量
vim /etc/profile
export SPARK_HOME=/root/apps/spark-2.3.0-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin
刷新策略
source /etc/profile
6、启动集群
(1)启动zookeeper
启动:zkServer.sh start
停止:zkServer.sh stop
查看状态:zkServer.sh status
(2)启动HDFS
启动:start-dfs.sh
停止:stop-dfs.sh
查看状态:
hdfs haadmin -getServiceState nn1(3)启动spark
hdfs haadmin -getServiceState nn2
启动集群(hadoop01,active):/root/apps/spark-2.3.0-bin-hadoop2.7/sbin/start-all.sh
启动master主节点(hadoop02,standby):/root/apps/spark-2.3.0-bin-hadoop2.7/sbin/start-master.sh
停止master主节点:/root/apps/spark-2.3.0-bin-hadoop2.7/sbin/stop-master.sh
停止worker从节点:/root/apps/spark-2.3.0-bin-hadoop2.7/sbin/stop-slaves.sh
7、检测是否启动成功
(1)JPS命令查看对应的守护进程是否都启动成功
(2)登录spark的web管理界面
active:http://hadoop01:8080/
standby:http://hadoop02:8080/