本文概述:
- Hadoop
2.6.5
版本, 需要事先配置好Hadoop的分布式集群环境 - 如果没有配置好Hadoop的分布式集群环境,请点击查看 Hadoop分布式集群安装配置教程
- spark 版本
spark-2.0.2-bin-hadoop2.6
- 统一使用hadoop 用户并给予 sudo 授权
- 本文只用了一台
work
节点
1. 安装 spark
下载地址 http://spark.apache.org/downloads.html
Master 节点:
[hadoop@master ~]$ sudo tar zxf spark-2.0.2-bin-hadoop2.6.tgz -C /opt/source/
[hadoop@master ~]$ cd /opt/source
[hadoop@master ~]$ sudo mv spark-2.0.2-bin-hadoop2.6 spark
[hadoop@master ~]$ sudo chown hadoop:hadoop spark -R
[hadoop@master ~]$ cd spark/conf
[hadoop@master ~]$ cp spark-env.sh.template spark-env.sh
修改Spark的配置文件spark-env.sh
[hadoop@master ~]$ vim spark-env.sh
export SPARK_DIST_CLASSPATH=$(/opt/source/hadoop/bin/hadoop classpath)
export HADOOP_CONF_DIR=/opt/source/hadoop/etc/hadoop
export SPARK_MASTER_IP=192.168.11.130 # Spark 集群 master 节点的 IP 地址
[hadoop@master ~]$ cp slaves.template slaves
[hadoop@master ~]$ cat slaves
slave
[hadoop@master ~]$ scp -r /opt/source/spark slave:/home/hadoop
Slave 节点:
[hadoop@slave ~]$ sudo mv ~/spark /opt/source/ # 要注意目录权限本文统一为hadoop
[hadoop@slave source]$ ll
drwxrwxr-x 10 hadoop hadoop 150 Dec 22 13:05 hadoop
drwxr-xr-x 13 hadoop hadoop 4096 Dec 26 18:39 spark
2. 启动Spark集群
首先要启动hadoop集群 [此处不再叙述]
2.1 master节点:
- 启动主节点:
[hadoop@master ~]$ cd /opt/source/spark/ - 启动slave节点:
[hadoop@master spark]$ ./sbin/start-master.sh
3. 关闭Spark集群
关闭Master节点
sbin/stop-master.sh关闭Worker节点
sbin/stop-slaves.sh关闭Hadoop集群
cd /opt/source/hadoop/
sbin/stop-all.sh