前提:安装好了JDK
1.下载spark安装包并上传到服务器解压
http://www.apache.org/dyn/closer.lua/spark/spark-1.5.2/spark-1.5.2-bin-hadoop2.6.tgz
tar -zxvf spark-1.5.2-bin-hadoop2.6.tgz -C /usr/local/apps/platform/
#创建软链接
ln -s spark-1.5.2-bin-hadoop2.6 spark
2.配置Spark
进入到Spark安装目录的conf目录并重命名并修改spark-env.sh文件
cd /usr/local/apps/platform/spark/conf
vim spark-env.sh
在该配置文件中添加如下配置
export JAVA_HOME=/usr/java/jdk1.7.0_45
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=hadoop1,hadoop2,hadoop3 -Dspark.deploy.zookeeper.dir=/spark"
export SPARK_MASTER_PORT=7077
修改slaves文件指定Worker
vim slaves
在该文件中添加子节点所在的位置(Worker节点)
hadoop2
hadoop3
将配置好的Spark拷贝到其他节点上并在其他节点创建软链接
scp -r spark-1.5.2-bin-hadoop2.6/ hadoop2:/usr/local/apps/platform/
scp -r spark-1.5.2-bin-hadoop2.6/ hadoop3:/usr/local/apps/platform/
3.启动spark集群
Spark集群配置完毕,在hadoop1上执行sbin/start-all.sh脚本,然后在hadoop2上执行sbin/start-master.sh启动第二个Master
启动后执行jps命令,hadoop1和hadoop2节点上有Master进程,其他子节点上有Work进行,登录Spark管理界面查看集群状态(主节点):http://hadoop1:8080/
连接HA Spark:
./spark-shell –master spark://hadoop1:7077,hadoop2:7077