参考文档
http://dblab.xmu.edu.cn/blog/1187-2/ Spark 2.0分布式集群环境搭建
http://blog.csdn.net/andy572633/article/details/7211546 linux下杀死进程(kill)的N种方法
更改原有spark目录名
可提前stop-all
将原有的spark改为spark1.6.2
sudo mv /usr/local/spark /usr/local/saprk1.6.2
Master安装spark2.0.2
下载安装包 spark-2.0.2-bin-without-hadoop.tgz
sudo tar -zxf ~/下载/spark-2.0.2-bin-without-hadoop.tgz -C /usr/local/
cd /usr/local
sudo mv ./spark-2.0.2-bin-without-hadoop/ ./spark
sudo chown -R hadoop ./spark
vim ~/.bashrc
在.bashrc添加如下配置:
export SPARK_HOME=/usr/local/spark
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
Master spark配置
slaves
将 slaves.template 拷贝到 slaves
cd /usr/local/spark/
cp ./conf/slaves.template ./conf/slaves
添加slaves
spark-env.sh
将 spark-env.sh.template 拷贝到 spark-env.sh
cp ./conf/spark-env.sh.template ./conf/spark-env.sh
添加内容如下:
export SPARK_DIST_CLASSPATH=$(/usr/local/hadoop/bin/hadoop classpath)
export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
export SPARK_MASTER_IP=xxx (具体ip)
分发到slaves
将spark文件夹发送到各个node上:
cd /usr/local/
tar -zcf ~/spark.master.tar.gz ./spark
cd ~
scp ./spark.master.tar.gz n01:/home/hadoop
scp ./spark.master.tar.gz n02:/home/hadoop
scp ./spark.master.tar.gz n03:/home/hadoop
scp ./spark.master.tar.gz n04:/home/hadoop
scp ./spark.master.tar.gz n05:/home/hadoop
scp ./spark.master.tar.gz n06:/home/hadoop
scp ./spark.master.tar.gz n07:/home/hadoop
scp ./spark.master.tar.gz n08:/home/hadoop
scp ./spark.master.tar.gz n09:/home/hadoop
scp ./spark.master.tar.gz n10:/home/hadoop
在n01…n10上执行:
sudo rm -rf /usr/local/spark/
sudo tar -zxf ~/spark.master.tar.gz -C /usr/local
sudo chown -R hadoop /usr/local/spark
启动spark集群
cd /usr/local/spark/
sbin/start-master.sh
打开http://m01:8080,如下
左上角可看到已升级为spark2.0.2
可能出现的问题
如果jps出现多个Master进程(可通过web ui 8080 8081 等端口反馈)
1.先停止当前spark-all
2.jps找到多余的Master PID
3.然后kill终结此Master进程
ps -ef | grep master
kill -s 9 master_pid
其中-s 9 制定了传递给进程的信号是9,即强制、尽快终止进程。
世界清静.