准备工作:四台配置好jdk,hadoop环境的虚拟机,配置方法请参见本人其他博客。
hadoop01:192.168.157.101
hadoop02:192.168.157.102
hadoop03:192.168.157.103
hadoop04:192.168.157.104
jdk版本为1.8.0_161
hadoop版本为2.7.5
下载spark版本:spark-2.3.0-bin-hadoop2.7.tar.gz
解压到/home/hadoop目录下
修改 /conf/spark-env.sh
在最下面添加
export JAVA_HOME=/usr/local/jdk1.8.0_161
export HADOOP_HOME=/home/hadoop/hadoop-2.7.5
export HADOOP_CONF_DIR=/home/hadoop/hadoop-2.7.5/etc/hadoop
export SPARK_MASTER_IP=hadoop01
修改slaves
添加
hadoop02
hadoop03
hadoop04
在hadoop01上进入sbin目录
hadoop@hadoop-virtual-machine:~/spark-2.3.0-bin-hadoop2.7/sbin$ start-master.sh
进入hadoop02
hadoop@hadoop-virtual-machine:~/spark-2.3.0-bin-hadoop2.7/sbin$ start-slaves.sh
输入jps
在hadoop01上有Master进程
在hadoop02,hadoop03,hadoop04上有Worker进程
集群配置成功