一.完全分布式多节点SSH免密登录
1、首先在三台机器上配置对本机的ssh免密码登录
生成本机的公钥,过程中不断敲回车即可,ssh-keygen命令默认会将公钥放在/root/.ssh目录下
# cd /home/root/.ssh
# sudo rm -rf ./*
# ssh-keygen -t rsa
# ls
id_rsa id_rsa.pub
将公钥复制为authorized_keys文件,此时使用ssh连接本机就不需要输入密码了
#cd /root/.ssh
#cp id_rsa.pub authorized_keys
2、接着配置三台机器互相之间的ssh免密码登录
使用ssh-copy-id -i spark命令将本机的公钥拷贝到指定机器的authorized_keys文件中(方便好用)
# ssh-copy-id -i spark1
# ssh-copy-id -i spark2
# ssh-copy-id -i spark3
在每个节点上重复上述步骤
3、互相登录验证一下,在spark1上:
# ssh spark2
Last login: Fri Mar 16 16:55:26 2018 from 192.168.202.1
# exit 退出登录
二.Scala安装配置
1.解压
# tar -zxvf scala-2.11.4.tgz -C /opt/modules
2.配置环境变量
# vi /etc/profile
# SCALA_HOME
export SCALA_HOME=/opt/modules/scala-2.11.4
export PATH=$PATH:$SCALA_HOME/bin
# source /etc/profile
3.验证安装
# scala -version
scala code runner version 2.11.4 -- Copyright 2002-2013, LAMP/EPFL
4.在每台机器上都要配置scala环境,可以直接用scp分发节点
# scp -r /opt/modules/scala-2.11.4/ 192.168.202.152:/opt/modules
三.Spark配置
安装spark包
# tar -zxvf spark-1.5.1-bin-hadoop2.4.tgz -C /opt/modules/
# mv spark-1.5.1-bin-hadoop2.4 spark
配置环境变量
# vi /etc/profile
# SPARK_HOME
export SPARK_HOME=/opt/modules/spark
export PATH=$PATH:$SPARK_HOME/bin
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
# source /etc/profile
修改spark-env.sh文件
# cd/opt/modules/spark/conf
# cp spark-env.sh.template spark-env.sh
# vi /opt/modules/spark-1.5.1-bin-hadoop2.4/conf/spark-env.sh
export JAVA_HOME=/opt/modules/jdk1.8.0_151
export SCALA_HOME=/opt/modules/scala
export SPARK_MASTER_IP=192.168.202.151
export SPARK_WORKER_MEMORY=1g
export HADOOP_CONF_DIR=/opt/modules/hadoop-2.5.0/etc/hadoop
修改slaves文件
# mv slaves.template slaves
# vi slaves
192.168.202.151
192.168.202.152
192.168.202.153
分发节点
# scp -r /opt/modules/spark/ 192.168.202.152:/opt/modules/
启动spark集群
# cd /opt/modules/spark-1.5.1-bin-hadoop2.4/sbin/
# ./start-all.sh
starting org.apache.spark.deploy.master.Master, logging to /opt/modules/spark-1.5.1-bin-hadoop2.4/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-spark1.out
192.168.202.153: starting org.apache.spark.deploy.worker.Worker, logging to /opt/modules/spark-1.5.1-bin-hadoop2.4/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-spark3.out
192.168.202.152: starting org.apache.spark.deploy.worker.Worker, logging to /opt/modules/spark-1.5.1-bin-hadoop2.4/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-spark2.out
[root@spark1 sbin]# jps
6048 Jps
2592 NodeManager
5921 Master
2482 NameNode
2677 JobHistoryServer
2538 DataNode
2735 QuorumPeerMain
[root@spark2 conf]# jps
4225 Jps
2629 QuorumPeerMain
4153 Worker
2474 NodeManager
2427 DataNode
2558 ResourceManager
使用jsp和8080端口可以检查集群是否启动成功
在浏览器输入:spark1:8080
进入spark-shell查看是否正常
# spark-shell