一、准备
1、系统:ubuntu
2、 安装包:Spark2.3.3+Java8+Scala2.11
二、步骤
1、官网下载Spark 安装包spark-2.3.3-bin-hadoop2.7.tgz
2、官网下载scala安装包scala-2.11.12.tgz
3、官网下载JDK安装包jdk-8u201-linux-x64.tar.gz
4、Spark相关配置
4.1安装SSH,附代码以及截图
sudo apt-get update
sudo apt-get install openssh-server
4.2 SSH免密码登录,附代码以及截图
ssh-keygen -t rsa #输入命令生成密钥
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 700 ~/.ssh
chmod 600 ~/.ssh/authorized_keys
sudo service ssh start
ps -e|grep ssh
ssh localhost
4.3 修改访问权限,附代码以及截图
tar –zxvf jdk-8u201-linux-x64.tar.gz
tar –zxvf spark-2.3.3-bin-hadoop2.7.tgz
tar –zxvf scala-2.11.12.tgz
ls
sudo cp –R jdk1.8.0_201 /opt
sudo cp –R spark-2.3.3-bin-hadoop2.7 /opt
sudo cp –R Scala-2.11.12 /opt
sudo chmod –R 777 /opt
4.4 修改profile文件,附代码以及截图
sudo vim /etc/profile
export JAVA_HOME=/opt/jdk1.8.0_17
export CLASS_PATH=/opt/jdk1.8.0_17/lib
export PATH=$PATH:$JAVA_HOME/bin
export SCALA_HOME=/opt/Scala-2.11.12
export PATH=$PATH:$SCALA_HOME/bin
export SPARK_HOME=/opt/spark-2.3.3-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin
:wq
source /etc/profile
4.5 修改Spark配置文件,附代码以及截图
cd /opt/spark-2.3.3-bin-hadoop2.7/conf
cp spark-env.sh.template spark-env.sh
cp log4j.properties.template log4j.properties
cp slaves.template slaves
修改spark-env设置主节点和从节点的配置
export JAVA_HOME=/opt/jdk1.8.0_201
export SCALA_HOME=/opt/Scala-2.11.12
export SPARK_MASTER_IP=SparkMaster
export SPARK_WORKER_MEMORY=4g
export SPARK_WORKER_CORES=2
export SPARK_WORKER_INSTANCES=1
5、Spark集群的启动与关闭。附上启动成功Shell界面以及Web端界面。
cd /opt/spark-2.3.3-bin-hadoop2.7/sbin
./start-all.sh
sudo chmod –R 777 /opt/spark-2.3.3-bin-hadoop2.7