先安装Hadoop 2.7.7 伪分布式环境,参考:
https://blog.csdn.net/ruth13156402807/article/details/115631343
一、安装Scala(可选)
使用root安装
1.1下载
Scala下载地址:http://www.scala-lang.org/download/all.html
选择对应的版本,此处在Linux上安装,选择的版本是scala-2.11.8.tgz,
下载地址:
https://www.scala-lang.org/download/2.11.8.htmlhttps://www.scala-lang.org/download/2.11.8.html
1.2上传解压缩
[root@master ~]# cd /usr/local/
[root@master local]# tar -zxvf scala-2.11.8.tgz
1.3 配置环境变量
[root@master ~]# vim /etc/profile
export SCALA_HOME=/usr/local/scala-2.11.8
export PATH=$PATH:$SCALA_HOME/bin
保存并使其立即生效
[root@master ~]# source /etc/profile
1.4验证是否安装成功
[root@master ~]# scala -version
Scala code runner version 2.11.8 -- Copyright 2002-2016, LAMP/EPFL
二、Spark的安装
使用hadoop用户进行安装
2.1 下载安装包
下载地址:
https://archive.apache.org/dist/spark/spark-2.3.0/
2.2 上传解压缩,上传到/home/hadoop/apps
[hadoop@master apps]$ tar -zxvf spark-2.3.0-bin-hadoop2.7.tgz
2.3 进入spark/conf修改配置文件
[hadoop@master apps]$ cd spark-2.3.0-bin-hadoop2.7/conf/
复制spark-env.sh.template并重命名为spark-env.sh,并在文件最后添加配置内容
[hadoop@master conf]$ cp spark-env.sh.template spark-env.sh
[hadoop@master conf]$ vim spark-env.sh
#添加内容
export JAVA_HOME=/usr/local/jdk1.8.0_211
export SCALA_HOME=/usr/share/scala-2.11.8
export HADOOP_HOME=/home/hadoop/apps/hadoop-2.7.7
export HADOOP_CONF_DIR=/home/hadoop/apps/hadoop-2.7.7/etc/hadoop
export SPARK_MASTER_IP=master
export SPARK_MASTER_PORT=7077
#其中master参考:
[hadoop@master hadoop-2.7.7]$ cat /etc/hosts
192.168.1.100 master
2.4配置环境变量
[hadoop@master ~]$ vim .bash_profile
#SPARK_HOME
export SPARK_HOME=/home/hadoop/apps/spark-2.3.0-bin-hadoop2.7
export PATH=$SPARK_HOME/bin
保存并使其立即生效
[hadoop@master ~]$ source .bash_profile
2.5启动Spark
[hadoop@master ~]$ cd apps/spark-2.3.0-bin-hadoop2.7/sbin/
[hadoop@master sbin]$ ./start-all.sh
starting org.apache.spark.deploy.master.Master, logging to /home/hadoop/apps/spark-2.3.0-bin-hadoop2.7/logs/spark-hadoop-org.apache.spark.deploy.master.Master-1-master.out
localhost: Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
localhost: starting org.apache.spark.deploy.worker.Worker, logging to /home/hadoop/apps/spark-2.3.0-bin-hadoop2.7/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-master.out
2.6 查看进程
[hadoop@master sbin]$ jps
37840 RunJar
32513 NodeManager
31892 NameNode
53588 Worker
32149 SecondaryNameNode
53528 Master
31993 DataNode
53705 Jps
32411 ResourceManager
#其中Worker和Master即为spark的进程
2.7 查看web界面
访问地址: http://192.168.1.100:8080
参考:https://www.cnblogs.com/qingyunzong/p/8903714.html