- 首先,将安装包通过XFTP传到Linux主机的/home/hadoop/Downloads文件夹下,然后解压安装:
[hadoop@master Downloads]$ sudo tar -zxf spark-2.1.0-bin-without-hadoop.tgz -C /usr/local
[hadoop@master Downloads]$ cd /usr/local
[hadoop@master local]$ sudo mv ./spark-2.1.0-bin-without-hadoop ./spark
修改hadoop用户对文件夹spark的访问权限:
[hadoop@master local]$ sudo chown -R hadoop:hadoop ./spark
- 配置
复制一份由Spark安装文件自带的配置文件模板:
[hadoop@master local]$ cd /usr/local/spark
[hadoop@master spark]$ cp ./conf/spark-env.sh.template ./conf/spark-env.sh
[hadoop@master spark]$ vim ./conf/spark-env.sh
使用vim编辑器打开spark-env.sh,再第一行添加配置信息:
export SPARK_DIST_CLASSPATH=$(/usr/local/hadoop/bin/hadoop classpath)
验证Spark安装是否成功:
[hadoop@master spark]$ cd /usr/local/spark
[hadoop@master spark]$ bin/run-example SparkPi 2>&1 | grep "Pi is roughly"
返回结果:
Pi is roughly 3.1386556932784666
3. 启动HDFS后,Spark可以对HDFS中的数据进行读写。