- Step1:下载sparkhttp://mirrors.shu.edu.cn/apache/spark/spark-2.3.0/spark-2.3.0-bin-hadoop2.7.tgz
- Step2:将下载好的spark通过命令传输到linux环境下(rz命令,最好创建一个file,放到file中 mkdir opt)
- Step3:cd /opt中通过命令解压压缩包
tar -zxvf spark-2.3.0-bin-hadoop2.7.tgz
- Step4:cd /etc中,vi profile文件, 将如下代码复制到文件中(注:目录有出处请修改)
#Spark enviroment
export SPARK_HOME=/opt/spark/spark-2.3.0-bin-hadoop2.7/ export PATH="$SPARK_HOME/bin:$PATH"
- Step5:在/opt/spark/中新建一个spark_file_test文件夹
mkdir spark_file_test
- Step6:在/opt/spark/spark_file_test中创建一个文件
touch hello_spark
- Step7编辑hello_spark文件,输入一些测试数据
vi hello_spark hello spark! hello spark! hello spark! hello spark!
- Step8:回到cd /opt/spark/spark-2.3.0-bin-hadoop2.7/bin目录中
- Step9:输入spark-shell,出现下图中的info表示成功
2018-04-30 09:35:53 WARN Utils:66 - Your hostname, localhost.localdomain resolves to a loopback address: 127.0.0.1; using 192.168.159.128 instead (on interface eth0) 2018-04-30 09:35:53 WARN Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address 2018-04-30 09:35:57 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN".Spark context Web UI available at http://192.168.159.128:4040 Spark context available as 'sc' (master = local[*], app id = local-1524847005612). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.3.0 /_/ Using Scala version 2.11.8 (Java HotSpot(TM) Client VM, Java 1.8.0_171) Type in expressions to have them evaluated. Type :help for more information
- Step10:读取文件,返回一个RDD
scala> var lines = sc.textFile("../../spark_file_test/hello_spark") 2018-04-27 09:40:53 WARN SizeEstimator:66 - Failed to check whether UseCompressedOops is set; assuming yes lines: org.apache.spark.rdd.RDD[String] = ../../spark_file_test/hello_spark MapPartitionsRDD[1] at textFile at <console>:24
- Step11:测试,读取文件的行数和第一行的信息
scala> lines.count() res0: Long = 5 scala> lines.first res1: String = Hello Spark!
!!!!!!!!!!!!!!!!!!!!!!!!SUCCESSFUL!!!!!!!!!!!!!!!!!!!!!!!!
Spark入门--初学者
最新推荐文章于 2024-06-18 02:09:28 发布