声明:如有错误可以在评论区提出,会及时改正
1、创建maven项目
1.使用idea开发工具,点击File -> New -> Project …,会弹出下图,勾选:Create from archetype,然后选择:maven-archetype-quickstart,然后点击:Next
2.上一步点击Next之后会弹出下图,需要填入项目名称,之后继续点击Next,后续就一直Next,最后Finish完成即可。
3.最后会完成,项目结构。
4.修改pom文件,使maven项目支持scala代码
添加dependencies依赖
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-compiler</artifactId>
<version>2.11.8</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.11.8</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.2.0</version>
</dependency>
添加build属性配置
<build>
<plugins>
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<version>2.15.2</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
2、编写代码
1.在src -> main 目录下创建scala包 ,然后点击scala目录右击,选择:Mark Directory as -> Source Root
2.然后可以创建运行文件:WordCount
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
object WordCount {
def main(args: Array[String]): Unit = {
val conf = new SparkConf()
val spark = SparkSession
.builder()
.master("local[1]")
.appName("wordcount")
.config(conf)
.getOrCreate()
//计算文件在本地的位置
val lines = spark.sparkContext.textFile("D:\\status\\file\\wordcount.txt")
val counts = lines.flatMap(_.split(" "))
.map((_,1))
.reduceByKey(_+_)
counts.foreach(print(_))
spark.stop()
}
}
然后就可以直接运行了。