idea关联scala与spark开发（全）

最新推荐文章于 2024-07-02 00:25:59 发布

小跳蚤的绿茵传奇

最新推荐文章于 2024-07-02 00:25:59 发布

阅读量2.7k

点赞数 3

分类专栏：相关环境搭建文章标签： scala spark intellij-idea

本文链接：https://blog.csdn.net/qq_49824182/article/details/127283471

版权

1.idea上安装scala插件

按照箭头指示操作
在这里插入图片描述
装好之后重启idea

2. 添加scala框架

创建项目：文件->新建->项目->名称和位置，java，maven->创建
在这里插入图片描述

添加scala框架支持:右键->添加框架支持->下拉找到scala，点击并确定

3. 创建scala案例运行测试

在main和test文件夹下建立scala文件夹
在这里插入图片描述
将main目录下的scala目录标记为源代码根目录

新建scala类，编写案例进行测试

4. 添加spark依赖包，运行spark案例

添加依赖包：文件->项目结构->
在这里插入图片描述

找到你安装本地spark目录下的jars包文件，点击确定，添加进去

之后你会看见这里多了jars目录，这是运行需要的库

创建test2运行spark程序案例并运行：

代码：

import org.apache.spark.{SparkConf, SparkContext}  
  
object test2 {  
  def main(args: Array[String]): Unit = {  
    val conf = new SparkConf().setAppName("WordCount").setMaster("local[2]")  
    val sc: SparkContext = new SparkContext(conf)  
    val line = sc.textFile("F:\\test.txt")  
  
    val word = line.flatMap(_.split(" "))  
  
    val tup  = word.map((_,1))  
    val reduced = tup.reduceByKey(_+_)  
    val res = reduced.sortBy(_._2,false)  
    println(res.collect.toBuffer)  
    res.saveAsTextFile("./TestWord")  
    sc.stop()  
  
  }  
}

txt文件：
hello hello world scala java Python
java hello c++ c kafka flume hadoop sqoop
supervisor redis hive hive hbase hbase zookeeper hive hdfs hdfs hdfs
大数据大数据大数据程序员

运行结果：
在这里插入图片描述
至此，已做好环境准备。

另外，如果关于需要配置pom.xml，提供以下参考文件，对应版本修改一下就好了：

<?xml version="1.0" encoding="UTF-8"?>  
<project xmlns="http://maven.apache.org/POM/4.0.0"  
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">  
    <modelVersion>4.0.0</modelVersion>  
  
    <!--设置自己的groupID-->  
    <groupId>org.example</groupId>  
    <artifactId>sparkDemo</artifactId>  
    <version>1.0-SNAPSHOT</version>  
  
    <!--设置依赖版本号-->  
    <properties>  
        <scala.version>2.11.12</scala.version>  
        <hadoop.version>2.7.3</hadoop.version>  
        <spark.version>2.4.0</spark.version>  
    </properties>  
    <dependencies>        <!--Scala-->  
        <dependency>  
            <groupId>org.scala-lang</groupId>  
            <artifactId>scala-library</artifactId>  
            <version>${scala.version}</version>  
        </dependency>        <!--Spark-->  
        <dependency>  
            <groupId>org.apache.spark</groupId>  
            <artifactId>spark-core_2.11</artifactId>  
            <version>${spark.version}</version>  
        </dependency>        <dependency>            <groupId>org.apache.spark</groupId>  
            <artifactId>spark-sql_2.11</artifactId>  
            <version>${spark.version}</version>  
        </dependency>        <dependency>            <groupId>mysql</groupId>  
            <artifactId>mysql-connector-java</artifactId>  
            <version>5.1.47</version>  
        </dependency>        <!--Hadoop-->  
        <dependency>  
            <groupId>org.apache.hadoop</groupId>  
            <artifactId>hadoop-client</artifactId>  
            <version>${hadoop.version}</version>  
        </dependency>  
        <!--  https://mvnrepository.com/artifact/com.google.code.gson/gson  
         <dependency>             <groupId>com.google.code.gson</groupId>             <artifactId>gson</artifactId>             <version>2.8.0</version>         </dependency>  
         &lt;!&ndash; https://mvnrepository.com/artifact/org.apache.kafka/kafka &ndash;&gt;         <dependency>             <groupId>org.apache.kafka</groupId>             <artifactId>kafka_2.11</artifactId>             <version>1.0.0</version>         </dependency>-->  
        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-mllib -->        <dependency>  
            <groupId>org.apache.spark</groupId>  
            <artifactId>spark-mllib_2.11</artifactId>  
            <version>${spark.version}</version>  
        </dependency>    </dependencies>  
    <build>        <sourceDirectory>src/main/scala</sourceDirectory>  
        <testSourceDirectory>src/test/scala</testSourceDirectory>  
  
        <plugins>            <plugin>                <groupId>net.alchim31.maven</groupId>  
                <artifactId>scala-maven-plugin</artifactId>  
                <version>3.2.2</version>  
                <executions>                    <execution>                        <goals>                            <goal>compile</goal>  
                            <goal>testCompile</goal>  
                        </goals>                        <configuration>                            <args>                                <arg>-dependencyfile</arg>  
                                <arg>${project.build.directory}/.scala_dependencies</arg>  
                            </args>                        </configuration>                    </execution>                </executions>            </plugin>  
            <plugin>                <groupId>org.apache.maven.plugins</groupId>  
                <artifactId>maven-shade-plugin</artifactId>  
                <version>2.4.3</version>  
                <executions>                    <execution>                        <phase>package</phase>  
                        <goals>                            <goal>shade</goal>  
                        </goals>                        <configuration>                            <filters>                                <filter>                                    <artifact>*:*</artifact>  
                                    <excludes>                                        <exclude>META-INF/*.SF</exclude>  
                                        <exclude>META-INF/*.DSA</exclude>  
                                        <exclude>META-INF/*.RSA</exclude>  
                                    </excludes>                                </filter>                            </filters>                            <transformers>                                <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">  
                                </transformer>                            </transformers>                        </configuration>                    </execution>                </executions>            </plugin>        </plugins>    </build></project>