Win10 Spark&Scala

本文前提是win10已经安装配置好了Java JDK,不再赘述。

1.下载安装Spark

Downloads | Apache Spark

    1.1 先将spark-3.3.0-bin-hadoop3.tgz 解压为spark-3.3.0-bin-hadoop3.tar,再将tar文件解压到目标路径,比如C:\Apps\spark-3.3.0-bin-hadoop3。此包已自带Scala 2.12。

    1.2 将C:\Apps\spark-3.3.0-bin-hadoop3\bin加到系统环境变量Path。新增变量SPARK_HOME和HADOOP_HOME,值都是路径C:\Apps\spark-3.3.0-bin-hadoop3

2. 下载安装winutils.exe

winutils/winutils.exe at master · steveloughran/winutils · GitHub

在windows下跑spark需要winutils.exe,因为它使用类POSIX方式通过windows API进行文件访问操作.winutils.exe 使Spark可以使用Windows特有的服务包括在windows环境跑shell命令。

3.Spark命令行

命令行执行命令 spark-shell,

spark.version可以查看版本,也可以运行其他Spark代码,比如创建RDD

4.配置IntelliJ

  4.1 安装Scala插件

  File>Settings...>Plugins, 点击Install安装,完成后需要重启IntelliJ。Installed表示已安装。

  4.2 设置Scala SDK

  File>Project Structure...>Platform Settings / Global Libraries -> +/Add -> Scala SDK -> Download。下载安装对应版本2.12.15

   4.3 下载sample工程,https://github.com/spark-examples/spark-hello-world-example

   4.3.1 修改pom.xml,修改scala,spark对应版本号

  <properties>
    <scala.version>2.12.15</scala.version>
    <spark.version>3.3.0</spark.version>
  </properties>

   4.3.2 插件报错

Failure to find org.apache.maven.plugins:maven-eclipse-plugin:pom

删除maven-eclipse-plugin在pom.xml的相关部分。

   4.3.3 报错解决

class java.lang.RuntimeException/error reading Scala signature of package.class: Scala signature package has wrong version

expected: 5.0
found: 5.2 in package.class)

这是设置的Scala sdk版本与pom里的版本不一致引起的。修改pom.xml

    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.12</artifactId>
      <version>${spark.version}</version>
      <scope>compile</scope>
    </dependency>

    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql_2.12</artifactId>
      <version>${spark.version}</version>
      <scope>compile</scope>
    </dependency>

最后的pom.xml如下

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>org.example</groupId>
  <artifactId>spark-hello-world-example</artifactId>
  <version>1.0-SNAPSHOT</version>
  <inceptionYear>2008</inceptionYear>
  <packaging>jar</packaging>
  <properties>
    <scala.version>2.12.15</scala.version>
    <spark.version>3.3.0</spark.version>
  </properties>

  <repositories>
    <repository>
      <id>scala-tools.org</id>
      <name>Scala-Tools Maven2 Repository</name>
      <url>http://scala-tools.org/repo-releases</url>
    </repository>
  </repositories>

  <pluginRepositories>
    <pluginRepository>
      <id>scala-tools.org</id>
      <name>Scala-Tools Maven2 Repository</name>
      <url>http://scala-tools.org/repo-releases</url>
    </pluginRepository>
  </pluginRepositories>

  <dependencies>
    <dependency>
      <groupId>org.scala-lang</groupId>
      <artifactId>scala-library</artifactId>
      <version>${scala.version}</version>
    </dependency>

    <dependency>
      <groupId>org.specs</groupId>
      <artifactId>specs</artifactId>
      <version>1.2.5</version>
      <scope>test</scope>
    </dependency>

    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.12</artifactId>
      <version>${spark.version}</version>
      <scope>compile</scope>
    </dependency>

    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql_2.12</artifactId>
      <version>${spark.version}</version>
      <scope>compile</scope>
    </dependency>

  </dependencies>

  <build>
    <sourceDirectory>src/main/scala</sourceDirectory>
    <resources><resource><directory>src/main/resources</directory></resource></resources>
    <plugins>

    </plugins>
  </build>
  <reporting>
    <plugins>
      <plugin>
        <groupId>org.scala-tools</groupId>
        <artifactId>maven-scala-plugin</artifactId>
        <configuration>
          <scalaVersion>${scala.version}</scalaVersion>
        </configuration>
      </plugin>
    </plugins>
  </reporting>
</project>

  4.3.4 创建SparkSessionTest应用,或选中SparkSessionTest右键 Run 'Spark Test'

  运行结果如下:

 5. 参考

How to Run Spark Hello World Example in IntelliJ - Spark by {Examples}

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值