文章目录
spark Scala开发环境搭建
下载安装jdk、idea、scala,记录好路径
用到的软件可从本人网盘下载
链接:https://pan.baidu.com/s/1qmCeVgQyS68dTSFe9HLhBQ
提取码:teac
打开idea 新建Maven project,选择jdk路径
设置工程参数
设置scala插件
File-Settings-Plugins,选择下载好的scala插件,直接选择压缩包即可。(在线下载特别慢)先从官网下载再导入比较快。
官网scala插件地址http://plugins.jetbrains.com/plugin/1347-scala
设置完成重启idea
设置安装scala
在电脑上安装完成scala后(非插件),再项目右键选择下图,点开后选择scala的安装路径即可配置
更新软件源
右键项目依次选择
将文件替换为下面的内容
<?xml version="1.0" encoding="UTF-8"?>
<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0 http://maven.apache.org/xsd/settings-1.0.0.xsd">
<mirrors>
<!-- mirror
| Specifies a repository mirror site to use instead of a given repository. The repository that
| this mirror serves has an ID that matches the mirrorOf element of this mirror. IDs are used
| for inheritance and direct lookup purposes, and must be unique across the set of mirrors.
|
<mirror>
<id>mirrorId</id>
<mirrorOf>repositoryId</mirrorOf>
<name>Human Readable Name for this Mirror.</name>
<url>http://my.repository.com/repo/path</url>
</mirror>
-->
<mirror>
<id>alimaven</id>
<name>aliyun maven</name>
<url>http://maven.aliyun.com/nexus/content/groups/public/</url>
<mirrorOf>central</mirrorOf>
</mirror>
<mirror>
<id>uk</id>
<mirrorOf>central</mirrorOf>
<name>Human Readable Name for this Mirror.</name>
<url>http://uk.maven.org/maven2/</url>
</mirror>
<mirror>
<id>CN</id>
<name>OSChina Central</name>
<url>http://maven.oschina.net/content/groups/public/</url>
<mirrorOf>central</mirrorOf>
</mirror>
<mirror>
<id>nexus</id>
<name>internal nexus repository</name>
<!-- <url>http://192.168.1.100:8081/nexus/content/groups/public/</url>-->
<url>http://repo.maven.apache.org/maven2</url>
<mirrorOf>central</mirrorOf>
</mirror>
</mirrors>
</settings>
修改pom文件
设置spark版本、java版本、scala版本,修改
scala版本为2.11.12
spark为2.4.0
jdk为1.8
具体内容为
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.11.12</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-compiler</artifactId>
<version>2.11.12</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-reflect</artifactId>
<version>2.11.12</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.12</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.11</artifactId>
<version>2.4.0</version>
</dependency>
</dependencies>
<repositories>
<repository>
<id>central</id>
<name>Maven Repository Switchboard</name>
<layout>default</layout>
<url>http://repo2.maven.org/maven2</url>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
<build>
<sourceDirectory>src/main/java</sourceDirectory>
<testSourceDirectory>src/test/java</testSourceDirectory>
<plugins>
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<version>2.15.2</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<!-- MAVEN 编译使用的JDK版本 -->
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.3</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
<encoding>UTF-8</encoding>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.0.0</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.10</version>
<configuration>
<skip>true</skip>
</configuration>
</plugin>
</plugins>
</build>
等二十分钟左右自动下载依赖包
新建scala类
如下图的文件夹中单击右键选择新建scala class
编写scala程序
,简单示例
import org.apache.spark.{SparkConf, SparkContext}
object wordcount {
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setMaster("local").setAppName("wordcount")
val sc = new SparkContext(conf)
val rdd= sc.textFile(path="D:/y.txt")
println(rdd.count())
}
}
点击运行
windows本地模式saveAsTextFile失败的解决方法
参考 https://www.cnblogs.com/029zz010buct/p/4680403.html
将winutils.exe
复制到一个文件夹中,例如D:\wcg\bina\bin
链接:https://pan.baidu.com/s/1qmCeVgQyS68dTSFe9HLhBQ
提取码:teac
在程序的main函数下设置下面一行,改路径,即可
System.setProperty("hadoop.home.dir", "D:/wcg/bina")
idea打包jar包发送到集群运行
即可生成jar包。
上传到Linux执行。