idea 利用maven创建spark项目

1、file--new project 进入如下页面

点击next,

2、进入如下页面

Name:填入项目名称

Location:对应项目文件地址

GroupId、ArtifactId,随意,会在pom文件中见到

Version:默认,会在pom文件中见到

点击 finish,

3、进入如下页面,需要配置pom文件,添加依赖

4、pom文件增加代码

<inceptionYear>2008</inceptionYear>
    <properties>
        <scala.version>2.11.8</scala.version>
        <spark.version>2.3.2</spark.version>
    </properties>

    <dependencies>
        <!--    XGBoost-->
        <!--    <dependency>
              <groupId>ml.dmlc</groupId>
              <artifactId>xgboost4j</artifactId>
              <version>0.81</version>
              <scope>system</scope>
              <systemPath>${project.basedir}/lib/xgboost4j-0.81.jar</systemPath>
            </dependency>
            <dependency>
              <groupId>ml.dmlc</groupId>
              <artifactId>xgboost4j-spark</artifactId>
              <version>0.81</version>
              <scope>system</scope>
              <systemPath>${project.basedir}/lib/xgboost4j-spark-0.81.jar</systemPath>
            </dependency>-->

        <dependency>
            <groupId>ml.dmlc</groupId>
            <artifactId>xgboost4j</artifactId>
            <version>0.81</version>
        </dependency>
        <dependency>
            <groupId>ml.dmlc</groupId>
            <artifactId>xgboost4j-spark</artifactId>
            <version>0.81</version>
        </dependency>


        <!--    LightGBM-->
        <dependency>
            <groupId>com.microsoft.ml.spark</groupId>
            <artifactId>mmlspark_2.11</artifactId>
            <version>0.18.0</version>
        </dependency>
        <dependency>
            <groupId>com.microsoft.ml.lightgbm</groupId>
            <artifactId>lightgbmlib</artifactId>
            <version>2.2.350</version>
        </dependency>


        <!--    spark依赖-->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-mllib_2.11</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-catalyst_2.11</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.11</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>${scala.version}</version>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.4</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.specs</groupId>
            <artifactId>specs</artifactId>
            <version>1.2.5</version>
            <scope>test</scope>
        </dependency>

        <!--    &lt;!&ndash;    flink依赖&ndash;&gt;
            <dependency>
              <groupId>org.apache.flink</groupId>
              <artifactId>flink-scala_2.11</artifactId>
              <version>1.7.0</version>
            </dependency>
            <dependency>
              <groupId>org.apache.flink</groupId>
              <artifactId>flink-streaming-scala_2.11</artifactId>
              <version>1.7.0</version>
            </dependency>
            <dependency>
              <groupId>net.minidev</groupId>
              <artifactId>json-smart</artifactId>
              <version>2.3</version>
              <scope>test</scope>
            </dependency>-->

    </dependencies>

    <build>
        <sourceDirectory>src/main/scala</sourceDirectory>
        <testSourceDirectory>src/test/scala</testSourceDirectory>
        <plugins>
            <plugin>
                <groupId>org.scala-tools</groupId>
                <artifactId>maven-scala-plugin</artifactId>
                <executions>
                    <execution>
                        <goals>
                            <goal>compile</goal>
                            <goal>testCompile</goal>
                        </goals>
                    </execution>
                </executions>
                <configuration>
                    <scalaVersion>${scala.version}</scalaVersion>
                    <args>
                        <arg>-target:jvm-1.8</arg>
                    </args>
                </configuration>
            </plugin>
            <!--      maven里执行测试用例的插件,不显式配置则用默认配置,默认绑定在test生命周期-->
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>2.19</version>
                <dependencies>
                </dependencies>
                <configuration>
                    <skipTests>true</skipTests>
                </configuration>
            </plugin>
            <!--      打包插件:https://www.jianshu.com/p/d44f713b1ec9-->
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-jar-plugin</artifactId>
                <configuration>
                    <archive>
                        <manifest>
                            <mainClass>com.cntaiping.fintech.business.CarRenewalMain</mainClass>
                            <addClasspath>true</addClasspath>
                            <classpathPrefix>lib/</classpathPrefix>
                        </manifest>
                    </archive>
                    <classesDirectory>
                    </classesDirectory>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-dependency-plugin</artifactId>
                <executions>
                    <execution>
                        <id>copy-dependencies</id>
                        <phase>package</phase>
                        <goals>
                            <goal>copy-dependencies</goal>
                        </goals>
                        <configuration>
                            <outputDirectory>${project.build.directory}/lib</outputDirectory>
                            <overWriteReleases>false</overWriteReleases>
                            <overWriteSnapshots>false</overWriteSnapshots>
                            <overWriteIfNewer>true</overWriteIfNewer>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

        </plugins>
    </build>
    <reporting>
        <plugins>

        </plugins>
    </reporting>

5、说明,依赖中有xgboost等算法的依赖,提前将其开源包添加在lib文件夹中

6、关键步骤,配置maven指向地址,

file-settings-maven

 maven home directory: maven在idea-plugins中的位置

user settings file:maven安装文件中配置文件settings的位置

local repository:maven安装文件中repo文件夹的位置

7、点击maven,刷新按钮,pom配置生效

 8、准备尝试代码,先建个文件夹

file--project structure

main--new folder 创建Scala文件夹,并标记为sources

 9、示例代码,成功

如果想要去掉info信息显示,参见其他文章中描述,resources中增加log4j.properties文件

  • 0
    点赞
  • 17
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
首先,需要在本地安装MavenSpark。然后,可以按照以下步骤创建MavenSpark项目: 1. 打开终端或命令行界面,进入要创建项目的目录。 2. 运行以下命令创建Maven项目:`mvn archetype:generate -DgroupId=com.example -DartifactId=my-spark-project -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false` 3. 进入项目目录:`cd my-spark-project` 4. 打开pom.xml文件,添加Spark依赖: ```xml <dependencies> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.4.</version> </dependency> </dependencies> ``` 5. 在src/main/java目录下创建一个Java类,例如:`com.example.App` 6. 编写Spark应用程序代码,例如: ```java package com.example; import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.JavaSparkContext; public class App { public static void main(String[] args) { SparkConf conf = new SparkConf().setAppName("My Spark App").setMaster("local"); JavaSparkContext sc = new JavaSparkContext(conf); JavaRDD<String> lines = sc.textFile("input.txt"); long count = lines.count(); System.out.println("Lines in file: " + count); sc.stop(); } } ``` 7. 在终端或命令行界面中,进入项目目录并运行以下命令构建项目:`mvn package` 8. 运行Spark应用程序:`spark-submit --class com.example.App target/my-spark-project-1.-SNAPSHOT.jar` 以上就是创建MavenSpark项目的步骤。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值