idea环境开发spark程序

Mac安装idea后可以方便的进行spark相关代码开发。假设idea上已安装了scala插件(安装过程可百度)。

第一步:新建一个maven工程及文件,在src目录下添加

工程结构如图:

 
NaiveBayesExample.scala文件

文件内容:

import org.apache.spark.{SparkConf, SparkContext}
// $example on$
import org.apache.spark.mllib.classification.{NaiveBayes, NaiveBayesModel}
import org.apache.spark.mllib.util.MLUtils
// $example off$

object NaiveBayesExample {

  def main(args: Array[String]): Unit = {
    val conf = new SparkConf().setAppName("NaiveBayesExample")
    val sc = new SparkContext(conf)
    // $example on$
    // Load and parse the data file.
    val data = MLUtils.loadLibSVMFile(sc, "data/mllib/sample_libsvm_data.txt")

    // Split data into training (60%) and test (40%).
    val Array(training, test) = data.randomSplit(Array(0.6, 0.4))

    val model = NaiveBayes.train(training, lambda = 1.0, modelType = "multinomial")

    val predictionAndLabel = test.map(p => (model.predict(p.features), p.label))
    val accuracy = 1.0 * predictionAndLabel.filter(x => x._1 == x._2).count() / test.count()

    // Save and load model
    model.save(sc, "target/tmp/myNaiveBayesModel1")
    val sameModel = NaiveBayesModel.load(sc, "target/tmp/myNaiveBayesModel1")
    // $example off$

    sc.stop()
  }
}

第二步,添加pom文件内容,如下:

<repositories>
    <repository>
        <id>scala-tools.org</id>
        <name>Scala-Tools Maven2 Repository</name>
        <url>http://scala-tools.org/repo-releases</url>
    </repository>
</repositories>

<properties>
    <scala.version>2.11</scala.version>
    <hadoop.version>2.6.5</hadoop.version>
    <scala.minor.version>2.11</scala.minor.version>
    <scala.complete.version>${scala.minor.version}.8</scala.complete.version>
    <spark.version>2.2.0</spark.version>
</properties>

<dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>${spark.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>${spark.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming_2.11</artifactId>
        <version>${spark.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-client</artifactId>
        <version>${hadoop.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>${hadoop.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-hdfs</artifactId>
        <version>${hadoop.version}</version>
    </dependency>
    <dependency> <!-- Spark dependency -->
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-mllib_2.10</artifactId>
        <version>1.6.1</version>
        <scope>provided</scope>
    </dependency>

</dependencies>

<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-enforcer-plugin</artifactId>
            <version>1.4.1</version>
            <executions>
                <execution>
                    <id>enforce</id>
                    <configuration>
                        <rules>
                            <requireModuleConvergence/>
                            <!--        <requireMavenVersion>
                                        <version>${maven.version.min}</version>
                                    </requireMavenVersion>-->
                            <requireJavaVersion>
                                <version>${java.version}</version>
                            </requireJavaVersion>
                        </rules>
                    </configuration>
                    <goals>
                        <goal>enforce</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
        <plugin>
            <groupId>net.alchim31.maven</groupId>
            <artifactId>scala-maven-plugin</artifactId>
            <version>3.2.2</version>
            <configuration>
                <scalaVersion>${scala.complete.version}</scalaVersion>
                <args>
                    <arg>-unchecked</arg>
                    <arg>-deprecation</arg>
                    <arg>-feature</arg>
                </args>
                <javacArgs>
                    <javacArg>-source</javacArg>
                    <javacArg>${java.version}</javacArg>
                    <javacArg>-target</javacArg>
                    <javacArg>${java.version}</javacArg>
                </javacArgs>
            </configuration>
            <executions>
                <execution>
                    <phase>compile</phase>
                    <goals>
                        <goal>compile</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-assembly-plugin</artifactId>
            <version>2.6</version>
            <configuration>
                <descriptorRefs>
                    <descriptorRef>jar-with-dependencies</descriptorRef>
                </descriptorRefs>
                <recompressZippedFiles>false</recompressZippedFiles>
                <skipAssembly>false</skipAssembly>
            </configuration>

            <!--<executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>single</goal>
                    </goals>
                </execution>
            </executions>-->
        </plugin>
    </plugins>
</build>

第三步,编译打包:
在工程目录下输入命令:mvn clean scala:compile compile package

第四部,运行程序

为了方便运行,将生成的jar包:/Applications/code/spark_test/target/spark_test-1.0-SNAPSHOT.jar

拷贝到:/applications/spark_test.jar

输入:./bin/spark-submit --class NaiveBayesExample --master spark://SH-snow.local:7077 /applications/spark_test.jar

运行即可。

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值