Spark教程——(2)编写spark-submit测试Demo

创建Maven项目:

填写Maven的pom文件如下:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>org.world.chenfei</groupId>
    <artifactId>JavaSparkPi</artifactId>
    <version>1.0-SNAPSHOT</version>

    <properties>
        <maven.compiler.source>1.8</maven.compiler.source>
        <maven.compiler.target>1.8</maven.compiler.target>
        <encoding>UTF-8</encoding>
        <spark.version>2.1.0</spark.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.10</artifactId>
            <version>${spark.version}</version>
        </dependency>

    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>2.4.3</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <filters>
                                <filter>
                                    <artifact>*:*</artifact>
                                    <excludes>
                                        <exclude>META-INF/*.SF</exclude>
                                        <exclude>META-INF/*.DSA</exclude>
                                        <exclude>META-INF/*.RSA</exclude>
                                    </excludes>
                                </filter>
                            </filters>
                            <transformers>
                                <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                    <mainClass>org.world.chenfei.JavaSparkPi</mainClass>
                                </transformer>
                            </transformers>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>


</project>

编写一个蒙特卡罗求PI的代码:

package org.world.chenfei;

import java.util.ArrayList;
import java.util.List;

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.api.java.function.Function2;

public class JavaSparkPi {
    public static void main(String[] args) throws Exception {
        SparkConf sparkConf = new SparkConf().setAppName("JavaSparkPi")/*.setMaster("local[2]")*/;
        JavaSparkContext jsc = new JavaSparkContext(sparkConf);

        int slices = (args.length == 1) ? Integer.parseInt(args[0]) : 2;
        int n = 100000 * slices;
        List<Integer> l = new ArrayList<Integer>(n);
        for (int i = 0; i < n; i++) {
            l.add(i);
        }

        JavaRDD<Integer> dataSet = jsc.parallelize(l, slices);

        int count = dataSet.map(new Function<Integer, Integer>() {
            @Override
            public Integer call(Integer integer) {
                double x = Math.random() * 2 - 1;
                double y = Math.random() * 2 - 1;
                return (x * x + y * y < 1) ? 1 : 0;
            }
        }).reduce(new Function2<Integer, Integer, Integer>() {
            @Override
            public Integer call(Integer integer, Integer integer2) {
                return integer + integer2;
            }
        });

        System.out.println("Pi is roughly " + 4.0 * count / n);

        jsc.stop();
    }
}

将本项目打包为Jar文件:

 此时在target目录下,就会生成这个项目的Jar包

尝试将该jar包在本地执行:

C:\Users\Administrator\Desktop\swap>java -jar JavaSparkPi-1.0-SNAPSHOT.jar

执行失败,并返回如下信息:

C:\Users\Administrator\Desktop\swap>java -jar JavaSparkPi-1.0-SNAPSHOT.jar
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
19/04/28 16:24:30 INFO SparkContext: Running Spark version 2.1.0
19/04/28 16:24:30 WARN SparkContext: Support for Scala 2.10 is deprecated as of
Spark 2.1.0
19/04/28 16:24:30 WARN NativeCodeLoader: Unable to load native-hadoop library fo
r your platform... using builtin-java classes where applicable
19/04/28 16:24:30 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: A master URL must be set in your configuration at org.apache.spark.SparkContext.<init>(SparkContext.scala:379) at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.sc ala:58) at org.world.chenfei.JavaSparkPi.main(JavaSparkPi.java:15) 19/04/28 16:24:30 INFO SparkContext: Successfully stopped SparkContext Exception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration at org.apache.spark.SparkContext.<init>(SparkContext.scala:379) at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.sc ala:58) at org.world.chenfei.JavaSparkPi.main(JavaSparkPi.java:15)

 

将Jar包上传到服务器上,并执行以下命令:

spark-submit --class org.world.chenfei.JavaSparkPi --executor-memory 500m --total-executor-cores 1 /home/cf/JavaSparkPi-1.0-SNAPSHOT.jar

执行成功,并返回如下信息:

[root@node1 ~]# spark-submit --class org.world.chenfei.JavaSparkPi --executor-memory 500m --total-executor-cores 1 /home/cf/JavaSparkPi-1.0-SNAPSHOT.jar
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/jars/avro-tools-1.7.6-cdh5.14.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
……
19/04/28 14:56:23 INFO util.Utils: Successfully started service 'sparkDriverActorSystem' on port 45426.
19/04/28 14:56:23 INFO spark.SparkEnv: Registering MapOutputTracker
19/04/28 14:56:23 INFO spark.SparkEnv: Registering BlockManagerMaster
19/04/28 14:56:23 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-97788ddb-d5eb-48ce-aa9b-e030102dd06c
……
19/04/28 14:56:25 INFO util.Utils: Fetching spark://10.200.101.131:41504/jars/JavaSparkPi-1.0-SNAPSHOT.jar to /tmp/spark-e96463c3-1979-4247-957c-b381f65ddc88/userFiles-666197fa-738d-41e1-a670-a758af1ef9e1/fetchFileTemp2787870198743975902.tmp
19/04/28 14:56:25 INFO executor.Executor: Adding file:/tmp/spark-e96463c3-1979-4247-957c-b381f65ddc88/userFiles-666197fa-738d-41e1-a670-a758af1ef9e1/JavaSparkPi-1.0-SNAPSHOT.jar to class loader
19/04/28 14:56:25 INFO executor.Executor: Finished task 0.0 in stage 0.0 (TID 0). 1031 bytes result sent to driver
19/04/28 14:56:25 INFO executor.Executor: Finished task 1.0 in stage 0.0 (TID 1). 1031 bytes result sent to driver
19/04/28 14:56:25 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 549 ms on localhost (executor driver) (1/2)
19/04/28 14:56:25 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 670 ms on localhost (executor driver) (2/2)
19/04/28 14:56:25 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
19/04/28 14:56:25 INFO scheduler.DAGScheduler: ResultStage 0 (reduce at JavaSparkPi.java:33) finished in 0.682 s
19/04/28 14:56:25 INFO scheduler.DAGScheduler: Job 0 finished: reduce at JavaSparkPi.java:33, took 1.102582 s
……
Pi is roughly 3.14016
……
19/04/28 14:56:26 INFO ui.SparkUI: Stopped Spark web UI at http://10.200.101.131:4040
19/04/28 14:56:26 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
19/04/28 14:56:26 INFO storage.MemoryStore: MemoryStore cleared
19/04/28 14:56:26 INFO storage.BlockManager: BlockManager stopped
19/04/28 14:56:26 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
19/04/28 14:56:26 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
19/04/28 14:56:26 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
19/04/28 14:56:26 INFO spark.SparkContext: Successfully stopped SparkContext
19/04/28 14:56:26 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
19/04/28 14:56:26 INFO util.ShutdownHookManager: Shutdown hook called
19/04/28 14:56:26 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-e96463c3-1979-4247-957c-b381f65ddc88

计算结果为:

Pi is roughly 3.14016

 

转载于:https://www.cnblogs.com/ratels/p/10784723.html

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值