spark打包加线上运行

两种方式:有外部依赖,没有外部依赖;

      1. 没有外部依赖

File--->Project Structure

 

Artifacts中点击+,添加一个Empty的jar

 

给当前应用配置名称,添加class

 

Build向artifact

 

最后一步,进行build

 

Jar的地址:build完毕之后会在当前项目的目录创建一个out/artifacts/{配置的artifacts名称}/{配置的artifacts名称}.jar 

      1. 有外部依赖
        1. 添加maven依赖

<plugin>

<groupId>org.scala-tools</groupId>

<artifactId>maven-scala-plugin</artifactId>

<version>2.15.0</version>

<executions>

  <execution>

    <goals>

      <goal>compile</goal>

      <goal>testCompile</goal>

    </goals>

  </execution>

</executions>

<configuration>

  <scalaVersion>${scala.version}</scalaVersion>

  <args>

    <arg>-target:jvm-1.5</arg>

  </args>

</configuration>

</plugin>

<plugin>

<groupId>org.apache.maven.plugins</groupId>

<artifactId>maven-eclipse-plugin</artifactId>

<version>2.10</version>

<configuration>

  <downloadSources>true</downloadSources>

  <buildcommands>

    <buildcommand>ch.epfl.lamp.sdt.core.scalabuilder</buildcommand>

  </buildcommands>

  <additionalProjectnatures>

    <projectnature>ch.epfl.lamp.sdt.core.scalanature</projectnature>

  </additionalProjectnatures>

  <classpathContainers>

    <classpathContainer>org.eclipse.jdt.launching.JRE_CONTAINER</classpathContainer>

    <classpathContainer>ch.epfl.lamp.sdt.launching.SCALA_CONTAINER</classpathContainer>

  </classpathContainers>

</configuration>

</plugin>

<plugin>

    <artifactId>maven-assembly-plugin</artifactId>

    <configuration>

      <descriptorRefs>

        <descriptorRef>jar-with-dependencies</descriptorRef>

      </descriptorRefs>

      <archive>

        <!--<manifest>

          <mainClass></mainClass>

        </manifest>-->

      </archive>

    </configuration>

    <executions>

      <execution>

        <id>make-assembly</id>

        <phase>package</phase>

        <goals>

          <goal>single</goal>

        </goals>

      </execution>

    </executions>

</plugin>

<plugin>

    <groupId>org.codehaus.mojo</groupId>

    <artifactId>build-helper-maven-plugin</artifactId>

    <version>1.10</version>

    <executions>

      <execution>

        <id>add-source</id>

        <phase>generate-sources</phase>

        <goals>

          <goal>add-source</goal>

        </goals>

        <configuration>

          <!-- 我们可以通过在这里添加多个source节点,来添加任意多个源文件夹 -->

          <sources>

            <source>src/main/java</source>

            <source>src/main/scala</source>

          </sources>

        </configuration>

      </execution>

    </executions>

</plugin>

        1. 执行打包
  1. 可视化:

 

  1. 命令行:

mvn clean package -DskipTests

 

 

  1. Spark-Submit
    1. 配置spark-submit脚本
      1. standalone

export HADOOP_CONF_DIR=/home/bigdata/app/hadoop/etc/hadoop

export SPARK_HOME=/home/bigdata/app/spark

 

${SPARK_HOME}/bin/spark-submit \

--class com.desheng.bigdata.spark.scala.core.p1._01SparkScalaWordCount \

--master spark://bigdata01:7077 \

--deploy-mode cluster \

--executor-memory 600m \

--total-executor-cores 2 \

hdfs://ns1/jars/spark/1810-bd/spark-wc.jar \

hdfs://ns1/data/hello.txt

 

 

      1. Yarn
  1. 配置

 

export HADOOP_CONF_DIR=/home/bigdata/app/hadoop/etc/hadoop

export SPARK_HOME=/home/bigdata/app/spark

 

${SPARK_HOME}/bin/spark-submit \

--class com.desheng.bigdata.spark.scala.core.p1._01SparkScalaWordCount \

--master yarn \

--deploy-mode cluster \

--executor-memory 600m \

--num-executors 1 \

hdfs://ns1/jars/spark/1810-bd/spark-wc.jar \

hdfs://ns1/data/hello.txt

  1. 修改yarn-site.xml配置文件

在每一个yarn-site.xml文件新增一下内容:

    <property>

        <name>yarn.nodemanager.pmem-check-enabled</name>

        <value>false</value>

    </property>

    <property>

        <name>yarn.nodemanager.vmem-check-enabled</name>

        <value>false</value>

</property>

重启yarn集群

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值