spark打包加线上运行

最新推荐文章于 2022-09-05 18:02:44 发布

折痕或DDG

最新推荐文章于 2022-09-05 18:02:44 发布

阅读量212

点赞数 1

分类专栏： hadoop

本文链接：https://blog.csdn.net/weixin_44971441/article/details/93908498

版权

hadoop 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

两种方式：有外部依赖，没有外部依赖；

1. 1. 没有外部依赖

File--->Project Structure

Artifacts中点击+，添加一个Empty的jar

给当前应用配置名称，添加class

Build向artifact

最后一步，进行build

Jar的地址：build完毕之后会在当前项目的目录创建一个out/artifacts/{配置的artifacts名称}/{配置的artifacts名称}.jar

1. 1. 有外部依赖
    1. 添加maven依赖

<groupId>org.scala-tools</groupId>

<artifactId>maven-scala-plugin</artifactId>

<goals>

<goal>compile</goal>

<goal>testCompile</goal>

</goals>

</execution>

</executions>

<scalaVersion>${scala.version}</scalaVersion>

<args>

<arg>-target:jvm-1.5</arg>

</args>

</configuration>

</plugin>

<groupId>org.apache.maven.plugins</groupId>

<artifactId>maven-eclipse-plugin</artifactId>

<buildcommand>ch.epfl.lamp.sdt.core.scalabuilder</buildcommand>

</buildcommands>

<projectnature>ch.epfl.lamp.sdt.core.scalanature</projectnature>

</additionalProjectnatures>

<classpathContainer>org.eclipse.jdt.launching.JRE_CONTAINER</classpathContainer>

<classpathContainer>ch.epfl.lamp.sdt.launching.SCALA_CONTAINER</classpathContainer>

</classpathContainers>

</configuration>

</plugin>

<artifactId>maven-assembly-plugin</artifactId>

<descriptorRef>jar-with-dependencies</descriptorRef>

</descriptorRefs>

<!--<manifest>

</manifest>-->

</archive>

</configuration>

<id>make-assembly</id>

<phase>package</phase>

<goals>

<goal>single</goal>

</goals>

</execution>

</executions>

</plugin>

<groupId>org.codehaus.mojo</groupId>

<artifactId>build-helper-maven-plugin</artifactId>

<id>add-source</id>

<phase>generate-sources</phase>

<goals>

<goal>add-source</goal>

</goals>

<source>src/main/scala</source>

</sources>

</configuration>

</execution>

</executions>

</plugin>

1. 1. 1. 执行打包
可视化：

命令行：

mvn clean package -DskipTests

Spark-Submit
1. 配置spark-submit脚本
  1. standalone

export HADOOP_CONF_DIR=/home/bigdata/app/hadoop/etc/hadoop

export SPARK_HOME=/home/bigdata/app/spark

${SPARK_HOME}/bin/spark-submit \

--class com.desheng.bigdata.spark.scala.core.p1._01SparkScalaWordCount \

--master spark://bigdata01:7077 \

--deploy-mode cluster \

--executor-memory 600m \

--total-executor-cores 2 \

hdfs://ns1/jars/spark/1810-bd/spark-wc.jar \

hdfs://ns1/data/hello.txt

1. 1. Yarn
配置

export HADOOP_CONF_DIR=/home/bigdata/app/hadoop/etc/hadoop

export SPARK_HOME=/home/bigdata/app/spark

${SPARK_HOME}/bin/spark-submit \

--class com.desheng.bigdata.spark.scala.core.p1._01SparkScalaWordCount \

--master yarn \

--deploy-mode cluster \

--executor-memory 600m \

--num-executors 1 \

hdfs://ns1/jars/spark/1810-bd/spark-wc.jar \

hdfs://ns1/data/hello.txt

修改yarn-site.xml配置文件

在每一个yarn-site.xml文件新增一下内容：

<name>yarn.nodemanager.pmem-check-enabled</name>

<value>false</value>

</property>

<name>yarn.nodemanager.vmem-check-enabled</name>

<value>false</value>

</property>

重启yarn集群

折痕或DDG

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录