使用Spark基于yarn进行操作,发生了错误,先附上日志:
18/08/11 20:29:29 INFO yarn.Client:
client token: N/A
diagnostics: Application application_1533988876407_0004 failed 2 times due to AM Container for appattempt_1533988876407_0004_000002 exited with exitCode: -103
For more detailed output, check application tracking page:http://hadoop102:8088/cluster/app/application_1533988876407_0004Then, click on links to logs of each attempt.
Diagnostics: Container [pid=16377,containerID=container_1533988876407_0004_02_000001] is running beyond virtual memory limits. Current usage: 59.9 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1533988876407_0004_02_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 16377 16375 16377 16377 (bash) 0 0 115851264 305 /bin/bash -c /opt/module/jdk1.8.0_144/bin/java -server -Xmx512m -Djava.io.tmpdir=/opt/module/hadoop-2.7.2/data/tmp/nm-local-dir/usercache/atguigu/appcache/application_1533988876407_0004/container_1533988876407_0004_02_000001/tmp -Dspark.yarn.app.container.log.dir=/opt/module/hadoop-2.7.2/logs/userlogs/application_1533988876407_0004/container_1533988876407_0004_02_000001 org.apache.spark.deploy.yarn.ExecutorLauncher --arg '192.168.180.101:33367' --properties-file /opt/module/hadoop-2.7.2/data/tmp/nm-local-dir/usercache/atguigu/appcache/application_1533988876407_0004/container_1533988876407_0004_02_000001/__spark_conf__/__spark_conf__.properties 1> /opt/module/hadoop-2.7.2/logs/userlogs/application_1533988876407_0004/container_1533988876407_0004_02_000001/stdout 2> /opt/module/hadoop-2.7.2/logs/userlogs/application_1533988876407_0004/container_1533988876407_0004_02_000001/stderr
|- 16381 16377 16377 16377 (java) 98 80 2257596416 15038 /opt/module/jdk1.8.0_144/bin/java -server -Xmx512m -Djava.io.tmpdir=/opt/module/hadoop-2.7.2/data/tmp/nm-local-dir/usercache/atguigu/appcache/application_1533988876407_0004/container_1533988876407_0004_02_000001/tmp -Dspark.yarn.app.container.log.dir=/opt/module/hadoop-2.7.2/logs/userlogs/application_1533988876407_0004/container_1533988876407_0004_02_000001 org.apache.spark.deploy.yarn.ExecutorLauncher --arg 192.168.180.101:33367 --properties-file /opt/module/hadoop-2.7.2/data/tmp/nm-local-dir/usercache/atguigu/appcache/application_1533988876407_0004/container_1533988876407_0004_02_000001/__spark_conf__/__spark_conf__.properties
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Failing this attempt. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1533990552464
final status: FAILED
tracking URL: http://hadoop102:8088/cluster/app/application_1533988876407_0004
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at com.agg.wordcount.Application$.delayedEndpoint$com$agg$wordcount$Application$1(Application.scala:11)
at com.agg.wordcount.Application$delayedInit$body.apply(Application.scala:5)
at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at com.agg.wordcount.Application$.main(Application.scala:5)
at com.agg.wordcount.Application.main(Application.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Exception in thread "main" org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at com.agg.wordcount.Application$.delayedEndpoint$com$agg$wordcount$Application$1(Application.scala:11)
at com.agg.wordcount.Application$delayedInit$body.apply(Application.scala:5)
at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at com.agg.wordcount.Application$.main(Application.scala:5)
at com.agg.wordcount.Application.main(Application.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
因为不仔细以为错误是两个Exception的错误,其实真的不是,原因是在上面的那几行:
Diagnostics: Container [pid=16377,containerID=container_1533988876407_0004_02_000001] is running beyond virtual memory limits. Current usage: 59.9 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container.
大致的意思就是内存不够了,你的应用程序,yarn不让你跑起来。
解决方法:在hadoop的配置文件 yarn-site.xml中增加两项配置即可
<!--是否启动一个线程检查每个任务正使用的虚拟内存量,如果任务超出分配值,则直接将其杀掉 -->
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<!--任务每使用1MB物理内存,最多可使用虚拟内存量,默认是2.1。 -->
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>4</value>
</property>