Yarn application has already ended! It might have been killed or unable to launch application master

8 篇文章 0 订阅
1 篇文章 0 订阅

环境:ambari+hdp 2.7.3

出现背景:nodename服务器出现异常,发生重启。

出现问题:以前能跑的pyspark脚本,运行的时候Yarn application has already ended! It might have been killed or unable to launch application master的错误。

解决方法:

1.在ambari中重启yarn,问题未得到解决。

2.在ambari中重启hdfs,问题未得到解决。

3.在ambari中重启spark,问题未得到解决。

4.编写测试脚本,spark采用local的模式运行,能够正常运行,确认问题应该出现在yarn上。

5.通过ambari中的run service check的功能对yarn进行check,出现:

  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
    tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
    result = _call(command, **kwargs_copy)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 303, in _call
    raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of 'yarn org.apache.hadoop.yarn.applications.distributedshell.Client -shell_command ls -num_containers 1 -jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -timeout 300000 --queue default' returned 2. 19/01/25 06:03:17 INFO distributedshell.Client: Initializing Client
19/01/25 06:03:17 INFO distributedshell.Client: Running Client
19/01/25 06:03:17 INFO client.RMProxy: Connecting to ResourceManager at lntbdnn1.lnt/10.250.10.67:8050
19/01/25 06:03:17 INFO client.AHSProxy: Connecting to Application History server at lntbddn2.lnt/10.250.10.69:10200
19/01/25 06:03:17 INFO distributedshell.Client: Got Cluster metric info from ASM, numNodeManagers=4
19/01/25 06:03:17 INFO distributedshell.Client: Got Cluster node info from ASM
19/01/25 06:03:17 INFO distributedshell.Client: Got node report from ASM for, nodeId=lntbddn1:45454, nodeAddresslntbddn1:8042, nodeRackName/default-rack, nodeNumContainers0
19/01/25 06:03:17 INFO distributedshell.Client: Got node report from ASM for, nodeId=lntbdnn1:45454, nodeAddresslntbdnn1:8042, nodeRackName/default-rack, nodeNumContainers0
19/01/25 06:03:17 INFO distributedshell.Client: Got node report from ASM for, nodeId=lntbddn3:45454, nodeAddresslntbddn3:8042, nodeRackName/default-rack, nodeNumContainers0
19/01/25 06:03:17 INFO distributedshell.Client: Got node report from ASM for, nodeId=lntbddn2:45454, nodeAddresslntbddn2:8042, nodeRackName/default-rack, nodeNumContainers0
19/01/25 06:03:17 INFO distributedshell.Client: Queue info, queueName=default, queueCurrentCapacity=0.0, queueMaxCapacity=1.0, queueApplicationCount=0, queueChildQueueCount=0
19/01/25 06:03:17 INFO distributedshell.Client: User ACL Info for Queue, queueName=root, userAcl=SUBMIT_APPLICATIONS
19/01/25 06:03:17 INFO distributedshell.Client: User ACL Info for Queue, queueName=root, userAcl=ADMINISTER_QUEUE
19/01/25 06:03:17 INFO distributedshell.Client: User ACL Info for Queue, queueName=default, userAcl=SUBMIT_APPLICATIONS
19/01/25 06:03:17 INFO distributedshell.Client: User ACL Info for Queue, queueName=default, userAcl=ADMINISTER_QUEUE
19/01/25 06:03:17 INFO distributedshell.Client: User ACL Info for Queue, queueName=llap, userAcl=SUBMIT_APPLICATIONS
19/01/25 06:03:17 INFO distributedshell.Client: User ACL Info for Queue, queueName=llap, userAcl=ADMINISTER_QUEUE
19/01/25 06:03:17 INFO distributedshell.Client: Max mem capability of resources in this cluster 98304
19/01/25 06:03:17 INFO distributedshell.Client: Max virtual cores capabililty of resources in this cluster 25
19/01/25 06:03:17 INFO distributedshell.Client: Copy App Master jar from local filesystem and add to local environment
19/01/25 06:03:18 INFO distributedshell.Client: Set the environment for the application master
19/01/25 06:03:18 INFO distributedshell.Client: Setting up app master command
19/01/25 06:03:18 INFO distributedshell.Client: Completed setting up app master command {{JAVA_HOME}}/bin/java -Xmx100m 

然后一看本本地时间,本地是时间是14:20,查看各个服务器时间,发现发现主服务器的时间少了8个小时,将主服务器时间修改。重新运行脚本正常,问题得到解决。

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值