Yarn 报错 Error in handling event type NODE_UPDATE to the Event Dispatcher

Hadoop系列 同时被 2 个专栏收录
18 篇文章 0 订阅
21 篇文章 0 订阅

报错完整信息如下:

2020-10-14 15:31:00,068 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerNode: Assigned container container_1602660632708_0001_01_000055 of capacity <memory:1024, vCores:1> on host hddatanode02:8041, which has 25 containers, <memory:25600, vCores:25> used and <memory:4529, vCores:-17> available after allocation
2020-10-14 15:31:00,068 FATAL org.apache.hadoop.yarn.event.EventDispatcher: Error in handling event type NODE_UPDATE to the Event Dispatcher
java.lang.NullPointerException
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator.decResourceRequest(LocalityAppPlacementAllocator.java:302)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator.allocateNodeLocal(LocalityAppPlacementAllocator.java:288)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.LocalityAppPlacementAllocator.allocate(LocalityAppPlacementAllocator.java:400)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:430)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoAppAttempt.allocate(FifoAppAttempt.java:83)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainer(FifoScheduler.java:702)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignNodeLocalContainers(FifoScheduler.java:627)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainersOnNode(FifoScheduler.java:589)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:518)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:971)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:761)
	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:103)
	at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
	at java.lang.Thread.run(Thread.java:748)
2020-10-14 15:31:00,075 INFO org.apache.hadoop.yarn.event.EventDispatcher: Exiting, bbye..
2020-10-14 15:31:00,078 ERROR org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: ExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted
2020-10-14 15:31:00,079 INFO org.eclipse.jetty.server.handler.ContextHandler: Stopped o.e.j.w.WebAppContext@7fb48179{/,null,UNAVAILABLE}{/cluster}
2020-10-14 15:31:00,082 INFO org.eclipse.jetty.server.AbstractConnector: Stopped ServerConnector@650ae78c{HTTP/1.1,[http/1.1]}{hdnamenode01:8088}
2020-10-14 15:31:00,082 INFO org.eclipse.jetty.server.handler.ContextHandler: Stopped o.e.j.s.ServletContextHandler@c1fca1e{/static,jar:file:/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hadoop-yarn-common-3.0.0-cdh6.0.1.jar!/webapps/static,UNAVAILABLE}
2020-10-14 15:31:00,083 INFO org.eclipse.jetty.server.handler.ContextHandler: Stopped o.e.j.s.ServletContextHandler@5bd1ceca{/logs,file:///var/log/hadoop-yarn/,UNAVAILABLE}
2020-10-14 15:31:00,084 INFO org.apache.hadoop.ipc.Server: Stopping server on 8032
2020-10-14 15:31:00,085 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8032
2020-10-14 15:31:00,085 INFO org.apache.hadoop.ipc.Server: Stopping server on 8033
2020-10-14 15:31:00,085 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2020-10-14 15:31:00,085 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning to standby state
2020-10-14 15:31:00,085 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8033
2020-10-14 15:31:00,085 WARN org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher: org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher$LauncherThread interrupted. Returning.
2020-10-14 15:31:00,085 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2020-10-14 15:31:00,085 INFO org.apache.hadoop.ipc.Server: Stopping server on 8030
2020-10-14 15:31:00,090 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8030
2020-10-14 15:31:00,090 INFO org.apache.hadoop.ipc.Server: Stopping server on 8031
2020-10-14 15:31:00,091 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2020-10-14 15:31:00,091 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2020-10-14 15:31:00,091 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8031

当连续运行大量查询的时候,RM会挂起并停止分配资源,在RM获取挂起时,在RegularContainerAllocator.getLocalityWaitFactor上抛出NullPointerException。

通过日志可以看出,在RM挂起之前,分配了大量的container,导致出现Error in handling event type NODE_UPDATE to the Event Dispatcher

相关issue:

  • https://issues.apache.org/jira/browse/YARN-8462
  • https://issues.apache.org/jira/browse/YARN-8193

相关修复是在hadoop 3.1.1 和3.2.0 以及2.10.1 中进行的修复

解决方案:

  • 升级hadoop版本
  • 进行针对相关issue打补丁包
  • 不要连续运行大量查询
  • 1
    点赞
  • 0
    评论
  • 0
    收藏
  • 打赏
    打赏
  • 扫一扫,分享海报

参与评论 您还未登录,请先 登录 后发表或查看评论
©️2022 CSDN 皮肤主题:数字20 设计师:CSDN官方博客 返回首页

打赏作者

Michealkz

你的鼓励将是我创作的最大动力

¥2 ¥4 ¥6 ¥10 ¥20
输入1-500的整数
余额支付 (余额:-- )
扫码支付
扫码支付:¥2
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、C币套餐、付费专栏及课程。

余额充值