hadoop问题集:3.yarn nodemanager是unhealthy的

从RM上看到有unhealy的NM,查了下NM的日志发现:

2018-04-12 20:05:56,277 WARN org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection: Directory /webser/hadoop-hadoop/tmp/nm-local-dir error, used space above threshold of 90.0%, removing from list of valid directories
2018-04-12 20:05:56,277 WARN org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection: Directory /home/hadoop/hadoop2/logs/userlogs error, used space above threshold of 90.0%, removing from list of valid directories
2018-04-12 20:05:56,277 INFO org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService: Disk(s) failed: 1/1 local-dirs are bad: /webser/hadoop-hadoop/tmp/nm-local-dir; 1/1 log-dirs are bad: /home/hadoop/hadoop2/logs/userlogs
2018-04-12 20:05:56,277 ERROR org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService: Most of the disks failed. 1/1 local-dirs are bad: /webser/hadoop-hadoop/tmp/nm-local-dir; 1/1 log-dirs are bad: /home/hadoop/hadoop2/logs/userlogs

说明yarn的本地目录已经到90%了,不再向里写数据了

# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        99G   84G  9.5G  90% /
tmpfs           3.9G     0  3.9G   0% /dev/shm

清理服务器上的垃圾即可。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
[2023-07-04 17:11:29.952]Exception when trying to cleanup container container_e10_1661450914423_18596_01_000003: java.io.IOException: Problem signalling container 97181 with SIGTERM; output: null and exitCode: -1 at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:750) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.sendSignal(ContainerLaunch.java:908) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.signalProcess(ContainerLaunch.java:922) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:774) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:173) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:62) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:221) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:143) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException: Signal container failed at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:163) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:159) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:739) ... 8 more | org.apache.flink.yarn.YarnResourceManager (ResourceManager.java:822)
最新发布
07-12

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值