解决步骤:
查看页面发现可用资源全显示0,下面图片是解决后的。
解决步骤,查了很多网上资料,有说关闭yarn.nodemanager.vmem-check-enabled 关闭线程检查内存。试过之后不好使。
后来去监控hadoop的log日志
tailf hadoop-root-nodemanager-craw-node212.log
tailf hadoop-root-resourcemanager-craw-node212.log
发现 hadoop-root-nodemanager-craw-node212.log 打印信息如下:
2022-04-08 14:00:50,775 WARN org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection: Directory /home/data/software/hadoop-3.2.2/data/tmp/nm-local-dir error, used space above threshold of 90.0%, removing from list of valid directories
2022-04-08 14:00:50,775 WARN org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection: Directory /home/data/software/hadoop-3.2.2/logs/userlogs error, used space above threshold of 90.0%, removing from list of valid directories
2022-04-08 14:00:50,776 INFO org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService: Disk(s) failed: 1/1 local-dirs usable space is below configured utilization percentage/no more usable space [ /home/data/software/hadoop-3.2.2/data/tmp/nm-local-dir : used space above threshold of 90.0% ] ; 1/1 log-dirs usable space is below configured utilization percentage/no more usable space [ /home/data/software/hadoop-3.2.2/logs/userlogs : used space above threshold of 90.0% ]
2022-04-08 14:00:50,776 ERROR org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService: Most of the disks failed. 1/1 local-dirs usable space is below configured utilization percentage/no more usable space [ /home/data/software/hadoop-3.2.2/data/tmp/nm-local-dir : used space above threshold of 90.0% ] ; 1/1 log-dirs usable space is below configured utilization percentage/no more usable space [ /home/data/software/hadoop-3.2.2/logs/userlogs : used space above threshold of 90.0% ]
2022-04-08 14:00:50,797 INFO org.apache.hadoop.yarn.server.nodemanager.NodeResourceMonitorImpl: Using ResourceCalculatorPlugin : org.apache.hadoop.yarn.util.ResourceCalculatorPlugin@54e041a4
2022-04-08 14:00:50,798 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.event.LogHandlerEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler
2022-04-08 14:00:50,800 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.sharedcache.SharedCacheUploadEventType for class org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.sharedcache.SharedCacheUploadService
2022-04-08 14:00:50,800 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: AMRMProxyService is disabled
因为nodemanager检测到本地磁盘使用量超过90%。
解决办法:
1 把节点上的不用的东西删完,删到90%以下即可
2 在yarn-site.xml中添加以下配置信息,修改上限和下限
<property>
<name>yarn.nodemanager.disk-health-checker.min-healthy-disks</name>
<value>0.0</value>
</property>
<property>
<name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
<value>100.0</value>
</property>
此外: 上面报错会引起下面报错信息:解决上面的问题即可。
2022-04-08 14:06:51,033 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Could not carry out resource dir checks for /home/data/software/hadoop-3.2.2/data/tmp/nm-local-dir, which was marked as good
java.io.FileNotFoundException: File /home/data/software/hadoop-3.2.2/data/tmp/nm-local-dir/filecache does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:668)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:989)