在NameNode运行日志(/var/log/Bigdata/hdfs/nn/hadoop-omm-namendoe-XXX.log)中搜索“WARN”,可以看到有大量时间在垃圾回收,如下例中耗时较长63s。2017-01-22 14:52:32,641 | WARN | org.apache.hadoop.util.JvmPauseMonitor$Monitor@1b39fd82 | Detected pause in JVM or host machine (eg GC): pause of approximately 63750ms
GC pool 'ParNew' had collection(s): count=1 time=0ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=63924ms | JvmPauseMonitor.java:189
分析NameNode日志“/var/log/Bigdata/hdfs/nn/hadoop-omm-namendoe-XXX.log”,可以看到NameNode在等待块上报,且总的Block个数过多,如下例中是3629万。2017-01-22 14:52:32,641 | INFO | IPC Server handler 8 on 25000 | STATE* Safe mode ON.
The reported blocks 29715437 needs additional 6542184 blocks to reach the threshold 0.9990 of total blocks 36293915.
打开Manager页面,查看NameNode的GC_OPTS参数配置如下:
图1查看NameNode的GC_OPTS参数配置