首先需要明确的问题是:HDFS健康的标准是什么样的呢?
如果所有的文件满足最小副本的要求,那么就认为文件系统是健康的。
(HDFS is considered healthy if—and only if—all files have a minimum number of replicas available)
如何检查HDFS的健康情况呢?
hadoop提供了fsck tool来对整个文件系统或者单独的文件、目录来进行健康状态的检查。
低版本的命令为:sudo -u hdfs hadoop fsck /
高版本的命令为:sudo -u hdfs hdfs fsck /
hdfs fsck输出的解释:
Usage: DFSck [-list-corruptfileblocks | [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]]
<path> start checking from this path 指定要进行检查的路径
-move move corrupted files to /lost+found 将有问题的文件move到 /lost+found
-delete delete corrupted files 删除有问题的文件
-files print out files being checked 打印出正在被检查的文件
-openforwrite print out files opened for write 打印出正在被写入的文件
-includeSnapshots include snapshot data if the given path indicates a snapshottable directory or there are snapshottable direc