HDFS块检查命令Fsck机理的分析

最新推荐文章于 2024-06-06 14:04:33 发布

Android路上的人

最新推荐文章于 2024-06-06 14:04:33 发布

阅读量1.3w

点赞数 3

分类专栏： Hadoop HDFS 文章标签： hdfs fsck

本文链接：https://blog.csdn.net/Androidlushangderen/article/details/50996821

版权

本文深入剖析HDFS的fsck命令，用于检查和处理损坏文件块。通过分析Fsck命令、参数使用、调用过程及原理，揭示其在HDFS中的功能，包括移动或删除损坏文件、输出损坏块信息等。了解这些可以帮助更好地管理和维护HDFS集群。

摘要由CSDN通过智能技术生成

前言

在HDFS中,所有的文件都是以block块的概念而存在的,那么在这样海量的文件数据的情况下,难免会发生一些文件块损坏的现象,那么有什么好的办法去发现呢.答案是使用HDFS的fsck相关的命令.这个命令独立于dfsadmin的命令,可能会让部分人不知道HDFS中还存在这样的命令,本文就来深度挖掘一下这个命令的特殊的用处和内在机理的实现.

Fsck命令

其实说到fsck命令本身,熟悉Linux操作系统的人,可能或多或少听到过或使用过这个命令.Fsck命令的全称为file system check,更加类似的是一种修复命令.当然,本文不会讲大量的关于操作系统的fsck怎么用,而是HDFS下的fsck的使用,在bin/hdfs fsck下还是有很多可选参数的.

Fsck参数使用

本人在测试集群中输入hdfs fsck命令,获取了帮助信息,在此信息中展示了最全的参数使用说明:

$ hdfs fsck
Usage: hdfs fsck <path> [-list-corruptfileblocks | [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]]
    <path>  start checking from this path
    -move   move corrupted files to /lost+found
    -delete delete corrupted files
    -files  print out files being checked
    -openforwrite   print out files opened for write
    -includeSnapshots   include snapshot data if the given path indicates a snapshottable directory or there are snapshottable directories under it
    -list-corruptfileblocks print out list of missing blocks and files they belong to
    -blocks print out block report
    -locations  print out locations for every block
    -racks  print out network topology for data-node locations
    -storagepolicies    print out storage policy summary for the blocks

    -blockId    print out which file this blockId belongs to, locations (nodes, racks) of this block, and other diagnostics info (under replicated, corrupted or not, etc)

简单的总结一下,首先是必填参数和命令名:

bin/hdfs fsck <path>

然后是一堆的可选参数:

-move: 移动损坏的文件到/lost+found目录下
-delete: 删除损坏的文件
-files: 输出正在被检测的文件
-openforwrite: 输出检测中的正在被写的文件
-includeSnapshots: 检测的文件包括系统snapShot快照目录下的
-list-corruptfileblocks: 输出损坏的块及其所属的文件
-blocks: 输出block的详细报告
-locations: 输出block的位置信息
-racks: 输出block的网络拓扑结构信息
-storagepolicies: 输出block的存储策略信息
-blockId: 输出指定blockId所属块的状况,位置等信息

具体参数功能对应到相应的程序会在下文的分析中进行详细的阐述.

Fsck过程调用

Fsck过程的调用指的是从终端机器输入到最终fsck在HDFS内部被执行的整个过程.中间穿过的类的其实不多,本人做了一张简图:
这里写图片描述
上图的调用形式,可以说是三层调用的结构.DFSck就是暴露在最外层的类.我们再来规整规整中间的过程.

输入fsck 直接调用到的就是此类.DFSck内部会发送http请求的方式,根据参数构造URL请求地址,发送到下一个处理对象中.
下一个处理对象就是FsckServlet.FsckServlet在这里相当于一个过渡者,马上调用真正操作类NamenodeFsck.
NamenodeFsck在这里会取出请求参数,然后在HDFS内部做真正的fsck检测操作.

Fsck原理分析

Fsck原理分析将会展示更加细致的fsck过程调用.按照上小节的提到的3层调用,同样我们也分为3个层次的渐近性的分析.

DFSck请求构造

你可以把此类想象成DFSAdmin.首先进入命令输入处理入口方法:

public int run(final String[] args) throws IOException {
    if (args.length == 0) {
      printUsage(System.err);
      return -1;
    }

    try {
      return UserGroupInformation.getCurrentUser().doAs(
          new PrivilegedExceptionAction<Integer>() {
            @Override
            public Integer run() throws Exception {
              return doWork(args);
            }
          });
    } catch (InterruptedException e) {
      throw new IOException(e);
    }
  }

在doWork方法中,马上就看到了对于参数的判别分类,同时开始构造不同的参数请求.

private int doWork(final String[] args) throws IOException {
    final StringBuilder url = new StringBuilder();

    url.append("/fsck?ugi=").append(ugi.getShortUserName());
    String dir = null;
    boolean doListCorruptFileBlocks = false;
    for (int idx = 0; idx < args.length; idx++) {
      if (args[idx].equals("-move")) { url.append("&move=1"); }
      else if (args[idx].equals("