【HDFS】namenode如何根据输入的文件（路径）名找到对应的inode的？

最新推荐文章于 2024-03-23 16:14:31 发布

年轻的海员

最新推荐文章于 2024-03-23 16:14:31 发布

阅读量2.7k

点赞数

本文链接：https://blog.csdn.net/tracymkgld/article/details/17553173

版权

大家都用过 hadoop dfs -ls/rmr/rm/get/put/cat等命令，后面跟的都是一个字符串形式的文件绝对路径/a/b/c/d这样的玩意，那么namenode如何根据你输入的/a/b/c/d这样字符串格式的东西找到对应的文件呢？

我们都知道文件对应的inodefile，目录对应inodeDirectory，它们都是inode,

abstract class INode implements Comparable<byte[]> {
  protected byte[] name;
  protected INodeDirectory parent;

从inode的代码片段可以看出来，inode使用Byte数组保存文件名，因此，字符串往Byte数组转化，比较，适配等就是需要解决的问题。

  INode[] getExistingPathINodes(String path) {
    byte[][] components = getPathComponents(path);
    INode[] inodes = new INode[components.length];

    this.getExistingPathINodes(components, inodes);
    
    return inodes;
  }

INodeDirectory提供了上面的方法。先来看看getPathComponents方法：

  static String[] getPathNames(String path) {
    if (path == null || !path.startsWith(Path.SEPARATOR)) {
      return null;
    }
    return path.split(Path.SEPARATOR);
  }

1、看上面，先将你给我的路径字符串切割，即取出"/"中间的各个字符串，得到这些字符串的数组，例如/aa/bb/cc/dd这样的,将得到{aa,bb,cc,dd}

  static byte[][] getPathComponents(String[] strings) {
    if (strings.length == 0) {
      return new byte[][]{null};
    }
    byte[][] bytes = new byte[strings.length][];
    for (int i = 0; i < strings.length; i++)
      bytes[i] = DFSUtil.string2Bytes(strings[i]);
    return bytes;
  }

2、看上面，第1步得到文件路径的字符串数组后，交给getPathComponents方法加工成byte二维数组，很好理解是吧，比如/aa/bb/cc/dd，就变化出4个byte数组嘛，就是2x*结构的byte数组。每个byte数组就是那一个字符串转换得来的啊。

再看刚才的getExistingPathINodes方法

  INode[] getExistingPathINodes(String path) {

    byte[][] components = getPathComponents(path);
    INode[] inodes = new INode[components.length];
//路径分割出几个字符串就是几个inode嘛，擦
    this.getExistingPathINodes(components, inodes);
    
    return inodes;
  }

继续看

  int getExistingPathINodes(byte[][] components, INode[] existing) {
    assert compareBytes(this.name, components[0]) == 0 :
      "Incorrect name " + getLocalName() + " expected " + components[0];
//首先必须要找到第一层目录的inode（inodeDirectory）去往下找，要不然找个屁啊！谁调用的这个方法？就是rootDir嘛，擦，rootDir就是	FSDirectory的一个final变量//，所以找文件都是从根root开始往下找
    INode curNode = this;
    int count = 0;
    int index = existing.length - components.length;

    if (index > 0)
      index = 0;// 先不管它，这里俩数组长度是相等的，因为existing的长度就是按照components的长度来的
    while ((count < components.length) && (curNode != null)) {
      if (index >= 0)
        existing[index] = curNode;
      if (!curNode.isDirectory() || (count == components.length - 1))
        break; // no more child, stop here
      INodeDirectory parentDir = (INodeDirectory)curNode;
      curNode = parentDir.getChildINode(components[count + 1]);
      count += 1;
      index += 1;
    }
    return count;
  }

这个东西是揪住第一层目录往下找，一直周到目标文件等深的地方，看看有几层目录到它那，并且在查找的过程中，就把每一层的inode找到了，并且放到一个inode数组里供别人取用。

小结一下就是，要对文件进行操作，必须先找到这个文件的inode，以及它往上追溯的所有inode，一直追到根，而追溯的具体执行过程恰巧相反，是通过根往下找，深度就是目标文件的深度，找的过程就是inode的name即一维byte数组的比较的过程。