dfs.ls.limit详解

dfs.ls.limit主要用于查询目录下的子文件,子文件单次返回最大个数。默认值为1000。

NameNodeRpcServer.java

@Override // ClientProtocol
  public DirectoryListing getListing(String src, byte[] startAfter,
      boolean needLocation) throws IOException {
    checkNNStartup();
    DirectoryListing files = namesystem.getListing(
        src, startAfter, needLocation);
    if (files != null) {
      metrics.incrGetListingOps();
      metrics.incrFilesInGetListingOps(files.getPartialListing().length);
    }
    return files;
  }

FSDirStatAndListingOp.java

private static DirectoryListing getListing(FSDirectory fsd, INodesInPath iip,
      byte[] startAfter, boolean needLocation, boolean includeStoragePolicy)
      throws IOException {
    if (FSDirectory.isExactReservedName(iip.getPathComponents())) {
      return getReservedListing(fsd);
    }

    fsd.readLock();
    try {
      if (iip.isDotSnapshotDir()) {
        return getSnapshotsListing(fsd, iip, startAfter);
      }
      final int snapshot = iip.getPathSnapshotId();
      final INode targetNode = iip.getLastINode();
      if (targetNode == null) {
        return null;
      }

      byte parentStoragePolicy = includeStoragePolicy
          ? targetNode.getStoragePolicyID()
          : HdfsConstants.BLOCK_STORAGE_POLICY_ID_UNSPECIFIED;

      if (!targetNode.isDirectory()) {
        // return the file's status. note that the iip already includes the
        // target INode
        return new DirectoryListing(
            new HdfsFileStatus[]{ createFileStatus(
                fsd, iip, null, parentStoragePolicy, needLocation, false)
            }, 0);
      }

      final INodeDirectory dirInode = targetNode.asDirectory();
      final ReadOnlyList<INode> contents = dirInode.getChildrenList(snapshot);
      int startChild = INodeDirectory.nextChild(contents, startAfter);
      int totalNumChildren = contents.size();
      int numOfListing = Math.min(totalNumChildren - startChild,
          fsd.getLsLimit());
      int locationBudget = fsd.getLsLimit();
      int listingCnt = 0;
      HdfsFileStatus listing[] = new HdfsFileStatus[numOfListing];
      for (int i = 0; i < numOfListing && locationBudget > 0; i++) {
        INode child = contents.get(startChild+i);
        byte childStoragePolicy = (includeStoragePolicy && !child.isSymlink())
            ? getStoragePolicyID(child.getLocalStoragePolicyID(),
                                 parentStoragePolicy)
            : parentStoragePolicy;
        listing[i] = createFileStatus(fsd, iip, child, childStoragePolicy,
            needLocation, false);
        listingCnt++;
        if (listing[i] instanceof HdfsLocatedFileStatus) {
            // Once we  hit lsLimit locations, stop.
            // This helps to prevent excessively large response payloads.
            // Approximate #locations with locatedBlockCount() * repl_factor
            LocatedBlocks blks =
                ((HdfsLocatedFileStatus)listing[i]).getLocatedBlocks();
            locationBudget -= (blks == null) ? 0 :
               blks.locatedBlockCount() * listing[i].getReplication();
        }
      }
      // truncate return array if necessary
      if (listingCnt < numOfListing) {
          listing = Arrays.copyOf(listing, listingCnt);
      }
      return new DirectoryListing(
          listing, totalNumChildren-startChild-listingCnt);
    } finally {
      fsd.readUnlock();
    }
  }

可以看到 int locationBudget = fsd.getLsLimit();这个返回子文件的个数。

FSDirectory.java

int configuredLimit = conf.getInt(
        DFSConfigKeys.DFS_LIST_LIMIT, DFSConfigKeys.DFS_LIST_LIMIT_DEFAULT);
    this.lsLimit = configuredLimit>0 ?
        configuredLimit : DFSConfigKeys.DFS_LIST_LIMIT_DEFAULT;

可以看到就是dfs.ls.limit,默认值1000,如果配置值不大于0,自动也会配置成默认值。

那下次查询从哪里开始呢?我们可以看到getListing接口还有一个入参startAfter,代表从这个位置开始查询。startAfter其实本质上就是上次查询最后一个子文件的名字。这里面有个问题,是如何查到子文件startAfter的位置的呢?hdfs使用的二分查询。

INodeDirectory.java

static int nextChild(ReadOnlyList<INode> children, byte[] name) {
    if (name.length == 0) { // empty name
      return 0;
    }
    int nextPos = ReadOnlyList.Util.binarySearch(children, name) + 1;
    if (nextPos >= 0) {
      return nextPos;
    }
    return -nextPos;
  }

ReadOnlyList.Util.java

public static <K, E extends Comparable<K>> int binarySearch(
        final ReadOnlyList<E> list, final K key) {
      int lower = 0;
      for(int upper = list.size() - 1; lower <= upper; ) {
        final int mid = (upper + lower) >>> 1;

        final int d = list.get(mid).compareTo(key);
        if (d == 0) {
          return mid;
        } else if (d > 0) {
          upper = mid - 1;
        } else {
          lower = mid + 1;
        }
      }
      return -(lower + 1);
    }

注意一个细节,如果查询到,返回为child的位置,如果查询不到,从0开始重新查询。

从整个剖析来看,当目录子文件过多时,ls的性能其实不是很好。而且还有如果正好最后一个子文件删除了,返回会出现重复。那为何不使用inode来作为查询条件呢?笔者认为可能是历史问题。建议dfs.ls.limit可以设置大一点。

  • 2
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

zfpigpig

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值