HBase 源码学习 ---- Flush （5）

最新推荐文章于 2022-04-05 09:31:34 发布

weixin_46149099

最新推荐文章于 2022-04-05 09:31:34 发布

阅读量83

点赞数

分类专栏： HBase源码理解文章标签： hbase

本文链接：https://blog.csdn.net/weixin_46149099/article/details/113825520

版权

HBase源码理解专栏收录该内容

5 篇文章 0 订阅

订阅专栏

上次写到HBase 1.4 版本MemStore 如何实现Snapshot，本文梳理Snapshot之后的flush过程。
Flush是用一个internalScanner scan出每个cell，然后一个一个写入临时文件，因为internal scan不涉及外部文件，只扫描内存，所以速度相对较快。
StoreFlusher 类的createScanner方法通过snapshot中的KeyValueScanner创建了一个StoreScanner：

Scan scan = new Scan();
      scan.setMaxVersions(store.getScanInfo().getMaxVersions());
      scanner = new StoreScanner(store, store.getScanInfo(), scan,
          Collections.singletonList(snapshotScanner), ScanType.COMPACT_RETAIN_DELETES,
          smallestReadPoint, HConstants.OLDEST_TIMESTAMP);

StoreScanner

从构造方法：

public StoreScanner(final Scan scan, ScanInfo scanInfo, ScanType scanType,
      final NavigableSet<byte[]> columns, final List<KeyValueScanner> scanners, long earliestPutTs,
      long readPt) throws IOException {
    this(null, scan, scanInfo, columns, readPt, scan.getCacheBlocks());
    if (scanType == ScanType.USER_SCAN) {
      this.matcher = UserScanQueryMatcher.create(scan, scanInfo, columns, oldestUnexpiredTS, now,
        null);
    } else {
      if (scan.hasFilter() || (scan.getStartRow() != null && scan.getStartRow().length > 0)
          || (scan.getStopRow() != null && scan.getStopRow().length > 0)
          || !scan.getTimeRange().isAllTime() || columns != null) {
        matcher = LegacyScanQueryMatcher.create(scan, scanInfo, columns, scanType, Long.MAX_VALUE,
          earliestPutTs, oldestUnexpiredTS, now, null, null, store.getCoprocessorHost());
      } else {
        this.matcher = CompactionScanQueryMatcher.create(scanInfo, scanType, Long.MAX_VALUE,
          earliestPutTs, oldestUnexpiredTS, now, null, null, null);
      }
    }

由于传入的Scan只设置了maxVersion，这里的matcher会create一个CompactionScanQueryMatcher

CompactionScanQueryMatcher

查看CompactionScanQueryMatcher create()方法：

public static CompactionScanQueryMatcher create(ScanInfo scanInfo, ScanType scanType,
      long readPointToUse, long earliestPutTs, long oldestUnexpiredTS, long now,
      byte[] dropDeletesFromRow, byte[] dropDeletesToRow,
      RegionCoprocessorHost regionCoprocessorHost) throws IOException {
    DeleteTracker deleteTracker = instantiateDeleteTracker(regionCoprocessorHost);
    if (dropDeletesFromRow == null) {
      if (scanType == ScanType.COMPACT_RETAIN_DELETES) {
        return new MinorCompactionScanQueryMatcher(scanInfo, deleteTracker, readPointToUse,
            oldestUnexpiredTS, now);
      } else {
        return new MajorCompactionScanQueryMatcher(scanInfo, deleteTracker, readPointToUse,
            earliestPutTs, oldestUnexpiredTS, now);
      }
    } else {
      return new StripeCompactionScanQueryMatcher(scanInfo, deleteTracker, readPointToUse,
          earliestPutTs, oldestUnexpiredTS, now, dropDeletesFromRow, dropDeletesToRow);
    }
  }

最终返回的是一个MinorCompactionScanQueryMatcher对象。查看其match() 方法：

public MatchCode match(Cell cell) throws IOException {
    MatchCode returnCode = preCheck(cell);
    if (returnCode != null) {
      return returnCode;
    }
    long mvccVersion = cell.getSequenceId();
    byte typeByte = cell.getTypeByte();
    if (CellUtil.isDelete(typeByte)) {
      if (mvccVersion > maxReadPointToTrackVersions) {
        // we should not use this delete marker to mask any cell yet.
        return MatchCode.INCLUDE;
      }
      trackDelete(cell);
      return MatchCode.INCLUDE;
    }
    returnCode = checkDeleted(deletes, cell);
    if (returnCode != null) {
      return returnCode;
    }
    // Skip checking column since we do not remove column during compaction.
    return columns.checkVersions(cell.getQualifierArray(), cell.getQualifierOffset(),
      cell.getQualifierLength(), cell.getTimestamp(), typeByte,
      mvccVersion > maxReadPointToTrackVersions);
  }

match为当前cell打上标记（matchCode），StoreScanner会根据标记对cell进行如下处理：

preCheck() 方法会skip掉TTL 过期的cell。
track tag是Delete的cell，过滤掉已经删除的cell。
跳过超出maxVersion的过期cell。

weixin_46149099

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
HBase 源码学习 ---- Flush （5）

上次写到HBase 1.4 版本MemStore 如何实现Snapshot，本文梳理Snapshot之后的flush过程。Flush是用一个internalScanner scan出每个cell，然后一个一个写入临时文件，因为internal scan不涉及外部文件，只扫描内存，所以速度相对较快。StoreFlusher 类的createScanner方法通过snapshot中的KeyValueScanner创建了一个StoreScanner：Scan scan = new Scan();
复制链接

扫一扫