HBase 0.1.0 Flush流程源码分析

最新推荐文章于 2020-10-24 18:05:18 发布

MrTitan

最新推荐文章于 2020-10-24 18:05:18 发布

阅读量1.3k

点赞数

分类专栏： HBase Hadoop 源码分析 Java 文章标签： hbase 分布式数据结构磁盘

本文链接：https://blog.csdn.net/MrTitan/article/details/8257573

版权

源码分析同时被 3 个专栏收录

20 篇文章 0 订阅

订阅专栏

Java

17 篇文章 0 订阅

订阅专栏

HBase

16 篇文章 0 订阅

订阅专栏

这篇文章将会分析和总结Flush流程的实现。

Flush是LSM-Tree重要的实现步骤，对应理解hbase非常关键。

简单来说，flush就是把内存中的数据flush到磁盘上，那么具体是怎么实现的纳？

首先，regionserver在适当的时机调用region.flushcache。步骤如下：

lock.readLock().lock();                      // Prevent splits and closes
      try {
        long startTime = -1;
        synchronized (updateLock) {// Stop updates while we snapshot the memcaches
          startTime = snapshotMemcaches();
        }
        return internalFlushcache(startTime);
      } finally {
        lock.readLock().unlock();
}

可以看到主要有2个步骤：

1.snapshot memcache。

// HRegion的 snapshot
this.memcacheSize.set(0L);

// 每个HStore都要snapshot
for (HStore hstore: stores.values()) {
	hstore.snapshotMemcache();
}

1.把本region的memcachesize设为0，以便后续的操作可以正确的赋值
2.每个HStore snapshot
	 
 //每个HStore的snapshot算法：把memcache的数据放到snapshot中，clear memcache

this.lock.writeLock().lock();
  try {
      synchronized (memcache) {
        if (memcache.size() != 0) {
          snapshot.putAll(memcache);
          memcache.clear();
        }
      }
    } finally {
      this.lock.writeLock().unlock()；

2.internalFlushcache

Region层次的internalFlushcache比较清晰，主要就是三个步骤：

1.拿到flush的操作id
2。每个HStore flushcache
3.记录flush完成到hlog中，后续的操作就不会被flush影响到了

long sequenceId = log.startCacheFlush();

for (HStore hstore: stores.values()) {
        hstore.flushCache(sequenceId);
      }
this.log.completeCacheFlush(this.regionInfo.getRegionName(),
        regionInfo.getTableDesc().getName(), sequenceId);

再来看下每个HStore是怎么flush的：

1.创建一个新的HStorefile文件。

// A. Write the Maps out to the disk
      HStoreFile flushedFile = new HStoreFile(conf, fs, basedir,
        info.getEncodedName(), family.getFamilyName(), -1L, null);
      String name = flushedFile.toString();
      MapFile.Writer out = flushedFile.getWriter(this.fs, this.compression,
        this.bloomFilter);

2.拿出HRegion memcache里的记录一条条的append到新建的 HStoreFile里，中间需要filter一下cf，即把内存中的记录分别flush到对应cf的store中

for (Map.Entry<HStoreKey, byte []> es: cache.entrySet()) {
          HStoreKey curkey = es.getKey();
          TextSequence f = HStoreKey.extractFamily(curkey.getColumn());
          if (f.equals(this.family.getFamilyName())) {
            entries++;
            out.append(curkey, new ImmutableBytesWritable(es.getValue()));
          }
        }

3.把log num写入storefile，把bloomfilter写入磁盘

// B. Write out the log sequence number that corresponds to this output
      // MapFile.  The MapFile is current up to and including the log seq num.
      flushedFile.writeInfo(fs, logCacheFlushId);
      
      // C. Flush the bloom filter if any
      if (bloomFilter != null) {
        flushBloomFilter();
      }

4..把当前的storefile的reader加入到该HStore的readers里，把storefile加到storefiles里

Long flushid = Long.valueOf(logCacheFlushId);
        // Open the map file reader.
this.readers.put(flushid,
    flushedFile.getReader(this.fs, this.bloomFilter));
this.storefiles.put(flushid, flushedFile);

这边要详细说明下 flushedFile.getWriter方法。

HBase封装了3层HStorefile的操作类处理对HStorefile的操作。类的层次关系如下：

HalfMapFile（for split）
| |
| |
HalfMapFileReader

OK，那么让我们看下HStorefile.getWriter是怎么写的

如果HStorefile是reference，代表是split的中间文件，split文件不需要write，报错

如果HStorefile不是reference，则返回BloomFilterMapFile.Writer。

if (isReference()) {
      throw new IOException("Illegal Access: Cannot get a writer on a" +
        "HStoreFile reference");
    }
    return new BloomFilterMapFile.Writer(conf, fs,
      getMapFilePath().toString(), HStoreKey.class,
      ImmutableBytesWritable.class, compression, bloomFilter);

因此，在flush中实际调用了BloomFilterMapFile.Writer，先写入bloomfilter，再写入文件。

总结一下：

1.flush不是每次只生成一个storefile，而是每个HStore生成一个storefile。

2.flush是批量append，因此效率很高，而且如果是有bloomfilter就会写bloomfilter

3.这边snapshot的概念好像跟后面的版本概念不一样，因此就不研究了