HBase源码分析之HRegion上compact流程分析(二)

        继《HBase源码分析之HRegion上compact流程分析(一)》一文后,我们继续HRegion上compact流程分析,接下来要讲的是针对表中某个列簇下文件的合并,即HStore的compact()方法,代码如下:

/**
   * Compact the StoreFiles.  This method may take some time, so the calling
   * thread must be able to block for long periods.
   * 
   * 合并存储文件。该方法可能花费一些时间,
   *
   * <p>During this time, the Store can work as usual, getting values from
   * StoreFiles and writing new StoreFiles from the memstore.
   * 在此期间,Store仍能像往常一样工作,从StoreFiles获取数据和从memstore写入新的StoreFiles
   *
   * Existing StoreFiles are not destroyed until the new compacted StoreFile is
   * completely written-out to disk.
   *
   * <p>The compactLock prevents multiple simultaneous compactions.
   * The structureLock prevents us from interfering with other write operations.
   *
   * <p>We don't want to hold the structureLock for the whole time, as a compact()
   * can be lengthy and we want to allow cache-flushes during this period.
   *
   * <p> Compaction event should be idempotent, since there is no IO Fencing for
   * the region directory in hdfs. A region server might still try to complete the
   * compaction after it lost the region. That is why the following events are carefully
   * ordered for a compaction:
   *  1. Compaction writes new files under region/.tmp directory (compaction output)
   *  2. Compaction atomically moves the temporary file under region directory
   *  3. Compaction appends a WAL edit containing the compaction input and output files.
   *  Forces sync on WAL.
   *  4. Compaction deletes the input files from the region directory.
   *
   * Failure conditions are handled like this:
   *  - If RS fails before 2, compaction wont complete. Even if RS lives on and finishes
   *  the compaction later, it will only write the new data file to the region directory.
   *  Since we already have this data, this will be idempotent but we will have a redundant
   *  copy of the data.
   *  - If RS fails between 2 and 3, the region will have a redundant copy of the data. The
   *  RS that failed won't be able to finish snyc() for WAL because of lease recovery in WAL.
   *  - If RS fails after 3, the region region server who opens the region will pick up the
   *  the compaction marker from the WAL and replay it by removing the compaction input files.
   *  Failed RS can also attempt to delete those files, but the operation will be idempotent
   *
   * See HBASE-2231 for details.
   *
   * @param compaction compaction details obtained from requestCompaction()
   * @throws IOException
   * @return Storefile we compacted into or null if we failed or opted out early.
   */
  @Override
  public List<StoreFile> compact(CompactionContext compaction) throws IOException {
    assert compaction != null;
    List<StoreFile> sfs = null;
    
    // 从合并上下文CompactionContext中获得合并请求CompactionRequest,即cr
    CompactionRequest cr = compaction.getRequest();;
    
    try {
      // Do all sanity checking in here if we have a valid CompactionRequest
      // because we need to clean up after it on the way out in a finally
      // block below
      // 
    
      // 获取compact开始时间compactionStartTime
      long compactionStartTime = EnvironmentEdgeManager.currentTime();
      
      // 确保合并请求request不为空,实际上getRequest已经判断并确保request不为空了,这里为什么还要再做判断和保证呢?先留个小小的疑问吧!
      assert compaction.hasSelection();
      
      // 从合并请求cr中获得需要合并的文件集合filesToCompact,集合中存储的都是存储文件StoreFile的实例
      // 这个文件集合是在构造CompactionRequest请求,或者合并其他请求时,根据传入的参数或者其他请求中附带的文件集合来确定的,
      // 即请求一旦生成,需要合并的文件集合filesToCompact就会存在
      Collection<StoreFile> filesToCompact = cr.getFiles();
      
      // 确保需要合并的文件集合filesToCompact不为空
      assert !filesToCompact.isEmpty();
      
      // 确保filesCompacting中包含所有的待合并文件filesToCompact
      synchronized (filesCompacting) {
        // sanity check: we're compacting files that this store knows about
        // TODO: change this to LOG.error() after more debugging
        Preconditions.checkArgument(filesCompacting.containsAll(filesToCompact));
      }

      // Ready to go. Have list of files to compact.
      LOG.info("Starting compaction of " + filesToCompact.size() + " file(s) in "
          + this + " of " + this.getRegionInfo().getRegionNameAsString()
          + " into tmpdir=" + fs.getTempDir() + ", totalSize="
          + StringUtils.humanReadableInt(cr.getSize()));

      // Commence the compaction.
      // 开始合并,调用CompactionContext的compact()方法,获得合并后的新文件newFiles
      List<Path> newFiles = compaction.compact();

      // TODO: get rid of this!
      // 根据参数hbase.hstore.compaction.complete确实是否要完整的完成compact
      // 这里有意思,这么处理意味着,新旧文件同时存在,新文件没有被挪到指定位置且新文件的Reader被关闭,对外提供服务的还是旧文件,啥目的呢?快速应用于读?
      if (!this.conf.getBoolean("hbase.hstore.compaction.complete", true)) {
        LOG.warn("hbase.hstore.compaction.complete is set to false");
        
        // 创建StoreFile列表sfs,大小为newFiles的大小
        sfs = new ArrayList<StoreFile>(newFiles.size());
        
        // 遍历新产生的合并后的文件newFiles,针对每个文件创建StoreFile和Reader,关闭StoreFile上的Reader,
        // 并将创建的StoreFile添加至列表sfs
        for (Path newFile : newFiles) {
          // Create storefile around what we wrote with a reader on it.
          StoreFile sf = createStoreFileAndReader(newFile);
          
          // 关闭其上的Reader
          sf.closeReader(true);
          sfs.add(sf);
        }
        
        // 返回合并后的文件
        return sfs;
      }
      
      // Do the steps necessary to complete the compaction.
      // 执行必要的步骤以完成这个合并
      
      // 移动已完成文件至正确的地方,创建StoreFile和Reader,返回StoreFile列表sfs
      sfs = moveCompatedFilesIntoPlace(cr, newFiles);
      
      // 在WAL中写入Compaction记录
      writeCompactionWalRec
  • 6
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值