继《HBase源码分析之HRegion上compact流程分析(一)》一文后,我们继续HRegion上compact流程分析,接下来要讲的是针对表中某个列簇下文件的合并,即HStore的compact()方法,代码如下:
/**
* Compact the StoreFiles. This method may take some time, so the calling
* thread must be able to block for long periods.
*
* 合并存储文件。该方法可能花费一些时间,
*
* <p>During this time, the Store can work as usual, getting values from
* StoreFiles and writing new StoreFiles from the memstore.
* 在此期间,Store仍能像往常一样工作,从StoreFiles获取数据和从memstore写入新的StoreFiles
*
* Existing StoreFiles are not destroyed until the new compacted StoreFile is
* completely written-out to disk.
*
* <p>The compactLock prevents multiple simultaneous compactions.
* The structureLock prevents us from interfering with other write operations.
*
* <p>We don't want to hold the structureLock for the whole time, as a compact()
* can be lengthy and we want to allow cache-flushes during this period.
*
* <p> Compaction event should be idempotent, since there is no IO Fencing for
* the region directory in hdfs. A region server might still try to complete the
* compaction after it lost the region. That is why the following events are carefully
* ordered for a compaction:
* 1. Compaction writes new files under region/.tmp directory (compaction output)
* 2. Compaction atomically moves the temporary file under region directory
* 3. Compaction appends a WAL edit containing the compaction input and output files.
* Forces sync on WAL.
* 4. Compaction deletes the input files from the region directory.
*
* Failure conditions are handled like this:
* - If RS fails before 2, compaction wont complete. Even if RS lives on and finishes
* the compaction later, it will only write the new data file to the region directory.
* Since we already have this data, this will be idempotent but we will have a redundant
* copy of the data.
* - If RS fails between 2 and 3, the region will have a redundant copy of the data. The
* RS that failed won't be able to finish snyc() for WAL because of lease recovery in WAL.
* - If RS fails after 3, the region region server who opens the region will pick up the
* the compaction marker from the WAL and replay it by removing the compaction input files.
* Failed RS can also attempt to delete those files, but the operation will be idempotent
*
* See HBASE-2231 for details.
*
* @param compaction compaction details obtained from requestCompaction()
* @throws IOException
* @return Storefile we compacted into or null if we failed or opted out early.
*/
@Override
public List<StoreFile> compact(CompactionContext compaction) throws IOException {
assert compaction != null;
List<StoreFile> sfs = null;
// 从合并上下文CompactionContext中获得合并请求CompactionRequest,即cr
CompactionRequest cr = compaction.getRequest();;
try {
// Do all sanity checking in here if we have a valid CompactionRequest
// because we need to clean up after it on the way out in a finally
// block below
//
// 获取compact开始时间compactionStartTime
long compactionStartTime = EnvironmentEdgeManager.currentTime();
// 确保合并请求request不为空,实际上getRequest已经判断并确保request不为空了,这里为什么还要再做判断和保证呢?先留个小小的疑问吧!
assert compaction.hasSelection();
// 从合并请求cr中获得需要合并的文件集合filesToCompact,集合中存储的都是存储文件StoreFile的实例
// 这个文件集合是在构造CompactionRequest请求,或者合并其他请求时,根据传入的参数或者其他请求中附带的文件集合来确定的,
// 即请求一旦生成,需要合并的文件集合filesToCompact就会存在
Collection<StoreFile> filesToCompact = cr.getFiles();
// 确保需要合并的文件集合filesToCompact不为空
assert !filesToCompact.isEmpty();
// 确保filesCompacting中包含所有的待合并文件filesToCompact
synchronized (filesCompacting) {
// sanity check: we're compacting files that this store knows about
// TODO: change this to LOG.error() after more debugging
Preconditions.checkArgument(filesCompacting.containsAll(filesToCompact));
}
// Ready to go. Have list of files to compact.
LOG.info("Starting compaction of " + filesToCompact.size() + " file(s) in "
+ this + " of " + this.getRegionInfo().getRegionNameAsString()
+ " into tmpdir=" + fs.getTempDir() + ", totalSize="
+ StringUtils.humanReadableInt(cr.getSize()));
// Commence the compaction.
// 开始合并,调用CompactionContext的compact()方法,获得合并后的新文件newFiles
List<Path> newFiles = compaction.compact();
// TODO: get rid of this!
// 根据参数hbase.hstore.compaction.complete确实是否要完整的完成compact
// 这里有意思,这么处理意味着,新旧文件同时存在,新文件没有被挪到指定位置且新文件的Reader被关闭,对外提供服务的还是旧文件,啥目的呢?快速应用于读?
if (!this.conf.getBoolean("hbase.hstore.compaction.complete", true)) {
LOG.warn("hbase.hstore.compaction.complete is set to false");
// 创建StoreFile列表sfs,大小为newFiles的大小
sfs = new ArrayList<StoreFile>(newFiles.size());
// 遍历新产生的合并后的文件newFiles,针对每个文件创建StoreFile和Reader,关闭StoreFile上的Reader,
// 并将创建的StoreFile添加至列表sfs
for (Path newFile : newFiles) {
// Create storefile around what we wrote with a reader on it.
StoreFile sf = createStoreFileAndReader(newFile);
// 关闭其上的Reader
sf.closeReader(true);
sfs.add(sf);
}
// 返回合并后的文件
return sfs;
}
// Do the steps necessary to complete the compaction.
// 执行必要的步骤以完成这个合并
// 移动已完成文件至正确的地方,创建StoreFile和Reader,返回StoreFile列表sfs
sfs = moveCompatedFilesIntoPlace(cr, newFiles);
// 在WAL中写入Compaction记录
writeCompactionWalRec