在《HBase源码分析之HRegion上compact流程分析(二)》一文中,我们没有讲解真正执行合并的CompactionContext的compact()方法。现在我们来分析下它的具体实现。
首先,CompactionContext表示合并的上下文信息,它只是一个抽象类,其compact()并没有实现,代码如下:
/**
* Runs the compaction based on current selection. select/forceSelect must have been called.
* @return The new file paths resulting from compaction.
*/
public abstract List<Path> compact() throws IOException;
那么,我们来找下它的实现类。它一共有两种实现类:DefaultCompactionContext和StripeCompaction,今天我们以DefaultCompactionContext为例来讲解。
首先看下DefaultCompactionContext中compact()方法的实现:
@Override
public List<Path> compact() throws IOException {
return compactor.compact(request);
}
这个compactor可以根据参数hbase.hstore.defaultengine.compactor.class配置,但是默认实现为DefaultCompactor。那么,接下来,我们看下它的实现:
/**
* Do a minor/major compaction on an explicit set of storefiles from a Store.
* 在一个Store中明确的storefiles集合中执行一个minor或者major合并
*/
public List<Path> compact(final CompactionRequest request) throws IOException {
// 从请求中获取文件详情fd,fd是FileDetails类型
FileDetails fd = getFileDetails(request.getFiles(), request.isAllFiles());
// 构造合并过程追踪器CompactionProgress
this.progress = new CompactionProgress(fd.maxKeyCount);
// Find the smallest read point across all the Scanners.
// 找到scanners中的最小的可读点,实际上就是找到最小能够读取数据的点
long smallestReadPoint = getSmallestReadPoint();
List<StoreFileScanner> scanners;
Collection<StoreFile> readersToClose;
// 根据参数hbase.regionserver.compaction.private.readers确定是否使用私有readers
if (this.conf.getBoolean("hbase.regionserver.compaction.private.readers", false)) {
// clone all StoreFiles, so we'll do the compaction on a independent copy of StoreFiles,
// HFileFiles, and their readers
// 克隆所有的StoreFiles,以便我们将在StoreFiles、HFileFiles以及它们的readers等一个独立的副本上执行合并
// 根据请求中待合并文件的数目创建一个StoreFile列表:readersToClose
readersToClose = new ArrayList<StoreFile>(request.getFiles().size());
// 将待合并文件复制一份加入readersToClose列表
for (StoreFile f : request.getFiles()) {
readersToClose.add(new StoreFile(f));
}
// 根据readersToClose列表,即待合并文件的副本创建文件浏览器FileScanners
scanners = createFileScanners(readersToClose, smallestReadPoint);
} else {
// 创建空的列表readersToClose
readersToClose = Collections.emptyList();
// 根据实际请求中的待合并文件列表创建文件浏览器FileScanners
scanners = createFileScanners(request.getFiles(), smallestReadPoint);
}
StoreFile.Writer writer = null;
List<Path> newFiles = new ArrayList<Path>();
boolean cleanSeqId = false;
IOException e = null;
try {
InternalScanner scanner = null;
try {
/* Include deletes, unless we are doing a compaction of all files */
// 确定scan类型scanType:
// 如果compact请求是MAJOR或ALL_FILES合并,则scanType为COMPACT_DROP_DELETES;
// 如果compact请求是MINOR合并,则scanType为COMPACT_RETAIN_DELETES。
ScanType scanType =
request.isAllFiles() ? ScanType.COMPACT_DROP_DELETES : ScanType.COMPACT_RETAIN_DELETES;
// 如果有协处理器,调用协处理器的preCreateCoprocScanner()方法
scanner = preCreateCoprocScanner(request, scanType, fd.earliestPutTs, scanners);
if (scanner == null) {
// 如果协处理器中未创建scanner,调用createScanner()方法创建一个
scanner = createScanner(store, scanners, scanType, smallestReadPoint, fd.earliestPutTs);
}
// 如果有协处理器,调用协处理器的preCompact()方法
scanner = postCreateCoprocScanner(request, scanType, scanner);
if (scanner == null) {
// NULL scanner returned from coprocessor hooks means skip normal processing.
return newFiles;
}
// Create the writer even if no kv(Empty store file is also ok),
// because we need record the max seq id for the store file, see HBASE-6059
// 确定最小读取点smallestReadPoint
if(fd.minSeqIdToKeep > 0) {
smallestReadPoint = Math.min(fd.minSeqIdToKeep, smallestReadPoint);
cleanSeqId = true;
}
// When all MVCC readpoints are 0, don't write them.
// See HBASE-8166, HBASE-12600, and HBASE-13389.
// 调用HStore的createWriterInTmp()方法,获取writer
writer = store.createWriterInTmp(fd.maxKeyCount, this.compactionCompression, true,
fd.maxMVCCReadpoint > 0, fd.maxTagsLength > 0);
// 调用performCompaction()方法,执行合并
boolean finished = performCompaction(scanner, writer, smallestReadPoint, cleanSeqId);
// 如果没有完成合并
if (!finished) {
// 关闭writer
writer.close();
// 删除writer中的临时文件