什么时候split?
当某store所有文件总大小大于某个值时split,注意,并不是判断某个storefile大小大于某个值。
什么时候compact?
检查整个region内所有store中任一个store 的总storefile是不是太多了(大于hbase.hstore.blockingStoreFiles(7)),太多了则会先进行compact。
flush会遍历这个region的所有store,并一个个执行。
compact遍历这个region找到符合条件的store进行compact.
[b]1. 在flush之后会判断是否需要split和compact[/b]
这里的split有一个判断条件,先计算这tableRegionsCount(regionserver上的这个table的online的region个数),
然后循环计算此region的所有store是否太大,这是通过getSizeToCheck方法计算出一个size,若当前的store总大小大于这个值,则表示此region需要split.
getSizeToCheck的计算方法首先判断tableRegionsCount是否等于0,若是则返回hbase.hregion.max.filesize ,若不是,则计算Math.min(getDesiredMaxFileSize(),
this.flushSize * (tableRegionsCount * tableRegionsCount)。
[b]2. compact后split
[/b]CompactionRequest.run中,compact完成之后,若完成了compact,则继续判断是否需要compact,判断的依据是if (s.getCompactPriority() <= 0) 表示7减去当前storefile的文件数是否<=0,也就是还有许多文件需要compact。
否则则进行split,在CompactSplitThread.requestSplit中,if (shouldSplitRegion() && r.getCompactPriority() >= PRIORITY_USER) ,首先判断系统设置的hbase.regionserver.regionSplitLimit(此参数可以限制整个系统总的region数)总region数是否大于当前在线的region数,若大于就不会split,再判断是否有这个region所有store中7-文件数>=1的store,两者都符合则split.
有一个疑问:难道不需要判断一下文件大小再split吗???
当某store所有文件总大小大于某个值时split,注意,并不是判断某个storefile大小大于某个值。
什么时候compact?
检查整个region内所有store中任一个store 的总storefile是不是太多了(大于hbase.hstore.blockingStoreFiles(7)),太多了则会先进行compact。
flush会遍历这个region的所有store,并一个个执行。
compact遍历这个region找到符合条件的store进行compact.
[b]1. 在flush之后会判断是否需要split和compact[/b]
这里的split有一个判断条件,先计算这tableRegionsCount(regionserver上的这个table的online的region个数),
然后循环计算此region的所有store是否太大,这是通过getSizeToCheck方法计算出一个size,若当前的store总大小大于这个值,则表示此region需要split.
getSizeToCheck的计算方法首先判断tableRegionsCount是否等于0,若是则返回hbase.hregion.max.filesize ,若不是,则计算Math.min(getDesiredMaxFileSize(),
this.flushSize * (tableRegionsCount * tableRegionsCount)。
boolean shouldCompact = region.flushcache();
// We just want to check the size
boolean shouldSplit = region.checkSplit() != null;
if (shouldSplit) {
this.server.compactSplitThread.requestSplit(region);
} else if (shouldCompact) {
server.compactSplitThread.requestCompaction(region, getName());
}
private long flushSize;
@Override
protected void configureForRegion(HRegion region) {
super.configureForRegion(region);
this.flushSize = region.getTableDesc() != null?
region.getTableDesc().getMemStoreFlushSize():
getConf().getLong(HConstants.HREGION_MEMSTORE_FLUSH_SIZE,
HTableDescriptor.DEFAULT_MEMSTORE_FLUSH_SIZE);
}
@Override
protected boolean shouldSplit() {
if (region.shouldForceSplit()) return true;
boolean foundABigStore = false;
// Get count of regions that have the same common table as this.region
int tableRegionsCount = getCountOfCommonTableRegions();
// Get size to check
long sizeToCheck = getSizeToCheck(tableRegionsCount);
for (Store store : region.getStores().values()) {
// If any of the stores is unable to split (eg they contain reference files)
// then don't split
if ((!store.canSplit())) {
return false;
}
// Mark if any store is big enough
long size = store.getSize();
if (size > sizeToCheck) {
LOG.debug("ShouldSplit because " + store.getColumnFamilyName() +
" size=" + size + ", sizeToCheck=" + sizeToCheck +
", regionsWithCommonTable=" + tableRegionsCount);
foundABigStore = true;
break;
}
}
return foundABigStore;
}
/**
* @return Region max size or <code>count of regions squared * flushsize, which ever is
* smaller; guard against there being zero regions on this server.
*/
long getSizeToCheck(final int tableRegionsCount) {
return tableRegionsCount == 0? getDesiredMaxFileSize():
Math.min(getDesiredMaxFileSize(),
this.flushSize * (tableRegionsCount * tableRegionsCount));
}
/**
* @return Count of regions on this server that share the table this.region
* belongs to
*/
private int getCountOfCommonTableRegions() {
RegionServerServices rss = this.region.getRegionServerServices();
// Can be null in tests
if (rss == null) return 0;
byte [] tablename = this.region.getTableDesc().getName();
int tableRegionsCount = 0;
try {
List<HRegion> hri = rss.getOnlineRegions(tablename);
tableRegionsCount = hri == null || hri.isEmpty()? 0: hri.size();
} catch (IOException e) {
LOG.debug("Failed getOnlineRegions " + Bytes.toString(tablename), e);
}
return tableRegionsCount;
}
[b]2. compact后split
[/b]CompactionRequest.run中,compact完成之后,若完成了compact,则继续判断是否需要compact,判断的依据是if (s.getCompactPriority() <= 0) 表示7减去当前storefile的文件数是否<=0,也就是还有许多文件需要compact。
否则则进行split,在CompactSplitThread.requestSplit中,if (shouldSplitRegion() && r.getCompactPriority() >= PRIORITY_USER) ,首先判断系统设置的hbase.regionserver.regionSplitLimit(此参数可以限制整个系统总的region数)总region数是否大于当前在线的region数,若大于就不会split,再判断是否有这个region所有store中7-文件数>=1的store,两者都符合则split.
有一个疑问:难道不需要判断一下文件大小再split吗???
boolean completed = r.compact(this);
long now = EnvironmentEdgeManager.currentTimeMillis();
LOG.info(((completed) ? "completed" : "aborted") + " compaction: " +
this + "; duration=" + StringUtils.formatTimeDiff(now, start));
if (completed) {
server.getMetrics().addCompaction(now - start, this.totalSize);
// degenerate case: blocked regions require recursive enqueues
if (s.getCompactPriority() <= 0) {
server.compactSplitThread
.requestCompaction(r, s, "Recursive enqueue");
} else {
// see if the compaction has caused us to exceed max region size
server.compactSplitThread.requestSplit(r);
}
}
public synchronized boolean requestSplit(final HRegion r) {
// don't split regions that are blocking
if (shouldSplitRegion() && r.getCompactPriority() >= PRIORITY_USER) {
byte[] midKey = r.checkSplit();
if (midKey != null) {
requestSplit(r, midKey);
return true;
}
}
return false;
}
private boolean shouldSplitRegion() {
return (regionSplitLimit > server.getNumberOfOnlineRegions());
}
this.regionSplitLimit = conf.getInt("hbase.regionserver.regionSplitLimit",
Integer.MAX_VALUE);
public int getCompactPriority() {
int count = Integer.MAX_VALUE;
for(Store store : stores.values()) {
count = Math.min(count, store.getCompactPriority());
}
return count;
}