Region的切分
目的: 当表中数据增大时,将表切分多个region,可以将多个region负载到多个regionserver,以达到负载均衡!
hbase2.0.5中的切分策略:
<property>
<name>hbase.regionserver.region.split.policy</name>
<value>org.apache.hadoop.hbase.regionserver.SteppingSplitPolicy</value>
<description>
A split policy determines when a region should be split. The various
other split policies that are available currently are BusyRegionSplitPolicy,
ConstantSizeRegionSplitPolicy, DisabledRegionSplitPolicy,
DelimitedKeyPrefixRegionSplitPolicy, KeyPrefixRegionSplitPolicy, and
SteppingSplitPolicy. DisabledRegionSplitPolicy blocks manual region splitting.
</description>
</property>
切分策略:
/**
* @return flushSize * 2 if there's exactly one region of the table in question
* found on this regionserver. Otherwise max file size.
* This allows a table to spread quickly across servers, while avoiding creating
* too many regions.
tableRegionsCount: 当前RegionServer中当前表,所拥有的region数量!
initialSize= flushSize(128M) * 2
getDesiredMaxFileSize : 取决于 hbase.hregion.max.filesize配置,默认10G
*/
@Override
protected long getSizeToCheck(final int tableRegionsCount) {
return tableRegionsCount == 1 ? this.initialSize : getDesiredMaxFileSize();
}
initialSize 读取hbase.increasing.policy.initial.size值,默认此值不配!不配置,默认取当前表的MemStoreFlushSize(创建表时,配置MEMSTORE_FLUSHSIZE属性,默认也没有)。
最终:
if (initialSize <= 0) {
initialSize = 2 * conf.getLong(HConstants.HREGION_MEMSTORE_FLUSH_SIZE,
TableDescriptorBuilder.DEFAULT_MEMSTORE_FLUSH_SIZE);
}
总结:
如果当前regionserver中,当前表的region个数为1,此时这个region达到256M时,会切分成2个!
之后,如果当前当前regionserver中,当前表的region个数为不为1,那么,下次切分是其中一个region中HFile的总大小超过10G,再次切分!