hbase split part:regions split policy

part I:split policy

  if u have used some index tools like lucene,there are some factors to control how many docs to merge some segs to a large one,and whether to freeze  some large enough seg files to a fix size ...yes ,thess cases are all similar to hbase's merge regions capacities(called online merge?) like below described.

  in opposite,hbase's has a 'region split policy' but lucene.that is if some regions are too large so decrease perf ,or some other cases like remerging a big region consume a lot time.so it is sensible!

 

  in 0.94.2,there are two policies for split,as desciribed below:

policytriggersplit pointfeature/use case

ConstantSizeRegionSplitPolicy

-all stores belong this region are splittable;

-one of the store file size is bigger than max.hstorefile.size

use the largetst store's split point

 -a constant size to check threshold

-suitable for predictable data increasement with pre-split

IncreasingToUpperBoundRegionSplitPolicy

(default)

-all stores belong this region are splittable(there is a bug in this verion,[1])

-one store file size is bigger than A,

A=min(max.stofile.size,C^2 * flush.size),

C=number of regions with same table on this rs .so on this rs,all the regions share the same value A when computing split-size-to-check

 same as above

-class inherit above policy,but use silent plicy

-a dynamic handle case,fit for unpreditable data size at first period.

but from the trigger on left side,we know that if regions count grow to 9 then this policy will BACK to the above policy!

 

 

  split point comutation

-exclude the meta table(i have blogged in previous topics)

-retrive the largest store file

-get the middle block of the file

-create a 'rowkey'' with the middle key (this is the target ) TODO

 

part II:split principle

 see [2] or look into HBaseAdmin#split(),below is a bird view:




 

 

part III: merge regions

 as of this version,there is only a offline merge capacity,that is,util.Merge.if u want to use online merge,see 'online merge' which will be fixed in 0.95 or 0.98.

 

part IV:conclusions

 in general,i prefer to use constant-size policy,a simple,controllable solution ,if u preslit the table while creating.

but u must specify the property with value 'ConstantSizeRegionSplitPolicy'  :

hbase.regionserver.region.split.policy

 

ref:

[1] IncreasingToUpperBoundRegionSplitPolicy.shouldSplit() should check all the stores before returning.

hbase -how many regions are fit for a table when prespiting or keeping running 

[2] Apache HBase Region Splitting and Merging (detailed split principle) 

 

 
 
 
 
 
 
 
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值