The algorithm is basically as follows:
Run over the set of all store files, from oldest to youngest
If there are more than 3 (hbase.hstore.compactionThreshold) store files left and the current store file is 20% larger then the sum of all younger store files, and it is larger than the memstore flush size, then we go on to the next, younger, store file and repeat step 2.
Once one of the conditions in step two is not valid anymore, the store files from the current one to the youngest one are the ones that will be merged together. If there are less than the compactionThreshold, no merge will be performed. There is also a limit which prevents more than 10 (hbase.hstore.compaction.max) store files to be merged in one compaction.
与compaction相关的配置参数,可以在Hbase-default.xml或者Hbase-site.xml进行查看或者配置。
2011/7/11更新选择哪些store files去做min compaction的代码注释:
//////////////////////////////////////////////////////////////////////////////
// Compaction
//////////////////////////////////////////////////////////////////////////////
/**
* Compact the StoreFiles. This method may take some time, so the calling
* thread must be able to block for long periods.
*
* <p>During this time, the Store can work as usual, getting values from
* StoreFiles and writing new StoreFiles from the memstore.
*
* Existing StoreFiles are not destroyed un

该算法描述了HBase选择Store文件进行合并的规则,主要考虑文件数量、文件大小与memstoreflush size的关系。当超过3个store文件且当前文件大于年轻文件总和的20%,且大于memstore flush size时,会触发合并。最多合并10个文件,可配置参数包括hbase.hstore.compactionThreshold和hbase.hstore.compaction.max等。
最低0.47元/天 解锁文章
3757

被折叠的 条评论
为什么被折叠?



