HBase:On the number of column families

1.声明

当前内容主要用于本人学习和复习,当前内容主要为官方文档中的On the number of column families的翻译和理解

2.On the number of column families

HBase currently does not do well with anything above two or three column families so keep the number of column families in your schema low. Currently, flushing is done on a per Region basis so if one column family is carrying the bulk of the data bringing on flushes, the adjacent families will also be flushed even though the amount of data they carry is small. When many column families exist the flushing interaction can make for a bunch of needless i/o (To be addressed by changing flushing to work on a per column family basis). In addition, compactions triggered at table/region level will happen per store too.

HBase当前的列族应该小于2或者3个,在你的设计中(表)中列族的数量应该尽量小。刷入操作是基于区域进行操作的,如果一个列族携带了大量的数据需要刷新,那么它旁边的列族也会被刷新,尽管旁边的列族的数据是非常小。当大量的列族存在刷新交互时将会导致一系列的无用的io(通过列族来修改刷新)。此外存储时也会触发表/区域级别的压缩事件

1.HBase中的列族因该小于2个或者3个,数量尽可能小

2.一个列族刷新可能导致旁边的列族也刷新,从而产生无用的io浪费

3.数据存储时,会触发表/区域级别的压缩事件

Try to make do with one column family if you can in your schemas. Only introduce a second and third column family in the case where data access is usually column scoped; i.e. you query one column family or the other but usually not both at the one time.

尽可能地让你的架构使用一个列族。数据访问通常都是列区域地,列族在这里仅引入第二列和第三列。例如你查询一个列族或者其他列族但是通常不是在同一事件查询地

3.Cardinality of ColumnFamilies

Where multiple ColumnFamilies exist in a single table, be aware of the cardinality (i.e., number of rows). If ColumnFamilyA has 1 million rows and ColumnFamilyB has 1 billion rows, ColumnFamilyA’s data will likely be spread across many, many regions (and RegionServers). This makes mass scans for ColumnFamilyA less efficient.

单个表中存在多个列族时,请注意这个基数(行数)。如果列族A有100万行,列族B有10亿行,列族A地数据是多范围多区域地(和RegionServers)。这将导致列族A地扫描效率降低

总地来说,列族越多扫描起来越慢,效率越低,所以尽可能使用更少地列族,列如使用1个

4.总结

1.表中的列族多少会直接影响到查询、更新、添加等效率

2.表中的列族应该小于2或者小于3,最好是1

3.当表出现列族更新操作时,导致问题就是其旁边的列族也会更新,导致压缩数据的操作执行

以上纯属个人见解,如有问题请联本人!

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值