Allow regions of specific table to be load-balanced

最新推荐文章于 2024-09-30 16:44:12 发布

macyang

最新推荐文章于 2024-09-30 16:44:12 发布

阅读量1.4k

点赞数

分类专栏： database/nosql 文章标签： table server each optimization random hbase

本文链接：https://blog.csdn.net/macyang/article/details/6186592

版权

database/nosql 专栏收录该内容

102 篇文章 0 订阅

订阅专栏

Description：

From our experience, cluster can be well balanced and yet, one table's regions may be badly concentrated on few region servers.
For example, one table has 839 regions (380 regions at time of table creation) out of which 202 are on one server.

It would be desirable for load balancer to distribute regions for specified tables evenly across the cluster. Each of such tables has number of regions many times the cluster size.

Jonathan Gray 给出的第一个comments是：

On cluster startup in 0.90, regions are assigned in one of two ways. By default, it will attempt to retain the previous assignment of the cluster. The other option which I've also used is round-robin. This will evenly distribute each table.

That plus the change to do round-robin on table create should probably cover per-table distribution fairly well.

I think the next step in the load balancer is a major effort to switch to something with more of a cost-based approach. I think ideally you don't need even distribution of each table, you want even distribution of load. If one hot table, it will get evenly balanced anyways.

One thing we could do is get rid of all random assignments and always try to do some kind of quick load balance or round-robin. It does seem like randomness always leads to one guy who gets an unfair share.

Matt Corgan 提出可以用一致性哈希解决：

Have you guys considered using a consistent hashing method to choose which server a region belongs to? You would create ~50 buckets for each server by hashing serverName_port_bucketNum, and then hash the start key of each region into the buckets.

There are a few benefits:

when you add a server it takes an equal load from all existing servers
if you remove a server it distributes its regions equally to the remaining servers
adding a server does not cause all regions to shuffle like round robin assignment would
assignment is nearly random, but repeatable, so no hot spots
when a region splits the front half will stay on the same server, but the back half will usually be sent to another server

And a few drawbacks:

each server wouldn't end up with exactly the same number of regions, but they would be close
if a hot spot does end up developing, you can't do anything about it, at least not unless it supported a list of manual overrides

Jonathan Gray 给出了不能使用一致性哈希的原因：

I think consistent hashing would be a major step backwards for us and unnecessary because there is no cost of moving bits around in HBase. The primary benefit of consistent hashing is that it reduces the amount of data you have to physically move around. Because of our use of HDFS, we never have to move physical data around.

In your benefit list, we are already implementing almost all of these features, or if not, it is possible in the current architecture. In addition, our architecture is extremely flexible and we can do all kinds of interesting load balancing techniques related to actual load profiles not just #s of shards/buckets as we do today or as would be done with consistent hashing.

The fact that split regions open back up on the same server is actually an optimization in many cases because it reduces the amount of time the regions are offline and when they come back online and do a compaction to drop references, all the files are more likely to be on the local DataNode rather than remote. In some cases, like time-series, you may want the splits to move to different servers. I could imagine some configurable logic in there to ensure the bottom half goes to a different server (or maybe the top half would actually be more efficient to move away since most the time you'll write more to the bottom half and thus want the data locality / quick turnaround). There's likely going to be a bit of split rework in 0.92 to make it more like the ZK-based regions-in-transition.

As far as binding regions to servers between cluster restarts, this is already implemented and on by default in 0.90.

Consistent hashing also requires a fixed keyspace (right?) and that's a mismatch for HBase's flexibility in this regard.

更多讨论参考：https://issues.apache.org/jira/browse/HBASE-3373

HBase中Region分配问题的探讨： http://www.spnguru.com/?p=246