实现目标
当ZK集群地址变更时,需要实现以下三个目标:
- 客户端不需要重启;
- 变更后集群中每台机器的负载(客户端数据量)是均衡的;
- 尽量避免不必要的客户端迁移;
When the set of servers changes, we would like to update the server list stored by clients without restarting the clients. Moreover, assuming that the number of clients per server is the same (in expectation) in the old configuration (as guaranteed by the current list shuffling for example), we would like to re-balance client connections across the new set of servers in a way that a) the number of clients per server is the same for all servers (in expectation) and b) there is no excessive/unnecessary client migration.
It is simple to achieve (a) without (b) - just re-shuffle the new list of servers at every client. But this would create unnecessary migration, which we’d like to avoid.
We propose a simple probabilistic migration scheme that achieves (a) and (b) - each client locally decides whether and where to migrate when the list of servers changes. The attached document describes the scheme and shows an evaluation of it in Zookeeper. We also implemented re-balancing through a consistent-hashing scheme and show a comparison. We derived the probabilistic migration rules from a simple formula that we can also provide, if someone’s interested in the proof.
负载均衡策略
基本描述: 如下图所示,S表示原有的集群,集群负载是均衡的,S’表示新的集群,M表示S和S’的交集,O表示S独有的机器集合,N表示S’独有的机器集合。
情况1: 当S’的数量大于S时,集群整体负载下降,连接到O的客户端必须迁移到N,连接到M的客户端只需部分迁移到N(概率是1-|S|/|S’|);
情况2: 当S’的数量小于等于S时,集群整体负载提高或不变,此时连接到M的客户端保持不变,连接到O的客户端以概率P=|M|(|S|-|S’|)/|S||O|迁移到M,以概率1-P迁移到N;
代码实现: StaticHostProvider的updateServerList