LintCode-519: Consistent Hashing

Description
中文
English
A general database method for performing a horizontal shard is to take the id against the total number of database servers n and then to find out which machine it is on. The downside of this approach is that as the data continues to increase, we need to increase the database server. When n is changed to n+1, almost all of the data has to be moved, which is not consistent. In order to reduce the defects caused by this naive’s hash method (%n), a new hash algorithm emerges: Consistent Hashing, Consistent Hashing. There are many ways to implement this algorithm. Here we implement a simple Consistent Hashing.
Take id to 360. If there are 3 machines at the beginning, then let 3 machines be responsible for the three parts of 0~119, 120~239, 240~359. Then, how much is the model, check which zone you are in, and which machine to go to.
When the machine changes from n to n+1, we find the largest one from the n intervals, then divide it into two and give half to the n+1th machine.
For example, when changing from 3 to 4, we find the third interval 0~119 is the current largest interval, then we divide 0~119 into 0~59 and 60~119. 0~59 is still given to the first machine, 60~119 to the fourth machine.
Then change from 4 to 5, we find the largest interval is the third interval 120~239, after splitting into two, it becomes 120~179, 180~239.
Suppose all the data is on one machine at the beginning. When adding to the nth machine, what is the distribution of the interval and the corresponding machine number?

You can assume n <= 360. At the same time, we agree that when there are multiple occurrences in the maximum interval, we split the machine with the smaller number.
For example, the size of 0~119, 120~239 is 120, but the number of the previous machine is 1, and the number of the next machine is 2, so we split the range of 0~119.
Have you met this question in a real interview?
Clarification
If the maximal interval is [x, y], and it belongs to machine id z, when you add a new machine with id n, you should divide [x, y, z] into two intervals:
[x, (x + y) / 2, z] and [(x + y) / 2 + 1, y, n]
Example
Example 1:
Input:
n = 1,
Output:
[
[0,359,1]
]
Explanation:
represent 0~359 belongs to machine 1.
Example 2:
Input:
n = 2,
Output:
[
[0,179,1],
[180,359,2]
]
Explanation:
represent 0~179 belongs to machine 1.
represent 180~359 belongs to machine 2.
Example 3:
Input:
n = 3,
Output:
[
[0,89,1]
[90,179,3],
[180,359,2]
]

这是一道system design的题,但我觉得更像算法题。
解法1:参考 的网上的答案。
注意事项:
1)bit operation <<或>>优先级非常低!
这题可能有更好的基于数学的解法,这样O(1)就可以生成。下次再思考。

class Solution {
public:
    /*
     * @param n: a positive integer
     * @return: n x 3 matrix
     */
    vector<vector<int>> consistentHashing(int n) {
        vector<vector<int>> results;
        vector<int> machines = {0, 359, 1};
        results.push_back(machines);
        
        for (int i = 1; i < n; ++i) {
            
            int index = 0;
            for (int j = 1; j < i; ++j) { //since index = 0, if j start from 0, it is wasteful.
                if (results[j][1] - results[j][0] > results[index][1] - results[index][0]) 
                    index = j;
            }
  
            int x = results[index][0];
            int y = results[index][1];
            results[index][1] = x + (y - x) / 2; //cannot be written as x + (y - x) >> 1 as >> has very low priority!
            
            machines[0] = results[index][1] + 1;
            machines[1] = y;
            machines[2] = i + 1;   //it is i+1 since the machine ID starts from 1
            
            results.push_back(machines);
        }
        
        return results;
    }
};
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值