Ketama算法增加虚拟节点对分布式存储带来的问题

最新推荐文章于 2020-09-16 16:30:57 发布

mrliu20082009

最新推荐文章于 2020-09-16 16:30:57 发布

阅读量2.7k

点赞数

分类专栏： java 存储文章标签：分布式存储算法 integer string file exe

本文链接：https://blog.csdn.net/mrliu20082009/article/details/6998721

版权

java 同时被 2 个专栏收录

77 篇文章 0 订阅

订阅专栏

存储

2 篇文章 0 订阅

订阅专栏

测试代码：

public class UserTest {
	
	private static final Integer NODE_COUNT = 6;
	
	private static final Integer VIRTUAL_NODE_COUNT = 200;
	
	private static final Integer EXE_TIMES = 100;
	
	public static void main(String[] args) throws IOException {
		UserTest test = new UserTest();
		//初始化节点
		List<Node> allNodes = test.getNodes(NODE_COUNT);
		KetamaNodeLocator locator = new KetamaNodeLocator(allNodes, HashAlgorithm.KETAMA_HASH, VIRTUAL_NODE_COUNT);		
		//初始化所有的userid
		List<String> allKeys = test.getAllStrings();
		//写userid到文件中，查看写的分布情况
//		for (String key : allKeys) {
//			Node node = locator.getPrimary(key);
//			
//			File file = new File(node.getUrl());
//			if (!file.getParentFile().exists()) {
//				file.getParentFile().mkdirs();
//			}
//			if (!file.exists()) {
//				file.createNewFile();
//			}
//			FileWriter writer = new FileWriter(file.getAbsolutePath(), true);
//			writer.write(key + System.getProperty("line.separator"));
//			writer.close();
//		}
		
		//添加一个节点6，将1中的数据复制到6中，验证数据是否都能找到
//		for (String key : allKeys) {
//			Node node = locator.getPrimary(key);
//			
//			File file = new File(node.getUrl());
//			BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
//			String line = null;
//			boolean find = false;
//			while ((line = reader.readLine()) != null) {
//				if (key.equals(line)) {
//					find = true;
//				}
//			}
//			if (!find) {
//				System.out.println("name:" + node.getName() + ",value=" + key);
//			}
			else if (node.getName().equals("node6")) {
				System.out.println("value=" + key);
			}
//		}
		
		System.out.println(HashAlgorithm.KETAMA_HASH.hash(HashAlgorithm.KETAMA_HASH.md5("userid-3"), 0));		
	}
	
	
	/**
	 * Gets the mock node by the material parameter
	 * 
	 * @param nodeCount 
	 * 		the count of node wanted
	 * @return
	 * 		the node list
	 */
	private List<Node> getNodes(int nodeCount) {
		List<Node> nodes = new ArrayList<Node>();
		
		for (int k = 1; k <= nodeCount; k++) {
			String name = "node" + k;
			String url = "D:\\" + "node" + k + ".txt";
			Node node = new Node(name, url);
			nodes.add(node);
		}
		
		return nodes;
	}
	
	/**
	 *	All the keys	
	 */
	private List<String> getAllStrings() {
		List<String> allStrings = new ArrayList<String>(EXE_TIMES);
		
		for (int i = 0; i < EXE_TIMES; i++) {
			allStrings.add("userid-" + i);
		}
		
		return allStrings;
	}
}

测试发现，不是某个节点的数据被映射到node6中，而是好几个，如果是做数据迁移的话，麻烦就不小了。

对于缓存来说，增加虚拟节点确实能带来好处，因为他只关心命中率，对于数据一致性要求不高。但是对于存储系统来说，就要考虑

数据迁移的问题，如果采用了虚拟节点，那么一个物理节点就存在多个虚拟节点。如果增加一台机器，它又会对应多个虚拟机点。

当再次对所有节点重新分配，需要将所有新增的虚拟节点的顺时针下一个虚拟节点统计出来，然后将统计出来的对应的所有物理节点的数据

迁移到新增的物理节点上。这也是个大的工作量。

又或者是我理解错了？当加一台物理节点时，不给他分配虚拟节点，只查询比他大的最小的虚拟节点，然后找出该虚拟节点的物理节点，将此物理节点的数据

迁移到新物理节点上，但这是否新的物理节点有些浪费呢？有待研究

mrliu20082009

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
Ketama算法增加虚拟节点对分布式存储带来的问题

测试代码：public class UserTest { private static final Integer NODE_COUNT = 6; private static final Integer VIRTUAL_NODE_COUNT = 200; private static final Integer EXE_TIMES = 100; public st
复制链接

扫一扫

专栏目录