1. 概念
A distributed hash table (DHT) is a class of a decentralized distributed system that provides a lookup service similar to a hash table。 ( key, value) pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key.
二个point:
1)分布式环境
2)hash表
A distributed hash table (DHT) is a class of a decentralized distributed system that provides a lookup service similar to a hash table。 ( key, value) pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key.
二个point:
1)分布式环境
2)hash表
对于一个key/value对,DHT在分布式集群中,提供像HashTable一样的服务,例如简单快捷的存取、查询。
参看下图(源自wiki):
2. DHT特征
DHT不是算法,不是实现,而是一个概念。其特征可以归纳为:
Decentralization: the nodes collectively form the system without any central coordination.
Fault tolerance: the system should be reliable (in some sense) even with nodes continuously joining, leaving, and failing.
Scalability: the system should function efficiently even with thousands or millions of nodes.
3. 一致性Hash(consistent hashing)
一致性Hash是一种特殊的DHT,其定义如下:
"Consistent hashing is a scheme that provides hash table functionality in a way that the addition or removal of one slot does not significantly change the mapping of keys to slots"
即:一致性Hash在node加入/离开时,不会导致映射关系的重大变化。
一致性Hash的实现思路如下:
DHT有很多实现,包括:
Chord是DHT(或者说Consistent Hashing)的一种经典实现(Cassandra中的DHT,可以看做是chord的简化版),描述Chord算法的文章比较多了(可参看参考文献)
以下简要叙述Chord结构:
Chord通过把Node(存储、查询结点)和Key映射到相同的空间来实现一致性Hash。为网络中每个Node分配一个唯一id(可以通过机器的mac地址做Hash),假设整个网络有N 个节点,我们可以认为这些整数首尾相连形成一个环,称之为Chord环。两个节点间的距离定义为节点间下标差,每个节点会存储一张路由表(Finger表),表内顺时针按照离本节点2、4、8、16、32.……2i的距离选定log2N个其他节点的ip信息来记录。
6. 总结
DHTs form an infrastructure that can be used to build more complex services, such as anycast, cooperative Web caching, distributed file systems, domain name services, instant messaging, multicast, and also peer-to-peer file sharing and content distribution systems
参考:
http://en.wikipedia.org/wiki/Distributed_hash_table
http://blog.csdn.net/sparkliang/article/details/5279393
http://blog.csdn.net/chen77716/article/details/6059575
DHT不是算法,不是实现,而是一个概念。其特征可以归纳为:
Decentralization: the nodes collectively form the system without any central coordination.
Fault tolerance: the system should be reliable (in some sense) even with nodes continuously joining, leaving, and failing.
Scalability: the system should function efficiently even with thousands or millions of nodes.
3. 一致性Hash(consistent hashing)
一致性Hash是一种特殊的DHT,其定义如下:
"Consistent hashing is a scheme that provides hash table functionality in a way that the addition or removal of one slot does not significantly change the mapping of keys to slots"
即:一致性Hash在node加入/离开时,不会导致映射关系的重大变化。
一致性Hash的实现思路如下:
- 假定哈希key均匀的分布在一个环上
- 所有的节点也都分布在同一环上
- 每个节点只负责一部分Key,当节点加入、退出时只影响加入退出的节点和其邻居节点或者其他节点只有少量的Key受影响
DHT有很多实现,包括:
- Chord
- Kademlia
- Pastry
- CAN
- P-Grid
- BitTorrent DHT
- Apache Cassandra
- Tapestry
Chord是DHT(或者说Consistent Hashing)的一种经典实现(Cassandra中的DHT,可以看做是chord的简化版),描述Chord算法的文章比较多了(可参看参考文献)
以下简要叙述Chord结构:
Chord通过把Node(存储、查询结点)和Key映射到相同的空间来实现一致性Hash。为网络中每个Node分配一个唯一id(可以通过机器的mac地址做Hash),假设整个网络有N 个节点,我们可以认为这些整数首尾相连形成一个环,称之为Chord环。两个节点间的距离定义为节点间下标差,每个节点会存储一张路由表(Finger表),表内顺时针按照离本节点2、4、8、16、32.……2i的距离选定log2N个其他节点的ip信息来记录。
6. 总结
DHTs form an infrastructure that can be used to build more complex services, such as anycast, cooperative Web caching, distributed file systems, domain name services, instant messaging, multicast, and also peer-to-peer file sharing and content distribution systems
参考:
http://en.wikipedia.org/wiki/Distributed_hash_table
http://blog.csdn.net/sparkliang/article/details/5279393
http://blog.csdn.net/chen77716/article/details/6059575