August 30th Thursday (八月　三十日　木曜日）-CSDN博客

本文链接：https://blog.csdn.net/Lu_ming/article/details/1779739

Kademlia is a distributed hash table for decentralized peer to peer computer networks designed by Petar Maymounkov and David Mazieres .
It specifies the structure of the network and how the exchange of information has to take place through node lookups. Kademlia nodes communicate
among themselves using the User Datagram Protocol. Over an existing network (like the internet), a new virtual or overlay network is formed
by the participant nodes. Each node is identified by a number or node ID. The node ID serves not only as an identification, but the Kademlia
algorithm uses it to locate values (usually file hashes or keywords). In fact, the node ID provides a direct map to stored values such as
file hashes.

The knowledge that a participant node has of the network varies with "distance". A concept of distance is defined and implemented in the network,
so that a node "A" can always tell which of other two nodes "B" and "C" is closer to himself (to "A"). Any node has a very detailed knowledge of
the nodes that are "very close" to himself (i.e. will know the ID's of all of the close nodes), a very sparse knowledge of very distant nodes
(i.e. will know the ID's of very few of them), and varying degrees of knowledge of the parts of the network that are at intermediate distances,
the shorter the distance to himself, the more detailed the knowledge.

Kademlia purpose is to store, and retrieve, some "Values", which can be any data, usually pointers to files. Values are pointed to by "Keys".
Nodes ID's and Keys can have its "distance" computed just like between two node ID's. Therefore any node "A" can calculate the distance to two keys
"K1" and "K2" and decide which one is closer to himself (to "A") than the other key. In the same way, a node "A" can decide which one of two keys "K1"
and "K2" is closer to another known node "B" than the other key. "A" and "B" will agree in that calculation.

To store a value, the storer node will calculate the corresponding key (usually a hash calculated from the value itself), search the network in
several steps, each step finding nodes that are closer to the desired key than in previous steps, until the most close nodes to that key can be determined.
Then it will store the value in all of these nodes.

Routing Tables

Kademlia routing tables consist of a list for each bit of the node id. (e.g. if a node ID consist of 128 bits, a node will keep 128 such lists.)
A list has many entries. Every entry in a list holds the necessary data to locate another node. The data in each list entry is typically the ip address,
port, and node id of another node. Every list corresponds to a specific distance from the node. Nodes that can go in the nth list must have a differing
nth bit from the node's id. This means that it is very easy to fill the first list as 1/2 of the nodes in the network are far away candidates. The next
list can use only 1/4 of the nodes in the network (one bit closer than the first), etc.

With an ID of 128 bits, every node in the network will classify other nodes in one of 128 different distances, one specific distance per bit.

As nodes are encountered on the network, they are added to the lists. This includes store and retrieval operations and even when helping other nodes
to find a key. Every node encountered will be considered for inclusion in the lists. Therefore the knowledge that a node has of the network is very dynamic.
This keeps the network constantly updated and adds resilience to accidents or attacks.

In the Kademlia literature, the lists are referred to as k-buckets. k is a system wide number, like 20. Every k-bucket is a list having up to k entries
inside. i.e. all nodes on the network will have lists containing up to 20 nodes for a particular bit (a particular distance from himself).

Since the possible nodes for each k-bucket decreases quickly (because there will be very few nodes that are that close), the lower bit k-buckets will
fully map all nodes in that section of the network. Since the quantity of possible ID's is much larger than any node population can ever be, some of
the k-buckets corresponding to very short distances will remain empty.

It is known that nodes that have been connected for a long time in a network will probably remain connected for a long time in the future. Because of
this statistical distribution, Kademlia selects long connected nodes to remain stored in the k-buckets. This increases the number of known valid nodes
at some time in the future and provides for a more stable network.

When a k-bucket is full and a new node is discovered for that k-bucket, the least recently seen node in the k-bucket is PINGed. If the node is found
to be still alive, the new node is place in a secondary list; a replacement cache. The replacement cache is used only if a node in the k-bucket stops
responding. In other words: new nodes are used only when older nodes disappear.

Protocol Messages

Kademlia has four messages.

1. PING - used to verify that a node is still alive.
2. STORE - Stores a (key, value) pair in one node.
3. FIND_NODE - The recipient of the request will return the k nodes in his own buckets that are the closest ones to the requested key.
4. FIND_VALUE - as FIND_NODE, but if the recipient of the request has the requested key in its store, it will return the corresponding value.

Each RPC message includes a random value from the initiator. This ensures that when the response is received it corresponds to the request previously sent.

Locating Nodes

Node lookups can proceed asynchronously. The quantity of simultaneous lookups is denoted by α and is typically three. A node initiates a FIND_NODE request
by querying to the k nodes in its own k-buckets that are the closest ones to the desired key. When these recipient nodes receive the request, they will look
in their k-buckets and return the k closest nodes to the desired key that they know. The requestor will update a results list with the results (node ID's)
he receives, keeping the k best ones (the k nodes that are closer to the searched key) that respond to queries. Then the requestor will select these k best
results and issue the request to them, and iterate this process again and again. Because every node has a better knowledge of his own surroundings than any
other node has, the received results will be other nodes that are every time closer and closer to the searched key. The iterations continue until no nodes
are returned that are closer than the best previous results. When the iterations stop, the best k nodes in the results list are the ones in the whole network
that are the closest to the desired key.

The node information can be augmented with round trip times, or RTT. This information will be used to choose a time-out specific for every consulted node.
When a query times out, another query can be initiated, never surpassing α queries at the same time.

Locating Resources

Information is located by mapping it to a key. A hash is typically used for the map. The storer nodes will have information due to a previous STORE message.
Locating a value follows the same procedure as locating the closest nodes to a key, except the search terminates when a node has the requested value in his
store and returns this value.

The values are stored at several nodes (k of them) to allow for nodes to come and go and still have the value available in some node. i.e. to provide redundancy.
Every certain time, a node that stores a value will explore the network to find the k nodes that are close to the key value and replicate the value onto them.
This compensate for disappeared nodes. Also, for popular values that might have many requests, the load in the storer nodes is diminished by having a retriever
store this value in some node near, but outside of, the k closest ones. This new storing is called a cache. In this way the value is stored farther and farther
away from the key, depending on the quantity of requests. This allows popular searches to find a storer quicker. Because the value is returned from nodes farther
away from the key, this alleviates possible "hot spots". Caching nodes will drop the value after a certain time depending on their distance from the key. Some
implementations (Kad) do not have replication nor caching. This is because Kad wants old information to go away quick. The node that is providing the file will
periodically refresh the information onto the network (perform NODE-LOOKUP and STORE messages). When all of the nodes having the file go offline, nobody will be
refreshing its values (sources and keywords) and the information will eventually disappear from the network.

Joining the Network

A node that would like to join the net must first go through a bootstrap process. In this phase, the node needs to know the IP address and port of another node
(obtained from the user, or from a stored list) that is already participating in the Kademlia network. If the bootstrapping node has not yet participated in the
network, it computes a random ID number that is supposed not to be already assigned to any other node. It uses this ID until leaving the network.

The joining node inserts the bootstrap node into one of its k-buckets. The new node then does a NODE_LOOKUP of his own ID against the only other node he knows.
The "self-lookup" will populate other nodes' k-buckets with the new node id, and will populate the new node k-buckets with the nodes in the path between him and
the bootstrap node. After this, the new node refreshes all k-buckets further away than the k-bucket where the bootstrap node falls in. This refresh is just a lookup
of a random key that is within that k-bucket range.

Initially, nodes have one k-bucket. When the k-bucket becomes full, it can be split. The split occurs if the range of nodes in the k-bucket spans the nodes own id
(values to the left and right in a binary tree). Kademlia relaxes even this rule for the one "closest nodes" k-bucket, because typically one single bucket will correspond
to the distance where all the nodes that are the closest to this node are, they may be more than k, and we want it to know them all. It may turn out that a highly
unbalanced binary sub-tree exists near the node. If k is 20, and there are 21+ nodes with a prefix "xxx0011....." and the new node is "xxx000011001", the new node
can contain multiple k-buckets for the other 21+ nodes. This is to guarantee that the network knows about all nodes in the closest region.

Accelerated lookups

Kademlia uses a XOR metric to define distance. Two node ID's or a node ID and a key are XORed and the result is the distance between them. For each bit, the XOR
function returns zero if the two bits are equal and one if the two bits are different. XOR metric distances hold the triangular property: The distance from "A" to "B"
is shorter than (or equal to) the distance from "A" to "C" plus the distance from "C" to "B".

The XOR metric allows Kademlia to extend routing tables beyond single bits. Groups of bits can be place in k-buckets. The group of bits are termed a prefix.
For an m-bit prefix, there will be 2m-1 k-buckets. The missing k-bucket is a further extension of the routing tree that contains the node ID. An m-bit prefix reduces
the maximum number of lookups from log2 n to log2b n. These are maximum values and the average value will be far less, increasing the chance of finding a node in
an own k-bucket that share more bits than just the prefix with the target key.

Nodes can use mixtures of prefixes in their routing table, such as the Kad Network used by eMule. The Kademlia network could even be heterogeneous in routing table
implementations. This would just complicate the analysis of lookups.