分布式文档存储
routing a document to a shard
shard = hash(routing) % number_of_primary_shardsrouting value default is document's _id
how primary and replica shards interact
我们可以发送请求到集群中的任何节点。每个节点完全有能力任何请求服务。每个节点都知道集群中的每个文档的位置,所以可以直接请求转发到所需的节点。在下面的例子中,我们将我们所有的请求发送到节点1,我们将称之为请求节点。
When sending requests, it is good practice to round-robin through all the nodes in the cluster, in order to spread the load.
creating, indexing, and deleting a document
上边所有的操作都是写操作,所以必须要再主分片上进行成功之后再同步到复制分片
1.请求发送到master节点。更新文档存在的shard为P0。
2。master节点知道P0存在于NODE3所以将请求转发给node3.
3。node3更新文档后,复制更新后的文档到NODE2,与NODE1的复制分片。当复制成功之后,返回请求成功
请求参数设置
replication sync|async 同步或异步复制,默认同步
consistency(数据一致性) 主分片需要一些复制分片来保证数据一致性(不是所有复制分片,因为有些复制分片所存在的节点网络不通) 选项为 one(只有主分片)|all(所有分片)|quorum(指定复制分片个数)
一个主分片的默认一致性复制分片个数为int( (primary + number_of_replicas) / 2 ) + 1
timeout 如果没有满足复制分片的个数,请求将等待 默认1分钟 1分钟之后请求超时 100
is 100 milliseconds, and 30s
is 30 seconds.
A new index has 1
replica by default, which means that two active shard copies shouldbe required in order to satisfy the need for a quorum
. However, these default settings would prevent us from doing anything useful with a single-node cluster. To avoid this problem, the requirement for a quorum is enforced only when number_of_replicas
is greater than 1
.
重新取回文档
For read requests, the requesting node will choose a different shard copy on every request in order to balance the load; it round-robins through all shard copies.
partial update a document
1.请求发送到master节点。master节点通过计算hash值发现此文档属于shard0
2.将请求转发到Node3
3.从P0主分片中取出文档,更新文档
4.复制更新后的文档到复制分片。
多文档模式。
因为ES知道每个文档所存在的节点与分片,请求将会以分片的形式分组(每一组请求都属于同一个主分片),最后将分组后的请求一并转发到node上