Replication

Intro

Replication means keeping a copy of the same data on multiple machines that are connected via a network.
There are several reasons why you might want to replicate data:

  1. To keep data geographically close to your users (and thus reduce latency)
  2. To allow the system to continue working even if some of its parts have failed (and thus increase availability)
  3. To scale out the number of machines that can serve read queries (and thus increase read throughput)

We will discuss three popular algorithms for replicating changes between nodes: single-leader, multi-leader, and leaderless replication, synchronous and asynchronous replication, handle failed replicas, replication lag and so on.

Leaders and Followers

Every write to the database needs to be processed by every replica; otherwise, the replicas would no longer contain the same data. The most common solution for this is called leader-based replication. It works as follows:

  1. One of the replicas is designated the leader. When clients want to write to the database, they must send their requests to the leader, which first writes the new data to its local storage.
  2. The other replicas are known as followers. Whenever the leader writes new data to its local storage, it also sends the data change to all of its followers as part of a replication log or change stream. Each follower takes the log from the leader and updates its local copy of the database accordingly, by applying all writes in the same order as they were processed on the leader.
  3. When a client wants to read from the database, it can query either the leader or any of the followers. However, writes are only accepted on the leader
    在这里插入图片描述

Synchronous Versus Asynchronous Replication

在这里插入图片描述

The advantage of synchronous replication is that the follower is guaranteed to have an up-to-date copy of the data that is consistent with the leader. If the leader suddenly fails, we can be sure that the data is still available on the follower. The disadvantage is that if the synchronous follower doesn’t respond, the write cannot be processed. The leader must block all writes and wait until the synchronous replica is available again.

In practice, one of the followers is synchronous, and the others are asynchronous. If the synchronous follower becomes unavailable or slow, one of the asynchronous followers is made synchronous. This guarantees that you have an up-to-date copy of the data on at least two nodes.

Often, leader-based replication is configured to be completely asynchronous. In this case, if the leader fails and is not recoverable, any writes that have not yet been replicated to followers are lost, even if it has been confirmed to the client.

Setting Up New Followers

  1. Copy the snapshot of leader’s database to the new follower node.
  2. The follower connects to the leader and requests all the data changes that have happened since the snapshot was taken.
  3. When the follower has processed the backlog of data changes since the snapshot, it can now continue to process data changes from the leader as they happen.

Handling Node Outages

Follower failure

If a follower crashes and is restarted, it knows the last transaction that was processed before the fault occurred from log in its local disk. Thus, the follower can connect to the leader and request all the data changes that occurred during the time when the follower was disconnected. When it has applied these changes, it has caught up to the leader and can continue receiving a stream of data changes as before.

Leader failure

  1. Determining that the leader has failed. (heart beat timeout)
  2. Choosing a new leader. The best candidate for leadership is usually the replica with the most up-to-date data changes from the old leader. (consensus problem)
  3. Reconfiguring the system to use the new leader. If the old leader comes back, it might still believe that it is the leader, not realizing that the other replicas have forced it to step down. This situation is called split brain.

Implementation of Replication Logs

Statement-based replication

In the simplest case, the leader logs every write request (statement) that it executes and sends that statement log to its followers.
There are various ways in which this approach to replication can break down:

  1. Any statement that calls a nondeterministic function is likely to generate a different value on each replica.
  2. If statements depend on the existing data, they must be executed in exactly the same order on each replica.

Write-ahead log (WAL) shipping

Logical (row-based) log replication

An alternative is to use different log formats for replication and for the storage engine, which allows the replication log to be decoupled from the storage engine internals.

Trigger-based replication

A trigger lets you register custom application code that is automatically executed when a data change occurs in a database system. The trigger has the opportunity to log this change into a separate table, from which it can be read by an external process. That external process can then apply any necessary application logic and replicate the data change to another system.

Problems with Replication Lag

Reading Your Own Writes

在这里插入图片描述
In this situation, we need read-after-write consistency. This is a guarantee that if the user reloads the page, they will always see any updates they submitted themselves. other users’ updates may not be visible until some later time.
There are various possible techniques to implement read-after-write consistency. To mention a few:

  1. When reading something that the user may have modified, read it from the leader, otherwise, read it from a follower. This requires that you have some way of knowing whether something might have been modified, without actually querying it.
  2. The client can remember the timestamp of its most recent write, then the system can ensure that the replica serving any reads for that user reflects updates at least until that timestamp.
  3. If your replicas are distributed across multiple datacenters, any request that needs to be served by the leader must be routed to the datacenter that contains the leader.

Monotonic Reads

在这里插入图片描述

One way of achieving monotonic reads is to make sure that each user always makes their reads from the same replica. However, if that replica fails, the user’s queries will need to be rerouted to another replica.

Consistent Prefix Reads

在这里插入图片描述

If the database always applies writes in the same order, reads always see a consistent prefix, so this anomaly cannot happen. One solution is to make sure that any writes that are causally related to each other are written to the same partition, There are also algorithms that explicitly keep track of causal dependencies.

Multi-Leader Replication

Leader-based replication has one major downside: if you can’t connect to the leader for any reason, you can’t write to the database.

Use Cases for Multi-Leader Replication

Multi-datacenter operation

在这里插入图片描述

Performance: Every write can be processed in the local datacenter and is replicated asynchronously to the other datacenters. Thus, the interdatacenter network delay is hidden from users.
Tolerance of datacenter outages: If the datacenter with the leader fails, each datacenter can continue operating independently of the others, and replication catches up when the failed datacenter comes back online.

Although multi-leader replication has advantages, it also has a big downside: the same data may be concurrently modified in two different datacenters, and those write conflicts must be resolved.

Clients with offline operation

In this case, every device has a local database that acts as a leader, and there is an asynchronous multi-leader replication process. The replication lag may be hours or even days, depending on when you have internet access available.

Handling Write Conflicts

在这里插入图片描述

Conflict avoidance

The simplest strategy for dealing with conflicts is to avoid them: if the application can ensure that all writes for a particular record go through the same leader, then conflicts cannot occur.

Converging toward a consistent state

A single-leader database applies writes in a sequential order: if there are several updates to the same field, the last write determines the final value of the field.
In a multi-leader configuration, there is no defined ordering of writes, so it’s not clear what the final value should be.
There are various ways of achieving convergent conflict resolution:

  1. Give each write a unique ID, pick the write with the highest ID as the winner, and throw away the other writes. This is dangerously prone to data loss.
  2. Give each replica a unique ID, and let writes that originated at a higher-numbered replica always take precedence over writes that originated at a lower-numbered replica. This approach also implies data loss.
  3. Somehow merge the values together.
  4. Record the conflict in an explicit data structure that preserves all information, and write application code that resolves the conflict at some later time.

Leaderless Replication

In some leaderless implementations, the client directly sends its writes to several replicas, while in others, a coordinator node does this on behalf of the client.

Writing to the Database When a Node Is Down

在这里插入图片描述

Catch up
  1. Read repair: When a client makes a read from several nodes in parallel, it can detect any stale responses and writes the newer value back to that replica. This approach works well for values that are frequently read.
  2. Anti-entropy process: background process that constantly looks for differences in the data between replicas and copies any missing data from one replica to another.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值