How Does Semisynchronous MySQL Replication Work?


With the recent release of Percona XtraDB Cluster, I am increasingly being asked about MySQL’s semi-synchronous replication. I find that there are often a number of misconceptions about how semi-synchronous replication really works. I think it is very important to understand what guarantees you actually get with semi-synchronous replication, and what you don’t get.

The first thing to understand is that despite the name, semi-synchronous replication is still asynchronous. Semi-synchronous is actually a pretty bad name, because there is no strong coupling between a commit on the master and a commit on the replicas. To understand why, let’s look at what truly synchronous replication means. In truly synchronous replication, when you commit a transaction, the commit does not complete until all replicas have also committed successfully. In MySQL’s semi-synchronous replication, the commit completes before the transaction is even sent to any of the replicas. Therefore, by definition the transaction cannot have committed on any of the replicas. If there’s any problem after the commit happens on the master, it’s possible that the replicas won’t get the transaction, and even after they do, there’s no guarantee they can apply and commit it successfully themselves (duplicate key error, anyone?). If any of these problems happens, it’s too late–the commit is already permanent on the master, and can’t be rolled back.

What should semi-synchronous replication be called instead? I believe that it should be called delayed-acknowledgment commits, because this is what actually happens. When a transaction commits on the master, the commit proceeds as normal, and the transaction is sent to the replicas as normal, but the client connection to the master is not told that the commit has completed until after at least one replica has acknowledged receiving the transaction.

Another way to look at the same thing is that semi-synchronous replication actually forces the client to be synchronized, not the replicas. The client is forced to wait until the transaction has been sent to one of the replicas, but the commit on the master is not forced to wait at all, nor are replicas forced to do anything. The commit has already happened on the master, so the cat’s out of the bag and there’s no way to force replicas to do anything. As a result, the effect is that the client’s activity is throttled so that it cannot outpace the replica’s ability to fetch updates from the master. Have you seen the bumper sticker that says “don’t drive faster than your Guardian Angel can fly?” That is the effect of this throttling.

Semi-synchronous replication also does not guarantee that your replicas will not become delayed. The client connection is forced to wait until at least one of the replicas has retrieved the transaction, but not until the transaction has actually been applied to the replica. As you probably know, it is perfectly possible to send a very long transaction to the replica in a matter of milliseconds. The replica will take a long time to apply this transaction to its own data, and during that time, it will be delayed relative to the master. However, other transactions can continue committing and sending their changes to the replica, because the process of retrieving changes from the master and applying them run in separate threads on the replica.

Finally, semi-synchronous replication does not provide strong guarantees against data loss. What do I mean by a strong guarantee against data loss? I consider the safety of my data strongly guaranteed when at least one other server must have a copy of the data before it can be committed on the master. However, that is not what happens in semi-synchronous replication. And if there is an error in semi-synchronous replication, such as a crash at the wrong moment, or a timeout, then even the throttling is abandoned, and everything defaults back to the traditional mode of replication.

What does semi-synchronous replication guarantee me then? If there are no errors or timeouts, then the guarantee is essentially that only one transaction per client is likely to be lost if the master crashes.

I do not mean to sound negative, or to send the message that semi-synchronous replication is not useful. It is useful, but if you misunderstand it, you could be relying on a strong guarantee that is not actually provided.

If you want to learn more about this, then I encourage you to read the relevant section of the MySQL manual. But read carefully, for example, the following sentences:

When a commit returns successfully, it is known that the data exists in at least two places (on the master and at least one slave). If the master commits but a crash occurs while the master is waiting for acknowledgment from a slave, it is possible that the transaction may not have reached any slave.

Finally, I would be interested to hear how many people are actually running semi-synchronous replication in production. I have a feeling that very few people are, even though a lot of people seem to have heard about it. What are your experiences?

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值