MySQL Limitations Part 2: The Binary Log [MySQL 短板 2]

This is the second in a series on what’s seriously limiting MySQL in certain circumstances (links: part 1). In the first part, I wrote about single-threaded replication. Upstream from the replicas is the primary, which enables replication by writing a so-called “binary log” of events that modify data in the server. The binary log is a real limitation in MySQL.

The binary log is necessary not only for replication, but for point-in-time recovery, too. Given a backup and the corresponding binary log position, you can replay the binary log and roll forward the state of your server to a desired point in time.

But enabling the binary log reduces MySQL’s performance dramatically. It is not the logging itself that’s the problem — writing the log is usually not much additional work. It’s ensuring consistency and durability that is expensive. Flushing it to disk adds an fsync call for every transaction. And the server performs an XA transaction between InnoDB and the binary log. This adds more fsync calls, and causes mutex contention, and prevents group commit, and probably other things that aren’t coming to mind now.

The performance reduction can be an order of magnitude or more.

What’s the solution? I’m not sure I can summarize it concisely. There is a lot of complexity, and honestly I don’t understand some of the server internals fully enough to have a 50-thousand-foot view of it all. The binary logging and replication code, and its interaction with InnoDB, is difficult to understand. Kristian Nielsen has an extensive series of posts on group commit alone.

I think that a full fix might require significant architectural changes to MySQL. This will be hard. Maybe Drizzle is going in a good direction — time will tell. All of the solutions that I can think of are too simplistic. For example, doing replication through the InnoDB transaction log would work fine if a) all the data were in InnoDB, and b) InnoDB’s data didn’t have to be synchronized with the .frm files (and Drizzle has gotten rid of the .frm files, hooray), and c) privileges and other changes to the non-InnoDB data in MySQL were handled manually.

It could work if you just made sure that you didn’t change privileges or schema, but that’s a description of a pretty limited, clunky replication system from the user’s point of view. Still, I have considered it. There would need to be a mechanism of transporting the log files, and InnoDB would have to be put into a state of constant “recovery,” and it would have to be modified to be available read-only in this state so that it could be used for read queries. This can be done, of course. It’s just a matter of how hard it is.

It’s worth noting that PBXT does replication through its transaction logs, so there’s even precedent for this among MySQL storage engines. And there is Galera’s multi-master synchronization technology to look at, too.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值