mongodb 恢复_MongoDB时间点恢复

最新推荐文章于 2022-03-11 14:02:00 发布

cumifi2519

最新推荐文章于 2022-03-11 14:02:00 发布

阅读量520

点赞数

文章标签：数据库 java 大数据 mysql python

原文链接：https://www.freecodecamp.org/news/mongodb-point-in-time-recoveries-or-how-we-saved-600-dollars-a-month-and-got-a-better-backup-55466b7d714/

版权

mongodb 恢复

by Gitter

通过吉特

MongoDB时间点恢复 (MongoDB point-in-time recoveries)

(…或者我们如何每月节省600美元并获得更好的备份解决方案) ((…or how we saved 600 dollars a month and got a better backup solution))

At Gitter, a small startup, we work hard every day to provide the best chat for communities (have you checked Ping Pong Wars?), while keeping costs low. So when I found that we were paying $600 every month for a basic backup service for our databases instead of Rubik’s cubes and craft beer, I thought there was room for an easy win.

在小型公司Gitter ，我们每天都在努力为社区提供最佳聊天服务(您是否检查过Ping Pong Wars ？)，同时保持较低的成本。因此，当我发现我们每个月为数据库的基本备份服务而不是Rubik的立方体和精酿啤酒支付600美元时，我认为轻松获胜的空间很大。

Point-in-time recoveries are state of the art when it comes to backing up your databases. That is, being able to pinpoint a particular transaction, often catastrophic, and fully recover the state of your dataset up to that point. Our solution only provided hourly backups, which isn’t quite enough for our needs. And yet we were paying a lot of money for that. Bah.

在备份数据库时，时间点恢复是最新技术。也就是说，能够查明通常是灾难性的特定事务，并在此之前完全恢复数据集的状态。我们的解决方案仅提供按小时备份，不足以满足我们的需求。但是我们为此付出了很多钱。呸。

At Gitter, we use MongoDB on EC2 instances with EBS volumes to store our datasets. This solution is very convenient when it comes to architecting an in-house backup system that supports point-in-time recoveries and it’s surprisingly easier than it may seem. I’ll show you how we do it.

在Gitter，我们在具有EBS卷的EC2实例上使用MongoDB来存储我们的数据集。在设计支持时间点恢复的内部备份系统时，此解决方案非常方便，而且它看起来比看起来容易得多。我会告诉你我们如何做到的。

快照 (The snapshot)

First, the snapshot part. We take snapshots regularly using a script I wrote. It’s based on the official tutorial from MongoDB so nothing too surprising here. Snapshots are also very handy when you want to spin up a new replica node: just create a new instance using a data volume based on the latest snapshot, add it to the replica set and MongoDB will only replay less than one hour of oplog, which is a lot faster than a full resync.

首先，快照部分。我们使用我编写的脚本定期拍摄快照。它基于MongoDB 的官方教程，因此在这里没有什么奇怪的。当您想启动一个新的副本节点时，快照也非常方便：只需使用基于最新快照的数据卷创建一个新实例，将其添加到副本集，MongoDB仅重播少于一小时的oplog，比完全重新同步要快得多。

You want to store both the data files and the journal on the same EBS volume: most of times it doesn’t impact I/O performance and achieving consistency can be tricky otherwise.

您希望将数据文件和日记存储在同一EBS卷上：大多数情况下，它不影响I / O性能，否则很难做到保持一致性。

Then you need to take a snapshot of the EBS volume. You can use your favourite AWS interface to do so. Remember that taking a snapshot is an instantaneous operation: once AWS receives the api call the volume will be “photographed” at its current state so you can safely resume your write operations. Nevertheless, it’s recommended to perform this operation on a secondary node.

然后，您需要对EBS卷进行快照。您可以使用自己喜欢的AWS界面执行此操作。请记住，拍摄快照是一项瞬时操作：AWS收到api调用后，卷将在其当前状态下“拍照”，因此您可以安全地恢复写操作。但是，建议在辅助节点上执行此操作。

The advantage of taking EBS snapshots is that AWS compresses the blocks and only stores differentials in S3, which represents a further saving in terms of cost.

拍摄EBS快照的优势在于，AWS压缩块，仅将差异存储在S3中，这进一步节省了成本。

The whole “freeze mongo; take snapshot; unfreeze mongo” takes about 1.4 seconds for us, so it’s an affordable tradeoff given the great convenience it gives us. Also, the advantage of the EBS snapshot solution is that AWS compresses the blocks and only stores differentials in S3, which represents a further saving in terms of cost.

整个“冻结蒙哥；拍摄快照； “取消冻结mongo”对我们大约需要1.4秒，因此，鉴于它给我们带来的极大便利，这是一个可以承受的折衷。此外，EBS快照解决方案的优势在于，AWS压缩块并仅将差异存储在S3中，这进一步节省了成本。

Job done, you’re a cost saving hero! Close all those expensive accounts and chip in for a pay raise. But is it enough?

完成工作，您是节省成本的英雄！关闭所有这些昂贵的帐户，然后加薪。但是够了吗？

恢复 (The recovery)

Having EBS snapshots of your MongoDB dataset is only granular up to the frequency that you’re taking them, say every 30 minutes or even one hour. This may not be enough and taking a snapshot every minute can be an overkill (and you’ll still have one minute granularity). No matter how you put it, some data will be lost even if it’s just little. To avoid this, you can use the MongoDB oplog to replay the transactions from the snapshot time up to the rogue one and fill in the time gap. Note that this only works if your oplog window is wide enough so be very careful sizing your oplog. You can keep an eye on it by using this statsd emitter.

拥有MongoDB数据集的EBS快照仅取决于您拍摄它们的频率，例如每30分钟甚至一小时一次。这可能还不够，而且每分钟拍摄快照可能是一个过大的选择(并且您仍然只有一分钟的时间间隔)。无论您如何说，即使只有很少的数据也会丢失一些数据。为避免这种情况，您可以使用MongoDB操作日志重播从快照时间到无赖时间的事务，并填补时间空白。请注意，这仅在您的oplog窗口足够宽时才有效，因此请非常小心地调整oplog的大小。您可以使用此statsd发射器关注它。

Also, the oplog must be available on a replica node, even if the whole dataset is gone. Worst case scenario, the transaction that destroyed your dataset was such a nasty one that you’ll end up recovering up to the snapshot time, which considering the magnitude of the disaster isn’t such a bad perspective.

此外，即使整个数据集都消失了，操作日志也必须在副本节点上可用。在最坏的情况下，破坏数据集的事务是如此令人讨厌，您最终将无法恢复到快照时间，考虑到灾难的严重程度，这并不是一个糟糕的观点。

So where can you get the oplog from? A secondary node again is generally a good choice. You can dump the oplog with mongodump but there’s a caveat: you want to only dump transactions that happened after the last one in the snapshot you’re recovering. The reason is that, for instance, replaying insertions when a unique index constraint is present will fail your restore. So you want to trim your oplog on both sides: after the snapshot and before the catastrophic event.

那么，您可以从哪里获得机会日志？通常，辅助节点还是一个不错的选择。您可以使用mongodump转储操作日志，但有一个警告：您只想转储正在恢复的快照中最后一个事务之后发生的事务。原因是，例如，当存在唯一索引约束时重放插入将使您的还原失败。因此，您想在两方面都进行调整：在快照之后和灾难性事件之前。

To do this you need to find the timestamp of the last transaction in the snapshot. Create an EBS volume using the snapshot taken prior to the catastrophic event and mount it on an instance. Start mongod binding to localhost and a temporary port, say 27272. Then run this query:

为此，您需要在快照中找到最后一个事务的时间戳。使用灾难性事件之前拍摄的快照创建EBS卷，并将其安装在实例上。开始将mongod绑定到localhost和一个临时端口，例如27272。然后运行以下查询：

$ mongo —-port 27272 local> db.oplog.rs.find({}, {ts: 1,}).sort({ts: -1}).limit(1){"ts" : Timestamp(1459850401, 11)}

Dump the oplog from a secondary replica node using the timestamp just calculated for the query. This creates a directory called oplog with the oplog collection bson file and collection metadata, which we will ignore. Don’t be afraid of dumping the oplog: it isn’t a very heavy operation and it will only take a few seconds if you have reasonable bandwidth.

使用刚刚为查询计算的时间戳从辅助副本节点转储oplog。这将创建一个名为oplog的目录，其中包含oplog集合bson文件和集合元数据，我们将忽略它们。不要担心转储oplog：这不是一个非常繁重的操作，如果您有合理的带宽，则只需几秒钟。

$ mongodump -h sendondary-node \--db local \--collection oplog.rs \--out oplog \--query '{"ts": { "$gt": { "$timestamp": {"t": 1459850401, "i": 11}}}}'

Convert the bson data into json so that it becomes readable by humans:

将bson数据转换为json，以便人类可以读取：

$ bsondump oplog/local/oplog.rs.bson > oplog.json

Find the timestamp of the bogus transaction, which represents the point until you want to replay the oplog:

查找伪事务的时间戳，该时间戳表示要重播oplog之前的时间：

$ grep "Hello. My name is Inigo Montoya. You killed my father. Prepare to die." oplog.json{"ts":{"$timestamp":{"t":1459852199,"i":1}},"h":{"$numberLong":"-6882763316726998947"},"v":2,"op":"i","ns":"quotes.movies","o":{"_id":{"$oid":"570393abf5d634897f2360a3"},"quote":"Hello. My name is Inigo Montoya. You killed my father. Prepare to die.","character":"Inigo Montoya","title":"The Princess Bride"}

In this case your timestamp is 1459852199:1.

在这种情况下，您的时间戳是1459852199：1。

Next, move the oplog where mongorestore will look for it:

接下来，将操作日志移动到mongorestore会寻找它的位置：

mv oplog/local/oplog.rs.bson oplog/oplog.bson

Now you’re ready to replay the oplog using — oplogLimit to set the delimiter:

现在，您可以使用— oplogLimit设置定界符来重播oplog：

$ mongorestore -h localhost:27272 --oplogReplay --oplogLimit 1459852199:1 oplog

Time to verify your database but there shouldn’t be any problems if you carefully followed the instructions.

是时候验证数据库了，但是如果您按照说明进行操作，应该不会有任何问题。

You’re now ready to restart the instance in production. Well done!

您现在可以在生产中重新启动实例。做得好！

This post was written by Daniele Valeriani.

该帖子由Daniele Valeriani撰写。

翻译自: https://www.freecodecamp.org/news/mongodb-point-in-time-recoveries-or-how-we-saved-600-dollars-a-month-and-got-a-better-backup-55466b7d714/

mongodb 恢复

cumifi2519

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫