Redis如何持久化?

AOF

AOF有哪些优点?

  • Using AOF Redis is much more durable: you can have different fsync policies: no fsync at all, fsync every second, fsync at every query. With the default policy of fsync every second, write performance is still great. fsync is performed using a background thread and the main thread will try hard to perform writes when no fsync is in progress, so you can only lose one second worth of writes.
  • The AOF log is an append-only log, so there are no seeks, nor corruption problems if there is a power outage. Even if the log ends with a half-written command for some reason (disk full or other reasons) the redis-check-aof tool is able to fix it easily.
  • Redis is able to automatically rewrite the AOF in background when it gets too big. The rewrite is completely safe as while Redis continues appending to the old file, a completely new one is produced with the minimal set of operations needed to create the current data set, and once this second file is ready Redis switches the two and starts appending to the new one.
  • AOF contains a log of all the operations one after the other in an easy to understand and parse format. You can even easily export an AOF file. For instance even if you’ve accidentally flushed everything using the FLUSHALL command, as long as no rewrite of the log was performed in the meantime, you can still save your data set just by stopping the server, removing the latest command, and restarting Redis again.
  • AOF是先执行操作再写操作,这样可以保证写入AOF操作的正确性

AOF有哪些缺点?

  • AOF files are usually bigger than the equivalent RDB files for the same dataset.
  • AOF can be slower than RDB depending on the exact fsync policy. In general with fsync set to every second performance is still very high, and with fsync disabled it should be exactly as fast as RDB even under high load. Still RDB is able to provide more guarantees about the maximum latency even in the case of a huge write load.

如何解决AOF的缺点?

  • 通过重写AOF可以减小AOF文件,这样在恢复时候会更快。
  • 通过不同的落盘策略来权衡数据安全和效率。

AOF的重写机制是怎么样的?

1、Redis forks, so now we have a child and a parent process.

2、The child starts writing the new AOF in a temporary file.

3、The parent accumulates all the new changes in an in-memory buffer (but at the same time it writes the new changes in the old append-only file, so if the rewriting fails, we are safe). (写入aof的缓冲区和重写缓冲区中)

4、When the child is done rewriting the file, the parent gets a signal, and appends the in-memory buffer at the end of the file generated by the child.

Now Redis atomically renames the new file into the old one, and starts appending new data into the new file.

redis 7.0之后父线程会创建多个增量文件,在里面添加新的操作。

AOF重写会产生什么样的问题?

问题1,Redis采用fork子进程重写AOF文件时,潜在的阻塞风险包括:fork子进程 和 AOF重写过程中父进程产生写入的场景,下面依次介绍。

a、fork子进程,fork这个瞬间一定是会阻塞主线程的(注意,fork时并不会一次性拷贝所有内存数据给子进程,老师文章写的是拷贝所有内存数据给子进程,我个人认为是有歧义的),fork采用操作系统提供的写实复制(Copy On Write)机制,就是为了避免一次性拷贝大量内存数据给子进程造成的长时间阻塞问题,但fork子进程需要拷贝进程必要的数据结构,其中有一项就是拷贝内存页表(虚拟内存和物理内存的映射索引表),这个拷贝过程会消耗大量CPU资源,拷贝完成之前整个进程是会阻塞的,阻塞时间取决于整个实例的内存大小,实例越大,内存页表越大,fork阻塞时间越久。拷贝内存页表完成后,子进程与父进程指向相同的内存地址空间,也就是说此时虽然产生了子进程,但是并没有申请与父进程相同的内存大小。那什么时候父子进程才会真正内存分离呢?“写实复制”顾名思义,就是在写发生时,才真正拷贝内存真正的数据,这个过程中,父进程也可能会产生阻塞的风险,就是下面介绍的场景。

b、fork出的子进程指向与父进程相同的内存地址空间,此时子进程就可以执行AOF重写,把内存中的所有数据写入到AOF文件中。但是此时父进程依旧是会有流量写入的,如果父进程操作的是一个已经存在的key,那么这个时候父进程就会真正拷贝这个key对应的内存数据,申请新的内存空间,这样逐渐地,父子进程内存数据开始分离,父子进程逐渐拥有各自独立的内存空间。因为内存分配是以页为单位进行分配的,默认4k,如果父进程此时操作的是一个bigkey,重新申请大块内存耗时会变长,可能会产阻塞风险。另外,如果操作系统开启了内存大页机制(Huge Page,页面大小2M),那么父进程申请内存时阻塞的概率将会大大提高,所以在Redis机器上需要关闭Huge Page机制。Redis每次fork生成RDB或AOF重写完成后,都可以在Redis log中看到父进程重新申请了多大的内存空间。

问题2,AOF重写不复用AOF本身的日志,一个原因是父子进程写同一个文件必然会产生竞争问题,控制竞争就意味着会影响父进程的性能。二是如果AOF重写过程中失败了,那么原本的AOF文件相当于被污染了,无法做恢复使用。所以Redis AOF重写一个新文件,重写失败的话,直接删除这个文件就好了,不会对原先的AOF文件产生影响。等重写完成之后,直接替换旧文件即可。

问题一:在AOF重写期间,Redis运行的命令会被积累在缓冲区,待AOF重写结束后会进行回放,在高并发情况下缓冲区积累可能会很大,这样就会导致阻塞,Redis后来通过Linux管道技术让aof期间就能同时进行回放,这样aof重写结束后只需要回放少量剩余的数据即可

问题二:对于任何文件系统都是不推荐并发修改文件的,例如hadoop的租约机制,Redis也是这样,避免重写发生故障,导致文件格式错乱最后aof文件损坏无法使用,所以Redis的做法是同时写两份文件,最后通过修改文件名的方式,保证文件切换的原子性

RDB

RBD的优势

  • RDB is a very compact single-file point-in-time representation of your Redis data. RDB files are perfect for backups. For instance you may want to archive your RDB files every hour for the latest 24 hours, and to save an RDB snapshot every day for 30 days. This allows you to easily restore different versions of the data set in case of disasters.
  • RDB is very good for disaster recovery, being a single compact file that can be transferred to far data centers, or onto Amazon S3 (possibly encrypted).
  • RDB maximizes Redis performances since the only work the Redis parent process needs to do in order to persist is forking a child that will do all the rest. The parent process will never perform disk I/O or alike.
  • RDB allows faster restarts with big datasets compared to AOF.

RDB的缺点

  • RDB is NOT good if you need to minimize the chance of data loss in case Redis stops working (for example after a power outage). You can configure different save points where an RDB is produced (for instance after at least five minutes and 100 writes against the data set, you can have multiple save points). However you’ll usually create an RDB snapshot every five minutes or more, so in case of Redis stopping working without a correct shutdown for any reason you should be prepared to lose the latest minutes of data.
  • RDB needs to fork() often in order to persist on disk using a child process. fork() can be time consuming if the dataset is big, and may result in Redis stopping serving clients for some milliseconds or even for one second if the dataset is very big and the CPU performance is not great. AOF also needs to fork() but less frequently and you can tune how often you want to rewrite your logs without any trade-off on durability.
    总结:1、数据丢失 2、fork()影响线程性能

RDB的过程

  • Redis forks. We now have a child and a parent process.

  • The child starts to write the dataset to a temporary RDB file.

  • When the child is done writing the new RDB file, it replaces the old one.
    利用写时复制,不影响主进程修改数据,但是最坏情况下内存将变成原来两倍。

RDB和AOF配合使用

Redis 4.0中提出了一个混合使用AOF日志和内存快照的方法。简单来说,内存快照以一定的频率执行,在两次快照之间,使用AOF日志记录这期间的所有命令操作。

这样一来,快照不用很频繁地执行,这就避免了频繁fork对主线程的影响。而且,AOF日志也只用记录两次快照间的操作,也就是说,不需要记录所有操作了,因此,就不会出现文件过大的情况了,也可以避免重写开销。

课后问题

我曾碰到过这么一个场景:我们使用一个2核CPU、4GB内存、500GB磁盘的云主机运行Redis,Redis数据库的数据量大小差不多是2GB,我们使用了RDB做持久化保证。当时Redis的运行负载以修改操作为主,写读比例差不多在8:2左右,也就是说,如果有100个请求,80个请求执行的是修改操作。你觉得,在这个场景下,用RDB做持久化有什么风险吗?你能帮着一起分析分析吗?

到这里,关于持久化我们就讲完了,这块儿内容是熟练掌握Redis的基础,建议你一定好好学习下这两节课。如果你觉得有收获,希望你能帮我分享给更多的人,帮助更多人解决持久化的问题。

首先只用RDB做持久化可能会承担数据丢失的风险
其次由于写操作占比很大,那么写时复制会导致内存占用变大。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值