Redis trouble21 -- aof持久化导致redis命令阻塞_writing the aof buffer without waiting for fsync t-CSDN博客

本文链接：https://blog.csdn.net/weixin_39523456/article/details/122431530

本文分析了Redis使用AOF持久化时出现的性能问题，包括异常日志解析、问题成因及解决方案，强调合理配置的重要性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

5.appendfsync everysec不是1s

1.异常日志

 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.
 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.
 * Starting automatic rewriting of AOF on 107914% growth
 * Background append only file rewriting started by pid 4143
 * AOF rewrite child asks to stop sending diffs.
 * Parent agreed to stop sending diffs. Finalizing AOF...
 * Concatenating 0.00 MB of AOF diff received from parent.
 * SYNC append only file rewrite performed
 * AOF rewrite: 2 MB of memory used by copy-on-write
 * Background AOF rewrite terminated with success
 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
 * Background AOF rewrite finished successfully
 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.
 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.
 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

2.问题分析

'配置文件配置'
appendonly yes # 开启aof
appendfsync everysec # 设置aof策略，每秒写入一次
aof-use-rdb-preamble yes #开启aof rdb混合使用
aof-load-truncated yes # redis启动加载aof文件时，忽略掉错误的命令，尽可能多的加载可用命令
aof-rewrite-incremental-fsync yes # 分批刷入aof文件,可以有效利用顺序IO
no-appendfsync-on-rewrite no # 保证数据尽可能少的丢失，设置为no，最多丢失2s数据，设置为yes，最多会丢失30s数据
auto-aof-rewrite-min-size 67108864 # aof文件大小 64M
auto-aof-rewrite-percentage 100 #(aof_current_size-aof_base_size)/aof_base_size与100%比较

'触发rewrite机制下边两条同时满足'
1.当前aof文件（aof_current_size）> 64MB
2.(aof_current_size-aof_base_size)/aof_base_size > 100%

结合监控分析
右边aof_delayed_fsync参数一致在持续增加，代表着aof在持续发生阻塞的情况
左边可以看到已经满足上述的aof进行rewrite的条件，aof在频繁的进行rewrite操作

3.引起原因

查看了监控的命令，以及aof文件的命令总结以下原因
1.客户端是用redis来做队列，又怕数据丢失，选择了aof做持久化，队列中的key还都很大，基本上都是30k左右的值，虽然监控上看内存的值是没有很大
2.大量的大命令都堆积到了aof文件中，aof文件很快就达到了rewrite的触发条件，导致redis在不断的进行rewrite
3.又因设置了no-appendfsync-on-rewrite no，所以在rewrite期间，是不允许追加fsync的，再加上频繁的rewrite操作，就导致了aof的阻塞发生

4.解决方案

对于redis来说，最好还是用来做缓存，用来做队列，还要使用aof来持久化是不建议的，上边就是很好的例子，建议将redis做队列的功能，更改为用kafka/rabbitmq/rocketmq等专业的队列中间件来实现，若想继续使用redis做的话，请关闭aof持久化，并减小参数值，避免redis的阻塞，至于数据丢失问题，可以外加数据补偿机制，如果redis宕机等以外情况发生可以自行重推数据

5.appendfsync everysec不是1s

no-appendfsync-on-rewrite no / appendfsync everysec
每秒落盘一次，实际上不是1s，看下边的逻辑图，主线程在对比时间判断的是2s，此时最多丢失2s数据

no-appendfsync-on-rewrite yes / appendfsync everysec 等价于 appendfsync no
那么buff中的数据只能等到linux的sync执行的时候才会落盘，默认间隔30s，此时最多丢失30s数据