redis不断异常自动重启问题

white_li5

已于 2025-01-03 17:47:15 修改

阅读量4k

点赞数 3

文章标签： redis 数据库 linux debian

于 2023-06-27 11:57:00 首次发布

本文链接：https://blog.csdn.net/white_li5/article/details/131414247

版权

文章目录

- 问题：宕机重启后，业务部署日志报错无法访问redis

问题：宕机重启后，业务部署日志报错无法访问redis

一、问题描述

首先查看redis状态，发现状态正常，本地尝试连接，watch命令每个2秒看redis状态，发现每隔20s左右，redis会自动重启

（1）第一次查看redis状态
root@white:~# systemctl status redis
● redis.service - Redis In-Memory Data Store
   Loaded: loaded (/etc/systemd/system/redis.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2023-03-09 19:05:04 CST; 22s
 Main PID: 30028 (redis-server)
    Tasks: 6
   Memory: 980.6M
   CGroup: /system.slice/redis.service
           └─30028 /usr/local/bin/redis-server 0.0.0.0:6379

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

（2）尝试本地连接，发现每隔20s左右会自动断开
root@white:~# redis-cli -h 127.0.0.1

（3）watch查看redis状态，发现redis每隔20s左右会重启
root@white:~# watch systemctl status redis

查看redis日志有，两块异常内容

root@white:~# tail -f /home/redis/var/log/redis.log
......
19 Jun 2023 15:41:23.222 # WARNING overcommit_memory is set to O! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.vm.overcommit_memory = 1
......
19 Jun 2023 15:49:39.577 # Bad file format reading the append only file: make a backup of your A0F file, then use ./redis-check-aof --fix <filename>

二、问题分析

关于redis不断重启，检查管理redis服务的systemd unit 文件，因为配置服务重启策略，说了redis服务是因为有异常退出，所以不会尝试重启

root@white:~# cat /etc/systemd/system/redis.service
[Unit]
Description=Redis In-Memory Data Store
After=network.target

[Service]
User=redis
Group=redis
ExecStart=/usr/local/bin/redis-server /etc/redis/redis.conf
ExecStop=/usr/local/bin/redis-cli shutdown
Restart=always    # 指定了服务的重启策略。具体来说，它表示当服务异常退出时，systemd 会自动尝试重新启动该服务，直到服务正常退出为止。

[Install]
WantedBy=multi-user.target

分析异常日志

异常日志一：
19 Jun 2023 15:41:23.222 # WARNING overcommit_memory is set to O! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.vm.overcommit_memory = 1

分析：
   这个日志主要是在警告系统的内存配置可能导致 Redis 后台保存数据时失败。具体来说，日志中提到了 overcommit_memory 参数被设置为 0，vm.overcommit_memory = 1 参数控制了 Linux 系统内存超额分配机制的行为。当这个参数被设置为 1 时，Linux 内核会拒绝分配超过系统可用内存的内存，从而防止系统因为内存不足而出现进程崩溃或者死锁等问题。在 Linux 系统中，如果 overcommit_memory 参数被设置为 0，那么当系统内存不足时，内核会尝试超额分配内存，这可能会导致 Redis 后台保存数据时失败，因此redis无法使用系统分配内存，因而出现异常，所以不断重启。
   为了解决这个问题，日志中建议将 vm.overcommit_memory 参数设置为 1。这个参数控制了系统内存超额分配机制的行为，当 vm.overcommit_memory 参数被设置为 1 时，内核会拒绝分配超过系统可用内存的内存，这可以避免 Redis 后台保存数据时失败的问题。
   日志中还提供了两种方法来修改 vm.overcommit_memory 参数，一种是将 vm.overcommit_memory = 1 添加到 /etc/sysctl.conf 文件中，并重启系统使其生效；另一种是使用 sysctl vm.overcommit_memory=1 命令来实时修改参数。

异常日志二：
19 Jun 2023 15:49:39.577 # Bad file format reading the append only file: make a backup of your A0F file, then use ./redis-check-aof --fix <filename>

分析：
	这个日志表示 Redis 在读取 AOF（Append-Only File）文件时遇到了格式错误，无法正确解析文件内容。这可能是由于 AOF 文件宕机重启已经损坏或者被修改导致的。
日志中建议用户先备份 AOF 文件，然后使用 redis-check-aof 工具来检查和修复文件。具体来说，可以使用以下命令来修复 AOF 文件：
	redis-check-aof --fix <filename>
	其中 <filename> 是需要修复的 AOF 文件名。这个命令会检查 AOF 文件的格式并尝试修复错误，如果修复成功，就可以重新加载 AOF 文件来恢复 Redis 数据库状态。
    需要注意的是，修复 AOF 文件可能会导致一些数据丢失或者不一致，因此在执行修复操作之前一定要先备份好 AOF 文件，并在修复后检查数据的完整性和正确性。

三、问题处理

设置vm.overcommit_memory 参数为1

root@white:~# sysctl vm.overcommit_memory=1
root@white:~# sysctl -p

根据日志提示，备份后修复AOF文件

（1）首先查找修复命令
root@white:~# find / -name "redis-check-aof"
（2）查找AOF文件
root@white:~# find / -name "*.aof"
（3）备份原AOF文件
root@white:~# cp -a appendonly.aof appendonly.aof.bak.$(date +"%Y%m%d")
（4）修复原文件
root@white:~# redis-check-aof appendonly.aof

重启redis
```
root@white:~# systemctl restart redis
```
检查是否没有不断重启，并且业务也反馈可以正常连接

四、问题优化

使vm.overcommit_memory=1永久生效

（1）在 /etc/sysctl.conf 追加配置
root@white:~# echo "vm.overcommit_memory=1"  >> /etc/sysctl.conf
（2）运行以下命令重新加载 sysctl.conf 文件，永久设置为系统的默认值
root@white:~# sysctl -p
（3）通过下面命令检查是否成功设置，如果输出为 1，则说明选项已成功设置。
root@white:~# cat /proc/sys/vm/overcommit_memory