菜鸟谈——redis之持久化
1. rdb(redis database)
1.1 是什么
在指定的时间间隔内将内存中的数据集快照写入磁盘,也就是snapshot快照,它恢复时是将快照文件直接读到内存里。
redis会单独创建(fork)一个子进程来进行持久化,会先将数据写入到一个临时文件中,待持久化过程结束了,再用这个临时文件替换上次持久化好的文件。
整个过程中,主进程不进行任何IO操作,这确保了极高的性能。
如果需要进行大规模的数据的恢复,且对于数据恢复的完整性不是十分敏感,那RDB方式要比AOF方式更加的高效。
RDB的缺点是最后一次持久化后的数据可能丢失。
1.2 fork
fork的作用是复制一个当前进程一样的进程。新进程的所有数据(变量、环境变量、程序计数器等)数值都和原进程一致,但是是一个全新的进程,并作为原进程的子进程。
1.3 rdb保存的是dump.rdb
1.4 配置位置
################################ SNAPSHOTTING ################################
#
# Save the DB on disk:
#
# save <seconds> <changes>
#
# Will save the DB if both the given number of seconds and the given
# number of write operations against the DB occurred.
# 给定的时间内,进行了给定的操作,则触发DB的持久化操作
# In the example below the behaviour will be to save:
# after 900 sec (15 min) if at least 1 key changed
# 15分钟内,至少有一个键改变
# after 300 sec (5 min) if at least 10 keys changed
# 5分钟内,至少10个键改变
# after 60 sec if at least 10000 keys changed
# 60秒内,10000个键改变
# Note: you can disable saving completely by commenting out all "save" lines.
# 可以禁用持久化功能,通过放开 save "" 这行的注释
# It is also possible to remove all the previously configured save
# points by adding a save directive with a single empty string argument
# like in the following example:
#
# save ""
save 900 1
save 300 10
save 60 10000
# By default Redis will stop accepting writes if RDB snapshots are enabled
# (at least one save point) and the latest background save failed.
# This will make the user aware (in a hard way) that data is not persisting
# on disk properly, otherwise chances are that no one will notice and some
# disaster will happen.
#
# If the background saving process will start working again Redis will
# automatically allow writes again.
#
# However if you have setup your proper monitoring of the Redis server
# and persistence, you may want to disable this feature so that Redis will
# continue to work as usual even if there are problems with disk,
# permissions, and so forth.
# 默认情况下,如果redis最后一次的后台保存失败,redis将停止接受写操作
# 以这样一中强硬的方式让用户知道数据不能正确的持久化到磁盘
# 否则会没有注意到灾难的发生
stop-writes-on-bgsave-error yes
# Compress string objects using LZF when dump .rdb databases?
# For default that's set to 'yes' as it's almost always a win.
# If you want to save some CPU in the saving child set it to 'no' but
# the dataset will likely be bigger if you have compressible values or keys.
# 对于存储在磁盘中的快照,可以设置是否进行压缩存储。如果是的话,redis会采用LZF算法进行压缩。如果不想消耗# CPU来进行压缩的话,可以设置为关闭此功能。。
rdbcompression yes
# Since version 5 of RDB a CRC64 checksum is placed at the end of the file.
# This makes the format more resistant to corruption but there is a performance
# hit to pay (around 10%) when saving and loading RDB files, so you can disable it
# for maximum performances.
#
# RDB files created with checksum disabled have a checksum of zero that will
# tell the loading code to skip the check.
# 是否校验RDB文件
rdbchecksum yes
# The filename where to dump the DB
# 配置rdb文件名
dbfilename dump.rdb
# The working directory.
#
# The DB will be written inside this directory, with the filename specified
# above using the 'dbfilename' configuration directive.
#
# The Append Only File will also be created inside this directory.
#
# Note that you must specify a directory here, not a file name.
# 指定rdb文件存放目录 默认当前目录 通过config get dir获得
dir ./
1.5 如何触发rdb快照
1.5.1 配置文件中默认的快照配置
save 900 1
save 300 10
save 60 10000
以下操作会触发rdb快照
- 15分钟(900秒)内,至少有一个键改变
- 5分钟(300秒)内,至少10个键改变
- 60秒内,10000个键改变
1.5.2 命令save或者是bgsave
save时只管保存,其他不管,全部阻塞
bgsave,redis会在后台异步进行快照操作,快照同时还可以响应客户端请求。可以通过lastsave命令获取最后一次成功执行快照的时间
1.5.3 执行flushall命令,也会产生dump.rdb文件,但里面是空的,无意义
1.6 如何恢复
将备份文件拷贝到redis启动所在目录(通过config get dir获取)并启动服务即可
1.7 优势
适合大规模的数据恢复
对数据的完整性和一致性要求不高
1.8 劣势
在一定的时间间隔做一个备份,如果redis意外down掉的话,就会丢失最后一次快照后所有的修改。。
fork的时候,内存中的数据被克隆一份,大致2倍的膨胀性需要考虑。。
1.9 如何停止
动态停止所有RDB保存规则的方法:redis-cli config set save “”
1.10 小总结
好处
•RDB是一个非常紧凑的文件
•RDB在保存RDB文件时,父进程唯一需要做的就是fork出一个子进程,接下来的工作全部由子进程来做,父进程不需要再做其他的IO操作,所以RDB持久化方式可以最大化redis的性能
•与AOF相比,在恢复大的数据集的时候,RDB方式会更快
坏处
•数据丢失风险大
•RDB需要经常fork子进程来保存数据集到硬盘上,当数据集比较大的时候,fork过程是非常耗时的,可能会导致redis在一些毫秒级不能响应客户端请求
2. aof(Append Only File)
2.1 是什么
以日志的形式来记录每个操作,将redis执行过的所有写指令记录下来(读指令不记录),只许追加文件但不可以改写文件,redis启动之初会读取该文件重新构建数据,换言之,redis重启的话就根据日志文件的内容将写指令从前到后执行一次以完成数据的恢复工作
2.2 aof保存的是appendonly.aof文件
2.3 配置位置
############################## APPEND ONLY MODE ###############################
# By default Redis asynchronously dumps the dataset on disk. This mode is
# good enough in many applications, but an issue with the Redis process or
# a power outage may result into a few minutes of writes lost (depending on
# the configured save points).
#
# The Append Only File is an alternative persistence mode that provides
# much better durability. For instance using the default data fsync policy
# (see later in the config file) Redis can lose just one second of writes in a
# dramatic event like a server power outage, or a single write if something
# wrong with the Redis process itself happens, but the operating system is
# still running correctly.
#
# AOF and RDB persistence can be enabled at the same time without problems.
# If the AOF is enabled on startup Redis will load the AOF, that is the file
# with the better durability guarantees.
#
# Please check http://redis.io/topics/persistence for more information.
# 是否打开aof,默认是不打开的
appendonly no
# The name of the append only file (default: "appendonly.aof")
# 指定aof的文件名
appendfilename "appendonly.aof"
# The fsync() call tells the Operating System to actually write data on disk
# instead of waiting for more data in the output buffer. Some OS will really flush
# data on disk, some other OS will just try to do it ASAP.
#
# Redis supports three different modes:
#
# no: don't fsync, just let the OS flush the data when it wants. Faster.
# always: fsync after every write to the append only log. Slow, Safest.
# everysec: fsync only one time every second. Compromise.
#
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".
# appendfsync always #同步持久化,每次发生数据变更会被立即记录到磁盘,性能较差但数据的完整性比较好
appendfsync everysec #出厂默认设置,异步操作,每秒记录,如果一秒内宕机,有数据丢失
# appendfsync no
# When the AOF fsync policy is set to always or everysec, and a background
# saving process (a background save or AOF log background rewriting) is
# performing a lot of I/O against the disk, in some Linux configurations
# Redis may block too long on the fsync() call. Note that there is no fix for
# this currently, as even performing fsync in a different thread will block
# our synchronous write(2) call.
#
# In order to mitigate this problem it's possible to use the following option
# that will prevent fsync() from being called in the main process while a
# BGSAVE or BGREWRITEAOF is in progress.
#
# This means that while another child is saving, the durability of Redis is
# the same as "appendfsync none". In practical terms, this means that it is
# possible to lose up to 30 seconds of log in the worst scenario (with the
# default Linux settings).
#
# If you have latency problems turn this to "yes". Otherwise leave it as
# "no" that is the safest pick from the point of view of durability.
# 重写时是否可以运用Appendfsync,用默认no即可,保证数据的安全性
no-appendfsync-on-rewrite no
# Automatic rewrite of the append only file.
# Redis is able to automatically rewrite the log file implicitly calling
# BGREWRITEAOF when the AOF log size grows by the specified percentage.
#
# This is how it works: Redis remembers the size of the AOF file after the
# latest rewrite (if no rewrite has happened since the restart, the size of
# the AOF at startup is used).
#
# This base size is compared to the current size. If the current size is
# bigger than the specified percentage, the rewrite is triggered. Also
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a percentage of zero in order to disable the automatic AOF
# rewrite feature.
# ????????
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
# An AOF file may be found to be truncated at the end during the Redis
# startup process, when the AOF data gets loaded back into memory.
# This may happen when the system where Redis is running
# crashes, especially when an ext4 filesystem is mounted without the
# data=ordered option (however this can't happen when Redis itself
# crashes or aborts but the operating system still works correctly).
#
# Redis can either exit with an error when this happens, or load as much
# data as possible (the default now) and start if the AOF file is found
# to be truncated at the end. The following option controls this behavior.
#
# If aof-load-truncated is set to yes, a truncated AOF file is loaded and
# the Redis server starts emitting a log to inform the user of the event.
# Otherwise if the option is set to no, the server aborts with an error
# and refuses to start. When the option is set to no, the user requires
# to fix the AOF file using the "redis-check-aof" utility before to restart
# the server.
#
# Note that if the AOF file will be found to be corrupted in the middle
# the server will still exit with an error. This option only applies when
# Redis will try to read more data from the AOF file but not enough bytes
# will be found.
# ??????
aof-load-truncated yes
2.4 aof启动/修复/恢复
2.4.1 正常恢复
- 启动:设置yes,修改默认的appendonly no,改为yes
- 将有数据的aof文件复制一份保存到对应目录(config get dir)
- 恢复:重启redis然后重新加载
2.4.2 异常恢复
- 启动:设置yes,修改默认的appendonly no,改为yes
- 备份被写坏的AOF文件
- 修复: redis-check-aof –fix 进行修复
- 恢复:重启redis然后冲重新加载
2.5 Rewrite(重点)
2.5.1 是什么
AOF采用文件追加方式,文件会越来越大为避免出现此种情况,新增了重写机制,当AOF文件的大小超过所设定的阙值时,redis就会启动AOF文件的内容压缩,只保留可以恢复数据的最小指令集,可以使用命令bgrewriteaof
2.5.1 重写原理
AOF文件持续增长而过大时,会fork出一条新进程来将文件重写(也是先写临时文件最后再rename),遍历新进程中的内存数据,每条记录有一条的set语句。重写aof文件的操作,并没有读取旧的aof文件,而是将整个内存中的数据库内容用命令的方式重写了一个新的aof文件,这点和rdb有点类似
2.5.1 触发机制
redis会记录上一次重写时的aof的文件大小,默认配置是当AOF文件大小是上次rewrite后大小的一倍且文件大于64M时触发
2.6 优势
每秒同步:appendfsync always 同步持久化,每次发生数据变更会被立即记录到磁盘,性能较差但数据完整性较好
每修改同步:appendfsync everysec 异步操作,每秒记录,如果一秒内宕机,有数据丢失
不同步:appendfsync no 从不同步
2.7 劣势
相同数据集的数据而言,aof文件要远大于rdb文件,恢复速度慢于rdb
aof运用效率要慢于rdb,每秒同步策略效率较好,不同步效率和rdb相同
2.8 小总结
好处:
AOF文件是一个只进行追加的日志文件
redis可以在AOF文件体积变的过大时,自动地在后台对AOF文件进行重写
AOF文件有序地保存了对数据库执行的所有文件写入操作,这些写入操作以redis协议的格式保存,因此AOF文件内容非常容易被人读懂,对文件进行分析也很轻松。
坏处:
对于相同的数据集来说,AOF文件的体积通常要大于RDB文件的体积
根据锁使用的fsync策略,AOF的速度可能会慢于rdb
3. which one
RDB持久化方式能够在指定的时间间隔能对你的数据进行快照存储
AOF持久化记录每次对服务器的写操作,当服务器重启的时候会重新执行这些命令来恢复原始的数据,AOF命令以redis协议追加保存每次写的操作到文件末尾,redis还能对AOF文件进行后台重写,使得AOF文件的体积不至于过大
只做缓存:如果你只希望你的数据在服务器运行的时候存在,你也可以不使用任何持久化的方式
同时启动两种持久化方式:
在这种情况下,当redis重启的时候会优先载入AOF文件来恢复原始的数据,因为在通常情况下AOF文件保存的数据集要比RDB保存的数据集完整
RDB的数据不实时,同时使用两者时服务器重启也会找AOF文件。那要不要只使用AOF呢?
建议不要,因为RDB更适合用于备份数据库(AOF在不断变化不好备份)
快速启动,而且不会有AOF可能存在潜在的bug,留着作为一个万一的手段。。。