目录
2.4.1 redisServer结构体中与RDB相关的代码
0.阅读
Sripathi Krishnan编写的《Redis RDB文件格式》
https://redis.io/topics/persistence
1.初识Redis持久化
1.1 持久化
目前我理解的持久化就是把存在于易失性介质中的数据转存到不太易失性介质中.
Redis是一个内存数据库,如果没有持久化手段,当机器重启或突然断电之后内存中的数据都会丢失.
所以对Redis来说,持久化显得尤为重要.目前我理解的Redis的持久化就是把内存里的数据存入到磁盘中。
1.2 Reids中的持久化方式
RDB (Redis Database)
AOF (Append Only File)
1.3 在客户端用info命令观察持久化现象
127.0.0.1:6379> info persistence
# Persistence
loading:0 /* 是否正在加载RDB文件 */
rdb_changes_since_last_save:0 /* 最后一次保存之后改变的键的个数 */
rdb_bgsave_in_progress:0 /* 是否正在后台执行RDB保存任务 */
rdb_last_save_time:1619187020 /* 最后一次执行RDB保存任务的时间 */
rdb_last_bgsave_status:ok /* 最后一次执行RDB保存任务的状态 */
rdb_last_bgsave_time_sec:-1 /* 最后一次执行RDB保存任务消耗的时间 */
rdb_current_bgsave_time_sec:-1 /* 如果正在执行RDB报错任务,则为当前RDB任务已经消耗的时间,
否则为-1*/
rdb_last_cow_size:0 /* 最后一次执行RDB保存任务消耗的内存 */
aof_enabled:0 /* 是否开启了AOF功能 */
aof_rewrite_in_progress:0 /* 是否正在后台执行AOF重写任务*/
aof_rewrite_scheduled:0 /* 是否等待调度一次AOF重写任务,如果触发了一次AOF重写,但是以
后台正在执行RDB保存时任务会将该状态置为1*/
aof_last_rewrite_time_sec:-1 /* 最后一次执行AOF重写任务消耗的时间 */
aof_current_rewrite_time_sec:-1 /* 如果正在执行AOF重写任务,则为当前该任务已经消耗的时间,否
则为-1*/
aof_last_bgrewrite_status:ok /* 最后一次执行AOF重写任务的状态 */
aof_last_write_status:ok /* 最后一次执行AOF缓冲区写入的状态(服务端执行命令是会开辟一段
内存将命令放入其中,然后从该缓冲区中同步到文件,该状态标记
最后一次同步到文件的状态)
*/
aof_last_cow_size:0 /* 最后一次执行AOF重写任务消耗的内存 */
module_fork_in_progress:0 /* 待研究 Flag indicating a module fork is on-going */
module_fork_last_cow_size:0 /* 待研究 The size in bytes of copy-on-write memory
during the last module fork operation */
2.RDB
2.1 触发时机
2.1.1 手动触发
SAVE
BGSAVE
2.1.2 配置文件中配置触发频率
################################ SNAPSHOTTING ################################
#
# Save the DB on disk:
#
# save <seconds> <changes>
#
# Will save the DB if both the given number of seconds and the given
# number of write operations against the DB occurred.
#
# In the example below the behaviour will be to save:
# after 900 sec (15 min) if at least 1 key changed
# after 300 sec (5 min) if at least 10 keys changed
# after 60 sec if at least 10000 keys changed
#
# Note: you can disable saving completely by commenting out all "save" lines.
#
# It is also possible to remove all the previously configured save
# points by adding a save directive with a single empty string argument
# like in the following example:
#
# save ""
save 900 1 //服务器在900秒之内,对数据库执行了至少1次修改
save 300 10 //服务器在300秒之内,对数据库执行了至少10修改
save 60 10000 //服务器在60秒之内,对数据库执行了至少1000修改
满足以上三个条件中的任意一个,则自动触发 BGSAVE 操作【这句话对吗?】待验证,
我觉得是触发save操作,不是触发bgsave操作.
或者使用命令CONFIG SET 命令配置? 链接【https://redis.io/commands/config-set】
CONFIG SET SAVE "900 1 300 10 60 10000".
其他相关配置:
save m n
#配置快照(rdb)促发规则,格式:save <seconds> <changes>
#save 900 1 900秒内至少有1个key被改变则做一次快照
#save 300 10 300秒内至少有300个key被改变则做一次快照
#save 60 10000 60秒内至少有10000个key被改变则做一次快照
#关闭该规则使用svae “”
dbfilename dump.rdb
#rdb持久化存储数据库文件名,默认为dump.rdb
stop-write-on-bgsave-error yes
#yes代表当使用bgsave命令持久化出错时候停止写RDB快照文件,no表明忽略错误继续写文件。
rdbchecksum yes
#在写入文件和读取文件时是否开启rdb文件检查,检查是否有无损坏,如果在启动是检查发现损坏,则停止启动。
dir "/etc/redis"
#数据文件存放目录,rdb快照文件和aof文件都会存放至该目录,请确保有写权限
rdbcompression yes
#是否开启RDB文件压缩,该功能可以节约磁盘空间
2.1.3 手动设置触发配置
CONFIG SET SAVE "10 1 5 2" // 可实时生效
/* Check if a background saving or AOF rewrite in progress terminated. */
if (hasActiveChildProcess() || ldbPendingChildren())
{
checkChildrenDone();
} else {
/* If there is not a background saving/rewrite in progress check if
* we have to save/rewrite now. */
for (j = 0; j < server.saveparamslen; j++) {
struct saveparam *sp = server.saveparams+j;
/* Save if we reached the given amount of changes,
* the given amount of seconds, and if the latest bgsave was
* successful or if, in case of an error, at least
* CONFIG_BGSAVE_RETRY_DELAY seconds already elapsed. */
if (server.dirty >= sp->changes &&
server.unixtime-server.lastsave > sp->seconds &&
(server.unixtime-server.lastbgsave_try >
CONFIG_BGSAVE_RETRY_DELAY ||
server.lastbgsave_status == C_OK))
{
serverLog(LL_NOTICE,"%d changes in %d seconds. Saving...",
sp->changes, (int)sp->seconds);
rdbSaveInfo rsi, *rsiptr;
rsiptr = rdbPopulateSaveInfo(&rsi);
rdbSaveBackground(server.rdb_filename,rsiptr);
break;
}
}
/* Trigger an AOF rewrite if needed. */
if (server.aof_state == AOF_ON &&
!hasActiveChildProcess() &&
server.aof_rewrite_perc &&
server.aof_current_size > server.aof_rewrite_min_size)
{
long long base = server.aof_rewrite_base_size ?
server.aof_rewrite_base_size : 1;
long long growth = (server.aof_current_size*100/base) - 100;
if (growth >= server.aof_rewrite_perc) {
serverLog(LL_NOTICE,"Starting automatic rewriting of AOF on %lld%% growth",growth);
rewriteAppendOnlyFileBackground();
}
}
}
2.1.4 主从复制时触发
主从复制时,从库全量复制同步主库数据,此时主库会执行 bgsave 命令进行快照;
在redis主从复制中,从节点执行全量复制操作,主节点会执行bgsave命令,并将rdb文件发送给从节点。
2.1.5 flushall触发
客户端执行数据库清空命令flushall时候,触发快照;
flushall命令用于清空数据库,需慎用,当我们使用了则表明我们需要对数据进行清空,那redis当然需要对快
照文件也进行清空,所以会触发bgsave.
2.1.6 shutdown触发
客户端执行shutdown关闭redis时,触发快照;
shutdown命令,redis在关闭前处于安全角度将所有数据全部保存下来,以便下次启动会恢复.
2.2 一些说明
1.相关命令
config get dbfilename //查看备份文件名,此名称是默认的,redis.conf中有说明
config get dir //查看备份的目录名,可以通过config set dir 'path'进行手动设置
info Persistence //查看持久化信息
2.相关路径和文件
/home/muten/module/redis-6.0.8/redis-6.0.8/src/redis-server
/home/muten/module/redis-6.0.8/redis-6.0.8/redis.conf
/home/muten/module/redis-6.0.8/redis-6.0.8/src/dump.rdb
改代码:
/home/muten/module/redis-6.0.8/redis-6.0.8/src/rdb.c的1394行,将其注释掉,
测试bgsave命令保存的时候,产生的子进程【redis-rdb-bgsave】.
2.3 RDB文件分析
2.3.1 链接知识
【xxd学习】
【od学习】
【ASCII表】
【'\0'的ASCII是0,对应的字符是空的,如果按照整型打印,则将字符'\0'强转为整形0】
2.3.2 分析方法及背景知识
前置背景:
rdb文件是二进制的文件
分析方式:
xxd /home/muten/module/redis-6.0.8/redis-6.0.8/src/dump.rdb
xxd -b /home/muten/module/redis-6.0.8/redis-6.0.8/src/dump.rdb
od -c /home/muten/module/redis-6.0.8/redis-6.0.8/src/dump.rdb
od -cx /home/muten/module/redis-6.0.8/redis-6.0.8/src/dump.rdb
hexdump -c /home/muten/module/redis-6.0.8/redis-6.0.8/src/dump.rdb
hexdump -cx /home/muten/module/redis-6.0.8/redis-6.0.8/src/dump.rdb
二进制文件可用以xxd,od,hexdump来分析,hexdump -c分析效果与od -c差不多,看着还不错.
八进制 - 十进制
300 - 192
372 - 250
302 - 194
376 - 254
321 - 209
373 - 251
360 - 240
377 - 255
#define RDB_OPCODE_MODULE_AUX 247 /* Module auxiliary data. */
#define RDB_OPCODE_IDLE 248 /* LRU idle time. */
#define RDB_OPCODE_FREQ 249 /* LFU frequency. */
#define RDB_OPCODE_AUX 250 /* RDB aux field. */
#define RDB_OPCODE_RESIZEDB 251 /* Hash table resize hint. */
#define RDB_OPCODE_EXPIRETIME_MS 252 /* Expire time in milliseconds. */
#define RDB_OPCODE_EXPIRETIME 253 /* Old expire time in seconds. */
#define RDB_OPCODE_SELECTDB 254 /* DB number of the following keys. */
#define RDB_OPCODE_EOF 255 /* End of the RDB file. */
八进制 - 十进制
367 - 247
370 - 248
371 - 249
372 - 250
373 - 251
374 - 252
375 - 253
376 - 254
377 - 255
2.3.3 RDB文件结构图
转义字符 ASCII
\t 9
\n 10
\b 8
\0 0
\f 12
这些字符放在字符串前面标明了字符串的长度.
控制字符 ASCII
@ 64
八进制 二进制
300 110000000 标明数据类型是INT8 取值范围是【-128-127】
2.4 源码阅读
2.4.1 redisServer结构体中与RDB相关的代码
struct redisServer {
...
redisDb *db;
/* 从节点列表和监视器列表 */
list *slaves, *monitors; /* List of slaves and MONITORs */
/* RDB / AOF loading information */
/* 正在载入状态 */
int loading; /* We are loading data from disk if true */
off_t loading_total_bytes;
off_t loading_loaded_bytes;
time_t loading_start_time;
/* 在load时,用来设置读或写的最大字节数max_processing_chunk */
off_t loading_process_events_interval_bytes;
/* 服务器内存使用的最大值 */
size_t stat_peak_memory; /* Max used memory record */
/* 计算fork()的时间 */
long long stat_fork_time; /* Time needed to perform latest fork() */
/* 计算fork的速率,GB/每秒 */
double stat_fork_rate; /* Fork rate in GB/sec. */
/* 记录数据库被修改的次数 */
long long dirty; /* Changes to DB from the last save */
/* 在BGSAVE之前要备份脏键dirty的值,如果BGSAVE失败会还原 */
long long dirty_before_bgsave; /* Used to restore dirty on failed BGSAVE */
pid_t rdb_child_pid; /* PID of RDB saving child */
struct saveparam *saveparams; /* Save points array for RDB */
int saveparamslen; /* Number of saving points */
char *rdb_filename; /* Name of RDB file */
/* 是否采用LZF压缩算法压缩RDB文件,默认yes */
int rdb_compression; /* Use compression in RDB? */
/* RDB文件是否使用校验和,默认yes */
int rdb_checksum; /* Use RDB checksum? */
/* 待研究 */
int rdb_del_sync_files; /* Remove RDB files used only for SYNC if
the instance does not use persistence. */
time_t lastsave; /* Unix time of last successful save */
time_t lastbgsave_try; /* Unix time of last attempted bgsave */
time_t rdb_save_time_last; /* Time used by last RDB save run. */
time_t rdb_save_time_start; /* Current RDB save start time. */
/* 当rdb_bgsave_scheduled为真时,才能开始BGSAVE */
int rdb_bgsave_scheduled; /* BGSAVE when possible if true. */
/* rdb执行的类型,是写入磁盘,还是写入从节点的socket */
int rdb_child_type; /* Type of save by active child. */
/* BGSAVE执行完的状态 */
int lastbgsave_status; /* C_OK or C_ERR */
/* 如果不能执行BGSAVE则不能写 */
int stop_writes_on_bgsave_err; /* Don't allow writes if can't BGSAVE */
/* 无磁盘同步,管道的写端 */
int rdb_pipe_write; /* RDB pipes used to transfer the rdb */
/* 无磁盘同步,管道的读端 */
int rdb_pipe_read; /* data to the parent process in diskless repl. */
connection **rdb_pipe_conns; /* Connections which are currently the */
int rdb_pipe_numconns; /* target of diskless rdb fork child. */
int rdb_pipe_numconns_writing; /* Number of rdb conns with pending writes. */
char *rdb_pipe_buff; /* In diskless replication, this buffer holds data */
int rdb_pipe_bufflen; /* that was read from the the rdb pipe. */
int rdb_key_save_delay; /* Delay in microseconds between keys while
* writing the RDB. (for testings) */
int key_load_delay; /* Delay in microseconds between keys while
* loading aof or rdb. (for testings) */
_Atomic time_t unixtime; /* Unix time sampled every cron cycle. */
/* Latency monitor */
/* 延迟的阀值 */
long long latency_monitor_threshold;
/* 延迟与造成延迟的事件关联的字典 */
dict *latency_events;
...
}
2.4.2 rdb.h中相关内容
#define RDB_VERSION 9 /*RDB的版本*/
/* Special RDB opcodes (saved/loaded with rdbSaveType/rdbLoadType). */
#define RDB_OPCODE_MODULE_AUX 247 /* Module auxiliary data. */
#define RDB_OPCODE_IDLE 248 /* LRU idle time. */
#define RDB_OPCODE_FREQ 249 /* LFU frequency. */
#define RDB_OPCODE_AUX 250 /* RDB aux field. */
#define RDB_OPCODE_RESIZEDB 251 /* Hash table resize hint. */
#define RDB_OPCODE_EXPIRETIME_MS 252 /* Expire time in milliseconds. */
#define RDB_OPCODE_EXPIRETIME 253 /* Old expire time in seconds. */
#define RDB_OPCODE_SELECTDB 254 /* DB number of the following keys. */
#define RDB_OPCODE_EOF 255 /* End of the RDB file. */
2.4.3 rdb.c中相关内容
1.rdbSaveInfoAuxFields学习
1.1 rdbSaveInfoAuxFields
/* Save a few default AUX fields with information about the RDB generated. */
int rdbSaveInfoAuxFields(rio *rdb, int rdbflags, rdbSaveInfo *rsi) {
int redis_bits = (sizeof(void*) == 8) ? 64 : 32;
int aof_preamble = (rdbflags & RDBFLAGS_AOF_PREAMBLE) != 0;
/* Add a few fields about the state when the RDB was created. */
if (rdbSaveAuxFieldStrStr(rdb,"redis-ver",REDIS_VERSION) == -1) return -1;
if (rdbSaveAuxFieldStrInt(rdb,"redis-bits",redis_bits) == -1) return -1;
if (rdbSaveAuxFieldStrInt(rdb,"ctime",time(NULL)) == -1) return -1;
if (rdbSaveAuxFieldStrInt(rdb,"used-mem",zmalloc_used_memory()) == -1) return -1;
/* Handle saving options that generate aux fields. */
if (rsi) {
if (rdbSaveAuxFieldStrInt(rdb,"repl-stream-db",rsi->repl_stream_db)
== -1) return -1;
if (rdbSaveAuxFieldStrStr(rdb,"repl-id",server.replid)
== -1) return -1;
if (rdbSaveAuxFieldStrInt(rdb,"repl-offset",server.master_repl_offset)
== -1) return -1;
}
if (rdbSaveAuxFieldStrInt(rdb,"aof-preamble",aof_preamble) == -1) return -1;
return 1;
}
1.2 rdbSaveAuxFieldStrStr
/* Wrapper for rdbSaveAuxField() used when key/val length can be obtained
* with strlen(). */
ssize_t rdbSaveAuxFieldStrStr(rio *rdb, char *key, char *val) {
return rdbSaveAuxField(rdb,key,strlen(key),val,strlen(val));
}
1.3 rdbSaveAuxField
/* Save an AUX field. */
ssize_t rdbSaveAuxField(rio *rdb, void *key, size_t keylen, void *val, size_t vallen) {
ssize_t ret, len = 0;
/* #define RDB_OPCODE_AUX 250(对应8进制是372) 指明可以开始写辅助字段开始的信息了 */
if ((ret = rdbSaveType(rdb,RDB_OPCODE_AUX)) == -1) return -1;
len += ret;
if ((ret = rdbSaveRawString(rdb,key,keylen)) == -1) return -1;
len += ret;
if ((ret = rdbSaveRawString(rdb,val,vallen)) == -1) return -1;
len += ret;
return len;
}
1.4 rdbSaveType
int rdbSaveType(rio *rdb, unsigned char type) {
return rdbWriteRaw(rdb,&type,1);
}
1.5 rdbWriteRaw
static int rdbWriteRaw(rio *rdb, void *p, size_t len) {
if (rdb && rioWrite(rdb,p,len) == 0)
return -1;
return len;
}
1.6 rioWrite
static inline size_t rioWrite(rio *r, const void *buf, size_t len) {
if (r->flags & RIO_FLAG_WRITE_ERROR) return 0;
while (len) {
size_t bytes_to_write = (r->max_processing_chunk && r->max_processing_chunk < len) ? r->max_processing_chunk : len;
if (r->update_cksum) r->update_cksum(r,buf,bytes_to_write);
if (r->write(r,buf,bytes_to_write) == 0) {
r->flags |= RIO_FLAG_WRITE_ERROR;
return 0;
}
buf = (char*)buf + bytes_to_write;
len -= bytes_to_write;
r->processed_bytes += bytes_to_write;
}
return 1;
}
2.4.4 客户端输入save命令
void saveCommand(client *c) {
if (server.rdb_child_pid != -1) {
addReplyError(c,"Background save already in progress");
return;
}
rdbSaveInfo rsi, *rsiptr;
/*
rdbPopulateSaveInfo这里&rsi是否为NULL影响后面是否会在【辅助信息】中输出
repl-stream-db,repl-id,repl-offset等内容,如果为NULL,则不输出.
是与备份相关的内容
*/
rsiptr = rdbPopulateSaveInfo(&rsi);
if (rdbSave(server.rdb_filename,rsiptr) == C_OK) {
addReply(c,shared.ok);
} else {
addReply(c,shared.err);
}
}
/* Save the DB on disk. Return C_ERR on error, C_OK on success. */
/* 将数据库保存在磁盘上,返回C_OK成功,否则返回C_ERR */
int rdbSave(char *filename, rdbSaveInfo *rsi) {
char tmpfile[256];
char cwd[MAXPATHLEN]; /* Current working dir path for error messages. */
FILE *fp;
rio rdb;
int error = 0;
/* 创建临时文件 */
snprintf(tmpfile,256,"temp-%d.rdb", (int) getpid());
/* 以写方式打开该文件 */
fp = fopen(tmpfile,"w");
/* 如果文件打开失败,则获取文件目录,写入打开失败的原因等信息到日志中 */
if (!fp) {
char *cwdp = getcwd(cwd,MAXPATHLEN);
serverLog(LL_WARNING,
"Failed opening the RDB file %s (in server root dir %s) "
"for saving: %s",
filename,
cwdp ? cwdp : "unknown",
strerror(errno));
return C_ERR;
}
/* 初始化一个rio对象,该对象是一个文件对象IO */
rioInitWithFile(&rdb,fp);
startSaving(RDBFLAGS_NONE);
/* 执行rdb是否以增量的方式进行同步 */
if (server.rdb_save_incremental_fsync)
rioSetAutoSync(&rdb,REDIS_AUTOSYNC_BYTES);/* 如果执行rdb时以增量的方式进行同步,则
每写入REDIS_AUTOSYNC_BYTES个字节
数据就执行一个sync同步操作*/
/* 将数据库的内容写到rio中 */
if (rdbSaveRio(&rdb,&error,RDBFLAGS_NONE,rsi) == C_ERR) {
errno = error;
goto werr;
}
/* 确保数据不会留在内核输出缓冲区中 */
/* Make sure data will not remain on the OS's output buffers */
/* 将用户缓冲区的内容刷新到内核的缓冲区中 */
if (fflush(fp) == EOF) goto werr;
/* 将内核缓冲区的内容同步到磁盘中*/
if (fsync(fileno(fp)) == -1) goto werr;
/* 关闭文件 */
if (fclose(fp) == EOF) goto werr;
/* Use RENAME to make sure the DB file is changed atomically only
* if the generate DB file is ok. */
if (rename(tmpfile,filename) == -1) {
char *cwdp = getcwd(cwd,MAXPATHLEN);
serverLog(LL_WARNING,
"Error moving temp DB file %s on the final "
"destination %s (in server root dir %s): %s",
tmpfile,
filename,
cwdp ? cwdp : "unknown",
strerror(errno));
unlink(tmpfile);
stopSaving(0);
return C_ERR;
}
/* 写日志文件 */
serverLog(LL_NOTICE,"DB saved on disk");
/* 重置服务器的脏键 */
server.dirty = 0;
/* 将SAVE操作的时间更新到server.lastsave中 */
server.lastsave = time(NULL);
/* 更新SAVE操作的状态 */
server.lastbgsave_status = C_OK;
/* 待研究 可能是保存更新状态用于去通知订阅者和其他一些模块 */
stopSaving(1);
/* 返回OK */
return C_OK;
/* 函数的写错误处理,写日志,关闭文件,删除临时文件,返回C_ERR */
werr:
serverLog(LL_WARNING,"Write error saving DB on disk: %s", strerror(errno));
fclose(fp);
unlink(tmpfile);
stopSaving(0);
return C_ERR;
}
接下来我们要分析rdbSave中算是最重要的一个函数调用rdbSaveRio,由于篇幅原因,我们将在下一节介绍.
2.4.5 rdbSaveRio的实现
1.首先看一下rio的定义:
struct _rio {
/* Backend functions.
* Since this functions do not tolerate short writes or reads the return
* value is simplified to: zero on error, non zero on complete success. */
size_t (*read)(struct _rio *, void *buf, size_t len);
size_t (*write)(struct _rio *, const void *buf, size_t len);
off_t (*tell)(struct _rio *);
int (*flush)(struct _rio *);
/* The update_cksum method if not NULL is used to compute the checksum of
* all the data that was read or written so far. The method should be
* designed so that can be called with the current checksum, and the buf
* and len fields pointing to the new block of data to add to the checksum
* computation. */
void (*update_cksum)(struct _rio *, const void *buf, size_t len);
/* The current checksum and flags (see RIO_FLAG_*) */
uint64_t cksum, flags;
/* number of bytes read or written */
size_t processed_bytes;
/* maximum single read or write chunk size */
size_t max_processing_chunk;
/* Backend-specific vars. */
union {
/* In-memory buffer target. */
struct {
sds ptr;
off_t pos;
} buffer;
/* Stdio file pointer target. */
struct {
FILE *fp;
off_t buffered; /* Bytes written since last fsync. */
off_t autosync; /* fsync after 'autosync' bytes written. */
} file;
/* Connection object (used to read from socket) */
struct {
connection *conn; /* Connection */
off_t pos; /* pos in buf that was returned */
sds buf; /* buffered data */
size_t read_limit; /* don't allow to buffer/read more than that */
size_t read_so_far; /* amount of data read from the rio (not buffered) */
} conn;
/* FD target (used to write to pipe). */
struct {
int fd; /* File descriptor. */
off_t pos;
sds buf;
} fd;
} io;
};
typedef struct _rio rio;
2.再看一下rdbSaveInfo的定义:
typedef struct rdbSaveInfo {
/* Used saving and loading. */
int repl_stream_db; /* DB to select in server.master client. */
/* Used only loading. */
int repl_id_is_set; /* True if repl_id field is set. */
char repl_id[CONFIG_RUN_ID_SIZE+1]; /* Replication ID. */
long long repl_offset; /* Replication offset. */
} rdbSaveInfo;
3.看一下rdbSaveRio的实现:
/*
产生一个RDB模式的将要发给特定RedisI/O通道(即rio)的数据库备份.
如果成功则返回C_OK,否则将返回C_ERR并且部分或全部输出由于IO错误
可能会丢失.当函数返回C_ERR并且error非NULL的时候,errno的值被设置
成IO错误值.
*/
/* Produces a dump of the database in RDB format sending it to the specified
* Redis I/O channel. On success C_OK is returned, otherwise C_ERR
* is returned and part of the output, or all the output, can be
* missing because of I/O errors.
*
* When the function returns C_ERR and if 'error' is not NULL, the
* integer pointed by 'error' is set to the value of errno just after the I/O
* error. */
int rdbSaveRio(rio *rdb, int *error, int rdbflags, rdbSaveInfo *rsi) {
/* 定义一些变量*/
dictIterator *di = NULL;
dictEntry *de;
char magic[10];
int j;
uint64_t cksum;
size_t processed = 0;
/* 如何开启了校验和选项,那么设置校验和函数 */
if (server.rdb_checksum)
rdb->update_cksum = rioGenericUpdateChecksum;
/* 将"REDIS"和RDB_VERSION对应数字的转成四个字节字符拼接的保存到magic字符串中 */
snprintf(magic,sizeof(magic),"REDIS%04d",RDB_VERSION);
/* 将magic写到rio中,一共9个字节 */
if (rdbWriteRaw(rdb,magic,9) == -1) goto werr;
/*
写入辅助信息,当rsi为空的时候,写入的信息有:
"redis-ver"(REDIS_VERSION),"redis-bits"(运行机器是32位还是64位机器),
"ctime"(当前时间),"used-mem"(zmalloc_used_memory()计算出的值),
"aof-preamble"(是否开启RDB和AOF混合持久化).
以上是都会有的,当rsi非NULL的时候,会有另外副本的相关信息:
"repl-stream-db","repl-id","repl-offset"
*/
if (rdbSaveInfoAuxFields(rdb,rdbflags,rsi) == -1) goto werr;
/*
存入模块相关内容,还需要再研究, #define REDISMODULE_AUX_BEFORE_RDB (1<<0)
*/
if (rdbSaveModulesAux(rdb, REDISMODULE_AUX_BEFORE_RDB) == -1) goto werr;
/* 遍历每一个数据库,发现一个数据就用rdbSaveType和rdbSaveLen将数据的信息保存*/
for (j = 0; j < server.dbnum; j++) {
/* 获取数据库指针地址和数据库字典 */
redisDb *db = server.db+j;
dict *d = db->dict;
if (dictSize(d) == 0) continue;
di = dictGetSafeIterator(d); /* 获取字典这种数据类型的迭代器 */
/* Write the SELECT DB opcode */
/* 写入当前待写入数据的标识,#define RDB_OPCODE_SELECTDB 254 (254对应的8进制是376) */
if (rdbSaveType(rdb,RDB_OPCODE_SELECTDB) == -1) goto werr;
/* 写入当前数据库号 */
if (rdbSaveLen(rdb,j) == -1) goto werr;
/* 获取数据库字典大小和过期键字典大小 */
/* Write the RESIZE DB opcode. */
uint64_t db_size, expires_size;
db_size = dictSize(db->dict);
expires_size = dictSize(db->expires);
/* 写入当前待写入数据的标识,#define RDB_OPCODE_RESIZEDB 251(251对应的8进制是373) */
if (rdbSaveType(rdb,RDB_OPCODE_RESIZEDB) == -1) goto werr;
/* 写入获取数据库字典大小和过期键字典大小 */
if (rdbSaveLen(rdb,db_size) == -1) goto werr;
if (rdbSaveLen(rdb,expires_size) == -1) goto werr;
/* 迭代遍历当前数据库的键值对 */
/* Iterate this DB writing every entry */
while((de = dictNext(di)) != NULL) {
sds keystr = dictGetKey(de);
robj key, *o = dictGetVal(de);
long long expire;
/*
赋值:
key.refcount = OBJ_STATIC_REFCOUNT;
key.type = OBJ_STRING
key.encoding = OBJ_ENCODING_RAW
key.ptr = _ptr
*/
initStaticStringObject(key,keystr);
/* 获取key的过期时间 */
expire = getExpire(db,&key);
/* 保存键值对数据 */
if (rdbSaveKeyValuePair(rdb,&key,o,expire) == -1) goto werr;
/* When this RDB is produced as part of an AOF rewrite, move
* accumulated diff from parent to child while rewriting in
* order to have a smaller final write. */
/*
当这里的RDB是作为AOF重写的一部分的时候,为了在最后有更少的写,在重写的时候
将父中的累计不同移动到子中
*/
if (rdbflags & RDBFLAGS_AOF_PREAMBLE &&
rdb->processed_bytes > processed+AOF_READ_DIFF_INTERVAL_BYTES)
{
processed = rdb->processed_bytes;
aofReadDiffFromParent();/* 写入与父不一致的数据 */
}
}
dictReleaseIterator(di);
di = NULL; /* So that we don't release it again on error. */
}
/* If we are storing the replication information on disk, persist
* the script cache as well: on successful PSYNC after a restart, we need
* to be able to process any EVALSHA inside the replication backlog the
* master will send us. */
/* 如果我们要将复制信息存储在磁盘上,那我们也需要将脚本持久化:
这样的话对于重启成功后的PSYNC命令,我们就可以处理任何主节点将要发给我们的同步日志中
的校验码*/
/* 如果有副本相关信息且有lua脚本相关信息,则对它们进行持久化 */
if (rsi && dictSize(server.lua_scripts)) {
di = dictGetIterator(server.lua_scripts);
while((de = dictNext(di)) != NULL) {
robj *body = dictGetVal(de);
if (rdbSaveAuxField(rdb,"lua",3,body->ptr,sdslen(body->ptr)) == -1)
goto werr;
}
dictReleaseIterator(di);
di = NULL; /* So that we don't release it again on error. */
}
/* 存入模块相关内容,#define REDISMODULE_AUX_AFTER_RDB (1<<1) */
if (rdbSaveModulesAux(rdb, REDISMODULE_AUX_AFTER_RDB) == -1) goto werr;
/* EOF opcode */
/* 写入结束符 */
if (rdbSaveType(rdb,RDB_OPCODE_EOF) == -1) goto werr;
/* CRC64 checksum. It will be zero if checksum computation is disabled, the
* loading code skips the check in this case. */
/* 写入CRC64校验和 */
cksum = rdb->cksum;
memrev64ifbe(&cksum);
if (rioWrite(rdb,&cksum,8) == 0) goto werr;
return C_OK;
werr:
if (error) *error = errno;
if (di) dictReleaseIterator(di);
return C_ERR;
}
2.4.6 客户端输入bgsave命令
int rdbSaveBackground(char *filename, rdbSaveInfo *rsi) {
pid_t childpid;
if (hasActiveChildProcess()) return C_ERR;
server.dirty_before_bgsave = server.dirty;
server.lastbgsave_try = time(NULL);
openChildInfoPipe();
if ((childpid = redisFork()) == 0) {
int retval;
/* Child */
redisSetProcTitle("redis-rdb-bgsave");
redisSetCpuAffinity(server.bgsave_cpulist);
retval = rdbSave(filename,rsi);
if (retval == C_OK) {
sendChildCOWInfo(CHILD_INFO_TYPE_RDB, "RDB");
}
exitFromChild((retval == C_OK) ? 0 : 1);
} else {
/* Parent */
if (childpid == -1) {
closeChildInfoPipe();
server.lastbgsave_status = C_ERR;
serverLog(LL_WARNING,"Can't save in background: fork: %s",
strerror(errno));
return C_ERR;
}
serverLog(LL_NOTICE,"Background saving started by pid %d",childpid);
server.rdb_save_time_start = time(NULL);
server.rdb_child_pid = childpid;
server.rdb_child_type = RDB_CHILD_TYPE_DISK;
return C_OK;
}
return C_OK; /* unreached */
}
2.4.7 在shutdown触发相关
1.服务器端通过信号进行shutdown
1.1 信号处理函数
static void sigShutdownHandler(int sig) {
char *msg;
switch (sig) {
case SIGINT:
msg = "Received SIGINT scheduling shutdown...";
break;
case SIGTERM:
msg = "Received SIGTERM scheduling shutdown...";
break;
default:
msg = "Received shutdown signal, scheduling shutdown...";
};
/* SIGINT is often delivered via Ctrl+C in an interactive session.
* If we receive the signal the second time, we interpret this as
* the user really wanting to quit ASAP without waiting to persist
* on disk. */
if (server.shutdown_asap && sig == SIGINT) {
serverLogFromHandler(LL_WARNING, "You insist... exiting now.");
rdbRemoveTempFile(getpid());
exit(1); /* Exit with an error since this was not a clean shutdown. */
} else if (server.loading) {
serverLogFromHandler(LL_WARNING, "Received shutdown signal during loading, exiting now.");
exit(0);
}
serverLogFromHandler(LL_WARNING, msg);
server.shutdown_asap = 1;
}
1.2 当server.shutdown_asap = 1变为1的时候调用了serverCron中的prepareForShutdown
int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
...
/* We received a SIGTERM, shutting down here in a safe way, as it is
* not ok doing so inside the signal handler. */
if (server.shutdown_asap) {
if (prepareForShutdown(SHUTDOWN_NOFLAGS) == C_OK) exit(0);
serverLog(LL_WARNING,"SIGTERM received but errors trying to shut down the server, check the logs for more information");
server.shutdown_asap = 0;
}
...
}
1.3 prepareForShutdown中调用了rdbSave
2.客户端发起shutdown命令
2.1 shutdownCommand源码
void shutdownCommand(client *c) {
int flags = 0;
if (c->argc > 2) {
addReply(c,shared.syntaxerr);
return;
} else if (c->argc == 2) {
if (!strcasecmp(c->argv[1]->ptr,"nosave")) {
flags |= SHUTDOWN_NOSAVE;
} else if (!strcasecmp(c->argv[1]->ptr,"save")) {
flags |= SHUTDOWN_SAVE;
} else {
addReply(c,shared.syntaxerr);
return;
}
}
if (prepareForShutdown(flags) == C_OK) exit(0);
addReplyError(c,"Errors trying to SHUTDOWN. Check logs.");
}
2.2 shutdownCommand中调用了prepareForShutdown,prepareForShutdown调用了rdbSave.