Redis源码解析:12AOF持久化

         除了RDB持久化功能之外,Redis还提供了AOF(AppendOnly File)持久化功能。与RDB持久化通过保存数据库中的键值对来记录数据库状态不同,AOF持久化是通过保存Redis服务器所执行的写命令来记录数据库状态的。与RDB持久化相比,AOF持久化可能丢失的数据更少,但是AOF持久化可能会降低Redis的性能。

         写人AOF文件的所有命令都是以Redis的统一请求协议格式保存的。

 

         在表示Redis服务器的结构体redisServer中,有关AOF的成员如下:

struct redisServer {
    ...
    /* AOF persistence */
    int aof_state;                  /* REDIS_AOF_(ON|OFF|WAIT_REWRITE) */
    int aof_fsync;                  /* Kind of fsync() policy */
    char *aof_filename;             /* Name of the AOF file */
    ...
    pid_t aof_child_pid;            /* PID if rewriting process */
    list *aof_rewrite_buf_blocks;   /* Hold changes during an AOF rewrite. */
    sds aof_buf;      /* AOF buffer, written before entering the event loop */
    int aof_fd;       /* File descriptor of currently selected AOF file */
    ...
    /* AOF pipes used to communicate between parent and child during rewrite. */
    int aof_pipe_write_data_to_child;
    int aof_pipe_read_data_from_parent;
    int aof_pipe_write_ack_to_parent;
    int aof_pipe_read_ack_from_child;
    int aof_pipe_write_ack_to_child;
    int aof_pipe_read_ack_from_parent;
    int aof_stop_sending_diff;     /* If true stop sending accumulated diffs
                                      to child process. */
    sds aof_child_diff;             /* AOF diff accumulator child side. */
    ...
};

 

一:AOF持久化

         AOF持久化功能的实现可以分为命令追加、文件写人、文件同步(sync)三个步骤。

1:命令追加

         开启了AOF快照功能后,当Redis服务器收到客户端命令时,会调用函数feedAppendOnlyFile。该函数按照统一请求协议对命令进行编码,将编码后的内容追加到AOF缓存server.aof_buf中。feedAppendOnlyFile代码如下:

void feedAppendOnlyFile(struct redisCommand *cmd, int dictid, robj **argv, int argc) {
    sds buf = sdsempty();
    robj *tmpargv[3];

    /* The DB this command was targeting is not the same as the last command
     * we appended. To issue a SELECT command is needed. */
    if (dictid != server.aof_selected_db) {
        char seldb[64];

        snprintf(seldb,sizeof(seldb),"%d",dictid);
        buf = sdscatprintf(buf,"*2\r\n$6\r\nSELECT\r\n$%lu\r\n%s\r\n",
            (unsigned long)strlen(seldb),seldb);
        server.aof_selected_db = dictid;
    }

    if (cmd->proc == expireCommand || cmd->proc == pexpireCommand ||
        cmd->proc == expireatCommand) {
        /* Translate EXPIRE/PEXPIRE/EXPIREAT into PEXPIREAT */
        buf = catAppendOnlyExpireAtCommand(buf,cmd,argv[1],argv[2]);
    } else if (cmd->proc == setexCommand || cmd->proc == psetexCommand) {
        /* Translate SETEX/PSETEX to SET and PEXPIREAT */
        tmpargv[0] = createStringObject("SET",3);
        tmpargv[1] = argv[1];
        tmpargv[2] = argv[3];
        buf = catAppendOnlyGenericCommand(buf,3,tmpargv);
        decrRefCount(tmpargv[0]);
        buf = catAppendOnlyExpireAtCommand(buf,cmd,argv[1],argv[2]);
    } else {
        /* All the other commands don't need translation or need the
         * same translation already operated in the command vector
         * for the replication itself. */
        buf = catAppendOnlyGenericCommand(buf,argc,argv);
    }

    /* Append to the AOF buffer. This will be flushed on disk just before
     * of re-entering the event loop, so before the client will get a
     * positive reply about the operation performed. */
    if (server.aof_state == REDIS_AOF_ON)
        server.aof_buf = sdscatlen(server.aof_buf,buf,sdslen(buf));

    /* If a background append only file rewriting is in progress we want to
     * accumulate the differences between the child DB and the current one
     * in a buffer, so that when the child process will do its work we
     * can append the differences to the new append only file. */
    if (server.aof_child_pid != -1)
        aofRewriteBufferAppend((unsigned char*)buf,sdslen(buf));

    sdsfree(buf);
}

         该函数中,首先判断本次命令的数据库索引dictid,是否与上次命令的数据库索引server.aof_selected_db相同,如果不同,则编码select命令;

         如果命令为EXPIRE、PEXPIRE或者EXPIREAT,则调用catAppendOnlyExpireAtCommand将命令编码为PEXPIREAT命令的格式;

         如果命令为setex或psetex,则先调用catAppendOnlyGenericCommand编码SET命令,然后调用catAppendOnlyExpireAtCommand编码PEXPIREAT命令;

         其他命令直接用catAppendOnlyGenericCommand对命令进行编码;

        

         如果server.aof_state为REDIS_AOF_ON,则说明开启了AOF功能,将编码后的buf追加到AOF缓存server.aof_buf中;

         另外,如果server.aof_child_pid不是-1,说明有子进程在进行AOF重写,则调用aofRewriteBufferAppend将编码后的buf追加到AOF重写缓存server.aof_rewrite_buf_blocks中。

 

2:文件写人、文件同步

         为了提高文件的写入效率,在现代操作系统中,当用户调用write函数将数据写入到文件描述符后,操作系统通常会将写入数据暂时保存在一个内存缓冲区里面,等到缓冲区的空间被填满、或者超过了指定的时限之后,操作系统才真正地将缓冲区中的数据写入到磁盘里面。

         这种做法虽然提高了效率,但也为写入数据带来了安全问题,如果计算机发生宕机,那么保存在内存缓冲区里面的写入数据将会丢失。

         为此,操作系统提供了fsync同步函数,可以手动让操作系统立即将缓冲区中的数据写入到硬盘里面,从而确保写入数据的安全性。

 

         Redis服务器的主循环中,每隔一段时间就会将AOF缓存server.aof_buf中的内容写入到AOF文件中。并且根据同步策略的不同,而选择不同的时机进行fsync。同步策略通过配置文件中的appendfsync选项设置,总共有三种同步策略,分别是:

         a:appendfsync  no

         不执行fsync操作,完全交由操作系统进行同步。这种方式是最快的,但也是最不安全的。

         b:appendfsync  always

         每次调用write将AOF缓存server.aof_buf中的内容写入到AOF文件时,立即调用fsync函数。这种方式是最安全的,却也是最慢的。

         c:appendfsync  everysec

         每隔1秒钟进行一次fsync操作,这是一种对速度和安全性进行折中的方法。如果用户没有设置appendfsync选项的值,则使用everysec作为选项默认值。

 

         将AOF缓存server.aof_buf中的内容写入到AOF文件中。并且根据同步策略的不同,而选择不同的时机进行fsync。这都是在函数flushAppendOnlyFile中实现的,其代码如下:

void flushAppendOnlyFile(int force) {
    ssize_t nwritten;
    int sync_in_progress = 0;
    mstime_t latency;

    if (sdslen(server.aof_buf) == 0) return;

    if (server.aof_fsync == AOF_FSYNC_EVERYSEC)
        sync_in_progress = bioPendingJobsOfType(REDIS_BIO_AOF_FSYNC) != 0;

    if (server.aof_fsync == AOF_FSYNC_EVERYSEC && !force) {
        /* With this append fsync policy we do background fsyncing.
         * If the fsync is still in progress we can try to delay
         * the write for a couple of seconds. */
        if (sync_in_progress) {
            if (server.aof_flush_postponed_start == 0) {
                /* No previous write postponing, remember that we are
                 * postponing the flush and return. */
                server.aof_flush_postponed_start = server.unixtime;
                return;
            } else if (server.unixtime - server.aof_flush_postponed_start < 2) {
                /* We were already waiting for fsync to finish, but for less
                 * than two seconds this is still ok. Postpone again. */
                return;
            }
            /* Otherwise fall trough, and go write since we can't wait
             * over two seconds. */
            server.aof_delayed_fsync++;
            redisLog(REDIS_NOTICE,"Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.");
        }
    }
    /* We want to perform a single write. This should be guaranteed atomic
     * at least if the filesystem we are writing is a real physical one.
     * While this will save us against the server being killed I don't think
     * there is much to do about the whole server stopping for power problems
     * or alike */

    latencyStartMonitor(latency);
    nwritten = write(server.aof_fd,server.aof_buf,sdslen(server.aof_buf));
    latencyEndMonitor(latency);
    /* We want to capture different events for delayed writes:
     * when the delay happens with a pending fsync, or with a saving child
     * active, and when the above two conditions are missing.
     * We also use an additional event name to save all sampl
  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值