Redis(五):主从复制

4 篇文章 0 订阅
2 篇文章 0 订阅
  • 全量同步
  • 增量同步
  • 主从配置
  • 复制流程及源码分析 
  • 主从复制问题及优化

  Redis主从复制可以分为全量同步和增量同步,在Redis 2.8之前从节点每次连接主节点都会发送SYNC命令,来执行一次全量同步;2.8之后全量同步主要针对首次连接的情况,对于断线重连可以进行增量同步,且2.8之后用PSYNC代替了SYNC,其中psync ? -1表示全量同步,增量同步用psync <runid> <offset>表示。

全量同步
  全量同步可分为数据传播、命令传播两阶段。数据传播阶段指主节点通过bgsave生成dump文件,并把该文件传递给从节点这一期间。命令传播阶段,即把主节点bgsave开始后到dump文件传输结束前,这期间产生的写操作以命令的形式传递给从节点。下面以首次连接为例,对这两个阶段作一个简要的说明
1)主节点接收到从节点的psyn ? -1命令之后,检测当前是否正在执行bgsave或刚执行完但还并未向任何Slave发送dump文件,如果没有则重新执行一次新的bgsave,生成新的dump文件,最终把最新的dump文件传递给Slave。这一步即数据传播阶段。
2)在bgsave开始到dump文件传输结束,这期间主节点针对所有的写命令都会同时写入到复制积压区中,当文件传输结束后,主节点将以命令的形式把积压区中的数据传递给从节点(即命令传播)。最终使主从数据库状态一致,图例的话大概如下


增量同步实现 
  增量同步主要针对断线重连的情况,也是通过命令传播的方式来同步,它的实现由以下三个部分组成
1)主节点复制积压区:复制积压区是由主节点维护的一个固定长度的先进先出的队列,默认大小1M。主节点针对每一条写命令都会同时写入到积压区内,当断线重连时通过比较从节点带过来的偏移量来决定全量还是增量同步。如果offset +1 数据还在缓冲区内,则实行增量同步,否则全量同步。不在的原因,比如继线期间有大量的写操作,由于队列先进先出导致一些数据从积压区中被移除。关于缓冲区的大小和时限,在下方配置中再介绍。
2)主从节点的复制偏移量offset:主从节点分别维护了一个复制偏移量,当主节点向从节点传输N个字节后,将自己的复制偏移量+N,从节点接收到N个字节后同样也将自己的复制偏移量+N,通过比较复制偏移量就可以知道当前主从节点状态是否一致,不一致的情况下执行全量同步。
3)主节点的runid:runid是一个长40的字符串,代表redis运行ID,由redis在启动时创建。从节点首次连接时,主节点会将自己的runid传输过去,从节点会将其保存在内存中。在后续连接时,从节点再将其带上,主节点通过比较runid,如果runid不同则认为从节点是首次连接到当前节点,从而进行全量同步。实际情况中,主从节点重启或从节点重连到新的Master等情况都会导致runid对应不上。

全量同步还是增量同步:
  总的来说,只要从节点传递的runid与当前master的runid一致,传递的offset + 1偏移量在master的缓冲积压区内,就采用增量同步,否则采用全量同步。所以以下情况都会进行全量同步

 

  • 从节点启动或重启:由于runid和offset保存在内存中,slave重启后会丢失,所以必须全量同步
  • 从节点断连超时:断连超时主节点会释放复制积压区,同样必须全量同步
  • 断连期间产生的写命令超过缓冲积压区设定大小,导致位于原offset+1位置的数据被清理,offset+1已经不在主节点缓冲积压区内的情况下,同样也得全量同步
  • 重启到新的Master:runid匹配不上,必须全量同步。

注:Redis提供了debug reload的重启方式,重启后主节点的runid和offset不会受影响,从而避免了全量复制。如下图所示

主从复制配置

################################# REPLICATION #################################
# slaveof <masterip> <masterport>

# masterauth <master-password>

# 复制期间,从节点针对请求的处理方式
# yes表示以现有数据继续响应客户端请求;
# no针对除info和slaveof外的所有请求返回"SYNC with master in progress"信息
slave-serve-stale-data yes

# 设置从节点只读模式。默认yes,no表示支持临时的写操作,这新数据同步后会被清除
slave-read-only yes

# 是否启用无磁盘复制
# 复制策略有两种:
# 一种是磁盘复制(disk-backed): 主节点先生成dump文件,再把文件传递给从节点
# 一种是无磁盘复制(Diskless):主节点直接将数据通过socket流的方式发送给从节点
# 在磁盘效率低但网络快且牢靠的情况下可以采用无磁盘复制,但目前处于测试阶段,并不建议
repl-diskless-sync no

# 无磁盘复制的情况下,传输开始后,后续到来的从节点需要等待本次传输结束,才能开始,
# 所以设定一个开始传输的延迟时间,以便等待多个从节点到来后,批量并行传输。
repl-diskless-sync-delay 5

# 从节点向主节点发送ping的周期
# repl-ping-slave-period 10

# 同步超时时间
# 如果当前节点是从节点,这个时间包括主节点响应ping命令超时、复制过程超时等
# 如果当前节点是主节点,这个时间指从节点的响应步超时时间
# repl-timeout 60

# 是否禁用TCP_NODELAY#
# yes情况下使用较少的带宽向从节点发送较小的tcp数据包,可能导致数据延迟,默认配置下Linux内核最多延迟40毫秒。
# no情况下使用较多的带宽来传输,数据延迟相对减少。
# 默认为no,流量很高的情况下可以设为yes,节省同步占用的带宽。
repl-disable-tcp-nodelay no

# 复制积压区大小,值越大,offset位置被清理的机会就越小,允许断开的时间就越长
# repl-backlog-size 1mb

# 复制积压区的超时时间,超时后主节点会清理复制积压区,将导致重连后全量同步
# repl-backlog-ttl 3600

# 从节点的优先级:选举新主节点时,优先极越高,被选中的可能性就越大。
slave-priority 100

# 当从节点个数<N,或N个从节点延迟>M时,主节点将拒绝执行写操作。
# 延迟时间的计算为:当前时间 - 从节点最后一次向主节点发送REPLCONF ACK <offset>命令的时间
# min-slaves-to-write 3
# min-slaves-max-lag 10

# A Redis master is able to list the address and port of the attached
# slaves in different ways. For example the "INFO replication" section
# offers this information, which is used, among other tools, by
# Redis Sentinel in order to discover slave instances.
# Another place where this info is available is in the output of the
# "ROLE" command of a masteer.
#
# The listed IP and address normally reported by a slave is obtained
# in the following way:
#
#   IP: The address is auto detected by checking the peer address
#   of the socket used by the slave to connect with the master.
#
#   Port: The port is communicated by the slave during the replication
#   handshake, and is normally the port that the slave is using to
#   list for connections.
#
# However when port forwarding or Network Address Translation (NAT) is
# used, the slave may be actually reachable via different IP and port
# pairs. The following two options can be used by a slave in order to
# report to its master a specific set of IP and port, so that both INFO
# and ROLE will report those values.
#
# There is no need to use both the options if you need to override just
# the port or the IP address.
#
# slave-announce-ip 5.5.5.5
# slave-announce-port 1234

这里说一下积压区大小的设置,积压区的大小一般根据从节点平均断线时间与主节点每秒产生的写操作来决定,一般设置为2*second*write_size就可以使用绝大部分断线重连使用增量同步。比如从节点平均断开时间为3秒,主节点每秒的写操作大概是1M,那么积压区大小可以设置2*3*1=6M。

主从复制流程及源码分析
  前面的图对主从流程进行了一个简单的说明,这里从源码角度分析,主从复制的源码位于replication.c文件中。先说下slave端流程
1)从节点通过slaveof配置或slaveof命令,设置master信息,并初始化复制状态为repl_state_connect。
2)从节点每秒执行一次replicationCron()来检测自身的主从复制状态,当复制状态server.repl_stat = repl_state_connect时向m发起连接,并将状态改为repl_state_connecting。
3)从节点依次发送ping、auth、replconf命令,其中ping命令用来检测连接,auth用来验证密码,replconf用来把从节点port、ip、capa、eof等信息通知主节点,便于主节点统计信息,最终更新复制状态为send_psync,代表即将开始同步。
4)从节点尝试同步。对于首次连接或重连到新Master的情况,由于没有对应的runid、偏移量等信息,因此发送PSYNC ? -1命令来进行一次全量同步;对于重连到原有Master的情况发送psyn <runid> <offset>尝试增量同步。 更新状态为receive_psync等待主节点响应,主节点针对psync会有3种回复,收到非ERR的回复后,从节点更新复制状态为transfer,表示传输中。

对于全量同步,从节点将创建临时文件,接收主节点传输的数据。对于continue增量同步,从节点依次执行每条命令。
5):全量同步下,存储主节点传输过来的dump.rdb文件,清除当前所有数据,并重新加载该dump.rdb文件至内存,启用了AOF重写的情况下,进行AOF重写。
6):增量同步下,主节点与普通客户端没什么区别,依次执行主节点传输过来的每条命令,该写AOF的写AOF。
7):同步完毕,更新从节点状态为server.repl_state = REPL_STATE_CONNECTED表示连接结束。
注:以上任何一步如果主节点返回任何错误信息,则本次连接结束,等待下次连接。
流程基本如下图所示

源码如下:1解析slaveof命令  2每秒执行一个replicationCron()

  1. --FULLRESYNC <runid> <offset>:表示将进行全量同步,runid代表主节点的运行ID,offset代表主节点当前复制偏移量,从节点将保存该runid,并初始化为自己偏移量为offset。
  2. --Continue <runid> <offset>:表示本次同步为增量同步。
  3. --ERR:表示主节点版本低于2.8,不支持PSYN
void slaveofCommand(client *c) {
    /* SLAVEOF is not allowed in cluster mode as replication is automatically
     * configured using the current address of the master node. */
    if (server.cluster_enabled) {
        ..
    }

    /* The special host/port combination "NO" "ONE" turns the instance
     * into a master. Otherwise the new master address is set. */
    if (!strcasecmp(c->argv[1]->ptr,"no") &&
        ...
    } else {
        long port;

        if ((getLongFromObjectOrReply(c, c->argv[2], &port, NULL) != C_OK))
            return;

        /* 重连已有Master */
        if (server.masterhost && !strcasecmp(server.masterhost,c->argv[1]->ptr)
            && server.masterport == port) {
            serverLog(LL_NOTICE,"SLAVE OF would result into synchronization with the master we are already connected with. No operation performed.");
            addReplySds(c,sdsnew("+OK Already connected to specified master\r\n"));
            return;
        }
        /* 首次连接或重连新Master,设置Master信息 */
        replicationSetMaster(c->argv[1]->ptr, port);
        sds client = catClientInfoString(sdsempty(),c);
        serverLog(LL_NOTICE,"SLAVE OF %s:%d enabled (user request from '%s')",
            server.masterhost, server.masterport, client);
        sdsfree(client);
    }
    addReply(c,shared.ok);
}
/* Set replication to the specified master address and port. */
void replicationSetMaster(char *ip, int port) {    
    sdsfree(server.masterhost);
    server.masterhost = sdsnew(ip);
    server.masterport = port;
    if (server.master) freeClient(server.master);
    disconnectAllBlockedClients(); /* Clients blocked in master, now slave. */
    disconnectSlaves(); /* Force our slaves to resync with us as well. */
    replicationDiscardCachedMaster(); /* Don't try a PSYNC. */
    freeReplicationBacklog(); /* Don't allow our chained slaves to PSYNC. */
    cancelReplicationHandshake();
    server.repl_state = REPL_STATE_CONNECT;    /* 复制状态 */    
    server.master_repl_offset = 0;             /* 偏移量 */
    server.repl_down_since = 0;
}
/* Replication cron function, called 1 time per second. */
void replicationCron(void) {
    static long long replication_cron_loops = 0;

    ......
    /* 检测状态:是否连接到Master */
    if (server.repl_state == REPL_STATE_CONNECT) {
        serverLog(LL_NOTICE,"Connecting to MASTER %s:%d",
            server.masterhost, server.masterport);
    /* 连接Master */
        if (connectWithMaster() == C_OK) {        
            serverLog(LL_NOTICE,"MASTER <-> SLAVE sync started");
        }
    }
    ......
}
int connectWithMaster(void) {
    int fd;

    fd = anetTcpNonBlockBestEffortBindConnect(NULL,
        server.masterhost,server.masterport,NET_FIRST_BIND_ADDR);
    ...
    /* 绑定事件:syncWithMaster */
    if (aeCreateFileEvent(server.el,fd,AE_READABLE|AE_WRITABLE,syncWithMaster,NULL) ==
            AE_ERR)
    {
        ...
    }
    /* 设置server的相关信息<即master>*/
    server.repl_transfer_lastio = server.unixtime;
    server.repl_transfer_s = fd;                /* 主从Socket流,用于复制过程中主从间的TCP通信,包括命令传播、接收数据等*/
    server.repl_state = REPL_STATE_CONNECTING;  /* 设置复制状态 */
    return C_OK;
}
void syncWithMaster(aeEventLoop *el, int fd, void *privdata, int mask) {   
   ......
 
    /* 发送PING命令检测主节点状态 */
    if (server.repl_state == REPL_STATE_CONNECTING) {
        serverLog(LL_NOTICE,"Non blocking connect for SYNC fired the event.");
       
        aeDeleteFileEvent(server.el,fd,AE_WRITABLE);
        server.repl_state = REPL_STATE_RECEIVE_PONG;  /* 更新从节点复制状态为PONG */
       
        err = sendSynchronousCommand(SYNC_CMD_WRITE,fd,"PING",NULL); /* 发送PING命令 */
        if (err) goto write_error;
        return;
    }
    
    /* 发送AUTH验证密码、RECONFIG命令通知主节点ip、端口、支持的功能等信息,
     * 以便主节点info统计 
     */
    ......
    
    /* 尝试部分同步,首次连接则全量同步,通过偏移量的传递来决定下次传输的指令 */
    if (server.repl_state == REPL_STATE_SEND_PSYNC) {
        /* 尝试部分同步 */
        if (slaveTryPartialResynchronization(fd,0) == PSYNC_WRITE_ERROR) {
            err = sdsnew("Write error sending the PSYNC command.");
            goto write_error;
        }
        server.repl_state = REPL_STATE_RECEIVE_PSYNC;  /* 更新复制状态为PSYNC */
        return;
    }

    psync_result = slaveTryPartialResynchronization(fd,1);  /* 获取PSYNC结果 */

    /* ......异常结果处理代码略  */

    /* 绑定事件readSyncBulkPayload,即将异步下载文件(即非阻塞) */
    if (aeCreateFileEvent(server.el,fd, AE_READABLE,readSyncBulkPayload,NULL)
            == AE_ERR)
    {
        serverLog(LL_WARNING,
            "Can't create readable event for SYNC: %s (fd=%d)",
            strerror(errno),fd);
        goto error;
    }

    server.repl_state = REPL_STATE_TRANSFER;  /* 更新状态为传输中 */
    server.repl_transfer_size = -1;
    server.repl_transfer_read = 0;
    server.repl_transfer_last_fsync_off = 0;
    server.repl_transfer_fd = dfd;
    server.repl_transfer_lastio = server.unixtime;
    server.repl_transfer_tmpfile = zstrdup(tmpfile);
    return;
......
}
/* 下载 */
void readSyncBulkPayload(aeEventLoop *el, int fd, void *privdata, int mask) {
    ......读取内容写入临时文件略

    if (eof_reached) {
        /* 重命名临时文件,覆盖掉现有rdb文件 */
        if (rename(server.repl_transfer_tmpfile,server.rdb_filename) == -1) {
            serverLog(LL_WARNING,"Failed trying to rename the temp DB into dump.rdb in MASTER <-> SLAVE synchronization: %s", strerror(errno));
            cancelReplicationHandshake();
            return;
        }

        /* 清除现有数据 */
        serverLog(LL_NOTICE, "MASTER <-> SLAVE sync: Flushing old data");
        signalFlushedDb(-1);
        emptyDb(replicationEmptyDbCallback);
        /* Before loading the DB into memory we need to delete the readable
         * handler, otherwise it will get called recursively since
         * rdbLoad() will call the event loop to process events from time to
         * time for non blocking loading. */
        aeDeleteFileEvent(server.el,server.repl_transfer_s,AE_READABLE);

        /* 重新加载RDB文件 */
        serverLog(LL_NOTICE, "MASTER <-> SLAVE sync: Loading DB in memory");
        if (rdbLoad(server.rdb_filename) != C_OK) {
            serverLog(LL_WARNING,"Failed trying to load the MASTER synchronization DB from disk");
            cancelReplicationHandshake();
            return;
        }

        /* 复制成功,最后的相关设置 */
        zfree(server.repl_transfer_tmpfile);
        close(server.repl_transfer_fd);
        replicationCreateMasterClient(server.repl_transfer_s);
        serverLog(LL_NOTICE, "MASTER <-> SLAVE sync: Finished with success");

        /* AOF重写 */
        /* Restart the AOF subsystem now that we finished the sync. This
         * will trigger an AOF rewrite, and when done will start appending
         * to the new file. */
        if (server.aof_state != AOF_OFF) {                
            ......
        }
    }

    return;
......
}

void replicationCreateMasterClient(int fd) {
    server.master = createClient(fd);
    server.master->flags |= CLIENT_MASTER;
    server.master->authenticated = 1;
    server.repl_state = REPL_STATE_CONNECTED;  /* 更新复制状态为CONNECTED */
    server.master->reploff = server.repl_master_initial_offset;  /* 复制偏移量 */
    memcpy(server.master->replrunid, server.repl_master_runid,
        sizeof(server.repl_master_runid));
    /* If master offset is set to -1, this master is old and is not
     * PSYNC capable, so we flag it accordingly. */
    if (server.master->reploff == -1)
        server.master->flags |= CLIENT_PRE_PSYNC;
}

Master流程
1)主节点解析sync或psync命令,通过对比runid offset来决定全量还是增量同步。如果可以增量同步,则回复--Continue <runid> <offset>,并将以命令的方式把数据传递给从节点。
2)全量同步。如果主节点当前正在执行bgsave,则本次复制将等待bgsave执行完毕。否则执行一次新的bgsave,生成dump文件,并传输。这里不说采用socket流的复制策略,仅说使用磁盘复制策略
3)当首个slave请求同步,且不存积压区时,主节点将创建复制积压区。注意:无论主节点多少个从节点,都只需要一个复制积压缓冲区。
源码如下:分别对应sync/psync的命令实现、增量同步的实现、全量同步的实现

/* sync和psync命令的实现 */
void syncCommand(client *c) {
    ......
    
    /* 优先尝试部分同步:即检测runid offset来 */
    if (!strcasecmp(c->argv[0]->ptr,"psync")) {	
        if (masterTryPartialResynchronization(c) == C_OK) {
            server.stat_sync_partial_ok++;
            return; /* 增量同步成功,不需要全量同步,结束 */ 
        } else {
            char *master_runid = c->argv[1]->ptr;
           
            if (master_runid[0] != '?') server.stat_sync_partial_err++;
        }
    } else {       
        c->flags |= CLIENT_PRE_PSYNC;
    }

    /* 全量同步 */
    server.stat_sync_full++;

    c->replstate = SLAVE_STATE_WAIT_BGSAVE_START;
    if (server.repl_disable_tcp_nodelay)
        anetDisableTcpNoDelay(NULL, c->fd); /* Non critical if it fails. */
    c->repldbfd = -1;
    c->flags |= CLIENT_SLAVE;
    listAddNodeTail(server.slaves,c);

    /* CASE 1:正在执行bgsave,且采用磁盘复制策略,对应配置repl-diskless-sync no*/
    if (server.rdb_child_pid != -1 &&
        server.rdb_child_type == RDB_CHILD_TYPE_DISK)
    {
        /* Ok a background save is in progress. Let's check if it is a good
         * one for replication, i.e. if there is another slave that is
         * registering differences since the server forked to save. */
        client *slave;
        listNode *ln;
        listIter li;

        listRewind(server.slaves,&li);
        while((ln = listNext(&li))) {
            slave = ln->value;
            if (slave->replstate == SLAVE_STATE_WAIT_BGSAVE_END) break;
        }
        
        if (ln && ((c->slave_capa & slave->slave_capa) == slave->slave_capa)) {
            /* 复制最后触发bgsave(且bgsave正在执行中)的客户端缓冲区数据,拷贝到当前客户端缓冲区 */
            copyClientOutputBuffer(c,slave);   
            /* 设置当前Slave的状态值为等待状态、偏移量为最后触发的那个Slave的初始偏移量 */
            replicationSetupSlaveForFullResync(c,slave->psync_initial_offset); 
            serverLog(LL_NOTICE,"Waiting for end of BGSAVE for SYNC");
        } else {            
            serverLog(LL_NOTICE,"Can't attach the slave to the current BGSAVE. Waiting for next BGSAVE for SYNC");
        }

    /* CASE 2: 正在执行bgsave,且采用socket流复制策略(对应配置repl-diskless-sync yes)*/
    } else if (server.rdb_child_pid != -1 &&
               server.rdb_child_type == RDB_CHILD_TYPE_SOCKET)
    {
        /* 必须等当前传输完毕,才能开启下一次传输*/
        serverLog(LL_NOTICE,"Current BGSAVE has socket target. Waiting for next BGSAVE for SYNC");

    /* CASE 3: 没有bgsave在执行 */
    } else {
        if (server.repl_diskless_sync && (c->slave_capa & SLAVE_CAPA_EOF)) {
            /* 对于socket流方式,迟延传输开始时间. */
            if (server.repl_diskless_sync_delay)
                serverLog(LL_NOTICE,"Delay next BGSAVE for diskless SYNC");
        } else {
            /* 执行一个新的bgsave. */
            if (server.aof_child_pid == -1) {
                startBgsaveForReplication(c->slave_capa);
            } else {
                serverLog(LL_NOTICE,
                    "No BGSAVE in progress, but an AOF rewrite is active. "
                    "BGSAVE for replication delayed");
            }
        }
    }
    
    /* 创建复制积压区:当第一个slave请求同步,且不存积压区时 */ 
    if (listLength(server.slaves) == 1 && server.repl_backlog == NULL)
        createReplicationBacklog();
    return;
}
/* 尝试增量同步 */
int masterTryPartialResynchronization(client *c) {
    long long psync_offset, psync_len;
    char *master_runid = c->argv[1]->ptr;
    char buf[128];
    int buflen;

    /* 校验runid */
    if (strcasecmp(master_runid, server.runid)) {
        /* Run id "?" is used by slaves that want to force a full resync. */
        if (master_runid[0] != '?') {
            serverLog(LL_NOTICE,"Partial resynchronization not accepted: "
                "Runid mismatch (Client asked for runid '%s', my runid is '%s')",
                master_runid, server.runid);
        } else {
            serverLog(LL_NOTICE,"Full resync requested by slave %s",
                replicationGetSlaveName(c));
        }
        goto need_full_resync;
    }

    /* 校验offset */
    if (getLongLongFromObjectOrReply(c,c->argv[2],&psync_offset,NULL) !=
       C_OK) goto need_full_resync;
    if (!server.repl_backlog ||
        psync_offset < server.repl_backlog_off ||
        psync_offset > (server.repl_backlog_off + server.repl_backlog_histlen))
    {
        serverLog(LL_NOTICE,
            "Unable to partial resync with slave %s for lack of backlog (Slave request was: %lld).", replicationGetSlaveName(c), psync_offset);
        if (psync_offset > server.master_repl_offset) {
            serverLog(LL_WARNING,
                "Warning: slave %s tried to PSYNC with an offset that is greater than the master replication offset.", replicationGetSlaveName(c));
        }
        goto need_full_resync;
    }

    /* 增量同步开始:设置slave状态、回复+CONTINUE、发送积压区命令
     * 1) Set client state to make it a slave.
     * 2) Inform the client we can continue with +CONTINUE
     * 3) Send the backlog data (from the offset to the end) to the slave. */
    c->flags |= CLIENT_SLAVE;
    c->replstate = SLAVE_STATE_ONLINE; /* 设置为online在线状态,通过info命令可以统计出来 */
    c->repl_ack_time = server.unixtime;
    c->repl_put_online_on_ack = 0;
    listAddNodeTail(server.slaves,c);  /* 将该slave添加至结点尾部 */
   
    buflen = snprintf(buf,sizeof(buf),"+CONTINUE\r\n");
    if (write(c->fd,buf,buflen) != buflen) {
        freeClientAsync(c);
        return C_OK;
    }
    /* 将积压区中的命令发送至slave */
    psync_len = addReplyReplicationBacklog(c,psync_offset);
    serverLog(LL_NOTICE,
        "Partial resynchronization request from %s accepted. Sending %lld bytes of backlog starting from offset %lld.",
            replicationGetSlaveName(c),
            psync_len, psync_offset);
   

    refreshGoodSlavesCount();
    return C_OK; /* The caller can return, no full resync needed. */

need_full_resync:
    /* 如果需要全量同步,则返回C_ERR */
    return C_ERR;
}
/* bgsave实现全量同步*/
int startBgsaveForReplication(int mincapa) {
    int retval;
    int socket_target = server.repl_diskless_sync && (mincapa & SLAVE_CAPA_EOF);
    listIter li;
    listNode *ln;

    serverLog(LL_NOTICE,"Starting BGSAVE for SYNC with target: %s",
        socket_target ? "slaves sockets" : "disk");
    /* 复制策略:磁盘复制、socket复制*/
    if (socket_target)
        retval = rdbSaveToSlavesSockets();
    else
        retval = rdbSaveBackground(server.rdb_filename); /* 生成新的rdb文件 */

    /* If we failed to BGSAVE, remove the slaves waiting for a full
     * resynchorinization from the list of salves, inform them with
     * an error about what happened, close the connection ASAP. */
    if (retval == C_ERR) {
        serverLog(LL_WARNING,"BGSAVE for replication failed");
        listRewind(server.slaves,&li);
        while((ln = listNext(&li))) {
            client *slave = ln->value;

            if (slave->replstate == SLAVE_STATE_WAIT_BGSAVE_START) {
                slave->flags &= ~CLIENT_SLAVE;
                listDelNode(server.slaves,ln);
                addReplyError(slave,
                    "BGSAVE failed, replication can't continue");
                slave->flags |= CLIENT_CLOSE_AFTER_REPLY;
            }
        }
        return retval;
    }

    /* 对于磁盘复制策略,迭代slave列表,依次传输。对于socket策略执行上面的rdbSaveToSlavesSockets()*/
    if (!socket_target) {
        listRewind(server.slaves,&li);
        while((ln = listNext(&li))) {
            client *slave = ln->value;
            /* 依次把数据传输给每个Slave */
            if (slave->replstate == SLAVE_STATE_WAIT_BGSAVE_START) {
                    replicationSetupSlaveForFullResync(slave,
                            getPsyncInitialOffset());
            }
        }
    }

    /* Flush the script cache, since we need that slave differences are
     * accumulated without requiring slaves to match our cached scripts. */
    if (retval == C_OK) replicationScriptCacheFlush();
    return retval;
}

TEST

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值