《REDIS设计与实现》- 网络连接库剖析(client的创建/释放、命令接收/回复等)

一 序

      这一篇看起来networking.c好头大,因为书上的没怎么提,主要看代码,这个流程有点长。感谢网上大神的文章,终于明白了这个流程。本篇的主要包含:客户端的连接处理、客户端的创建与释放、接受客户端的命令、回复客户端、关于客户端命令几部分。

这个找到原始出处会补充上。限于篇幅本文只整理server端流程,客户端的待下篇整理。

对比下自己画的server端的草图

二  建立连接

 要向Redis服务器发送命令,首先要建立与Redis服务器之间的TCP连接。在分析Redis启动过程时,初始化这一步会注册事件处理器,源码在server.c  ,函数是initServer():

// 初始化服务器
void initServer(void) {
    int j;
    
   ...

    /* Open the TCP listening socket for the user commands. */
    // 监听端口
    if (server.port != 0 &&
        listenToPort(server.port,server.ipfd,&server.ipfd_count) == C_ERR)
        exit(1);

    /* Open the listening Unix domain socket. */
     // 打开Unix本地端口
    if (server.unixsocket != NULL) {
        unlink(server.unixsocket); /* don't care if this fails */
        server.sofd = anetUnixServer(server.neterr,server.unixsocket,
            server.unixsocketperm, server.tcp_backlog);
        if (server.sofd == ANET_ERR) {
            serverLog(LL_WARNING, "Opening Unix socket: %s", server.neterr);
            exit(1);
        }
        anetNonBlock(NULL,server.sofd);
    }

...

  
    /* Create the serverCron() time event, that's our main way to process
     * background operations. */
     // 创建一个时间事件,并安装serverCron()处理时间事件 
    if(aeCreateTimeEvent(server.el, 1, serverCron, NULL, NULL) == AE_ERR) {
        serverPanic("Can't create the serverCron time event.");
        exit(1);
    }

    /* Create an event handler for accepting new connections in TCP and Unix
     * domain sockets. */
    // 为每一个TCP连接的client创建文件事件,并安装acceptTcpHandler()函数来accept连接 
    for (j = 0; j < server.ipfd_count; j++) {
        if (aeCreateFileEvent(server.el, server.ipfd[j], AE_READABLE,
            acceptTcpHandler,NULL) == AE_ERR)
            {
                serverPanic(
                    "Unrecoverable error creating server.ipfd file event.");
            }
    }
    // 为Unix本地连接创建文件事件,安装acceptUnixHandler()函数来处理本地连接
    if (server.sofd > 0 && aeCreateFileEvent(server.el,server.sofd,AE_READABLE,
        acceptUnixHandler,NULL) == AE_ERR) serverPanic("Unrecoverable error creating server.sofd file event.");

 
  ...
}

     在Redis配置文件中有一项bind配置,通过bind可以配置监听来自哪些网络接口请求。Redis在启动时会监听这些接口,将fds保存在server.ipfd数组中。Redis循环所有的网络接口,为这些接口绑定AE_READABLE(可读事件),事件的处理器是acceptTcpHandler。

  aeCreateFileEvent函数(源码在ae.c)会为每个对应的fd绑定一个aeFileEvent,aeFileEvent中绑定了处理读写事件的函数。

/*
 * 根据 mask 参数的值,监听 fd 文件的状态,
 * 当 fd 可用时,执行 proc 函数
 */
int aeCreateFileEvent(aeEventLoop *eventLoop, int fd, int mask,
        aeFileProc *proc, void *clientData)
{
    if (fd >= eventLoop->setsize) {
        errno = ERANGE;
        return AE_ERR;
    }
     //根据fd值获取aeFileEvent,后面用来绑定aeFileProc
    aeFileEvent *fe = &eventLoop->events[fd];
     //注册事件,取决于具体实现(epoll、select等)
    if (aeApiAddEvent(eventLoop, fd, mask) == -1)
        return AE_ERR;
    fe->mask |= mask;
     //这里注册的事读事件,aeFileProc是acceptTcpHandler
    if (mask & AE_READABLE) fe->rfileProc = proc;
    if (mask & AE_WRITABLE) fe->wfileProc = proc;
     // 私有数据	
    fe->clientData = clientData;
     // 如果有需要,更新事件处理器的最大 fd
    if (fd > eventLoop->maxfd)
        eventLoop->maxfd = fd;
    return AE_OK;
}

当有客户端尝试连接Redis服务器时,aeApiPoll函数会返回1,并从eventLoop->fired[j]中获取发生事件的fd,进而获取到对应的aeFileEvent。因为之前在serverSocket上注册的是AE_READABLE事件,所以调用fe->rfileProc处理客户端连接(acceptTcpHandler)。  这段逻辑代码在ae.c 的函数aeProcessEvents ,参见:https://blog.csdn.net/bohu83/article/details/84964656

acceptTcpHandler函数中,会创建与客户端之间的socket连接、调用createClient创建客户端,看下源码在networking.c:

//接受TCP连接的处理函数
void acceptTcpHandler(aeEventLoop *el, int fd, void *privdata, int mask) {
    int cport, cfd, max = MAX_ACCEPTS_PER_CALL;//每次call的最大连接数(1000)
    char cip[NET_IP_STR_LEN];//连接的ip
    UNUSED(el);
    UNUSED(mask);
    UNUSED(privdata);

    while(max--) {
    	  // accept接受client的连接
        cfd = anetTcpAccept(server.neterr, fd, cip, sizeof(cip), &cport);
        if (cfd == ANET_ERR) {
            if (errno != EWOULDBLOCK)
                serverLog(LL_WARNING,
                    "Accepting client connection: %s", server.neterr);
            return;
        }//打印连接日志
        serverLog(LL_VERBOSE,"Accepted %s:%d", cip, cport);
        //创建一个连接状态的client
        acceptCommonHandler(cfd,0,cip);;
    }
}

在networking中有连接事件的处理函数,连接分为tcp连接和本地连接:acceptTcpHandler和acceptUnixHandler。再看看acceptUnixHandler。

//接收本地连接的处理函数
void acceptUnixHandler(aeEventLoop *el, int fd, void *privdata, int mask) {
    int cfd, max = MAX_ACCEPTS_PER_CALL;
    UNUSED(el);
    UNUSED(mask);
    UNUSED(privdata);

    while(max--) {
    	    // accept 本地客户端连接
        cfd = anetUnixAccept(server.neterr, fd);//本地连接返回进程fd
        if (cfd == ANET_ERR) {
            if (errno != EWOULDBLOCK)
                serverLog(LL_WARNING,
                    "Accepting client connection: %s", server.neterr);
            return;
        }
        serverLog(LL_VERBOSE,"Accepted connection to %s", server.unixsocket);
        acceptCommonHandler(cfd,CLIENT_UNIX_SOCKET,NULL);
    }
}

可以看到一个常规处理函数acceptCommonHandler 源码在networking.c,用来在server中创建client数据结构,保存ip地址,port端口,数据缓冲区buf等通信信息。

//创建一个client的连接状态
static void acceptCommonHandler(int fd, int flags, char *ip) {
    client *c;
    //创建一个新的client
    if ((c = createClient(fd)) == NULL) {
        serverLog(LL_WARNING,
            "Error registering fd event for the new client: %s (fd=%d)",
            strerror(errno),fd);
        close(fd); /* May be already closed, just ignore errors */
        return;
    }
    /* If maxclient directive is set and this is one client more... close the
     * connection. Note that we create the client instead to check before
     * for this condition, since now the socket is already set in non-blocking
     * mode and we can send an error for free using the Kernel I/O */
     // 如果新添加的客户端令服务器的最大客户端数量达到了
    // 那么向新客户端写入错误信息,并关闭新客户端
    // 先创建客户端,再进行数量检查是为了方便地进行错误信息写入 
    if (listLength(server.clients) > server.maxclients) {
        char *err = "-ERR max number of clients reached\r\n";

        /* That's a best effort error message, don't check write errors */
        if (write(c->fd,err,strlen(err)) == -1) {
            /* Nothing to do, Just to avoid the warning... */
        }
        //更新链接拒绝数
        server.stat_rejected_conn++;
        freeClient(c);
        return;
    }

    /* If the server is running in protected mode (the default) and there
     * is no password set, nor a specific interface is bound, we don't accept
     * requests from non loopback interfaces. Instead we try to explain the
     * user what to do to fix it if needed. */
    // 如果服务器正在以保护模式运行(默认),且没有设置密码,也没有绑定指定的接口,
    //我们就不接受非回环接口的请求。相反,如果需要,我们会尝试解释用户如何解决问题  
    if (server.protected_mode &&
        server.bindaddr_count == 0 &&
        server.requirepass == NULL &&
        !(flags & CLIENT_UNIX_SOCKET) &&
        ip != NULL)
    {
        if (strcmp(ip,"127.0.0.1") && strcmp(ip,"::1")) {
        	//挺逗的,错误信息这么长,觉得给出错误码,到对应的维护手册去看更好。
            char *err =
                "-DENIED Redis is running in protected mode because protected "
                "mode is enabled, no bind address was specified, no "
                "authentication password is requested to clients. In this mode "
                "connections are only accepted from the loopback interface. "
                "If you want to connect from external computers to Redis you "
                "may adopt one of the following solutions: "
                "1) Just disable protected mode sending the command "
                "'CONFIG SET protected-mode no' from the loopback interface "
                "by connecting to Redis from the same host the server is "
                "running, however MAKE SURE Redis is not publicly accessible "
                "from internet if you do so. Use CONFIG REWRITE to make this "
                "change permanent. "
                "2) Alternatively you can just disable the protected mode by "
                "editing the Redis configuration file, and setting the protected "
                "mode option to 'no', and then restarting the server. "
                "3) If you started the server manually just for testing, restart "
                "it with the '--protected-mode no' option. "
                "4) Setup a bind address or an authentication password. "
                "NOTE: You only need to do one of the above things in order for "
                "the server to start accepting connections from the outside.\r\n";
            if (write(c->fd,err,strlen(err)) == -1) {
                /* Nothing to do, Just to avoid the warning... */
            }
            //更新拒接连接的个数
            server.stat_rejected_conn++;
            freeClient(c);
            return;
        }
    }
     // 更新连接的数量
    server.stat_numconnections++;
     // 更新client状态的标志
    c->flags |= flags;
}

执行到这里客户端与服务器连接建立完成,服务器已经为客户端socket注册了命令处理器,等待客户端发送命令。

三 client的创建与释放

在上面接受客户端的连接请求处理中, 需要创建client保存信息,看下对应的源码createClient、:

//创建一个客户端
client *createClient(int fd) {
	  //分配空间
    client *c = zmalloc(sizeof(client));

    /* passing -1 as fd it is possible to create a non connected client.
     * This is useful since all the commands needs to be executed
     * in the context of a client. When commands are executed in other
     * contexts (for instance a Lua script) we need a non connected client. */
     // 如果fd为-1,表示创建的是一个无网络连接的伪客户端,用于执行lua脚本的时候。
    // 如果fd不等于-1,表示创建一个有网络连接的客户端 
    if (fd != -1) {
    	   // 设置fd为非阻塞模式
        anetNonBlock(NULL,fd);
         // 禁止使用 Nagle 算法,client向内核递交的每个数据包都会立即发送给server出去,TCP_NODELAY
        anetEnableTcpNoDelay(NULL,fd);
        //设置keepalive
        if (server.tcpkeepalive)
            anetKeepAlive(NULL,fd,server.tcpkeepalive);
       // 绑定读事件到事件el, loop (开始接收命令请求)     
        if (aeCreateFileEvent(server.el,fd,AE_READABLE,
            readQueryFromClient, c) == AE_ERR)
        {
            close(fd);
            zfree(c);
            return NULL;
        }
    }
    //初始化属性
    //默认选择0数据库
    selectDb(c,0);
    +//设置clinet的ID
    c->id = server.next_client_id++;
    //套接字
    c->fd = fd;
    //名字
    c->name = NULL;
    //回复静态缓冲区的偏移量
    c->bufpos = 0;
    //查询缓冲区
    c->querybuf = sdsempty();
    //查询缓冲区的峰值
    c->querybuf_peak = 0;
    //命令请求的类型
    c->reqtype = 0;
    //参数的个数
    c->argc = 0;
    //参数列表
    c->argv = NULL;
    //当前执行的命令和最近一次执行的命令
    c->cmd = c->lastcmd = NULL;
    //查询缓冲区中未读入的命令内容数量
    c->multibulklen = 0;
    //读入的参数的长度
    c->bulklen = -1;
    //已发送的字节数
    c->sentlen = 0;
    //client的状态
    c->flags = 0;
    //设置创建client的时间和最后一次互动的时间
    c->ctime = c->lastinteraction = server.unixtime;
    // 认证状态
    c->authenticated = 0;
     // replication复制的状态,初始为无
    c->replstate = REPL_STATE_NONE;
    //设置从节点的写处理器为ack,是否在slave向master发送ack
    c->repl_put_online_on_ack = 0;
    // replication复制的偏移量
    c->reploff = 0;
    //通过ack命令接收到的偏移量
    c->repl_ack_off = 0;
    //通过 AKC 命令接收到偏移量的时间
    c->repl_ack_time = 0;
    //从节点的端口号(客户端为从服务器时使用)
    c->slave_listening_port = 0;
    //从节点IP地址
    c->slave_ip[0] = '\0';
    //从节点的功能
    c->slave_capa = SLAVE_CAPA_NONE;
    //回复链表
    c->reply = listCreate();
    //回复链表的字节数
    c->reply_bytes = 0;
    //回复缓冲区的内存大小软限制的时间
    c->obuf_soft_limit_reached_time = 0;
    //回复链表的释放和复制方法
    listSetFreeMethod(c->reply,decrRefCountVoid);    
    listSetDupMethod(c->reply,dupClientReplyValue);
    //阻塞类型
    c->btype = BLOCKED_NONE;
    // 阻塞超时
    c->bpop.timeout = 0;
    //造成客户端阻塞的列表键字典
    c->bpop.keys = dictCreate(&setDictType,NULL);
    //存储解除阻塞的键,用于保存PUSH入元素的键,也就是dstkey( BRPOPLPUSH 命令时使用)
    c->bpop.target = NULL;
    // 阻塞状态
    c->bpop.numreplicas = 0;
    //要达到的复制偏移量
    c->bpop.reploffset = 0;
    // 全局的复制偏移量
    c->woff = 0;
    // 监控的键
    c->watched_keys = listCreate();
     // 订阅频道
    c->pubsub_channels = dictCreate(&setDictType,NULL);
     // 订阅频道
    c->pubsub_patterns = listCreate();
    // 被缓存的peerid,peerid就是 ip:port
    c->peerid = NULL;
    // 订阅发布模式的释放和比较方法
    listSetFreeMethod(c->pubsub_patterns,decrRefCountVoid);
    listSetMatchMethod(c->pubsub_patterns,listMatchObjects);
     // 如果不是伪客户端,那么添加到服务器的客户端链表中
    if (fd != -1) listAddNodeTail(server.clients,c);
     // 初始化客户端的事务状态	
    initClientMultiState(c);
    return c;
}

根据传入的文件描述符fd,可以创建用于不同情景下的client。这个fd就是服务器接收客户端connect后所返回的文件描述符。

  • fd == -1。表示创建一个无网络连接的客户端。主要用于执行 lua 脚本时。
  • fd != -1。表示接收到一个正常的客户端连接,则会创建一个有网络连接的客户端,也就是创建一个文件事件,来监听这个fd是否可读,当客户端发送数据,则事件被触发。创建客户端时,还会禁用Nagle算法。

其余参数看注释就好,需要关注的就是在createClient函数中注册了readQueryFromClient处理器并初始化客户端。。

下面看客户端的释放

/* Remove the specified client from global lists where the client could
 * be referenced, not including the Pub/Sub channels.
 * This is used by freeClient() and replicationCacheMaster(). */
// 从client所有保存各种client状态的链表中删除指定的client 
void unlinkClient(client *c) {
    listNode *ln;

    /* If this is marked as current client unset it. */
    //如果是当前client,移除当前client
    if (server.current_client == c) server.current_client = NULL;

    /* Certain operations must be done only if the client has an active socket.
     * If the client was already unlinked or if it's a "fake client" the
     * fd is already set to -1. */
    // 指定的client不是伪client,或不是已经删除的client     
    if (c->fd != -1) {
        /* Remove from the list of active clients. */
         // 从client链表中找到地址
        ln = listSearchKey(server.clients,c);
        serverAssert(ln != NULL);
         // 删除当前client的节点
        listDelNode(server.clients,ln);

        /* Unregister async I/O handlers and close the socket. */
        // 从文件事件中删除对该client的fd的监听
        aeDeleteFileEvent(server.el,c->fd,AE_READABLE);
        aeDeleteFileEvent(server.el,c->fd,AE_WRITABLE);
        // 释放文件描述符
        close(c->fd);
        c->fd = -1;
    }

    /* Remove from the list of pending writes if needed. */
     //如果需要从待写的队列中删除client
    if (c->flags & CLIENT_PENDING_WRITE) {
    	   // 要写或者安装写处理程序的client链表找到当前client
        ln = listSearchKey(server.clients_pending_write,c);
        serverAssert(ln != NULL);
        // 删除当前client的节点
        listDelNode(server.clients_pending_write,ln);
        // 取消标志
        c->flags &= ~CLIENT_PENDING_WRITE;
    }

    /* When client was just unblocked because of a blocking operation,
     * remove it from the list of unblocked clients. */
      // 如果指定的client是非阻塞的
    if (c->flags & CLIENT_UNBLOCKED) {
        ln = listSearchKey(server.unblocked_clients,c);
        serverAssert(ln != NULL);
        listDelNode(server.unblocked_clients,ln);
        c->flags &= ~CLIENT_UNBLOCKED;
    }
}
//释放客户端
void freeClient(client *c) {
    listNode *ln;

    /* If it is our master that's beging disconnected we should make sure
     * to cache the state to try a partial resynchronization later.
     *
     * Note that before doing this we make sure that the client is not in
     * some unexpected state, by checking its flags. */
    // 如果client,已经连接主机,可能需要保存主机状态信息,以便进行 PSYNC 
    if (server.master && c->flags & CLIENT_MASTER) {
        serverLog(LL_WARNING,"Connection with master lost.");
        if (!(c->flags & (CLIENT_CLOSE_AFTER_REPLY|
                          CLIENT_CLOSE_ASAP|
                          CLIENT_BLOCKED|
                          CLIENT_UNBLOCKED)))
        {   //处理master主机断开连接(要缓存client,可以迅速重新启用恢复,不用整体从头建立连接)
            replicationCacheMaster(c);
            return;
        }
    }

    /* Log link disconnection with slave */
    //打印log 与slave 链接断开
    if ((c->flags & CLIENT_SLAVE) && !(c->flags & CLIENT_MONITOR)) {
        serverLog(LL_WARNING,"Connection with slave %s lost.",
            replicationGetSlaveName(c));
    }

    /* Free the query buffer */
    //清空输入缓冲区
    sdsfree(c->querybuf);
    c->querybuf = NULL;

    /* Deallocate structures used to block on blocking ops. */    
    if (c->flags & CLIENT_BLOCKED) unblockClient(c);//解开阻塞
    dictRelease(c->bpop.keys);//释放关于阻塞的字典空间

    /* UNWATCH all the keys */
    // 清空 WATCH 信息
    unwatchAllKeys(c);
    listRelease(c->watched_keys);

    /* Unsubscribe from all the pubsub channels */
    // 退订所有频道和模式
    pubsubUnsubscribeAllChannels(c,0);
    pubsubUnsubscribeAllPatterns(c,0);
    dictRelease(c->pubsub_channels);
    listRelease(c->pubsub_patterns);

    /* Free data structures. */
    listRelease(c->reply);//释放reply数据结构
    freeClientArgv(c);//清空客户端参数

    /* Unlink the client: this will close the socket, remove the I/O
     * handlers, and remove references of the client from different
     * places where active clients may be referenced. */
     //移除所有的引用(会关闭socket,从事件循环中移除对该client的监听)
    unlinkClient(c);

    /* Master/slave cleanup Case 1:
     * we lost the connection with a slave. */
      //从服务器的客户端断开连接
    if (c->flags & CLIENT_SLAVE) {
    	   // 如果当前服务器的复制状态为:正在发送RDB文件给从节点
        if (c->replstate == SLAVE_STATE_SEND_BULK) {
        	  // 关闭用于保存主服务器发送RDB文件的文件描述符
            if (c->repldbfd != -1) close(c->repldbfd);
             // 释放RDB文件的字符串形式的大小	
            if (c->replpreamble) sdsfree(c->replpreamble);
        }
        // 获取保存当前client的链表地址,监控器链表或从节点链表
        list *l = (c->flags & CLIENT_MONITOR) ? server.monitors : server.slaves;
         // 取出保存client的节点	
        ln = listSearchKey(l,c);
        serverAssert(ln != NULL);
         // 删除该client
        listDelNode(l,ln);
        /* We need to remember the time when we started to have zero
         * attached slaves, as after some time we'll free the replication
         * backlog. */
         // 服务器从节点链表为空,要保存当前时间 
        if (c->flags & CLIENT_SLAVE && listLength(server.slaves) == 0)
            server.repl_no_slaves_since = server.unixtime;
        refreshGoodSlavesCount();//更新存活的slave数目
    }

    /* Master/slave cleanup Case 2:
     * we lost the connection with the master. */
     //如果是一个主服务器的客户端断开连接 
    if (c->flags & CLIENT_MASTER) replicationHandleMasterDisconnection();

    /* If this client was scheduled for async freeing we need to remove it
     * from the queue. */
     // 如果client即将关闭,则从clients_to_close中找到并删除 
    if (c->flags & CLIENT_CLOSE_ASAP) {
        ln = listSearchKey(server.clients_to_close,c);
        serverAssert(ln != NULL);
        listDelNode(server.clients_to_close,ln);
    }

    /* Release other dynamically allocated client structure fields,
     * and finally release the client structure itself. */
    // 如果client有名字,则释放 
    if (c->name) decrRefCount(c->name);
    zfree(c->argv);// 释放参数列表
    freeClientMultiState(c);// 清除事物状态信息
    sdsfree(c->peerid);
    zfree(c);// 释放客户端 redisClient 结构本身
}

  四 接受客户端请求

  客户端向socket写入RESP协议格式的命令,等待服务器返回执行结果。协议本篇不展开.

    追寻客户端的请求,可以看到在创建client时,有添加一个客户端fd的可读事件,当客户端连接上Redis服务器后,服务器会得到一个文件描述符fd,而且服务器会监听该文件描述符的读事件,这些在createClient()函数中,我们有分析。那么当客户端发送了命令,触发了AE_READABLE事件,那么就会调用回调函数readQueryFromClient()来从文件描述符fd中读发来的命令,并保存在输入缓冲区中querybuf.(server通过可读事件读取指令到缓冲区:也是上面第二节说的 numevents = aeApiPoll(eventLoop, tvp)只不过次事件处理器变成了readQueryFromClient),下面我们看这个可读事件的处理函数readQueryFromClient。源码在networking.c

//读取客户端的查询缓冲区内容
void readQueryFromClient(aeEventLoop *el, int fd, void *privdata, int mask) {
    client *c = (client*) privdata;
    int nread, readlen;
    size_t qblen;
    UNUSED(el);
    UNUSED(mask);

    //读入的长度,16K
    readlen = PROTO_IOBUF_LEN;
    /* If this is a multi bulk request, and we are processing a bulk reply
     * that is large enough, try to maximize the probability that the query
     * buffer contains exactly the SDS string representing the object, even
     * at the risk of requiring more read(2) calls. This way the function
     * processMultiBulkBuffer() can avoid copying buffers to create the
     * Redis Object representing the argument. */
    // 如果是多条请求,根据请求的大小,设置读入的长度readlen 
    if (c->reqtype == PROTO_REQ_MULTIBULK && c->multibulklen && c->bulklen != -1
        && c->bulklen >= PROTO_MBULK_BIG_ARG)
    {
        int remaining = (unsigned)(c->bulklen+2)-sdslen(c->querybuf);

        if (remaining < readlen) readlen = remaining;
    }
    
    // 获取查询缓冲区当前内容的长度
    // 如果读取出现 short read ,那么可能会有内容滞留在读取缓冲区里面
    // 这些滞留内容也许不能完整构成一个符合协议的命令,
    qblen = sdslen(c->querybuf);
     // 更新缓冲区的峰值peak
    if (c->querybuf_peak < qblen) c->querybuf_peak = qblen;
    ;//为querybuf开辟空间	
    c->querybuf = sdsMakeRoomFor(c->querybuf, readlen);
    //将client发来的命令,读入到输入缓冲区(querybuf)中
    nread = read(fd, c->querybuf+qblen, readlen);
    if (nread == -1) {//出错处理
        if (errno == EAGAIN) {
            return;
        } else {
            serverLog(LL_VERBOSE, "Reading from client: %s",strerror(errno));
            freeClient(c);
            return;
        }
      //读操作完成(遇到 EOF)  
    } else if (nread == 0) {
        serverLog(LL_VERBOSE, "Client closed connection");
        freeClient(c);
        return;
    }
    
     // 更新输入缓冲区的已用大小和未用大小。
    sdsIncrLen(c->querybuf,nread);
      // 更新最后一次服务器和client交互的时间
    c->lastinteraction = server.unixtime;
    // 如果是主节点,则更新复制操作的偏移量
    if (c->flags & CLIENT_MASTER) c->reploff += nread;
      // 更新从网络输入的字节数 	
    server.stat_net_input_bytes += nread;
     // 如果输入缓冲区长度超过服务器设置的最大缓冲区长度(PROTO_MAX_QUERYBUF_LEN 1G)
     // 清空缓冲区并释放客户端
    if (sdslen(c->querybuf) > server.client_max_querybuf_len) {
    	  // 将client信息转换为sds
        sds ci = catClientInfoString(sdsempty(),c), bytes = sdsempty();
         // 输入缓冲区保存在bytes中
        bytes = sdscatrepr(bytes,c->querybuf,64);
         // 打印到日志
        serverLog(LL_WARNING,"Closing client that reached max query buffer length: %s (qbuf initial bytes: %s)", ci, bytes);
        sdsfree(ci);  // 释放空间
        sdsfree(bytes);
        freeClient(c);
        return;
    }
    processInputBuffer(c);  // 处理client输入的命令内容
}

  可以看到调用了processInputBuffer(),对缓冲区中的指令进行解析,redis中有两种请求,一种是通过telent一种的redis-cli,所以有两种指令协议,当然也有两种协议解析,函数分别是processMultibulkBuffer和processInlineBuffer。它们与具体的协议格式有关,这里对主流程关系不大,不展开。看下processInputBuffer的源码:

// 处理client输入的命令内容
void processInputBuffer(client *c) {
    server.current_client = c;
    /* Keep processing while there is something in the input buffer */
    // 尽可能地处理查询缓冲区中的内容
    // 如果读取出现 short read ,那么可能会有内容滞留在读取缓冲区里面
    // 这些滞留内容也许不能完整构成一个符合协议的命令,
    // 需要等待下次读事件的就绪
    while(sdslen(c->querybuf)) {
        /* Return if clients are paused. */
         //cilent处于暂停状态,直接返回
        if (!(c->flags & CLIENT_SLAVE) && clientsArePaused()) break;

        /* Immediately abort if the client is in the middle of something. */
         // 如果client处于被阻塞状态,直接返回
        if (c->flags & CLIENT_BLOCKED) break;

        /* CLIENT_CLOSE_AFTER_REPLY closes the connection once the reply is
         * written to the client. Make sure to not let the reply grow after
         * this flag has been set (i.e. don't process more commands).
         *
         * The same applies for clients we want to terminate ASAP. */
        // 客户端已经设置了关闭 FLAG ,没有必要处理命令了
        if (c->flags & (CLIENT_CLOSE_AFTER_REPLY|CLIENT_CLOSE_ASAP)) break;

        /* Determine request type when unknown. */
        // 如果是未知的请求类型,则判定请求类型
        // 简单来说,多条查询是一般客户端发送来的,
        // 而内联查询则是 TELNET 发送来的
        if (!c->reqtype) {
            if (c->querybuf[0] == '*') {
            	    // 多条查询
                c->reqtype = PROTO_REQ_MULTIBULK;
            } else {// 否则就是内联请求,是Telnet发来的
                c->reqtype = PROTO_REQ_INLINE;
            }
        }
        
        /* 将缓冲区中的内容转换成命令,以及命令参数*/
        // 如果是内联请求
        if (c->reqtype == PROTO_REQ_INLINE) {
        	   //处理Telnet发来的内联命令,并创建成对象,保存在client的参数列表中
            if (processInlineBuffer(c) != C_OK) break;
          // 如果是多条请求   	
        } else if (c->reqtype == PROTO_REQ_MULTIBULK) {
        	// 将client的querybuf中的协议内容转换为client的参数列表中的对象
            if (processMultibulkBuffer(c) != C_OK) break;
        } else {
            serverPanic("Unknown request type");
        }

        /* Multibulk processing could see a <= 0 length. */
        // 如果参数为0,则重置client
        if (c->argc == 0) {
            resetClient(c);
        } else {
            /* Only reset the client when the command was executed. */
             // 执行命令成功后重置client
            if (processCommand(c) == C_OK)
                resetClient(c);
            /* freeMemoryIfNeeded may flush slave output buffers. This may result
             * into a slave, that may be the active client, to be freed. */
            if (server.current_client == NULL) break;
        }
    }// 执行成功,则将用于崩溃报告的client设置为NULL
    server.current_client = NULL;
}

可见在解析完指令后执行了processCommand来执行指令。

五 服务器执行命令

在上一步中Redis服务器已经将客户端的请求解析完成,参数保存在client的argv中。在processCommand函数,源码在server.c

源码较长,先会通过argv[0]在命令字典中找到对应的命令然后做一系列的判断,例如client是否通过auth验证、命令参数个数是否正确、是否开启了集群功能需要转向请求、服务器最大内存限制判断等等,这里只专注于命令执行,就是

int processCommand(client *c) {
....
/* Exec the command */
    // 执行命令
    // client处于事务环境中,但是执行命令不是exec、discard、multi和watch
    if (c->flags & CLIENT_MULTI &&
        c->cmd->proc != execCommand && c->cmd->proc != discardCommand &&
        c->cmd->proc != multiCommand && c->cmd->proc != watchCommand)
    {
        // 除了上述的四个命令,其他的命令添加到事务队列中
        queueMultiCommand(c);
        addReply(c,shared.queued);
    // 执行普通的命令
    } else {
        call(c,CMD_CALL_FULL);
        // 保存写全局的复制偏移量
        c->woff = server.master_repl_offset;
        // 如果因为BLPOP而阻塞的命令已经准备好,则处理client的阻塞状态
        if (listLength(server.ready_keys))
            handleClientsBlockedOnLists();
    }
    return C_OK;
}

else分支中的call函数真正调用了命令执行函数:

c->cmd->proc(c);

client的cmd是一个redisCommand结构变量源码在server.h,它的结构是:

// redis命令结构
struct redisCommand {
	   // 命令名称
    char *name;
     // proc函数指针,指向返回值为void,参数为client *c的函数
    // 指向实现命令的函数
    redisCommandProc *proc;
      // 参数个数
    int arity;
    // 字符串形式的标示值
    char *sflags; /* Flags as string representation, one char per flag. */
    // 实际的标示值
    int flags;    /* The actual flags, obtained from the 'sflags' field. */
    /* Use a function to determine keys arguments in a command line.
     * Used for Redis Cluster redirect. */
     // getkeys_proc是函数指针,返回值是有个整型的数组
    // 从命令行判断该命令的参数 
    redisGetKeysProc *getkeys_proc;
    /* What keys should be loaded in background when calling this command? */
     // 指定哪些参数是key
    int firstkey; /* 第一个参数是 key The first argument that's a key (0 = no keys) */
    int lastkey;  /* 最后一个参数是 key The last argument that's a key */
    int keystep;  /*  第一个参数和最后一个参数的步长 The step between first and last key */
    // microseconds记录执行命令的耗费总时长
    // calls记录命令被执行的总次数
    long long microseconds, calls;
};

proc可以在server.c文件中的redisCommandTable中找到:

   {"set",setCommand,-3,"wm",0,NULL,1,1,1,0,0},

setCommand属于String类型值的命令,可以在t_string.c中找到。setCommand函数中会针对NX EX expire等进行判断,最终通过dict的setKey函数设置键值对,更新server.dirty值

//setGenericCommand()函数是以下命令: SET, SETEX, PSETEX, SETNX.的最底层实现
//flags 可以是NX或XX,由上面的宏提供
//expire 定义key的过期时间,格式由unit指定
//ok_reply和abort_reply保存着回复client的内容,NX和XX也会改变回复
//如果ok_reply为空,则使用 "+OK"
//如果abort_reply为空,则使用 "$-1"
void setGenericCommand(client *c, int flags, robj *key, robj *val, robj *expire, int unit, robj *ok_reply, robj *abort_reply) {
    long long milliseconds = 0; /* initialized to avoid any harmness warning *///初始化,避免错误

    //如果定义了key的过期时间
    if (expire) {
    	   //从expire对象中取出值,保存在milliseconds中,如果出错发送默认的信息给client
        if (getLongLongFromObjectOrReply(c, expire, &milliseconds, NULL) != C_OK)
            return;
         // 如果过期时间小于等于0,则发送错误信息给client    
        if (milliseconds <= 0) {
            addReplyErrorFormat(c,"invalid expire time in %s",c->cmd->name);
            return;
        }
         // 如果过期时间小于等于0,则发送错误信息给client
        if (unit == UNIT_SECONDS) milliseconds *= 1000;
    }

     //lookupKeyWrite函数是为执行写操作而取出key的值对象
    //如果设置了NX(不存在),并且在数据库中 找到 该key,或者
    //设置了XX(存在),并且在数据库中 没有找到 该key
    //回复abort_reply给client
    if ((flags & OBJ_SET_NX && lookupKeyWrite(c->db,key) != NULL) ||
        (flags & OBJ_SET_XX && lookupKeyWrite(c->db,key) == NULL))
    {
        addReply(c, abort_reply ? abort_reply : shared.nullbulk);
        return;
    }
     //在当前db设置键为key的值为val
    setKey(c->db,key,val);
     //设置数据库为脏(dirty),服务器每次修改一个key后,都会对脏键(dirty)增1
    server.dirty++;
    //设置key的过期时间
    //mstime()返回毫秒为单位的格林威治时间
    if (expire) setExpire(c->db,key,mstime()+milliseconds);
    //发送"set"事件的通知,用于发布订阅模式,通知客户端接受发生的事件	
    notifyKeyspaceEvent(NOTIFY_STRING,"set",key,c->db->id);
     //发送"expire"事件通知
    if (expire) notifyKeyspaceEvent(NOTIFY_GENERIC,
        "expire",key,c->db->id);
    //设置成功,则向客户端发送ok_reply    
    addReply(c, ok_reply ? ok_reply : shared.ok);
}

六 服务器返回结果

setCommand函数中,调用了addReply函数向client的输出缓冲或reply中写入返回结果,返回结果为shared.ok:,我们再看看看addReply 函数,源码在networking.c

// 添加obj到client的回复缓冲区中
void addReply(client *c, robj *obj) {
	  // 准备client为可写的
    if (prepareClientToWrite(c) != C_OK) return;

    /* This is an important place where we can avoid copy-on-write
     * when there is a saving child running, avoiding touching the
     * refcount field of the object if it's not needed.
     *  如果在使用子进程,那么尽可能地避免修改对象的 refcount 域。
     * If the encoding is RAW and there is room in the static buffer
     * we'll be able to send the object to the client without
     * messing with its page. 
     * 如果对象的编码为 RAW ,并且静态缓冲区中有空间
     * 那么就可以在不弄乱内存页的情况下,将对象发送给客户端。
     */
    if (sdsEncodedObject(obj)) {
    	  // 首先尝试复制内容到 c->buf 中,这样可以避免内存分配
        if (_addReplyToBuffer(c,obj->ptr,sdslen(obj->ptr)) != C_OK)
        	   // 如果固定的回复缓冲区( c->buf)空间不足够,则添加到回复链表(c->reply)中,可能引起内存分配
            _addReplyObjectToList(c,obj);
           //如果是int编码的对象 
    } else if (obj->encoding == OBJ_ENCODING_INT) {
        /* Optimization: if there is room in the static buffer for 32 bytes
         * (more than the max chars a 64 bit integer can take as string) we
         * avoid decoding the object and go for the lower level approach. */
        // 优化,如果 c->buf 中有等于或多于 32 个字节的空间
        // 那么将整数直接以字符串的形式复制到 c->buf 中 
        if (listLength(c->reply) == 0 && (sizeof(c->buf) - c->bufpos) >= 32) {
            char buf[32];
            int len;
             // 转换为字符串
            len = ll2string(buf,sizeof(buf),(long)obj->ptr);
            if (_addReplyToBuffer(c,buf,len) == C_OK)
                return;
            /* else... continue with the normal code path, but should never
             * happen actually since we verified there is room. */
        }
         // 当前对象是整数,但是长度大于32位,则解码成字符串对象
        obj = getDecodedObject(obj);
        // 保存到缓存中
        if (_addReplyToBuffer(c,obj->ptr,sdslen(obj->ptr)) != C_OK)
        	// 如果添加失败,则保存到回复链表中
            _addReplyObjectToList(c,obj);
        decrRefCount(obj);
    } else {
        serverPanic("Wrong obj->encoding in addReply()");
    }
}

 调用了prepareClientToWrite查看是否有需要回复的client,如果存在那么增加client的fd的可写事件,在事件处理时就会触发事件,_addReplyToBuffer函数会尝试将命令结果放入输出缓冲(c->buf)中,如果不成功(c-reply中有内容,或者超过缓冲大小),会调用_addReplyObjectToList函数放入c->reply链表中。看下prepareClientToWrite源码在networking.c

// 准备一个可写的client 
int prepareClientToWrite(client *c) {
    /* If it's the Lua client we always return ok without installing any
     * handler since there is no socket at all. */
    // 如果是要执行lua脚本的伪client,则总是返回C_OK,总是可写的 
    if (c->flags & CLIENT_LUA) return C_OK;

    /* CLIENT REPLY OFF / SKIP handling: don't send replies. */
    //如果客户端关闭回复或者忽略回复直接返回错误
    if (c->flags & (CLIENT_REPLY_OFF|CLIENT_REPLY_SKIP)) return C_ERR;

    /* Masters don't receive replies, unless CLIENT_MASTER_FORCE_REPLY flag
     * is set. */
     // 如果主节点服务器且没有设置强制回复,返回C_ERR 
    if ((c->flags & CLIENT_MASTER) &&
        !(c->flags & CLIENT_MASTER_FORCE_REPLY)) return C_ERR;
     // 如果是载入AOF的伪client,则返回C_ERR
    if (c->fd <= 0) return C_ERR; /* Fake client for AOF loading. */

    /* Schedule the client to write the output buffers to the socket only
     * if not already done (there were no pending writes already and the client
     * was yet not flagged), and, for slaves, if the slave can actually
     * receive writes at this stage. */
     // 如果client的回复缓冲区为空,且client还有输出的数据,但是没有设置写处理程序,且
    // replication的状态为关闭状态,或已经将RDB传输完成且不向主节点发送ack(判断条件好复杂啊) 
    if (!clientHasPendingReplies(c) &&
        !(c->flags & CLIENT_PENDING_WRITE) &&
        (c->replstate == REPL_STATE_NONE ||
         (c->replstate == SLAVE_STATE_ONLINE && !c->repl_put_online_on_ack)))
    {
        /* Here instead of installing the write handler, we just flag the
         * client and put it into a list of clients that have something
         * to write to the socket. This way before re-entering the event
         * loop, we can try to directly write to the client sockets avoiding
         * a system call. We'll only really install the write handler if
         * we'll not be able to write the whole reply at once. */
        // 将client设置为还有输出的数据,但是没有设置写处理程序 
        c->flags |= CLIENT_PENDING_WRITE;
        // 将当前client加入到要写或者安装写处理程序的client链表
        listAddNodeHead(server.clients_pending_write,c);
    }

    /* Authorize the caller to queue in the output buffer of this client. */
    // 授权调用者在这个client的输出缓冲区排队
    return C_OK;
}

以上内容与命令执行在一次事件循环中,因为算是输出执行结果的一部分,所以到了返回结果的一节中。

********************************************************

再看下图上一开始的在每次serverCon的循环事件,aeMain函数(源码在ae.c),在aeProcessEvents执行前会先执行eventLoop->beforesleep函数,这个函数在main函数中指定,是beforeSleep。看下源码

//事件轮询的主函数
void aeMain(aeEventLoop *eventLoop) {
    eventLoop->stop = 0;
    //不停止一直在处理
    while (!eventLoop->stop) {
    	  // 执行处理事件之前的函数
        if (eventLoop->beforesleep != NULL)
            eventLoop->beforesleep(eventLoop);
        //处理到时的时间事件和就绪的文件事件
        aeProcessEvents(eventLoop, AE_ALL_EVENTS);
    }
}

beforesleep函数是在server.c.

// 在Redis进入事件循环之前被调用
void beforeSleep(struct aeEventLoop *eventLoop) {
    UNUSED(eventLoop);

    /* Call the Redis Cluster before sleep function. Note that this function
     * may change the state of Redis Cluster (from ok to fail or vice versa),
     * so it's a good idea to call it before serving the unblocked clients
     * later in this function. */
    // 在sleep函数之前调用 clusterBeforeSleep()。请注意,此功能可能会更改Redis Cluster的状态(从ok到fail,反之亦然)
    if (server.cluster_enabled) clusterBeforeSleep();

    /* Run a fast expire cycle (the called function will return
     * ASAP if a fast cycle is not needed). */
    // 主节点主动执行过期键的删除操作,以快速模式执行,1ms
    if (server.active_expire_enabled && server.masterhost == NULL)
        activeExpireCycle(ACTIVE_EXPIRE_CYCLE_FAST);

    /* Send all the slaves an ACK request if at least one client blocked
     * during the previous event loop iteration. */
    // 如果至少一个client在进入事件循环之前被阻塞,那么发送所有的从节点一个ack请求
    // get_ack_from_slaves如果为真,则发送REPLCONF GETACK
    if (server.get_ack_from_slaves) {
        robj *argv[3];

        // 创建一个参数对象列表
        argv[0] = createStringObject("REPLCONF",8);
        argv[1] = createStringObject("GETACK",6);
        argv[2] = createStringObject("*",1); /* Not used argument. */
        // 给所有从节点服务器发送该请求
        replicationFeedSlaves(server.slaves, server.slaveseldb, argv, 3);
        // 释放参数对象列表
        decrRefCount(argv[0]);
        decrRefCount(argv[1]);
        decrRefCount(argv[2]);
        // 清空标志
        server.get_ack_from_slaves = 0;
    }

    /* Unblock all the clients blocked for synchronous replication
     * in WAIT. */
    // 解除所有等待WAIT命令而被阻塞的client
    if (listLength(server.clients_waiting_acks))
        processClientsWaitingReplicas();

    /* Try to process pending commands for clients that were just unblocked. */
    // 处理所有非阻塞的client的输入缓冲区的内容
    if (listLength(server.unblocked_clients))
        processUnblockedClients();

    /* Write the AOF buffer on disk */
    // 将AOF缓存冲洗到磁盘中
    flushAppendOnlyFile(0);

    /* Handle writes with pending output buffers. */
    // 处理放在clients_pending_write链表中的待写的client,将输出缓冲区的内容写到fd中
    handleClientsWithPendingWrites();
}

beforeSleep函数中与输出结果相关的是调用了handleClientsWithPendingWrites函数,源码在networking.c

/* This function is called just before entering the event loop, in the hope
 * we can just write the replies to the client output buffer without any
 * need to use a syscall in order to install the writable event handler,
 * get it called, and so forth. */
// 这个函数是在进入事件循环之前调用的,希望我们只需要将回复写入客户端输出缓冲区,
//而不需要使用系统调用来安装可写事件处理程序,调用它等等。 
int handleClientsWithPendingWrites(void) {
    listIter li;
    listNode *ln;
    // 要写或者安装写处理程序的client链表的长度
    int processed = listLength(server.clients_pending_write);
       // 设置遍历方向
    listRewind(server.clients_pending_write,&li);
     //循环等待回应的client列表
    while((ln = listNext(&li))) {
    	   // 取出当前client
        client *c = listNodeValue(ln);
        //删除client的 要写或者安装写处理程序 的标志
        c->flags &= ~CLIENT_PENDING_WRITE;
        // 从要写或者安装写处理程序的client链表中删除
        listDelNode(server.clients_pending_write,ln);

        /* Try to write buffers to the client socket. */
        //输出buf内容回复client
        if (writeToClient(c->fd,c,0) == C_ERR) continue;

        /* If there is nothing left, do nothing. Otherwise install
         * the write handler. */
         //如果没有回复了不做事,否则增加可写事件 
        if (clientHasPendingReplies(c) &&
            aeCreateFileEvent(server.el, c->fd, AE_WRITABLE,
                sendReplyToClient, c) == AE_ERR)
        {
            freeClientAsync(c);
        }
    }
    // 返回处理的client的个数
    return processed;
}

handleClientsWithPendingWrites  会调用writeToClient函数,会检查缓冲和reploy中的内容,向客户端输出内容。

  如果缓冲区和reply中的内容没有输出完(如在处理writeToClient时又有新的数据加入缓冲区中没有及时入里所以添加可写事件。) handleClientsWithPendingWrites函数中会为client关联写事件处理器sendReplyToClient,在后面的事件循环中socket会返回并调用sendReplyToClient继续输出。sendReplyToClient函数内部直接调用了writeToClient函数,区别是参数handler_installed不同,需要对事件处理器做额外的处理.源码在networking.c

/* Write event handler. Just send data to the client. */
// 写事件处理程序,只是发送回复给client
void sendReplyToClient(aeEventLoop *el, int fd, void *privdata, int mask) {
    UNUSED(el);
    UNUSED(mask);
    // 发送完数据会删除fd的可读事件
    writeToClient(fd,privdata,1);
}
/* Write data in output buffers to client. Return C_OK if the client
 * is still valid after the call, C_ERR if it was freed. */
// 将输出缓冲区的数据写给client,如果返回OK 表示client 有效,client被释放则返回C_ERR 
int writeToClient(int fd, client *c, int handler_installed) {
    ssize_t nwritten = 0, totwritten = 0;
    size_t objlen;
    size_t objmem;
    robj *o;

    // 如果指定的client的回复缓冲区中还有数据,表示可以写socket
    while(clientHasPendingReplies(c)) {
    	   // 固定缓冲区发送未完成
        if (c->bufpos > 0) {
        	   // 将缓冲区的数据写到fd中
            nwritten = write(fd,c->buf+c->sentlen,c->bufpos-c->sentlen);
            if (nwritten <= 0) break; // 写失败跳出循环
            // 更新发送的数据计数器 	
            c->sentlen += nwritten;
            totwritten += nwritten;

            /* If the buffer was sent, set bufpos to zero to continue with
             * the remainder of the reply. */
            // 如果发送的数据等于buf的偏移量,表示发送完成 
            if ((int)c->sentlen == c->bufpos) {
                c->bufpos = 0;
                c->sentlen = 0;
            }
        } else {// 固定缓冲区发送完成,发送回复链表的内容
        	 // 回复链表的第一条回复对象,和对象值的长度和所占的内存        	  
            o = listNodeValue(listFirst(c->reply));
            objlen = sdslen(o->ptr);
            objmem = getStringObjectSdsUsedMemory(o);

            if (objlen == 0) {//空对象直接删除
                listDelNode(c->reply,listFirst(c->reply));
                c->reply_bytes -= objmem;
                continue;
            }
            // 将当前节点的值写到fd中
            nwritten = write(fd, ((char*)o->ptr)+c->sentlen,objlen-c->sentlen);
            //写失败直接跳出
            if (nwritten <= 0) break;
            // 更新发送的数据计数器	
            c->sentlen += nwritten;
            totwritten += nwritten;

            /* If we fully sent the object on head go to the next one */
            // 发送完成,则删除该节点,重置发送的数据长度,更新回复链表的总字节数
            if (c->sentlen == objlen) {
                listDelNode(c->reply,listFirst(c->reply));
                c->sentlen = 0;
                c->reply_bytes -= objmem;
            }
        }
        /* Note that we avoid to send more than NET_MAX_WRITES_PER_EVENT
         * bytes, in a single threaded server it's a good idea to serve
         * other clients as well, even if a very large request comes from
         * super fast link that is always able to accept data (in real world
         * scenario think about 'KEYS *' against the loopback interface).
         *
         * However if we are over the maxmemory limit we ignore that and
         * just deliver as much data as it is possible to deliver. */
        //发送信息的总体大小超出限制或者内存不足,跳出 
        if (totwritten > NET_MAX_WRITES_PER_EVENT &&
            (server.maxmemory == 0 ||
             zmalloc_used_memory() < server.maxmemory)) break;
    }
    // 更新写到网络的字节数
    server.stat_net_output_bytes += totwritten;
    //处理写入失败
    if (nwritten == -1) {
        if (errno == EAGAIN) {
            nwritten = 0;
        } else {
            serverLog(LL_VERBOSE,
                "Error writing to client: %s", strerror(errno));
            freeClient(c);
            return C_ERR;
        }
    }//写入成功
    if (totwritten > 0) {
        /* For clients representing masters we don't count sending data
         * as an interaction, since we always send REPLCONF ACK commands
         * that take some time to just fill the socket output buffer.
         * We just rely on data / pings received for timeout detection. */
         // 如果不是主节点服务器,则更新最近和服务器交互的时间
        if (!(c->flags & CLIENT_MASTER)) c->lastinteraction = server.unixtime;
    }
    // 如果指定的client的回复缓冲区中已经没有数据,发送完成
    if (!clientHasPendingReplies(c)) {
        c->sentlen = 0;
        // 删除当前client的可写事件的监听
        if (handler_installed) aeDeleteFileEvent(server.el,c->fd,AE_WRITABLE);

        /* Close connection after entire reply has been sent. */
         // 如果指定了写入按成之后立即关闭的标志,则释放client
        if (c->flags & CLIENT_CLOSE_AFTER_REPLY) {
            freeClient(c);
            return C_ERR;
        }
    }
    return C_OK;
}

可见writeToClient函数就是往clients发送reply。

客户端接受

client可以从socket中收到server返回的RESP结果的返回结果,经过转换返回给调用端或者在控制台输出。

暂不展开。看下client 命令实现

七 CLIENT 命令

3.2.12 client指令: 源码在networking.c
CLIENT KILL [ip:port] [ID client-id] [TYPE normal|master|slave|pubsub] [ADDR ip:port] [SKIPME yes/no]
CLIENT GETNAME
CLIENT LIST
CLIENT PAUSE timeout
CLIENT REPLY ON|OFF|SKIP
CLIENT SETNAME connection-name
下面参照源码:

/**
*client 命令实现
*/
void clientCommand(client *c) {
    listNode *ln;
    listIter li;
    client *client;
     //  CLIENT LIST 的实现
    if (!strcasecmp(c->argv[1]->ptr,"list") && c->argc == 2) {
        /* CLIENT LIST */
         // 获取所有的client信息
        sds o = getAllClientsInfoString();
         // 添加到到输入缓冲区中
        addReplyBulkCBuffer(c,o,sdslen(o));
        sdsfree(o);
    } else if (!strcasecmp(c->argv[1]->ptr,"reply") && c->argc == 3) {
    	// CLIENT REPLY ON|OFF|SKIP 命令实现
        /* CLIENT REPLY ON|OFF|SKIP */
        // 如果是 ON
        if (!strcasecmp(c->argv[2]->ptr,"on")) {
        	  // 取消 off 和 skip 的标志
            c->flags &= ~(CLIENT_REPLY_SKIP|CLIENT_REPLY_OFF);
            // 回复 +OK
            addReply(c,shared.ok);
              // 如果是 OFF
        } else if (!strcasecmp(c->argv[2]->ptr,"off")) {
        	  // 打开 OFF标志
            c->flags |= CLIENT_REPLY_OFF;
             // 如果是 SKIP
        } else if (!strcasecmp(c->argv[2]->ptr,"skip")) {
        	    // 没有设置 OFF 则设置 SKIP 标志
            if (!(c->flags & CLIENT_REPLY_OFF))
                c->flags |= CLIENT_REPLY_SKIP_NEXT;
        } else {
            addReply(c,shared.syntaxerr);
            return;
        }
        //CLIENT KILL
    } else if (!strcasecmp(c->argv[1]->ptr,"kill")) {
        /* CLIENT KILL <ip:port>
         * CLIENT KILL <option> [value] ... <option> [value] */
        char *addr = NULL;
        int type = -1;
        uint64_t id = 0;
        int skipme = 1;
        int killed = 0, close_this_client = 0;

        if (c->argc == 3) {
            /* Old style syntax: CLIENT KILL <addr> */
            // CLIENT KILL addr:port只能通过地址杀死client,旧版本兼容
            addr = c->argv[2]->ptr;
            skipme = 0; /* With the old form, you can kill yourself. */
        } else if (c->argc > 3) {
        	// 新版本可以根据[ID client-id] [master|normal|slave|pubsub] [ADDR ip:port] [SKIPME yes/no]杀死client
            int i = 2; /* Next option index. */

            /* New style syntax: parse options. */
            // 解析语法
            while(i < c->argc) {
                int moreargs = c->argc > i+1;

                // CLIENT KILL [ID client-id]
                if (!strcasecmp(c->argv[i]->ptr,"id") && moreargs) {
                    long long tmp;
                    // 获取client的ID
                    if (getLongLongFromObjectOrReply(c,c->argv[i+1],&tmp,NULL)
                        != C_OK) return;
                    id = tmp;
                  // CLIENT KILL TYPE type, 这里的 type 可以是 [master|normal|slave|pubsub]   
                } else if (!strcasecmp(c->argv[i]->ptr,"type") && moreargs) {
                	  // 获取client的类型,[master|normal|slave|pubsub]四种之一
                    type = getClientTypeByName(c->argv[i+1]->ptr);
                    if (type == -1) {
                        addReplyErrorFormat(c,"Unknown client type '%s'",
                            (char*) c->argv[i+1]->ptr);
                        return;
                    }
                   // CLIENT KILL [ADDR ip:port] 
                } else if (!strcasecmp(c->argv[i]->ptr,"addr") && moreargs) {
                	   // 获取ip:port
                    addr = c->argv[i+1]->ptr;
                     // CLIENT KILL [SKIPME yes/no]
                } else if (!strcasecmp(c->argv[i]->ptr,"skipme") && moreargs) {
                	   // 如果是yes,设置设置skipme,调用该命令的客户端将不会被杀死
                    if (!strcasecmp(c->argv[i+1]->ptr,"yes")) {
                        skipme = 1;
                      // 设置为no会影响到还会杀死调用该命令的客户端。   
                    } else if (!strcasecmp(c->argv[i+1]->ptr,"no")) {
                        skipme = 0;
                    } else {
                        addReply(c,shared.syntaxerr);
                        return;
                    }
                } else {
                    addReply(c,shared.syntaxerr);
                    return;
                }
                i += 2;
            }
        } else {
            addReply(c,shared.syntaxerr);
            return;
        }

        /* Iterate clients killing all the matching clients. */
         // 迭代所有的client节点
        listRewind(server.clients,&li);
        while ((ln = listNext(&li)) != NULL) {
            client = listNodeValue(ln);
            // 比较当前client和这四类信息,如果有一个不符合就跳过本层循环,否则就比较下一个信息
            if (addr && strcmp(getClientPeerId(client),addr) != 0) continue;
            if (type != -1 && getClientType(client) != type) continue;
            if (id != 0 && client->id != id) continue;
            if (c == client && skipme) continue;

            /* Kill it. */
            // 杀死当前的client
            if (c == client) {
                close_this_client = 1;
            } else {
                freeClient(client);
            }
            killed++; // 计算杀死client的个数
        }

        /* Reply according to old/new format. */
         // 回复client信息
        if (c->argc == 3) {
            if (killed == 0) // 没找到符合信息的
                addReplyError(c,"No such client");
            else
                addReply(c,shared.ok);
        } else {  // 发送杀死的个数
            addReplyLongLong(c,killed);
        }

        /* If this client has to be closed, flag it as CLOSE_AFTER_REPLY
         * only after we queued the reply to its output buffers. */
        if (close_this_client) c->flags |= CLIENT_CLOSE_AFTER_REPLY;
       //  CLIENT SETNAME connection-name 	
    } else if (!strcasecmp(c->argv[1]->ptr,"setname") && c->argc == 3) {
        int j, len = sdslen(c->argv[2]->ptr);
        char *p = c->argv[2]->ptr;

        /* Setting the client name to an empty string actually removes
         * the current name. */
        // 设置名字为空  
        if (len == 0) {
        	  // 先释放掉原来的名字
            if (c->name) decrRefCount(c->name);
            c->name = NULL;
            addReply(c,shared.ok);
            return;
        }

        /* Otherwise check if the charset is ok. We need to do this otherwise
         * CLIENT LIST format will break. You should always be able to
         * split by space to get the different fields. */
         // 检查名字格式是否正确 
        for (j = 0; j < len; j++) {
            if (p[j] < '!' || p[j] > '~') { /* ASCII is assumed. */
                addReplyError(c,
                    "Client names cannot contain spaces, "
                    "newlines or special characters.");
                return;
            }
        }
         // 释放原来的名字
        if (c->name) decrRefCount(c->name);
        // 设置新名字	
        c->name = c->argv[2];
        incrRefCount(c->name);
        addReply(c,shared.ok);
        // CLIENT getname 获取客户端的名字  
    } else if (!strcasecmp(c->argv[1]->ptr,"getname") && c->argc == 2) {
        if (c->name)// 回复名字
            addReplyBulk(c,c->name);
        else
            addReply(c,shared.nullbulk);
       //  CLIENT PAUSE timeout      
    } else if (!strcasecmp(c->argv[1]->ptr,"pause") && c->argc == 3) {
        long long duration;
        // 以毫秒为单位将等待时间保存在duration中
        if (getTimeoutFromObjectOrReply(c,c->argv[2],&duration,UNIT_MILLISECONDS)
                                        != C_OK) return;
         // 暂停client                                
        pauseClients(duration);
        addReply(c,shared.ok);
    } else {
        addReplyError(c, "Syntax error, try CLIENT (LIST | KILL | GETNAME | SETNAME | PAUSE | REPLY)");
    }
}

总结:

  我开始看的时候,关于server端回复怎么也不明白,前后看了两天才明白。

简单汇总下调用关系:

1 客户端的创建与释放。
   createClient、freeclient

2. 客户端的连接处理

   acceptTcpHandler、acceptUnixHandler --》acceptCommonHandler() 它创建了createClient(注册了 readQueryFromClient)。

3. 客户端命令的接受与回复
   readQueryFromClient---》processInputBuffer()--》1.1)processMultibulkBuffer()跟协议格式有关,跟主流程关系不大
                                                 --》1.2)processInlineBuffer()
                                                    2 processCommand()是server.c的功能-->call()--》  c->cmd->proc(c);
                                                                                    -->addReply()

4  回复:

addReply()-->1.prepareClientToWrite 
                2._addReplyToBuffer(写到c->buf)
*************************************************************
  aeMain(ae.c)--》beforesleep(server.c)-->
  handleClientsWithPendingWrites(networking.c)-》1. writeToClient()
                                                  2.sendReplyToClient()--》writeToClient()

 

 

参考:

http://blog.chinaunix.net/uid-790245-id-3766842.html

  https://www.jianshu.com/p/6188becd2cea

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值