深入理解redis的事件处理模型

这里将要分享的是redis6.2版本的源码,直接从github上面拉取

redis的启动入口函数是main函数,在main函数中主要是数据的初始化、handler的初始化及注册、事件的创建、时间的处理初始化,main方法中主要的核心函数主要有三个initServer()initServerlast()aeMain(server.el)。这个版本的redis采取的响应器模型是main+thread+worker+多路复用器循环处理(accept+eventprocess),下图是整个的基本处理模型图。

1、启动初始化

1.1 在initServer()方法中主要的初始化工作

  • 调用方法aeCreateEventLoop初始化事件循环处理器
  • 创建数据库server.db= zmalloc
  • 端口监听listenToPort
  • 绑定时间调度器调度函数aeCreateTimeEvent(server.el, 1, serverCron, NULL, NULL)一些耗时的操作就是在这里serverCron,有个周期性的调度器,比如定时检查key的过期数据备份等,redis中的涉及到的很多耗时的处理一般都会fork一个子进程通过pipleline的方式与父进程通信,使得父进程中redis的读写不受影响
  • createSocketAcceptHandler(&server.ipfd, acceptTcpHandler)createSocketAcceptHandler(&server.tlsfd, acceptTLSHandler) 这里的handler里面在连接事件到达的时候注册了读的handler readQueryFromClient,写的handler writeToClient,今后redis的读写都是通过这个两个handler处理
  • aeCreateFileEvent(server.el, server.module_blocked_pipe[0], AE_READABLE, moduleBlockedClientPipeReadable,NULL)

底层系统调用epoll_ctl(state->epfd,op,fd,&ee),这里的系统调用指的是linux系统下。

  • 设置eventloopbeforeSleep
  • 单线程

writeToClient(c,0)

writeToClient(client *c, int handler_installed)

  • 多线程handleClientsWithPendingWritesUsingThreads ,这里需要根据配置执

实际上redis并不会根据配置就开启多线程,这里的多线程开启是有条件的,在networking.c中的handleClientsWithPendingWritesUsingThreads方法中有这么一个判断。

if (server.io_threads_num == 1 || stopThreadedIOIfNeeded()) {

        return handleClientsWithPendingWrites();

    }




    /* Start threads if needed. */

    if (!server.io_threads_active) startThreadedIO();

stopThreadedIOIfNeeded()这个方法在满足pending < (server.io_threads_num*2)当前等待写的线程数小于我们配置文件中的线程数的时候会stopThreadedIO()关闭线程,这里的线程操作也是调用系统函数pthread_mutex_lock,并且在关闭前会再次检查当前是否有在等待的读线程handleClientsWithPendingReadsUsingThreads()

handleClientsWithPendingWrites

  • 注册handler方法sendReplyToClient

1.2 initServerlast()方法主要涉及到iothread的准备工作

  • initThreadedIO(void)中创建线程,这里主要涉及到到的是多线程的读写,就是的redis多线程io
  • 这里是线程创建的核心部分,并且在这里注册 ioThreadMian()函数
//这里是系统调用线程的创建

pthread_create(&tid,NULL,IOThreadMain,(void*)(long)i)

ioThreadMain的核心部分代码是一个while(1)的死循环,系统调用epoll_wait来获取到事件,通过循环的方式来处理事件,读事件的方法是 readQueryFromClient(c->conn),这里还有一个postponeClientRead 方法是把client添加到list的头部,这里是多线程的处理方式,为客户端的读事件分配做准备,然后交给多线程去处理,主要处理函数是processInputBuffer(c)

void *IOThreadMain(void *myid) {

    /* The ID is the thread number (from 0 to server.iothreads_num-1), and is

     * used by the thread to just manipulate a single sub-array of clients. */

    long id = (unsigned long)myid;

    char thdname[16];





    snprintf(thdname, sizeof(thdname), "io_thd_%ld", id);

    redis_set_thread_title(thdname);

    redisSetCpuAffinity(server.server_cpulist);

    makeThreadKillable();





    while(1) {

        /* Wait for start */

        for (int j = 0; j < 1000000; j++) {

            if (getIOPendingCount(id) != 0) break;

        }





        /* Give the main thread a chance to stop this thread. */

        if (getIOPendingCount(id) == 0) {

            pthread_mutex_lock(&io_threads_mutex[id]);

            pthread_mutex_unlock(&io_threads_mutex[id]);

            continue;

        }





        serverAssert(getIOPendingCount(id) != 0);





        /* Process: note that the main thread will never touch our list

         * before we drop the pending count to 0. */

        listIter li;

        listNode *ln;

        listRewind(io_threads_list[id],&li);

        while((ln = listNext(&li))) {

            client *c = listNodeValue(ln);

            if (io_threads_op == IO_THREADS_OP_WRITE) {

                writeToClient(c,0);

            } else if (io_threads_op == IO_THREADS_OP_READ) {

                readQueryFromClient(c->conn);

            } else {

                serverPanic("io_threads_op value is unknown");

            }

        }

        listEmpty(io_threads_list[id]);

        setIOPendingCount(id, 0);

    }

}

1.3 aeMain(server.el)方法启动事件循环处理器

核心部分代码,这里就是事件模型的核心部分,这里aeProcessEvents会调用aeApiPoll方法,这里就是nio部分系统调用,大家熟悉的epoll了,这里调用epoll_wait

while (!eventLoop->stop) {

//这里是事件处理的入口函数

        aeProcessEvents(eventLoop, AE_ALL_EVENTS|

                                   AE_CALL_BEFORE_SLEEP|

                                   AE_CALL_AFTER_SLEEP);

    }

下面就是具体的处理部分

int aeProcessEvents(aeEventLoop *eventLoop, int flags)

{

    int processed = 0, numevents;



    /* Nothing to do? return ASAP */

    if (!(flags & AE_TIME_EVENTS) && !(flags & AE_FILE_EVENTS)) return 0;



    /* Note that we want to call select() even if there are no

     * file events to process as long as we want to process time

     * events, in order to sleep until the next time event is ready

     * to fire. */

    if (eventLoop->maxfd != -1 ||

        ((flags & AE_TIME_EVENTS) && !(flags & AE_DONT_WAIT))) {

        int j;

        struct timeval tv, *tvp;

        long msUntilTimer = -1;



        if (flags & AE_TIME_EVENTS && !(flags & AE_DONT_WAIT))

            msUntilTimer = msUntilEarliestTimer(eventLoop);



        if (msUntilTimer >= 0) {

            tv.tv_sec = msUntilTimer / 1000;

            tv.tv_usec = (msUntilTimer % 1000) * 1000;

            tvp = &tv;

        } else {

            /* If we have to check for events but need to return

             * ASAP because of AE_DONT_WAIT we need to set the timeout

             * to zero */

            if (flags & AE_DONT_WAIT) {

                tv.tv_sec = tv.tv_usec = 0;

                tvp = &tv;

            } else {

                /* Otherwise we can block */

                tvp = NULL; /* wait forever */

            }

        }



        if (eventLoop->flags & AE_DONT_WAIT) {

            tv.tv_sec = tv.tv_usec = 0;

            tvp = &tv;

        }



        if (eventLoop->beforesleep != NULL && flags & AE_CALL_BEFORE_SLEEP)

            eventLoop->beforesleep(eventLoop);



        /* 系统调用epoll_wait函数,只有超时或者有事件发生的时候才会返回,

       * 拿到事件后循环处理事件

      */

        numevents = aeApiPoll(eventLoop, tvp);



        /* After sleep callback. */

        if (eventLoop->aftersleep != NULL && flags & AE_CALL_AFTER_SLEEP)

            eventLoop->aftersleep(eventLoop);



        for (j = 0; j < numevents; j++) {

            aeFileEvent *fe = &eventLoop->events[eventLoop->fired[j].fd];

            int mask = eventLoop->fired[j].mask;

            int fd = eventLoop->fired[j].fd;

            int fired = 0; /* 触发事件的 fd个数 */



            /* Normally we execute the readable event first, and the writable

             * event later. This is useful as sometimes we may be able

             * to serve the reply of a query immediately after processing the

             * query.

             *

             * However if AE_BARRIER is set in the mask, our application is

             * asking us to do the reverse: never fire the writable event

             * after the readable. In such a case, we invert the calls.

             * This is useful when, for instance, we want to do things

             * in the beforeSleep() hook, like fsyncing a file to disk,

             * before replying to a client. */

            int invert = fe->mask & AE_BARRIER;



            /* Note the "fe->mask & mask & ..." code: maybe an already

             * processed event removed an element that fired and we still

             * didn't processed, so we check if the event is still valid.

             *

             * Fire the readable event if the call sequence is not

             * inverted. */

            if (!invert && fe->mask & mask & AE_READABLE) {

                fe->rfileProc(eventLoop,fd,fe->clientData,mask);

                fired++;

                fe = &eventLoop->events[fd]; /* Refresh in case of resize. */

            }



            /* Fire the writable event. */

            if (fe->mask & mask & AE_WRITABLE) {

                if (!fired || fe->wfileProc != fe->rfileProc) {

                    fe->wfileProc(eventLoop,fd,fe->clientData,mask);

                    fired++;

                }

            }



            /* If we have to invert the call, fire the readable event now

             * after the writable one. */

            if (invert) {

                fe = &eventLoop->events[fd]; /* Refresh in case of resize. */

                if ((fe->mask & mask & AE_READABLE) &&

                    (!fired || fe->wfileProc != fe->rfileProc))

                {

                    fe->rfileProc(eventLoop,fd,fe->clientData,mask);

                    fired++;

                }

            }



            processed++;

        }

    }

    /* Check time events */

   //这里就是redis处理key过期、生成rdb的触发部分,对于某些耗时的操作会fork子进程处理

    if (flags & AE_TIME_EVENTS)

        processed += processTimeEvents(eventLoop);



    return processed; /* return the number of processed file/time events */

}

2.读写的主要处理流程

写流程主要过程是网络层准备好数据,redisserver解析并且找到命令函数,调用函数 XXXgenericCommand解析指令,执行命令回调函数回来第一件事是encoding优化,这里涉及到redisd存储的编码和数据结构quicklistziplistskiplisthtintset等,然后存入db并且夹扎着rehash的推进,新数据写进hs[1],具体的数据格式可以自行翻阅源码。

2.1 源码部分

对应读写事件的处理在ae.c中的aeProcessEvents(aeEventLoop *eventLoop, int flags)中,核心部分是这个numevents = aeApiPoll(eventLoop, tvp),这里通过系统调用拿到事件,ae_poll.c中方法aeApiPoll(aeEventLoop *eventLoop, struct timeval *tvp) 中的对系统函数的调用拿到已经触发的事件,根据在initServer()方法中注册的相应handler进行调用处理

epoll_wait(state->epfd,state->events,eventLoop->setsize,

            tvp ? (tvp->tv_sec*1000 + (tvp->tv_usec + 999)/1000) : -1);

读事件的处理fe->rfileProc(eventLoop,fd,fe->clientData,mask);对应的handlerreadQueryFromClient

fe->wfileProc(eventLoop,fd,fe->clientData,mask);

                    fired++;

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值