[原]tornado源码分析系列（六）[HTTPServer详解]

最新推荐文章于 2024-11-14 19:52:49 发布

weixin_34248023

最新推荐文章于 2024-11-14 19:52:49 发布

阅读量107

点赞数

文章标签：运维 python

本文深入剖析HTTPServer的工作原理，包括其如何通过listen和bind进行基本操作，_handle_events函数的作用，以及如何根据CPU数量调整进程数。文章还详细介绍了HTTPServer如何处理HTTP请求，并通过创建HTTPRequest对象来传递请求信息。

摘要由CSDN通过智能技术生成

引言：上一章讲了关于HTTPServer的原理，这一次通过分析源码来了解更多的细节

看看HTTPServer类的组织结构：

HTTPServer的主要工作

一.提供了一些基础的比如说listen,bind此类共有操作

二.完成了一个 _handle_events()的公有回调函数，此函数是 IOLoop的基础，此函数为每一个连接创建一个单独的 IOStream 对象

三.start函数，启动HTTPServer,并设置相应的参数（如根据CPU个数来设置进程数等）

从HTTPServer类的构造函数可以看出，最重要的参数是设置回调函数，此回调函数用于处理request对象

每次有HTTP的请求，都会通过HTTPConnection 封装一个HTTPRequest对象，这个对象包含了HTTP请求的所有信息

所以在HTTPServer层，需要对这个对象进行一番处理后调用 request.write将结果返回给客户端

此回调函数会先注册到HTTPServer，然后注册到HTTPConnection 里面，因为request这个对象是由HTTPConnection对象产生

   def _handle_events(self, fd, events):
        while True:
            try:
                connection, address = self._socket.accept()
            except socket.error, e:
                if e.args[0] in (errno.EWOULDBLOCK, errno.EAGAIN):
                    return
                raise
            #SSL 选项
            if self.ssl_options is not None:
                assert ssl, "Python 2.6+ and OpenSSL required for SSL"
                try:
                    connection = ssl.wrap_socket(connection,
                                                 server_side=True,
                                                 do_handshake_on_connect=False,
                                                 **self.ssl_options)
                except ssl.SSLError, err:
                    if err.args[0] == ssl.SSL_ERROR_EOF:
                        return connection.close()
                    else:
                        raise
                except socket.error, err:
                    if err.args[0] == errno.ECONNABORTED:
                        return connection.close()
                    else:
                        raise
            try:
                if self.ssl_options is not None:
                    stream = iostream.SSLIOStream(connection, io_loop=self.io_loop)
                else:
                        #为每一个 connection 创建一个 iostream 实例，以后的IO操作由此实例负责
                        #IOLoop只负责 accept这个连接
                        
                    stream = iostream.IOStream(connection, io_loop=self.io_loop)
                
                #将 stream对象和对应的 address , callback加入到HTTPConnection 中
                #HTTPConnection稍后会有介绍
                #这里的 request_callback 是由Demo里 httpserver.HTTPServer(handle_request) 传递进来
                #现代的 HTTP 框架都采用这种模式
                #创建一个 handle_request 这个 回调函数嵌套的注册到下层，直到真正处理request
                #一般情况是回调继续传递下去直到遇到一个类方法能够传递 request 对象给这个函数
                HTTPConnection(stream, address, self.request_callback,
                               self.no_keep_alive, self.xheaders)
            except:
                logging.error("Error in connection callback", exc_info=True)

通过调用HTTPConnection，然后传递stream,address和request_callback到HTTPConnection可以看到，处理request的回调函数注册到了HTTPConnection.

还需要注意的地方就是，每一次有一个连接的到来，IOLoop都只负责处理accept此连接，然后后面的IO操作就交给IOStream来处理

在start()函数中，会为每个进程创建一个单独的IOLoop，然后此IOLoop的回调函数统一采用_handle_events()

_handle_events()函数的处理流程总体来说是这样：

1.注册到本进程的IOLoop中

2.当有事件发生，只注册了READ事件，也就是只接受新连接，每次有连接到来，都回调_handle_events()

3.accept此新连接，然后为此新连接创建一个IOStream对象，以后此IOStream负责本连接的所有IO操作，这里是一层抽象，实际

在IOStream的读写事件也是注册到了本进程的IOLoop中，只不过回调函数不一样，因为注册时候的描述符不同。

　调用方式是通过handle[fd]()这种方式调用，所以对于监听套接口每次都只会调用_handle_events()而对于其他的IOStream的连接

fd会调用在read_bytes(),read_utils()中注册的回调函数

再看看在HTTPServer中的 start()函数

    def start(self, num_processes=1):

        assert not self._started
        self._started = True
        if num_processes is None or num_processes <= 0:
            num_processes = _cpu_count()
        if num_processes > 1 and ioloop.IOLoop.initialized():
            logging.error("Cannot run in multiple processes: IOLoop instance "
                          "has already been initialized. You cannot call "
                          "IOLoop.instance() before calling start()")
            num_processes = 1
        if num_processes > 1:
            logging.info("Pre-forking %d server processes", num_processes)
            #根据 处理器个数来决定 fork多少个进程
            for i in range(num_processes):
                if os.fork() == 0:# fork() == 0 表示子进程
                    import random
                    from binascii import hexlify
                    try:
                        # If available, use the same method as
                        # random.py
                        seed = long(hexlify(os.urandom(16)), 16)
                    except NotImplementedError:
                        # Include the pid to avoid initializing two
                        # processes to the same value
                        seed(int(time.time() * 1000) ^ os.getpid())
                    random.seed(seed)
                    #为每个进程创建一个IOLoop实例
                    self.io_loop = ioloop.IOLoop.instance()
                    #为每个IOLoop 添加回调函数，这里采用统一回调方式，和IOStream 一样
                    self.io_loop.add_handler(
                        self._socket.fileno(), self._handle_events,
                        ioloop.IOLoop.READ)
                    return
            os.waitpid(-1, 0)#预防僵尸进程，Unix 环境编程介绍很多
        else:
            if not self.io_loop:
                self.io_loop = ioloop.IOLoop.instance()
            self.io_loop.add_handler(self._socket.fileno(),
                                     self._handle_events,
                                     ioloop.IOLoop.READ)

可以在代码注释中看到会根据每一个CPU一个IOLoop实例的方式处理，至于中间的产生随机数是为什么，如果有人知道请告知我

在start()函数的最后可以看到add_handle将监听套接口和_handle_events()函数注册到了IOLoop中，这就是上面所讲的HTTPServer处理连接的过程

总结：总结不想写了，土逼Continue...