TORNADO源码分析

最新推荐文章于 2022-07-29 13:41:57 发布

LoveIT小五

最新推荐文章于 2022-07-29 13:41:57 发布

阅读量1k

点赞数

分类专栏： tornado 文章标签：源码

tornado 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

TORNADO源码分析

Tornado的web框架在web.py中实现，主要包括RequestHandler类（本质为对http请求处理的封装）和Application类（是一些列请求处理的集合，构成的一个web-application，源代码注释不翻译更容易理解：A collection of request handlers that make up a web application）。

RequestHandler分析

RequestHandler提供了一个针对http请求处理的基类封装，方法比较多，主要有以下功能：

提供了GET/HEAD/POST/DELETE/PATCH/PUT/OPTIONS等方法的功能接口，具体开发时RequestHandler的子类重写这些方法以支持不同需求的请求处理。

提供对http请求的处理方法，包括对headers，页面元素，cookie的处理。

提供对请求响应的一些列功能，包括redirect，write（将数据写入输出缓冲区），渲染模板（render, reander_string）等

其他的一些辅助功能，如结束请求/响应，刷新输出缓冲区，对用户授权相关处理等。

Application分析

源代码中的注释写的非常好：A collection of request handlers that make up a web application. Instances of this class are callable and can be passed directly to HTTPServer to serve the application. 该类初始化的第一个参数接受一个(regexp, request_class)形式的列表，指定了针对不同URL请求所采取的处理方法，包括对静态文件请求的处理（web.StaticFileHandler）。Application类中实现 __call__ 函数，这样该类就成为可调用的对象，由HTTPServer来进行调用。比如下边是httpserver.py中HTTPConection类的代码，该处request_callback即为Application对象。

def _on_headers(self, data):

# some codes...

self.request_callback(self._request)

__call__函数会遍历Application的handlers列表，匹配到相应的URL后通过handler._execute进行相应处理；如果没有匹配的URL，则会调用ErrorHandler。

在Application初始时有一个debug参数，当debug=True时，运行程序时当有代码、模块发生修改，程序会自动重新加载，即实现了auto-reload功能。该功能在autoreload.py文件中实现，是否需要reload的检查在每次接收到http请求时进行，基本原理是检查每一个sys.modules以及_watched_files所包含的模块在程序中所保存的最近修改时间和文件系统中的最近修改时间是否一致，如果不一致，则整个程序重新加载。

def _reload_on_update(modify_times):

for module in sys.modules.values():

# module test and some path handles

_check_file(modify_times, path)

for path in _watched_files:

_check_file(modify_times, path)

Tornado的autoreload模块提供了一个对外的main接口，可以通过下边的方法实现运行test.py程序运行的auto-reload。但是测试了一下，功能有限，相比于django的autorelaod模块（具有较好的封装和较完善的功能）还是有一定的差距。最主要的原因是Tornado中的实现耦合了一些ioloop的功能，因而autoreload不是一个可独立的模块。

# tornado

python -m tornado.autoreload test.py [args...]

# django

from django.utils import autoreload

autoreload.main(your-main-func)

asynchronous方法

该方法通常被用为请求处理函数的decorator，以实现异步操作，被@asynchronous修饰后的请求处理为长连接，在调用self.finish之前会一直处于连接等待状态。

总结

tornado源码分析2 一文中给出了一张tornado httpserver的工作流程图，调用Application发生在HTTPConnection大方框的handle_request椭圆中。那篇文章里使用的是一个简单的请求处理函数handle_request，无论是handle_request还是application，其本质是一个函数（可调用的对象），当服务器接收连接并读取http请求header之后进行调用，进行请求处理和应答。

http_server = httpserver.HTTPServer(handle_request)

http_server = httpserver.HTTPServer(application)

附录：autoreload.py文件源码红色部分为实现debug=True，网页自动加载的功能部分

#!/usr/bin/env python

# Licensed under the Apache License, Version 2.0 (the "License"); you may

# not use this file except in compliance with the License. You may obtain

# a copy of the License at

# http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software

# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT

# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the

# License for the specific language governing permissions and limitations

# under the License.

"""xAutomatically restart the server when a source file is modified.

Most applications should not access this module directly. Instead, pass the

keyword argument ``debug=True`` to the `tornado.web.Application` constructor.

This will enable autoreload mode as well as checking for changes to templates

and static resources. Note that restarting is a destructive operation

and any requests in progress will be aborted when the process restarts.

This module can also be used as a command-line wrapper around scripts

such as unit test runners. See the `main` method for details.

The command-line wrapper and Application debug modes can be used together.

This combination is encouraged as the wrapper catches syntax errors and

other import-time failures, while debug mode catches changes once

the server has started.

This module depends on `.IOLoop`, so it will not work in WSGI applications

and Google App Engine. It also will not work correctly when `.HTTPServer`'s

multi-process mode is used.

Reloading loses any Python interpreter command-line arguments (e.g. ``-u``)

because it re-executes Python using ``sys.executable`` and ``sys.argv``.

Additionally, modifying these variables will cause reloading to behave

incorrectly.

"""

from __future__ import absolute_import, division, print_function, with_statement

import os

import sys

# sys.path handling

# -----------------

# If a module is run with "python -m", the current directory (i.e. "")

# is automatically prepended to sys.path, but not if it is run as

# "path/to/file.py". The processing for "-m" rewrites the former to

# the latter, so subsequent executions won't have the same path as the

# original.

# Conversely, when run as path/to/file.py, the directory containing

# file.py gets added to the path, which can cause confusion as imports

# may become relative in spite of the future import.

# We address the former problem by setting the $PYTHONPATH environment

# variable before re-execution so the new process will see the correct

# path. We attempt to address the latter problem when tornado.autoreload

# is run as __main__, although we can't fix the general case because

# we cannot reliably reconstruct the original command line

# (http://bugs.python.org/issue14208).

if __name__ == "__main__":

# This sys.path manipulation must come before our imports (as much

# as possible - if we introduced a tornado.sys or tornado.os

# module we'd be in trouble), or else our imports would become

# relative again despite the future import.

# There is a separate __main__ block at the end of the file to call main().

if sys.path[0] == os.path.dirname(__file__):

del sys.path[0]

import functools

import logging

import os

import pkgutil

import sys

import traceback

import types

import subprocess

import weakref

from tornado import ioloop

from tornado.log import gen_log

from tornado import process

from tornado.util import exec_in

try:

import signal

except ImportError:

signal = None

_watched_files = set()

_reload_hooks = []

_reload_attempted = False

_io_loops = weakref.WeakKeyDictionary()

def start(io_loop=None, check_time=500):

"""Begins watching source files for changes using the given `.IOLoop`. """

io_loop = io_loop or ioloop.IOLoop.current()

if io_loop in _io_loops:

return

_io_loops[io_loop] = True

if len(_io_loops) > 1:

gen_log.warning("tornado.autoreload started more than once in the same process")

add_reload_hook(functools.partial(io_loop.close, all_fds=True))

modify_times = {}

callback = functools.partial(_reload_on_update, modify_times)

scheduler = ioloop.PeriodicCallback(callback, check_time, io_loop=io_loop)

scheduler.start()

def wait():

"""Wait for a watched file to change, then restart the process.

Intended to be used at the end of scripts like unit test runners,

to run the tests again after any source file changes (but see also

the command-line interface in `main`)

"""

io_loop = ioloop.IOLoop()

start(io_loop)

io_loop.start()

def watch(filename):

"""Add a file to the watch list.

All imported modules are watched by default.

"""

_watched_files.add(filename)

def add_reload_hook(fn):

"""Add a function to be called before reloading the process.

Note that for open file and socket handles it is generally

preferable to set the ``FD_CLOEXEC`` flag (using `fcntl` or

``tornado.platform.auto.set_close_exec``) instead

of using a reload hook to close them.

"""

_reload_hooks.append(fn)

def _reload_on_update(modify_times):

if _reload_attempted:

# We already tried to reload and it didn't work, so don't try again.

return

if process.task_id() is not None:

# We're in a child process created by fork_processes. If child

# processes restarted themselves, they'd all restart and then

# all call fork_processes again.

return

for module in sys.modules.values():

# Some modules play games with sys.modules (e.g. email/__init__.py

# in the standard library), and occasionally this can cause strange

# failures in getattr. Just ignore anything that's not an ordinary

# module.

if not isinstance(module, types.ModuleType):

continue

path = getattr(module, "__file__", None)

if not path:

continue

if path.endswith(".pyc") or path.endswith(".pyo"):

path = path[:-1]

_check_file(modify_times, path)

for path in _watched_files:

_check_file(modify_times, path)

def _check_file(modify_times, path):

try:

modified = os.stat(path).st_mtime

except Exception:

return

if path not in modify_times:

modify_times[path] = modified

return

if modify_times[path] != modified:

gen_log.info("%s modified; restarting server", path)

_reload()

def _reload():

global _reload_attempted

_reload_attempted = True

for fn in _reload_hooks:

fn()

if hasattr(signal, "setitimer"):

# Clear the alarm signal set by

# ioloop.set_blocking_log_threshold so it doesn't fire

# after the exec.

signal.setitimer(signal.ITIMER_REAL, 0, 0)

# sys.path fixes: see comments at top of file. If sys.path[0] is an empty

# string, we were (probably) invoked with -m and the effective path

# is about to change on re-exec. Add the current directory to $PYTHONPATH

# to ensure that the new process sees the same path we did.

path_prefix = '.' + os.pathsep

if (sys.path[0] == '' and

not os.environ.get("PYTHONPATH", "").startswith(path_prefix)):

os.environ["PYTHONPATH"] = (path_prefix +

os.environ.get("PYTHONPATH", ""))

if sys.platform == 'win32':

# os.execv is broken on Windows and can't properly parse command line

# arguments and executable name if they contain whitespaces. subprocess

# fixes that behavior.

subprocess.Popen([sys.executable] + sys.argv)

sys.exit(0)

else:

try:

os.execv(sys.executable, [sys.executable] + sys.argv)

except OSError:

# Mac OS X versions prior to 10.6 do not support execv in

# a process that contains multiple threads. Instead of

# re-executing in the current process, start a new one

# and cause the current process to exit. This isn't

# ideal since the new process is detached from the parent

# terminal and thus cannot easily be killed with ctrl-C,

# but it's better than not being able to autoreload at

# all.

# Unfortunately the errno returned in this case does not

# appear to be consistent, so we can't easily check for

# this error specifically.

os.spawnv(os.P_NOWAIT, sys.executable,

[sys.executable] + sys.argv)

sys.exit(0)

_USAGE = """\

Usage:

python -m tornado.autoreload -m module.to.run [args...]

python -m tornado.autoreload path/to/script.py [args...]

"""

def main():

"""Command-line wrapper to re-run a script whenever its source changes.

Scripts may be specified by filename or module name::

python -m tornado.autoreload -m tornado.test.runtests

python -m tornado.autoreload tornado/test/runtests.py

Running a script with this wrapper is similar to calling

`tornado.autoreload.wait` at the end of the script, but this wrapper

can catch import-time problems like syntax errors that would otherwise

prevent the script from reaching its call to `wait`.

"""

original_argv = sys.argv

sys.argv = sys.argv[:]

if len(sys.argv) >= 3 and sys.argv[1] == "-m":

mode = "module"

module = sys.argv[2]

del sys.argv[1:3]

elif len(sys.argv) >= 2:

mode = "script"

script = sys.argv[1]

sys.argv = sys.argv[1:]

else:

print(_USAGE, file=sys.stderr)

sys.exit(1)

try:

if mode == "module":

import runpy

runpy.run_module(module, run_name="__main__", alter_sys=True)

elif mode == "script":

with open(script) as f:

global __file__

__file__ = script

# Use globals as our "locals" dictionary so that

# something that tries to import __main__ (e.g. the unittest

# module) will see the right things.

exec_in(f.read(), globals(), globals())

except SystemExit as e:

logging.basicConfig()

gen_log.info("Script exited with status %s", e.code)

except Exception as e:

logging.basicConfig()

gen_log.warning("Script exited with uncaught exception", exc_info=True)

# If an exception occurred at import time, the file with the error

# never made it into sys.modules and so we won't know to watch it.

# Just to make sure we've covered everything, walk the stack trace

# from the exception and watch every file.

for (filename, lineno, name, line) in traceback.extract_tb(sys.exc_info()[2]):

watch(filename)

if isinstance(e, SyntaxError):

# SyntaxErrors are special: their innermost stack frame is fake

# so extract_tb won't see it and we have to get the filename

# from the exception object.

watch(e.filename)

else:

logging.basicConfig()

gen_log.info("Script exited normally")

# restore sys.argv so subsequent executions will include autoreload

sys.argv = original_argv

if mode == 'module':

# runpy did a fake import of the module as __main__, but now it's

# no longer in sys.modules. Figure out where it is and watch it.

loader = pkgutil.get_loader(module)

if loader is not None:

watch(loader.get_filename())

wait()

if __name__ == "__main__":

# See also the other __main__ block at the top of the file, which modifies

# sys.path before our imports

main()

LoveIT小五

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
TORNADO源码分析

TORNADO源码分析Tornado的web框架在web.py中实现，主要包括RequestHandler类（本质为对http请求处理的封装）和Application类（是一些列请求处理的集合，构成的一个web-application，源代码注释不翻译更容易理解：A collection of request handlers that make up a web application）
复制链接

扫一扫