一直在用flask写一些小型的后台服务。有余力之下,去研究了一下flask的源码。不得不赞叹flask对于python的运用之炉火纯青。下面开始分析。
flask框架使用了库werkzeug,werkzeug是基于WSGI的,WSGI是什么呢?(Web Server Gateway Interface),WSGI是一个协议,通俗翻译Web服务器网关接口,该协议把接收自客户端的所有请求都转交给这个对象处理。
基于如上背景知识,这里不只是研究了flask的源码,也会把涉及到的werkzeug库的源码,以及werkzeug库使用到的python基本库的源码列出来,力求能够把整个逻辑给陈列清楚。
服务类
Flask:
应用程序类,该类直接面向用户。所有的Flask程序都必须创建这样一个程序实例。简单点儿可以app=Flask(__name__),Flask可以利用这个参数决定程序的根目录。
* 类成员:
Request
Response
* 类方法:
run
run方法调用werkzeug库中的一个run_simple函数来启动BaseWSGIServer。
def run_simple(hostname, port, application, use_reloader=False,
use_debugger=False, use_evalex=True,
extra_files=None, reloader_interval=1,
reloader_type='auto', threaded=False,
processes=1, request_handler=None, static_files=None,
passthrough_errors=False, ssl_context=None):
if use_debugger:
from werkzeug.debug import DebuggedApplication
application = DebuggedApplication(application, use_evalex)
if static_files:
from werkzeug.wsgi import SharedDataMiddleware
application = SharedDataMiddleware(application, static_files)
def inner():
try:
fd = int(os.environ['WERKZEUG_SERVER_FD'])
except (LookupError, ValueError):
fd = None
make_server(hostname, port, application, threaded,
processes, request_handler,
passthrough_errors, ssl_context,
fd=fd).serve_forever()
if use_reloader:
# If we're not running already in the subprocess that is the
# reloader we want to open up a socket early to make sure the
# port is actually available.
if os.environ.get('WERKZEUG_RUN_MAIN') != 'true':
if port == 0 and not can_open_by_fd:
raise ValueError('Cannot bind to a random port with enabled '
'reloader if the Python interpreter does '
'not support socket opening by fd.')
# Create and destroy a socket so that any exceptions are
# raised before we spawn a separate Python interpreter and
# lose this ability.
address_family = select_ip_version(hostname, port)
s = socket.socket(address_family, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind((hostname, port))
if hasattr(s, 'set_inheritable'):
s.set_inheritable(True)
# If we can open the socket by file descriptor, then we can just
# reuse this one and our socket will survive the restarts.
if can_open_by_fd:
os.environ['WERKZEUG_SERVER_FD'] = str(s.fileno())
s.listen(LISTEN_QUEUE)
else:
s.close()
from ._reloader import run_with_reloader
run_with_reloader(inner, extra_files, reloader_interval,
reloader_type)
else:
inner()
make_server创建http服务器,server_forever启动服务监听端口。使用select异步处理。
BaseWSGIServer:
WSGI服务器,继承自HTTPServer。
* 类成员:
app;初始化的时候将Flask的实例传入。
* 类方法:
__init__ ;初始化完成后启动HTTPServer,并将请求处理类WSGIRequestHandler传入。
HTTPServer
HTTP服务器。python基础库(BaseHTTPServer.py)。该类实现了一个server_bind方法用来绑定服务器地址和端口,可不必理会。继承自SocketServer.TCPServer。可以看到,HTTP实际上是基于TCP服务的。
SocketServer.TCPServer
继承自BaseServer。实现了一个fileno的函数,用来获取socket套接字的描述符。这一个是预留给select模块使用的。
请求处理类
BaseRequestHandler
最基础的请求处理类。类成员包括server(HTTPServer)和request。然后在初始化的时候,依次调用setup、handle、finish方法。
SocketServer.StreamRequestHandler
继承自BaseRequestHandler。重写setup和finish方法。setup主要是建立读和写两个描述符,finish用来关闭连接。关键的是handle方法。
BaseHTTPRequestHandler
继承自SocketServer.StreamRequestHandler。利用这个类基本上可以处理简单的http请求了。但Flask只使用了它的几个方法。最重要的是handle方法。
def handle(self):
"""Handle multiple requests if necessary."""
self.close_connection = 1
self.handle_one_request()
while not self.close_connection:
self.handle_one_request()
可以看到,它主要是去调用了handle_one_request方法,并且可以在同一个连接中处理多个request。这个类的handle_one_request方法我们不用管,在
WSGIRequestHandler中我们会重写这个方法。
重点说一下WSGIRequestHandler
WSGIRequestHandler
继承自BaseHTTPRequestHandler。Flask没有自定义请求处理类,使用了WSGI库的WSGIRequestHandler。
类方法make_environ。一个重要的方法,用来创建上下文环境。这个暂且不说。
类方法handle_one_request
def handle_one_request(self):
"""Handle a single HTTP request."""
self.raw_requestline = self.rfile.readline()
if not self.raw_requestline:
self.close_connection = 1
elif self.parse_request():
return self.run_wsgi()
调用run_wsgi方法来处理
def run_wsgi(self):
if self.headers.get('Expect', '').lower().strip() == '100-continue':
self.wfile.write(b'HTTP/1.1 100 Continue\r\n\r\n')
self.environ = environ = self.make_environ()
headers_set = []
headers_sent = []
def write(data):
assert headers_set, 'write() before start_response'
if not headers_sent:
status, response_headers = headers_sent[:] = headers_set
try:
code, msg = status.split(None, 1)
except ValueError:
code, msg = status, ""
self.send_response(int(code), msg)
header_keys = set()
for key, value in response_headers:
self.send_header(key, value)
key = key.lower()
header_keys.add(key)
if 'content-length' not in header_keys:
self.close_connection = True
self.send_header('Connection', 'close')
if 'server' not in header_keys:
self.send_header('Server', self.version_string())
if 'date' not in header_keys:
self.send_header('Date', self.date_time_string())
self.end_headers()
assert isinstance(data, bytes), 'applications must write bytes'
self.wfile.write(data)
self.wfile.flush()
def start_response(status, response_headers, exc_info=None):
if exc_info:
try:
if headers_sent:
reraise(*exc_info)
finally:
exc_info = None
elif headers_set:
raise AssertionError('Headers already set')
headers_set[:] = [status, response_headers]
return write
def execute(app):
application_iter = app(environ, start_response)
try:
for data in application_iter:
write(data)
if not headers_sent:
write(b'')
finally:
if hasattr(application_iter, 'close'):
application_iter.close()
application_iter = None
try:
execute(self.server.app)
except (socket.error, socket.timeout) as e:
self.connection_dropped(e, environ)
except Exception:
if self.server.passthrough_errors:
raise
from werkzeug.debug.tbtools import get_current_traceback
traceback = get_current_traceback(ignore_system_exceptions=True)
try:
# if we haven't yet sent the headers but they are set
# we roll back to be able to set them again.
if not headers_sent:
del headers_set[:]
execute(InternalServerError())
except Exception:
pass
self.server.log('error', 'Error on request:\n%s',
traceback.plaintext)
这个函数比较长,我们只看程序的主干部分,execute(self.server.app)==》application_iter = app(environ, start_response)。app之前已经说过,在创建BaseWSGIServer的时候就把Flask的实例对象传入。这里你可能会疑惑,一个对象怎么还能带参数的调用?这是属于python的语法,只要你在Flask类中定义了__call__方法,当你这样使用的时候,实际上调用的是Flask的__call__方法,类似的是,所有的函数都有__call__方法,这里不再赘述。那么,Flask.__call__到底做什么了呢?我们来看。
def __call__(self, environ, start_response):
"""Shortcut for :attr:`wsgi_app`."""
return self.wsgi_app(environ, start_response)
def wsgi_app(self, environ, start_response):
ctx = self.request_context(environ)
ctx.push()
error = None
try:
try:
response = self.full_dispatch_request()
except Exception as e:
error = e
response = self.make_response(self.handle_exception(e))
return response(environ, start_response)
finally:
if self.should_ignore_error(error):
error = None
ctx.auto_pop(error)
分析wsgi_app方法。
前边两行是创建上下文变量,self.request_context(environ)主要做两件事,第一创建url适配器,第二url适配器去匹配客户端请求路径self.path,然后存放到environ中的PATH_INFO变量中。其中会执行到WSGIRequestHandler的make_environ方法,这里不再叙述。
最重要的处理请求的环节,response = self.full_dispatch_request()。
def full_dispatch_request(self):
"""Dispatches the request and on top of that performs request
pre and postprocessing as well as HTTP exception catching and
error handling.
.. versionadded:: 0.7
"""
self.try_trigger_before_first_request_functions()
try:
request_started.send(self)
rv = self.preprocess_request()
if rv is None:
rv = self.dispatch_request()
except Exception as e:
rv = self.handle_user_exception(e)
response = self.make_response(rv)
response = self.process_response(response)
request_finished.send(self, response=response)
return response
self.try_trigger_before_first_request_functions()是在正式处理请求之前执行before_first_request请求钩子注册的函数,在处理第一个请求之前运行。
然后self.preprocess_request执行before_request请求钩子注册的函数,在每次请求之前运行。
self.dispatch_request重头戏来了,就是这个方法来处理http请求的。怎么处理呢?
def dispatch_request(self):
"""Does the request dispatching. Matches the URL and returns the
return value of the view or error handler. This does not have to
be a response object. In order to convert the return value to a
proper response object, call :func:`make_response`.
.. versionchanged:: 0.7
This no longer does the exception handling, this code was
moved to the new :meth:`full_dispatch_request`.
"""
req = _request_ctx_stack.top.request
if req.routing_exception is not None:
self.raise_routing_exception(req)
rule = req.url_rule
# if we provide automatic options for this URL and the
# request came with the OPTIONS method, reply automatically
if getattr(rule, 'provide_automatic_options', False) \
and req.method == 'OPTIONS':
return self.make_default_options_response()
# otherwise dispatch to the handler for that endpoint
return self.view_functions[rule.endpoint](**req.view_args)
首先自然是获取请求参数req。self.view_functions是什么东西呢?就是存放视图与函数的映射表,即你用@app.route('/', methods=['GET','POST'])的时候存放的映射信息。我们可以来看一下app.route这个装饰器。
def route(self, rule, **options):
"""A decorator that is used to register a view function for a
given URL rule. This does the same thing as :meth:`add_url_rule`
but is intended for decorator usage::
@app.route('/')
def index():
return 'Hello World'
For more information refer to :ref:`url-route-registrations`.
:param rule: the URL rule as string
:param endpoint: the endpoint for the registered URL rule. Flask
itself assumes the name of the view function as
endpoint
:param options: the options to be forwarded to the underlying
:class:`~werkzeug.routing.Rule` object. A change
to Werkzeug is handling of method options. methods
is a list of methods this rule should be limited
to (`GET`, `POST` etc.). By default a rule
just listens for `GET` (and implicitly `HEAD`).
Starting with Flask 0.6, `OPTIONS` is implicitly
added and handled by the standard request handling.
"""
def decorator(f):
endpoint = options.pop('endpoint', None)
self.add_url_rule(rule, endpoint, f, **options)
return f
return decorator
def add_url_rule(self, rule, endpoint=None, view_func=None, **options):
if endpoint is None:
endpoint = _endpoint_from_view_func(view_func)
options['endpoint'] = endpoint
methods = options.pop('methods', None)
# if the methods are not given and the view_func object knows its
# methods we can use that instead. If neither exists, we go with
# a tuple of only `GET` as default.
if methods is None:
methods = getattr(view_func, 'methods', None) or ('GET',)
methods = set(methods)
# Methods that should always be added
required_methods = set(getattr(view_func, 'required_methods', ()))
# starting with Flask 0.8 the view_func object can disable and
# force-enable the automatic options handling.
provide_automatic_options = getattr(view_func,
'provide_automatic_options', None)
if provide_automatic_options is None:
if 'OPTIONS' not in methods:
provide_automatic_options = True
required_methods.add('OPTIONS')
else:
provide_automatic_options = False
# Add the required methods now.
methods |= required_methods
# due to a werkzeug bug we need to make sure that the defaults are
# None if they are an empty dictionary. This should not be necessary
# with Werkzeug 0.7
options['defaults'] = options.get('defaults') or None
rule = self.url_rule_class(rule, methods=methods, **options)
rule.provide_automatic_options = provide_automatic_options
self.url_map.add(rule)
if view_func is not None:
old_func = self.view_functions.get(endpoint)
if old_func is not None and old_func != view_func:
raise AssertionError('View function mapping is overwriting an '
'existing endpoint function: %s' % endpoint)
self.view_functions[endpoint] = view_func
请求处理类的实例化:
当select监听到连接到来时,调用http服务的_handle_request_noblock方法(由BaseServer实现),==》process_request==》finish_request==》self.RequestHandlerClass(request, client_address, self)实例化请求处理类。
下载源码:git clone git@github.com:mitsuhiko/flask.git
注:本文旨在叙述flask的工作流程,所以只讲了最基础的BaseWSGIServer,而没有讲ThreadedWSGIServer等。