1 一些概念
1.1 完整的服务器/客户端交互流程
客户端发出HTTP请求,服务端接受到请求,然后解析并处理请求,生成响应内容,最后服务端回复响应内容,客户端接受到响应内容并展示出来。
1.2 WSGI
Python Web Server Gateway Interface,一种规定Web Server/Gateway和Web Aapplication/Framework(Python实现)交互的规范。
1.2.1 Web Server/Gateway
能够处理HTTP协议的伺服应用程序(通常用C/C++编写)。WSGI server负责从客户端接收request,将request转发给WSGI application,将WSGI application返回的response返回给客户端。
1.2.2 Web Aapplication/Framework
接收由WSGI server转发的request,处理请求,并将处理结果返回给WSGI server。WSGI application中可以包括多个栈式的中间件(middlewares),middlware需要同时实现server与application两个角色:对server来说,middlware扮演application,对application来说,中间件扮演server。
web server有多种实现方式(apache http server, nginx,IIS,uWSGI,Gunicorn),web application也有不同的框架(Django,Flask,Tornado),因而可以组合搭配。
1.3 补充
- uwsgi协议:与uWSGI server(C语言编写的应用程序)相关的协议,专门用于uWSGI server和Web application之间交互。
- web server和web 应用之间的交互规范有很多,如CGI,FasgCGI。Java对应的规范为Servlet,实现了Servlet API的Java web框架开发的应用可以在任何实现了Servlet API的web服务器上运行。WSGI的实现受Servlet的启发比较大。
2 Web Server/Gateway
WSGI中Server/Gateway一方提供environ和start_response,并调用Application/Framework一方提供的application。
2.1 environ
environ字典包含了一些CGI规范要求的数据,以及WSGI规范新增的数据,还可能包含一些操作系统的环境变量以及Web服务器相关的环境变量,具体见[4]。
2.2 start_response
start_response为callable对象,callable对象可以是一个函数,方法,类或带有__call__ 方法的实例。
2.2.1 参数
start_respomse有2个必须的位置参数(positional arguments)和一个可选参数(optional argument)
- status:要返回的状态信息,状态码和原因短语。
- response_headers:响应报文请求头。
- exc_info:是一个Python sys.exc_info() tuple,响应处理失败时调用以返回错误页面,即当start_response被error handler调用时application应提供exc_info。
2.2.2 返回值
start_response返回可调用对象write(body_data),必须参数body_data为字节串形式的响应体。向application返回可调用对象write(body_data)是为了支持这个application/framework的必要的输出API。换言之,在application/framework层可能也需要调用write来返回响应的某些部分,而不是所有返回响应数据的操作都交给server/gateway层。
2.3 示例
# Server/Gateway Side code named server.py
import os, sys
enc, esc = sys.getfilesystemencoding(), 'surrogateescape'
def unicode_to_wsgi(u):
# Convert an environment variable to a WSGI "bytes-as-unicode" string
return u.encode(enc, esc).decode('iso-8859-1')
def wsgi_to_bytes(s):
return s.encode('iso-8859-1')
def run_with_cgi(application):
environ = {k: unicode_to_wsgi(v) for k,v in os.environ.items()}
environ['wsgi.input'] = sys.stdin.buffer
environ['wsgi.errors'] = sys.stderr
environ['wsgi.version'] = (1, 0)
environ['wsgi.multithread'] = False
environ['wsgi.multiprocess'] = True
environ['wsgi.run_once'] = True
if environ.get('HTTPS', 'off') in ('on', '1'):
environ['wsgi.url_scheme'] = 'https'
else:
environ['wsgi.url_scheme'] = 'http'
headers_set = []
headers_sent = []
def write(data):
out = sys.stdout.buffer
if not headers_set:
raise AssertionError("write() before start_response()")
elif not headers_sent:
# Before the first output, send the stored headers
status, response_headers = headers_sent[:] = headers_set
out.write(wsgi_to_bytes('Status: %s\r\n' % status))
for header in response_headers:
out.write(wsgi_to_bytes('%s: %s\r\n' % header))
out.write(wsgi_to_bytes('\r\n'))
out.write(data)
out.flush()
def start_response(status, response_headers, exc_info=None):
if exc_info:
try:
if headers_sent:
# Re-raise original exception if headers sent
raise exc_info[1].with_traceback(exc_info[2])
finally:
exc_info = None # avoid dangling circular ref
elif headers_set:
raise AssertionError("Headers already set!")
headers_set[:] = [status, response_headers]
# Note: error checking on the headers should happen here,
# *after* the headers are set. That way, if an error
# occurs, start_response can only be re-called with
# exc_info set.
return write
result = application(environ, start_response)
try:
for data in result:
if data: # don't send headers until body appears
write(data)
if not headers_sent:
write(b'') # send headers now if body was empty
finally:
if hasattr(result, 'close'):
result.close()
2.4 补充说明
- start_response将从application/framework侧获取的status和response_header缓存到server/gateway侧的header_set变量中,server/gateway侧将通过调用write来向客户端返回响应内容(header_set(status, response_headers),body bytestring)。
- server/gateway对响应头的完整性负责,如果application有所遗漏,server/gateway需要补上,而且HTTP headers是大小写敏感的,server/gateway需要检查响应报文中的响应头的大小写。
- start_response实际上不能负责向客户端传递响应头,这件事交由可调用对象write完成。start_response应该存储响应头,并且WSGI要求仅在application第一次返回非空的字节串或者第一次调用write之后,server/gateway才能向客户端传递响应头。不过例外的情况是当Content-Length为0时,响应头传递不必等待响应体。
- 响应头的延迟机制确保了缓存的或者异步的application可以在出错时将最初的输出替换为错误输出,如"200 OK" -> "500 Internal Error”。
- 当且仅当exc_info被提供时,application可能多次调用start_response。如果在当前application对象中已经调用无exc_info参数的start_response,再调用无exc_info参数start_response时,前一个调用会抛出异常。
- 如果提供了exc_info(这意味错误已经发生)且响应头已经被发送,则start_response应当抛出 raise exc_info[1].with_traceback(exc_info[2])。再次抛出的错误将终止application。响应头已经发送,无法用与错误相关的新响应头覆盖,如果继续发送错误页面的响应体的话,则正确页面的响应头搭配错误页面的响应体,这是不被允许的,应当终止响应。
3 Web Aapplication/Framework
Application/Framework一方提供的application也是是callable。入参为server/gateway提供的environ和start_response。返回能够生成0个或多个字节串的可迭代对象(一个字节串列表,或者生成字节串的生成器函数形式的application,或者可迭代类实例形式的application)。开发者只需要聚焦开发自己的application,然后将其传入Server/Gateway一侧就行。
3.1 函数形式的application
# Application/Framework code named main.py
from server import run_with_cgi
HELLO_WORLD = b"Hello world!\n"
ERROR_INFO = b"Response Process Error!\n"
def simple_app(environ, start_response):
"""Simplest possible application object"""
try:
# regular application code here
status = "200 OK"
response_headers = [("content-type", "text/plain")]
start_response(status, response_headers)
return [HELLO_WORLD]
except:
# XXX should trap runtime issues like MemoryError, KeyboardInterrupt
# in a separate handler before this bare 'except:'...
status = "500 Internal Error"
response_headers = [("content-type", "text/plain")]
start_response(status, response_headers, sys.exc_info())
return [ERROR_INFO]
if __name__ == '__main__':
run_with_wsgi(simple_app)
'''执行
$ python main.py
Status: 200 OK
Content-type: text/plain
Hello world!
'''
3.2 类形式的application之一
直接将AppClass传入run_with_cgi,则把AppClass当application调用时,将执行AppClass(environ, start_response),返回一个AppClass对象,WSGI要求application返回可迭代对象,因此需要定义__iter__
。然后在迭代AppClass对象时(server.py 69行),调用__iter__
来完成Application/Framework一侧的处理。注意此处__iter__()
实现为生成器,生成器本身是迭代器,因此不需要额外实现AppClass类的__next__()
方法。
# Application/Framework code named main.py
import sys
from server import run_with_cgi
HELLO_WORLD = b"Hello world!\n"
ERROR_INFO = b"Response Process Error!\n"
class AppClass:
"""Produce the same output, but using a class
Note: 'AppClass' is the "application" here, so calling it
returns an instance of 'AppClass', which is then the iterable
return value of the "application callable" as required by
the spec.
"""
def __init__(self, environ, start_response):
self.environ = environ
self.start = start_response
def __iter__(self):
try:
# regular application code here
status = "200 OK"
response_headers = [("content-type", "text/plain")]
self.start(status, response_headers)
yield HELLO_WORLD
except:
# XXX should trap runtime issues like MemoryError, KeyboardInterrupt
# in a separate handler before this bare 'except:'...
status = "500 Internal Error"
response_headers = [("content-type", "text/plain")]
self.start(status, response_headers, sys.exc_info())
yield ERROR_INFO
if __name__ == '__main__':
run_with_cgi(AppClass)
3.3 类形式的application之二
class AppClass1:
"""Produce the same output, but using a class
Note: If we wanted to use *instances* of 'AppClass' as application
objects instead, we would have to implement a '__call__'
method, which would be invoked to execute the application,
and we would need to create an instance for use by the
server or gateway.
"""
def __call__(self, envirion, start_response):
try:
# regular application code here
status = "200 OK"
response_headers = [("content-type", "text/plain")]
start_response(status, response_headers)
yield HELLO_WORLD
except:
# XXX should trap runtime issues like MemoryError, KeyboardInterrupt
# in a separate handler before this bare 'except:'...
status = "500 Internal Error"
response_headers = [("content-type", "text/plain")]
start_response(status, response_headers, sys.exc_info())
yield ERROR_INFO
if __name__ == '__main__':
app = AppClass1()
run_with_cgi(app)
4 Middleware
- Middleware 处于 server/gateway 和 application/framework 之间,对 server/gateway 来说,它相当于 application/framework;对 application/framework 来说,它相当于 server/gateway。
- 使用原有AppClass1实例app创建Latinator实例app_with_middleware,serve/gateway将把app_with_middleware作为application调用。当server/gateway准备好参数environ和start_response后调用app_with_middleware对象,实际上执行
app_with_middleware.__call__(environ, start_response)
,在该方法中创建了新的"start_response":start_piglatin,调用原本的start_response将在start_piglatin中完成。 - 在start_ piglatin中,调用原本的start_response的前后,都可以做一些操作。示例代码在调用start_response之前变更了响应头,在调用start_response之后根据转换标志transform_ok来选择不同类型的write。
- 最后,作为application,app_with_middleware应该返回可迭代的字节串形式响应体。而
app_with_middleware.__call__(environ, start_response)
的返回将原本的app的可迭代返回对象用类LatinIter做了一层封装,以保证每次迭代的数据都是piglatin处理后的数据。 - start_latin对于start_response的封装不仅体现在可以调整start_response原始入参status和response_headers,还体现在封装start_response返回的可调用对象write上。当app_with_middleware作为application时,app_with_middleware可以拿到原本的server/gateway提供的可调用对象write,而当app_with_middleware作为server/gateway时,app可以拿到app_with_middleware提供的可调用对象(对write的封装,write或者write_latin)。
# Middleware code, named middleware.py
from piglatin import translate
class LatinIter:
"""Transform iterated output to piglatin, if it's okay to do so
Note that the "okayness" can change until the application yields
its first non-empty bytestring, so 'transform_ok' has to be a mutable
truth value.
"""
def __init__(self, result, transform_ok):
if hasattr(result, 'close'):
self.close = result.close
self._next = iter(result).__next__
self.transform_ok = transform_ok
def __iter__(self):
return self
def __next__(self):
data = self._next() # bytestring
if self.transform_ok:
return translate(data.decode()).encode('iso-8859-1') # call must be byte-safe on Py3
else:
return data
class Latinator:
# by default, don't transform output
transform = False
def __init__(self, application):
self.application = application
def __call__(self, environ, start_response):
transform_ok = []
def start_latin(status, response_headers, exc_info=None):
# Reset ok flag, in case this is a repeat call
del transform_ok[:]
for name, value in response_headers:
if name.lower() == 'content-type' and value == 'text/plain':
transform_ok.append(True)
# Strip content-length if present, else it'll be wrong
response_headers = [(name, value)
for name, value in response_headers
if name.lower() != 'content-length'
]
break
write = start_response(status, response_headers, exc_info)
if transform_ok:
def write_latin(data):
write(translate(data)) # call must be byte-safe on Py3
return write_latin
else:
return write
return LatinIter(self.application(environ, start_latin), transform_ok)
在main.py中使用middleware
```python
from middleware import Latinator
from server import run_with_cgi
class AppClass1:
pass # same as the code above
if __name__ == '__main__':
app = AppClass1()
app_with_middleware = Latinator(app)
run_with_cgi(app_with_middleware)
'''输出
Status: 200 OK
content-type: text/plain
ello-Hay orld-way!
'''
4 补充
4.1 迭代器对象
当你在一个类中定义了__iter__()
方法,__iter___()
应该返回一个迭代器对象,迭代器对象是一个实现了__next__()
方法的对象。可以在循环中使用对象。对于for循环的第一次迭代,先调用__iter__()
一次返回对象本身,然后每次迭代将调用 __next__()
。
class Fib:
def __init__(self, max):
self.max = max
def __iter__(self):
print('__iter__ called')
self.a = 0
self.b = 1
return self
def __next__(self):
print('__next__ called')
fib = self.a
if fib > self.max:
raise StopIteration
self.a, self.b = self.b, self.a + self.b
return fib
if __name__ == '__main__':
for i in Fib(3):
print(i)
'''输出
__iter__ called
__next__ called
0
__next__ called
1
__next__ called
1
__next__ called
2
__next__ called
3
__next__ called
'''
4.2 函数的标识
函数签名:函数名,参数列表(类型),返回类型。如add(int, int)
函数头:函数定义中除开函数体的部分,如int add(int a, int b) { return a+b;}中的int add(int a, int b)
函数原型/函数声明:函数名,参数列表(参数名(可以省略) + 类型),返回类型。如int add(int, int); 或者 int add(int a, int b);
5 参考
https://www.cnblogs.com/linxiyue/p/10800020.html
https://www.cnblogs.com/liugp/p/17153418.html
https://www.cnblogs.com/-wenli/p/10884168.html
https://peps.python.org/pep-3333/
https://www.codeswithpankaj.com/post/code-introspection-in-python