本文参考了http://blog.csdn.net/sraing/article/details/8455242一文
python中提供了一个wsgiref的库,其是一个按照WSGI规范实现的server。以下是自己在看wsgiref的代码过程中的一些记录。
代码主要是wsgiref/simple_server.py这一文件,如其名所示,它实现了一个简单的wsgi服务器,下面从该服务器的调用顺序依次注解。
在程序开始时,创建一个调用make_server创建一个server,同时传递了要调用的application,默认创建的server是WSGIServer,默认的requesthandler是WSGIRequestHandler,这两个类将在后面介绍。
def demo_app(environ,start_response):
from StringIO import StringIO
stdout = StringIO()
print >>stdout, "Hello world!"
print >>stdout
h = environ.items(); h.sort()
for k,v in h:
print >>stdout, k,'=', repr(v)
start_response("200 OK", [('Content-Type','text/plain')])
return [stdout.getvalue()]
def make_server(
host, port, app, server_class=WSGIServer, handler_class=WSGIRequestHandler
):
"""Create a new WSGI server listening on `host` and `port` for `app`"""
server = server_class((host, port), handler_class)
server.set_app(app)
return server
if __name__ == '__main__':
httpd = make_server('', 8000, demo_app)
sa = httpd.socket.getsockname()
print "Serving HTTP on", sa[0], "port", sa[1], "..."
import webbrowser
webbrowser.open('http://localhost:8000/xyz?abc')
httpd.handle_request() # serve one request, then exit
下面是WSGIServer
class WSGIServer(HTTPServer):
"""BaseHTTPServer that implements the Python WSGI protocol"""
application = None
def server_bind(self):
"""Override server_bind to store the server name."""
HTTPServer.server_bind(self)
self.setup_environ()
def setup_environ(self):
# Set up base environment
env = self.base_environ = {}
env['SERVER_NAME'] = self.server_name
env['GATEWAY_INTERFACE'] = 'CGI/1.1'
env['SERVER_PORT'] = str(self.server_port)
env['REMOTE_HOST']=''
env['CONTENT_LENGTH']=''
env['SCRIPT_NAME'] = ''
def get_app(self):
return self.application
def set_app(self,application):
self.application = application
它继承了HTTPServer,它在BaseHTTPServer.py文件中定义,查看该文件以及SocketServer.py,可以知道有如下继承关系
+------------+
| BaseServer |
+------------+
|
v
+-----------+ +-----------+
| TCPServer |------->| HTTPServer|
+-----------+ +-----------+
服务器的基本逻辑已经在BaseServer实现了,在TCPServer的__init__()中调用server_activate()启动服务器(其实就是socket开始监听),注意这里不是调用BaseServer的serve_forever函数。
值得注意的是BaseServer的finish_request函数,在其中调用了创建BaseServer对象是传入的RequestHandlerClass对象用以处理接收到的请求。
def finish_request(self, request, client_address):
"""Finish one request by instantiating RequestHandlerClass."""
self.RequestHandlerClass(request, client_address, self)
其中传入的request对象由get_request()函数获取,该函数根据不同的协议有不同的实现,所以其实现是在TCPServer中的,事实上它就是简单的返回一个socket对象而已
def get_request(self):
"""Get the request and client address from the socket.
May be overridden.
"""
return self.socket.accept()
接下来是WSGIRequestHandler
class WSGIRequestHandler(BaseHTTPRequestHandler):
server_version = "WSGIServer/" + __version__
def get_environ(self):
env = self.server.base_environ.copy()
env['SERVER_PROTOCOL'] = self.request_version
env['REQUEST_METHOD'] = self.command
if '?' in self.path:
path,query = self.path.split('?',1)
else:
path,query = self.path,''
env['PATH_INFO'] = urllib.unquote(path)
env['QUERY_STRING'] = query
host = self.address_string()
if host != self.client_address[0]:
env['REMOTE_HOST'] = host
env['REMOTE_ADDR'] = self.client_address[0]
if self.headers.typeheader is None:
env['CONTENT_TYPE'] = self.headers.type
else:
env['CONTENT_TYPE'] = self.headers.typeheader
length = self.headers.getheader('content-length')
if length:
env['CONTENT_LENGTH'] = length
for h in self.headers.headers:
k,v = h.split(':',1)
k=k.replace('-','_').upper(); v=v.strip()
if k in env:
continue # skip content length, type,etc.
if 'HTTP_'+k in env:
env['HTTP_'+k] += ','+v # comma-separate multiple headers
else:
env['HTTP_'+k] = v
return env
def get_stderr(self):
return sys.stderr
def handle(self):
"""Handle a single HTTP request"""
self.raw_requestline = self.rfile.readline()
if not self.parse_request(): # An error code has been sent, just exit
return
handler = ServerHandler(
self.rfile, self.wfile, self.get_stderr(), self.get_environ()
)
handler.request_handler = self # backpointer for logging
handler.run(self.server.get_app())
它继承了BaseHTTPRequestHandler,它在BaseHTTPServer.py文件中定义,查看该文件以及SocketServer.py,可以知道有如下继承关系
+-------------------+ | BaseRequestHandler| +-------------------+ | v +---------------------+ +-----------------------+ | StreamRequestHandler|------->| BaseHTTPRequestHandler| +---------------------+ +-----------------------+
构建BaseRequestHandler对象是,会将相应的server对象保存
def __init__(self, request, client_address, server):
self.request = request
self.client_address = client_address
self.server = server
self.setup()
try:
self.handle()
finally:
self.finish()
值得注意的是其中的handle函数,它是处理请求的主要部分
def handle(self):
"""Handle a single HTTP request"""
self.raw_requestline = self.rfile.readline()
if not self.parse_request(): # An error code has been sent, just exit
return
handler = ServerHandler(
self.rfile, self.wfile, self.get_stderr(), self.get_environ()
)
handler.request_handler = self # 把log设置为BaseHTTPRequestHandler的log_request
handler.run(self.server.get_app())
注意最后三行代码,ServerHandler继承自SimpleHandler,其继承关系如下
+------------+
| BaseHandler|
+------------+
|
v
+--------------+
| SimpleHandler|
+--------------+
class ServerHandler(SimpleHandler):
server_software = software_version
def close(self):
try:
self.request_handler.log_request(
self.status.split(' ',1)[0], self.bytes_sent
)
finally:
SimpleHandler.close(self)
在handlers.py文件中我们找到了run函数的实现
def run(self, application):
"""Invoke the application"""
# Note to self: don't move the close()! Asynchronous servers shouldn't
# call close() from finish_response(), so if you close() anywhere but
# the double-error branch here, you'll break asynchronous servers by
# prematurely closing. Async servers must return from 'run()' without
# closing if there might still be output to iterate over.
try:
self.setup_environ()
self.result = application(self.environ, self.start_response)
self.finish_response()
except:
try:
self.handle_error()
except:
# If we get an error handling an error, just give up already!
self.close()
raise # ...and let the actual server figure it out.
进一步查看finish_response的实现
def finish_response(self):
"""Send any iterable data, then close self and the iterable
Subclasses intended for use in asynchronous servers will
want to redefine this method, such that it sets up callbacks
in the event loop to iterate over the data, and to call
'self.close()' once the response is finished.
"""
try:
if not self.result_is_file() or not self.sendfile():
for data in self.result:
self.write(data)
self.finish_content()
finally:
self.close()
可以看到,它就是把app返回的数据用write函数write出来。
关于wsgi application, http://blog.csdn.net/sraing/article/details/8455242中有一个很好的解释,下面引用自该文章
def simple_app(environ, start_response): status = '200 OK' response_headers = [('Content-type', 'text/plain')] start_response(status, response_headers) return [u"This is hello wsgi app".encode('utf8')]
我们再用wsgiref 作为wsgi server ,然后调用这个wsgi app,就能直观看到一次request,response的效果,简单修改代码如下:
from wsgiref.simple_server import make_server def simple_app(environ, start_response): status = '200 OK' response_headers = [('Content-type', 'text/plain')] start_response(status, response_headers) return [u"This is hello wsgi app".encode('utf8')] httpd = make_server('', 8000, simple_app) print "Serving on port 8000..." httpd.serve_forever()
访问http://127.0.0.1:8000 就能看到效果了。
此外,上面讲到了wsgi app只要是一个callable对象就可以了,因此不一定要是函数,一个实现了__call__方法的实例也可以,示例代码如下:
from wsgiref.simple_server import make_server class AppClass: def __call__(self,environ, start_response): status = '200 OK' response_headers = [('Content-type', 'text/plain')] start_response(status, response_headers) return ["hello world!"] app = AppClass() httpd = make_server('', 8000, app) print "Serving on port 8000..." httpd.serve_forever()
WSGI MiddleWare
from wsgiref.simple_server import make_server URL_PATTERNS= ( ('hi/','say_hi'), ('hello/','say_hello'), ) class Dispatcher(object): def _match(self,path): path = path.split('/')[1] for url,app in URL_PATTERNS: if path in url: return app def __call__(self,environ, start_response): path = environ.get('PATH_INFO','/') app = self._match(path) if app : app = globals()[app] return app(environ, start_response) else: start_response("404 NOT FOUND",[('Content-type', 'text/plain')]) return ["Page dose not exists!"] def say_hi(environ, start_response): start_response("200 OK",[('Content-type', 'text/html')]) return ["kenshin say hi to you!"] def say_hello(environ, start_response): start_response("200 OK",[('Content-type', 'text/html')]) return ["kenshin say hello to you!"] app = Dispatcher() httpd = make_server('', 8000, app) print "Serving on port 8000..." httpd.serve_forever()
上面的例子可以看出来,middleware 包装之后,一个简单wsgi app就有了URL dispatch功能。然后我还可以在这个app外面再加上其它的middleware来包装它,例如加一个权限认证的middleware:
class Auth(object): def __init__(self,app): self.app = app def __call__(self,environ, start_response): #TODO return self.app(environ, start_response) app = Dispatcher() auth_app = Auth(app) httpd = make_server('', 8000, auth_app) print "Serving on port 8000..." httpd.serve_forever()
经过这些middleware的包装,已经有点框架的感觉了。其实基于wsgi的框架,例如paste,pylons就是这样通过一层层middleware组合起来的。只是一个成熟的框架,这样的middleware会有很多,例如:
def configure(app): return ErrorHandlerMiddleware( SessionMiddleware( IdentificationMiddleware( AuthenticationMiddleware( UrlParserMiddleware(app))))))
只要这些Middleware符合wsgi规范,甚至还可以在各个框架之间组合重用。例如pylons的认证Middleware可以直接被TurboGears拿去使用。