文章目录
从python之HTTP模块 这篇文章受到启发,直接去看顶层的 socketserver.py
源码,看完之后很多东西恍然大悟,而不是陷入垃圾的海洋——没有高屋建瓴的上帝视角,有些东西真的莫名其妙,很难想通。
1. 预备核心知识
以下预备知识很有助于理解源码:
1.0 socket 通信基本流程
- 服务端:① 等待和接受数据 ② 处理数据
- 客户端:② 连接服务器 ② 发送和接收数据
这里提一小点(关于
ip地址和端口
的疑问):
因为是需要客户端主动去寻找服务端的,所以:
是服务端先绑定(bind)一个ip地址和端口,也就是服务器本地的ip地址和端口;
而客户端需要连接(connect)这个ip地址和端口。
1.1 socket 的含义和作用
这里简单理解:
- 含义:可简单理解为服务端和客户端中的任意一端
(“两个程序通过一个双向的通信连接实现数据的交换,这个连接的一端称为一个socket。”) - 作用:衔接网络传输层和应用层,便于编程
1.2 区分socket编程和socketserver编程
Python 提供了两个级别访问的网络服务:
-
1、低级别:socket编程,
提供了标准的 BSD Sockets API,可实现偏底层的开发socket编程思路:
(1)服务端
- 创建socket:
socket.socket()
- 绑定socket到本地IP和端口:
socket.bind()
- 开始监听连接:
socket.listen()
- 循环,接收连接请求:
socket.accept()
- 接收数据:
socket.recv()
- 发送数据:
socket.sendall()
- (传输完毕后)关闭socket:
socket.close()
(2)客户端
- 创建socket:
socket.socket()
- 连接服务器地址:
socket.connect()
- 发送数据:
socket.sendall()
- 接收数据:
socket.reccv()
- (传输完毕后)关闭socket:
socket.close()
更详细可以查看之前的文章 Python 网络编程(5):基于socket的网络编程,
也可看其他博客:socket编程 - 创建socket:
-
2、高级别:socketserver编程,
提供了服务器中心类,简化了开发。说大白话,socketserver 就是为了简化编程 对socket更高级的封装,并且实现并发等功能(socketserver编程也是本篇的内容)也正是因为如此,socketserver其实包含了socket和其本身socketserver两种!
socketserver编程:
(1)服务端
- 自定义一个Handler类:
class MyHandler(socketserver.BaseRequestHandler)
用以继承BaseRequestHandler基类,并重写函数handle()。这其中就包含了绑定、监听等初始化操作。 - 创建一个server(socketserver):
server = socketserver.TCPServer((HOST, PORT), MyHandler)
调用上面创建的Handler类(最主要的是里面自定义的handle()函数) - 让server处于永久循环(除非ctrl c 打断):
server.serve_forever()
(2)客户端
- 创建一个socket(不是socketclient):
注:这里有两种常见方式:① 使用 try…finally…语句或者② 更简洁的with语句
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock
- 连接到服务端:
sock.connect((HOST, PORT))
- 向服务端发送数据:
sock.sendall()
- 从服务端接收数据:
sock.recv()
- 自定义一个Handler类:
注意一个问题:有socketserver,那是不是也有socketclient?
答案是否定的,socketserver对客户端没有太高的要求,只需要自己写一些socket就行了
(具体源码和实例看下文)
更多详细内容也以看之前的文章:Python 网络编程:基于socketserver的网络编程,或者点击:socketserver编程
1.3 socketserver 的框架(模块继承)
继承图如下:
+------------+
| BaseServer |
+------------+
|
v
+-----------+ +------------------+
| TCPServer |------->| UnixStreamServer |
+-----------+ +------------------+
|
v
+-----------+ +--------------------+
| UDPServer |------->| UnixDatagramServer |
+-----------+ +--------------------+
也就是说,在socketserver
中定义了五种不同的服务器类:
BaseServer
定义了 API,单并非用来实例化和直接使用。TCPServer
使用 TCP / IP 套接字进行通信。UDPServer
使用数据报套接字。UnixStreamServer
和UnixDatagramServer
使用Unix
域套接字,分别继承自TCPServer
和UDPServer
,仅在 Unix 平台上可用。
1.4 socketserver 两种角度的server定义
这个如果没搞清楚的话,初学很容易懵逼!
-
1、基于socket的servers(
socket-based servers
)- address family:
- AF_INET{,6}: IP (Internet Protocol) sockets (default)
- AF_UNIX: Unix domain sockets
- others, e.g. AF_DECNET are conceivable (see <socket.h> - socket type:
- SOCK_STREAM (reliable stream, e.g. TCP)
- SOCK_DGRAM (datagrams, e.g. UDP)
- address family:
-
2、基于request的servers(
request-based servers
)- client address verification before further looking at the request
(This is actually a hook for any processing that needs to look
at the request before anything else, e.g. logging) - how to handle multiple requests:
- synchronous (one request is handled at a time)
- forking (each request is handled by a new process)
- threading (each request is handled by a new thread)
关于这个模块的这段解释,我不是很懂:
The classes in this module favor the server type that is simplest to
write: a synchronous TCP/IP server. This is bad class design, but
saves some typing. (There’s also the issue that a deep class hierarchy
slows down method lookups.) - client address verification before further looking at the request
2. 源码
2.0 预备理解
直接说结论:
BaseServer
是所有server的基类(”Base class for server classes.”)。
所以其尽可能地抽象出所有server的共性,例如开启事件监听循环。
大白话解释:TCPServer或UDPServer均需调用基类BaseServer以完成初始化。BaseRequestHandler
是所有handler的基类(”Base class for request handler classes.“)- 所有server必须引用
BaseRequestHandler类
,并重写里面的handle()方法
(源码里面有提)
第三点对于理解尤为重要,所以问题的核心也就清晰了:
自定一个Handler类,重定义handle()方法即可。
(这部分放在文章的”应用“章节)
下面只列出了较为核心的几个,还有 UDPServer、DatagramRequestHandler和ThreadingMixIn、ForkingMixIn等省略。
提一句,后两个类各自实现了多进程和多线程的功能
socketserver包提供5个Server类,这些单独使用这些Server类都只能完成同步的操作,他是一个单线程的,不能同时处理各个客户端的请求,只能按照顺序依次处理。
下图为SocketServer的框架图(摘自 socketserver模块解析)
一点说明:下面的小节摘了几个重要类的源码,相信上面的框架性的东西理解之后再看已经没有什么难度了,更为详细解读部分以后有机会再弄。
2.1 BaseServer
class BaseServer:
"""Base class for server classes.
Methods for the caller: # 供调用者使用的方法
- __init__(server_address, RequestHandlerClass)
- serve_forever(poll_interval=0.5)
- shutdown()
- handle_request() # if you do not use serve_forever()
- fileno() -> int # for selector
Methods that may be overridden: # 可能被覆写的方法
- server_bind()
- server_activate()
- get_request() -> request, client_address
- handle_timeout()
- verify_request(request, client_address)
- server_close()
- process_request(request, client_address)
- shutdown_request(request)
- close_request(request)
- service_actions()
- handle_error()
Methods for derived classes: # 用于派生类的方法
- finish_request(request, client_address)
Class variables that may be overridden by derived classes or
instances:
- timeout
- address_family
- socket_type
- allow_reuse_address
Instance variables: # 实例变量
- RequestHandlerClass
- socket
"""
timeout = None
def __init__(self, server_address, RequestHandlerClass):
"""Constructor. May be extended, do not override."""
self.server_address = server_address
self.RequestHandlerClass = RequestHandlerClass
self.__is_shut_down = threading.Event()
self.__shutdown_request = False
def server_activate(self):
"""Called by constructor to activate the server.
May be overridden.
"""
pass
def serve_forever(self, poll_interval=0.5):
"""Handle one request at a time until shutdown.
Polls for shutdown every poll_interval seconds. Ignores
self.timeout. If you need to do periodic tasks, do them in
another thread.
"""
self.__is_shut_down.clear()
try:
# XXX: Consider using another file descriptor or connecting to the
# socket to wake this up instead of polling. Polling reduces our
# responsiveness to a shutdown request and wastes cpu at all other
# times.
with _ServerSelector() as selector:
selector.register(self, selectors.EVENT_READ)
while not self.__shutdown_request:
ready = selector.select(poll_interval)
# bpo-35017: shutdown() called during select(), exit immediately.
if self.__shutdown_request:
break
if ready:
self._handle_request_noblock()
self.service_actions()
finally:
self.__shutdown_request = False
self.__is_shut_down.set()
def shutdown(self):
"""Stops the serve_forever loop.
Blocks until the loop has finished. This must be called while
serve_forever() is running in another thread, or it will
deadlock.
"""
self.__shutdown_request = True
self.__is_shut_down.wait()
def service_actions(self):
"""Called by the serve_forever() loop.
May be overridden by a subclass / Mixin to implement any code that
needs to be run during the loop.
"""
pass
# The distinction between handling, getting, processing and finishing a
# request is fairly arbitrary. Remember:
#
# - handle_request() is the top-level call. It calls selector.select(),
# get_request(), verify_request() and process_request()
# - get_request() is different for stream or datagram sockets
# - process_request() is the place that may fork a new process or create a
# new thread to finish the request
# - finish_request() instantiates the request handler class; this
# constructor will handle the request all by itself
def handle_request(self):
"""Handle one request, possibly blocking.
Respects self.timeout.
"""
# Support people who used socket.settimeout() to escape
# handle_request before self.timeout was available.
timeout = self.socket.gettimeout()
if timeout is None:
timeout = self.timeout
elif self.timeout is not None:
timeout = min(timeout, self.timeout)
if timeout is not None:
deadline = time() + timeout
# Wait until a request arrives or the timeout expires - the loop is
# necessary to accommodate early wakeups due to EINTR.
with _ServerSelector() as selector:
selector.register(self, selectors.EVENT_READ)
while True:
ready = selector.select(timeout)
if ready:
return self._handle_request_noblock()
else:
if timeout is not None:
timeout = deadline - time()
if timeout < 0:
return self.handle_timeout()
def _handle_request_noblock(self):
"""Handle one request, without blocking.
I assume that selector.select() has returned that the socket is
readable before this function was called, so there should be no risk of
blocking in get_request().
"""
try:
request, client_address = self.get_request()
except OSError:
return
if self.verify_request(request, client_address):
try:
self.process_request(request, client_address)
except Exception:
self.handle_error(request, client_address)
self.shutdown_request(request)
except:
self.shutdown_request(request)
raise
else:
self.shutdown_request(request)
def handle_timeout(self):
"""Called if no new request arrives within self.timeout.
Overridden by ForkingMixIn.
"""
pass
def verify_request(self, request, client_address):
"""Verify the request. May be overridden.
Return True if we should proceed with this request.
"""
return True
def process_request(self, request, client_address):
"""Call finish_request.
Overridden by ForkingMixIn and ThreadingMixIn.
"""
self.finish_request(request, client_address)
self.shutdown_request(request)
def server_close(self):
"""Called to clean-up the server.
May be overridden.
"""
pass
def finish_request(self, request, client_address):
"""Finish one request by instantiating RequestHandlerClass."""
self.RequestHandlerClass(request, client_address, self)
def shutdown_request(self, request):
"""Called to shutdown and close an individual request."""
self.close_request(request)
def close_request(self, request):
"""Called to clean up an individual request."""
pass
def handle_error(self, request, client_address):
"""Handle an error gracefully. May be overridden.
The default is to print a traceback and continue.
"""
print('-'*40, file=sys.stderr)
print('Exception occurred during processing of request from',
client_address, file=sys.stderr)
import traceback
traceback.print_exc()
print('-'*40, file=sys.stderr)
def __enter__(self):
return self
def __exit__(self, *args):
self.server_close()
2.2 TCPServer
class TCPServer(BaseServer):
"""Base class for various socket-based server classes.
Defaults to synchronous IP stream (i.e., TCP).
Methods for the caller:
- __init__(server_address, RequestHandlerClass, bind_and_activate=True)
- serve_forever(poll_interval=0.5)
- shutdown()
- handle_request() # if you don't use serve_forever()
- fileno() -> int # for selector
Methods that may be overridden:
- server_bind()
- server_activate()
- get_request() -> request, client_address
- handle_timeout()
- verify_request(request, client_address)
- process_request(request, client_address)
- shutdown_request(request)
- close_request(request)
- handle_error()
Methods for derived classes:
- finish_request(request, client_address)
Class variables that may be overridden by derived classes or
instances:
- timeout
- address_family
- socket_type
- request_queue_size (only for stream sockets)
- allow_reuse_address
Instance variables:
- server_address
- RequestHandlerClass
- socket
"""
address_family = socket.AF_INET
socket_type = socket.SOCK_STREAM
request_queue_size = 5
allow_reuse_address = False
def __init__(self, server_address, RequestHandlerClass, bind_and_activate=True):
"""Constructor. May be extended, do not override."""
BaseServer.__init__(self, server_address, RequestHandlerClass)
self.socket = socket.socket(self.address_family,
self.socket_type)
if bind_and_activate:
try:
self.server_bind()
self.server_activate()
except:
self.server_close()
raise
def server_bind(self):
"""Called by constructor to bind the socket.
May be overridden.
"""
if self.allow_reuse_address:
self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
self.socket.bind(self.server_address)
self.server_address = self.socket.getsockname()
def server_activate(self):
"""Called by constructor to activate the server.
May be overridden.
"""
self.socket.listen(self.request_queue_size)
def server_close(self):
"""Called to clean-up the server.
May be overridden.
"""
self.socket.close()
def fileno(self):
"""Return socket file number.
Interface required by selector.
"""
return self.socket.fileno()
def get_request(self):
"""Get the request and client address from the socket.
May be overridden.
"""
return self.socket.accept()
def shutdown_request(self, request):
"""Called to shutdown and close an individual request."""
try:
#explicitly shutdown. socket.close() merely releases
#the socket and waits for GC to perform the actual close.
request.shutdown(socket.SHUT_WR)
except OSError:
pass #some platforms may raise ENOTCONN here
self.close_request(request)
def close_request(self, request):
"""Called to clean up an individual request."""
request.close()
2.3 BaseRequestHandler
class BaseRequestHandler:
"""Base class for request handler classes.
This class is instantiated for each request to be handled. The
constructor sets the instance variables request, client_address
and server, and then calls the handle() method. To implement a
specific service, all you need to do is to derive a class which
defines a handle() method.
The handle() method can find the request as self.request, the
client address as self.client_address, and the server (in case it
needs access to per-server information) as self.server. Since a
separate instance is created for each request, the handle() method
can define other arbitrary instance variables.
"""
def __init__(self, request, client_address, server):
self.request = request
self.client_address = client_address
self.server = server
self.setup()
try:
self.handle()
finally:
self.finish()
def setup(self):
pass
def handle(self):
pass
def finish(self):
pass
# The following two classes make it possible to use the same service
# class for stream or datagram servers.
# Each class sets up these instance variables:
# - rfile: a file object from which receives the request is read
# - wfile: a file object to which the reply is written
# When the handle() method returns, wfile is flushed properly
2.4 StreamRequestHandler
class StreamRequestHandler(BaseRequestHandler):
"""Define self.rfile and self.wfile for stream sockets."""
# Default buffer sizes for rfile, wfile.
# We default rfile to buffered because otherwise it could be
# really slow for large data (a getc() call per byte); we make
# wfile unbuffered because (a) often after a write() we want to
# read and we need to flush the line; (b) big writes to unbuffered
# files are typically optimized by stdio even when big reads
# aren't.
rbufsize = -1
wbufsize = 0
# A timeout to apply to the request socket, if not None.
timeout = None
# Disable nagle algorithm for this socket, if True.
# Use only when wbufsize != 0, to avoid small packets.
disable_nagle_algorithm = False
def setup(self):
self.connection = self.request
if self.timeout is not None:
self.connection.settimeout(self.timeout)
if self.disable_nagle_algorithm:
self.connection.setsockopt(socket.IPPROTO_TCP,
socket.TCP_NODELAY, True)
self.rfile = self.connection.makefile('rb', self.rbufsize)
if self.wbufsize == 0:
self.wfile = _SocketWriter(self.connection)
else:
self.wfile = self.connection.makefile('wb', self.wbufsize)
def finish(self):
if not self.wfile.closed:
try:
self.wfile.flush()
except socket.error:
# A final socket error may have occurred here, such as
# the local error ECONNABORTED.
pass
self.wfile.close()
self.rfile.close()
2.4 多线程和多进程
(暂略)
这部分可能用到的参考:
socketserver模块使用与源码分析
3. 应用
这里只是搭了一个框架,内容在本文【1.2】章节
-区分socket编程和socketserver编程 ,已经讲述完成,详细部分可以点击上面提到的连接。
3.1 基于socket
3.2 基于socketserver
参考: