Python 网络编程:再谈socket、socketserver(SocketServer 源码解读)

python之HTTP模块 这篇文章受到启发,直接去看顶层的 socketserver.py 源码,看完之后很多东西恍然大悟,而不是陷入垃圾的海洋——没有高屋建瓴的上帝视角,有些东西真的莫名其妙,很难想通。

1. 预备核心知识

以下预备知识很有助于理解源码:

1.0 socket 通信基本流程

  • 服务端:① 等待接受数据 ② 处理数据
  • 客户端:② 连接服务器 ② 发送接收数据

这里提一小点(关于ip地址和端口的疑问):
因为是需要客户端主动去寻找服务端的,所以:
是服务端先绑定(bind)一个ip地址和端口,也就是服务器本地的ip地址和端口;
而客户端需要连接(connect)这个ip地址和端口。

1.1 socket 的含义和作用

这里简单理解:

  • 含义:可简单理解为服务端和客户端中的任意一端
    (“两个程序通过一个双向的通信连接实现数据的交换,这个连接的一端称为一个socket。”)
  • 作用:衔接网络传输层和应用层,便于编程

1.2 区分socket编程和socketserver编程

Python 提供了两个级别访问的网络服务:

  • 1、低级别:socket编程
    提供了标准的 BSD Sockets API,可实现偏底层的开发

    socket编程思路:

    (1)服务端

    • 创建socket:socket.socket()
    • 绑定socket到本地IP和端口:socket.bind()
    • 开始监听连接:socket.listen()
    • 循环,接收连接请求:socket.accept()
    • 接收数据:socket.recv()
    • 发送数据:socket.sendall()
    • (传输完毕后)关闭socket:socket.close()

    (2)客户端

    • 创建socket:socket.socket()
    • 连接服务器地址:socket.connect()
    • 发送数据:socket.sendall()
    • 接收数据:socket.reccv()
    • (传输完毕后)关闭socket:socket.close()

    更详细可以查看之前的文章 Python 网络编程(5):基于socket的网络编程
    也可看其他博客:socket编程

  • 2、高级别:socketserver编程
    提供了服务器中心类,简化了开发。说大白话,socketserver 就是为了简化编程 对socket更高级的封装,并且实现并发等功能(socketserver编程也是本篇的内容)

    也正是因为如此,socketserver其实包含了socket和其本身socketserver两种!

    socketserver编程:

    (1)服务端

    • 自定义一个Handler类:class MyHandler(socketserver.BaseRequestHandler)
      用以继承BaseRequestHandler基类,并重写函数handle()。这其中就包含了绑定、监听等初始化操作。
    • 创建一个server(socketserver):server = socketserver.TCPServer((HOST, PORT), MyHandler)
      调用上面创建的Handler类(最主要的是里面自定义的handle()函数)
    • 让server处于永久循环(除非ctrl c 打断):server.serve_forever()

    (2)客户端

    • 创建一个socket不是socketclient):
      注:这里有两种常见方式:① 使用 try…finally…语句或者② 更简洁的with语句
      with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock
    • 连接到服务端:sock.connect((HOST, PORT))
    • 向服务端发送数据:sock.sendall()
    • 从服务端接收数据:sock.recv()

注意一个问题:有socketserver,那是不是也有socketclient
答案是否定的,socketserver对客户端没有太高的要求,只需要自己写一些socket就行了

(具体源码和实例看下文)

更多详细内容也以看之前的文章:Python 网络编程:基于socketserver的网络编程,或者点击:socketserver编程

1.3 socketserver 的框架(模块继承)

继承图如下:

+------------+
| BaseServer |
+------------+
      |
      v
+-----------+        +------------------+
| TCPServer |------->| UnixStreamServer |
+-----------+        +------------------+
      |
      v
+-----------+        +--------------------+
| UDPServer |------->| UnixDatagramServer |
+-----------+        +--------------------+

也就是说,在socketserver中定义了五种不同的服务器类:

  • BaseServer 定义了 API,单并非用来实例化和直接使用。
  • TCPServer 使用 TCP / IP 套接字进行通信。
  • UDPServer 使用数据报套接字。
  • UnixStreamServerUnixDatagramServer 使用 Unix 域套接字,分别继承自TCPServerUDPServer,仅在 Unix 平台上可用。

1.4 socketserver 两种角度的server定义

这个如果没搞清楚的话,初学很容易懵逼!

  • 1、基于socket的servers(socket-based servers

    • address family:
      - AF_INET{,6}: IP (Internet Protocol) sockets (default)
      - AF_UNIX: Unix domain sockets
      - others, e.g. AF_DECNET are conceivable (see <socket.h>
    • socket type:
      - SOCK_STREAM (reliable stream, e.g. TCP)
      - SOCK_DGRAM (datagrams, e.g. UDP)
  • 2、基于request的servers(request-based servers

    • client address verification before further looking at the request
      (This is actually a hook for any processing that needs to look
      at the request before anything else, e.g. logging)
    • how to handle multiple requests:
      - synchronous (one request is handled at a time)
      - forking (each request is handled by a new process)
      - threading (each request is handled by a new thread)

    关于这个模块的这段解释,我不是很懂:
    The classes in this module favor the server type that is simplest to
    write: a synchronous TCP/IP server. This is bad class design, but
    saves some typing. (There’s also the issue that a deep class hierarchy
    slows down method lookups.)

2. 源码

2.0 预备理解

直接说结论:

  • BaseServer是所有server的基类(”Base class for server classes.”)。
    所以其尽可能地抽象出所有server的共性,例如开启事件监听循环。
    大白话解释:TCPServer或UDPServer均需调用基类BaseServer以完成初始化。
  • BaseRequestHandler是所有handler的基类(”Base class for request handler classes.“)
  • 所有server必须引用BaseRequestHandler类,并重写里面的handle()方法(源码里面有提)

第三点对于理解尤为重要,所以问题的核心也就清晰了

自定一个Handler类,重定义handle()方法即可。
(这部分放在文章的”应用“章节)

下面只列出了较为核心的几个,还有 UDPServer、DatagramRequestHandler和ThreadingMixIn、ForkingMixIn等省略。
提一句,后两个类各自实现了多进程多线程的功能

socketserver包提供5个Server类,这些单独使用这些Server类都只能完成同步的操作,他是一个单线程的,不能同时处理各个客户端的请求,只能按照顺序依次处理。

下图为SocketServer的框架图(摘自 socketserver模块解析
在这里插入图片描述
一点说明:下面的小节摘了几个重要类的源码,相信上面的框架性的东西理解之后再看已经没有什么难度了,更为详细解读部分以后有机会再弄。

2.1 BaseServer

class BaseServer:

    """Base class for server classes.

    Methods for the caller: # 供调用者使用的方法

    - __init__(server_address, RequestHandlerClass)
    - serve_forever(poll_interval=0.5)
    - shutdown()
    - handle_request()  # if you do not use serve_forever()
    - fileno() -> int   # for selector

    Methods that may be overridden: # 可能被覆写的方法

    - server_bind()
    - server_activate()
    - get_request() -> request, client_address
    - handle_timeout()
    - verify_request(request, client_address)
    - server_close()
    - process_request(request, client_address)
    - shutdown_request(request)
    - close_request(request)
    - service_actions()
    - handle_error()

    Methods for derived classes: # 用于派生类的方法

    - finish_request(request, client_address)

    Class variables that may be overridden by derived classes or
    instances:

    - timeout
    - address_family
    - socket_type
    - allow_reuse_address

    Instance variables: # 实例变量

    - RequestHandlerClass
    - socket

    """

    timeout = None

    def __init__(self, server_address, RequestHandlerClass):
        """Constructor.  May be extended, do not override."""
        self.server_address = server_address
        self.RequestHandlerClass = RequestHandlerClass
        self.__is_shut_down = threading.Event()
        self.__shutdown_request = False

    def server_activate(self):
        """Called by constructor to activate the server.

        May be overridden.

        """
        pass

    def serve_forever(self, poll_interval=0.5):
        """Handle one request at a time until shutdown.

        Polls for shutdown every poll_interval seconds. Ignores
        self.timeout. If you need to do periodic tasks, do them in
        another thread.
        """
        self.__is_shut_down.clear()
        try:
            # XXX: Consider using another file descriptor or connecting to the
            # socket to wake this up instead of polling. Polling reduces our
            # responsiveness to a shutdown request and wastes cpu at all other
            # times.
            with _ServerSelector() as selector:
                selector.register(self, selectors.EVENT_READ)

                while not self.__shutdown_request:
                    ready = selector.select(poll_interval)
                    # bpo-35017: shutdown() called during select(), exit immediately.
                    if self.__shutdown_request:
                        break
                    if ready:
                        self._handle_request_noblock()

                    self.service_actions()
        finally:
            self.__shutdown_request = False
            self.__is_shut_down.set()

    def shutdown(self):
        """Stops the serve_forever loop.

        Blocks until the loop has finished. This must be called while
        serve_forever() is running in another thread, or it will
        deadlock.
        """
        self.__shutdown_request = True
        self.__is_shut_down.wait()

    def service_actions(self):
        """Called by the serve_forever() loop.

        May be overridden by a subclass / Mixin to implement any code that
        needs to be run during the loop.
        """
        pass

    # The distinction between handling, getting, processing and finishing a
    # request is fairly arbitrary.  Remember:
    #
    # - handle_request() is the top-level call.  It calls selector.select(),
    #   get_request(), verify_request() and process_request()
    # - get_request() is different for stream or datagram sockets
    # - process_request() is the place that may fork a new process or create a
    #   new thread to finish the request
    # - finish_request() instantiates the request handler class; this
    #   constructor will handle the request all by itself

    def handle_request(self):
        """Handle one request, possibly blocking.

        Respects self.timeout.
        """
        # Support people who used socket.settimeout() to escape
        # handle_request before self.timeout was available.
        timeout = self.socket.gettimeout()
        if timeout is None:
            timeout = self.timeout
        elif self.timeout is not None:
            timeout = min(timeout, self.timeout)
        if timeout is not None:
            deadline = time() + timeout

        # Wait until a request arrives or the timeout expires - the loop is
        # necessary to accommodate early wakeups due to EINTR.
        with _ServerSelector() as selector:
            selector.register(self, selectors.EVENT_READ)

            while True:
                ready = selector.select(timeout)
                if ready:
                    return self._handle_request_noblock()
                else:
                    if timeout is not None:
                        timeout = deadline - time()
                        if timeout < 0:
                            return self.handle_timeout()

    def _handle_request_noblock(self):
        """Handle one request, without blocking.

        I assume that selector.select() has returned that the socket is
        readable before this function was called, so there should be no risk of
        blocking in get_request().
        """
        try:
            request, client_address = self.get_request()
        except OSError:
            return
        if self.verify_request(request, client_address):
            try:
                self.process_request(request, client_address)
            except Exception:
                self.handle_error(request, client_address)
                self.shutdown_request(request)
            except:
                self.shutdown_request(request)
                raise
        else:
            self.shutdown_request(request)

    def handle_timeout(self):
        """Called if no new request arrives within self.timeout.

        Overridden by ForkingMixIn.
        """
        pass

    def verify_request(self, request, client_address):
        """Verify the request.  May be overridden.

        Return True if we should proceed with this request.

        """
        return True

    def process_request(self, request, client_address):
        """Call finish_request.

        Overridden by ForkingMixIn and ThreadingMixIn.

        """
        self.finish_request(request, client_address)
        self.shutdown_request(request)

    def server_close(self):
        """Called to clean-up the server.

        May be overridden.

        """
        pass

    def finish_request(self, request, client_address):
        """Finish one request by instantiating RequestHandlerClass."""
        self.RequestHandlerClass(request, client_address, self)

    def shutdown_request(self, request):
        """Called to shutdown and close an individual request."""
        self.close_request(request)

    def close_request(self, request):
        """Called to clean up an individual request."""
        pass

    def handle_error(self, request, client_address):
        """Handle an error gracefully.  May be overridden.

        The default is to print a traceback and continue.

        """
        print('-'*40, file=sys.stderr)
        print('Exception occurred during processing of request from',
            client_address, file=sys.stderr)
        import traceback
        traceback.print_exc()
        print('-'*40, file=sys.stderr)

    def __enter__(self):
        return self

    def __exit__(self, *args):
        self.server_close()

2.2 TCPServer

class TCPServer(BaseServer):

    """Base class for various socket-based server classes.

    Defaults to synchronous IP stream (i.e., TCP).

    Methods for the caller:

    - __init__(server_address, RequestHandlerClass, bind_and_activate=True)
    - serve_forever(poll_interval=0.5)
    - shutdown()
    - handle_request()  # if you don't use serve_forever()
    - fileno() -> int   # for selector

    Methods that may be overridden:

    - server_bind()
    - server_activate()
    - get_request() -> request, client_address
    - handle_timeout()
    - verify_request(request, client_address)
    - process_request(request, client_address)
    - shutdown_request(request)
    - close_request(request)
    - handle_error()

    Methods for derived classes:

    - finish_request(request, client_address)

    Class variables that may be overridden by derived classes or
    instances:

    - timeout
    - address_family
    - socket_type
    - request_queue_size (only for stream sockets)
    - allow_reuse_address

    Instance variables:

    - server_address
    - RequestHandlerClass
    - socket

    """

    address_family = socket.AF_INET

    socket_type = socket.SOCK_STREAM

    request_queue_size = 5

    allow_reuse_address = False

    def __init__(self, server_address, RequestHandlerClass, bind_and_activate=True):
        """Constructor.  May be extended, do not override."""
        BaseServer.__init__(self, server_address, RequestHandlerClass)
        self.socket = socket.socket(self.address_family,
                                    self.socket_type)
        if bind_and_activate:
            try:
                self.server_bind()
                self.server_activate()
            except:
                self.server_close()
                raise

    def server_bind(self):
        """Called by constructor to bind the socket.

        May be overridden.

        """
        if self.allow_reuse_address:
            self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        self.socket.bind(self.server_address)
        self.server_address = self.socket.getsockname()

    def server_activate(self):
        """Called by constructor to activate the server.

        May be overridden.

        """
        self.socket.listen(self.request_queue_size)

    def server_close(self):
        """Called to clean-up the server.

        May be overridden.

        """
        self.socket.close()

    def fileno(self):
        """Return socket file number.

        Interface required by selector.

        """
        return self.socket.fileno()

    def get_request(self):
        """Get the request and client address from the socket.

        May be overridden.

        """
        return self.socket.accept()

    def shutdown_request(self, request):
        """Called to shutdown and close an individual request."""
        try:
            #explicitly shutdown.  socket.close() merely releases
            #the socket and waits for GC to perform the actual close.
            request.shutdown(socket.SHUT_WR)
        except OSError:
            pass #some platforms may raise ENOTCONN here
        self.close_request(request)

    def close_request(self, request):
        """Called to clean up an individual request."""
        request.close()

2.3 BaseRequestHandler

class BaseRequestHandler:

    """Base class for request handler classes.

    This class is instantiated for each request to be handled.  The
    constructor sets the instance variables request, client_address
    and server, and then calls the handle() method.  To implement a
    specific service, all you need to do is to derive a class which
    defines a handle() method.

    The handle() method can find the request as self.request, the
    client address as self.client_address, and the server (in case it
    needs access to per-server information) as self.server.  Since a
    separate instance is created for each request, the handle() method
    can define other arbitrary instance variables.

    """

    def __init__(self, request, client_address, server):
        self.request = request
        self.client_address = client_address
        self.server = server
        self.setup()
        try:
            self.handle()
        finally:
            self.finish()

    def setup(self):
        pass

    def handle(self):
        pass

    def finish(self):
        pass


# The following two classes make it possible to use the same service
# class for stream or datagram servers.
# Each class sets up these instance variables:
# - rfile: a file object from which receives the request is read
# - wfile: a file object to which the reply is written
# When the handle() method returns, wfile is flushed properly

2.4 StreamRequestHandler

class StreamRequestHandler(BaseRequestHandler):

    """Define self.rfile and self.wfile for stream sockets."""

    # Default buffer sizes for rfile, wfile.
    # We default rfile to buffered because otherwise it could be
    # really slow for large data (a getc() call per byte); we make
    # wfile unbuffered because (a) often after a write() we want to
    # read and we need to flush the line; (b) big writes to unbuffered
    # files are typically optimized by stdio even when big reads
    # aren't.
    rbufsize = -1
    wbufsize = 0

    # A timeout to apply to the request socket, if not None.
    timeout = None

    # Disable nagle algorithm for this socket, if True.
    # Use only when wbufsize != 0, to avoid small packets.
    disable_nagle_algorithm = False

    def setup(self):
        self.connection = self.request
        if self.timeout is not None:
            self.connection.settimeout(self.timeout)
        if self.disable_nagle_algorithm:
            self.connection.setsockopt(socket.IPPROTO_TCP,
                                       socket.TCP_NODELAY, True)
        self.rfile = self.connection.makefile('rb', self.rbufsize)
        if self.wbufsize == 0:
            self.wfile = _SocketWriter(self.connection)
        else:
            self.wfile = self.connection.makefile('wb', self.wbufsize)

    def finish(self):
        if not self.wfile.closed:
            try:
                self.wfile.flush()
            except socket.error:
                # A final socket error may have occurred here, such as
                # the local error ECONNABORTED.
                pass
        self.wfile.close()
        self.rfile.close()
        

2.4 多线程和多进程

(暂略)
这部分可能用到的参考:
socketserver模块使用与源码分析

3. 应用

这里只是搭了一个框架,内容在本文【1.2】章节-区分socket编程和socketserver编程 ,已经讲述完成,详细部分可以点击上面提到的连接。

3.1 基于socket

3.2 基于socketserver

参考:

  1. 启发博客:python之HTTP模块
  2. 源码:socketserver.py
  3. Python官方文档:socketserver — A framework for network servers
  4. socketserver模块解析
  5. Python SocketServer 源码阅读
  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值