Openstack源码分析 Neutron源码分析（二）-------------rpc篇

最新推荐文章于 2023-10-31 16:12:56 发布

self-motivation

最新推荐文章于 2023-10-31 16:12:56 发布

阅读量4k

点赞数 3

分类专栏： OpenStack 文章标签： OpenStack Neutorn Rpc WSGI GreenPool

本文链接：https://blog.csdn.net/happyanger6/article/details/54604693

版权

OpenStack 专栏收录该内容

17 篇文章 10 订阅

订阅专栏

上一篇分析了neutron wsgi应用的源码，这一篇分析另外一部分核心功能，rpc篇，同时分析一下neutron采用的并发模型。

还是上篇的代码，启动完wsgi后，启动rpc_workers。

neutron/server/wsgi_eventlet.py:

def eventlet_wsgi_server():
    neutron_api = service.serve_wsgi(service.NeutronApiService)
    start_api_and_rpc_workers(neutron_api)

可以看到neutron默认采用的并发模型是eventlet,eventlet是一个并发网络库，底层主要是通过epoll机制来实现非阻塞I/O操作，并提供了协程机制。关于eventlet不是本篇重点，以后专门写一篇介绍eventlet的。这里只知道使用它来实现并发即可。

这里提一下neutron采用的并发模型，neutron采用的是多进程加GreenPool的并发模型。wsgi app,rpc分别fork不同的子进程来执行，在每个子进程内部通过eventlet提供的GreenPool来提高吞吐量，可以理解为线程池（实际上是GreenThread,协程）。后面详细讲解这个过程。

WSGI服务启动涉及的类的关系图如下:

对照serve_wsgi函数来讲解上面的类图，边结合类图边看代码可以理的更清楚一些：

neutron/service.py:

def serve_wsgi(cls):

    try:
        service = cls.create()
        service.start()
    except Exception:
        with excutils.save_and_reraise_exception():
            LOG.exception(_LE('Unrecoverable error: please check log '
                              'for details.'))

    return service

创建一个NeutronApiService对象，从类图可以看到这个类是WsgiService的子类，然后调用WsgiService的start方法启动服务：

 service = cls.create()
 service.start()

start方法中会使用oslo_service.wsgi中的Loader来加载wsgi app,这个Loader实际上会使用上一篇中讲到的paste.deploy来加载app。

接着来看下其父类WsgiService的start方法:

neutron/service.py:

class WsgiService(object):
    """Base class for WSGI based services.

    For each api you define, you must also define these flags:
    :<api>_listen: The address on which to listen
    :<api>_listen_port: The port on which to listen

    """

    def __init__(self, app_name):
        self.app_name = app_name
        self.wsgi_app = None

    def start(self):
        self.wsgi_app = _run_wsgi(self.app_name)

在start方法中调用_run_wsgi:

def run_wsgi_app(app):
    server = wsgi.Server("Neutron")
    server.start(app, cfg.CONF.bind_port, cfg.CONF.bind_host,
                 workers=_get_api_workers())
    LOG.info(_LI("Neutron service started, listening on %(host)s:%(port)s"),
             {'host': cfg.CONF.bind_host, 'port': cfg.CONF.bind_port})
    return server

结合类图可以看到WsgiService会声明一个neutron.wsgi::Server对象，这个对象内部会使用eventlet.GreenPool这个GreenThread池。然后调用其start方法。

neutron/wsgi.py:

def start(self, application, port, host='0.0.0.0', workers=0):
    """Run a WSGI server with the given application."""
    self._host = host
    self._port = port
    backlog = CONF.backlog

    self._socket = self._get_socket(self._host,
                                    self._port,
                                    backlog=backlog)

    self._launch(application, workers)

def _launch(self, application, workers=0):
    service = WorkerService(self, application, self.disable_ssl)
    if workers < 1:
        # The API service should run in the current process.
        self._server = service
        # Dump the initial option values
        cfg.CONF.log_opt_values(LOG, logging.DEBUG)
        service.start()
        systemd.notify_once()
    else:
        # dispose the whole pool before os.fork, otherwise there will
        # be shared DB connections in child processes which may cause
        # DB errors.
        api.dispose()
        # The API service runs in a number of child processes.
        # Minimize the cost of checking for child exit by extending the
        # wait interval past the default of 0.01s.
        self._server = common_service.ProcessLauncher(cfg.CONF,
                                                      wait_interval=1.0)
        self._server.launch_service(service, workers=workers)

结合类图和_launch代码可知，Server对象会把application封装成一个WorkerService，然后使用oslo_service.service中提供的ProcessLanucher对象来启动WokerService。

封装成WorkerService

 service = WorkerService(self, application, self.disable_ssl)

调用ProcessLauncher运行service:

 self._server = common_service.ProcessLauncher(cfg.CONF,
                                                      wait_interval=1.0)
        self._server.launch_service(service, workers=workers)

ProcessLauncher的主要作用是根据workers数量来fork不同个数个子进程,再在每个子进程中启动WorkerService。

根据workers数量来创建不同个数的子进程来运行service:

def launch_service(self, service, workers=1):
    """Launch a service with a given number of workers.

   :param service: a service to launch, must be an instance of
          :class:`oslo_service.service.ServiceBase`
   :param workers: a number of processes in which a service
          will be running
    """
    _check_service_base(service)
    wrap = ServiceWrapper(service, workers)

    LOG.info(_LI('Starting %d workers'), wrap.workers)
    while self.running and len(wrap.children) < wrap.workers:
        self._start_child(wrap)

WorkerService在启动过程中会使用Server对象的GreenPool来spawn一个GreenThread来调用eventlet.wsgi.server运行我们的app.这样就最终运行起来了wsgi app服务并对外提供restful API.

neutron/wsgi.py:

可以看到start方法中会调用self._service也就是Server对象的pool.spawn来运行Server的_run方法：

class WorkerService(worker.NeutronWorker):

def start(self):
    super(WorkerService, self).start()
    # When api worker is stopped it kills the eventlet wsgi server which
    # internally closes the wsgi server socket object. This server socket
    # object becomes not usable which leads to "Bad file descriptor"
    # errors on service restart.
    # Duplicate a socket object to keep a file descriptor usable.
    dup_sock = self._service._socket.dup()
    if CONF.use_ssl and not self._disable_ssl:
        dup_sock = sslutils.wrap(CONF, dup_sock)
    self._server = self._service.pool.spawn(self._service._run,
                                            self._application,
                                            dup_sock)

self._service._run即为Server的_run方法:

def _run(self, application, socket):
    """Start a WSGI server in a new green thread."""
    eventlet.wsgi.server(socket, application,
                         max_size=self.num_threads,
                         log=LOG,
                         keepalive=CONF.wsgi_keep_alive,
                         socket_timeout=self.client_socket_timeout)

默认情况下，workers配置为1，因此会创建一个子进程来提供restfulAPI服务，这个子进程中的eventlet.wsgi.server最终会运行在一个GreenThread中。

通过上面WSGI server的启动，我们知道了neutron使用进程+GreenPool的方式来运行服务，后面运行rpc服务也是使用上面这种架构。我们也知道了关键对象ProcessLauncher是通过创建进程的方式来启动服务的。ProcessLauncher启动的service需要是oslo_service.service::ServiceBase的子类并实现start方法。

有了上面的基础，再分析rpc的启动过程就容易了。

def start_api_and_rpc_workers(neutron_api):
    pool = eventlet.GreenPool()

    api_thread = pool.spawn(neutron_api.wait)

    try:
        neutron_rpc = service.serve_rpc()
    except NotImplementedError:
        LOG.info(_LI("RPC was already started in parent process by "
                     "plugin."))
    else:
        rpc_thread = pool.spawn(neutron_rpc.wait)

        plugin_workers = service.start_plugin_workers()
        for worker in plugin_workers:
            pool.spawn(worker.wait)

        # api and rpc should die together.  When one dies, kill the other.
        rpc_thread.link(lambda gt: api_thread.kill())
        api_thread.link(lambda gt: rpc_thread.kill())

    pool.waitall()

主进程中使用GreenPool来运行neutron_api,neutron_rpc的wait方法，并调用waitall方法等待2个GreenThread结束，实际上这意味着主进程只是等待wsgi API,rpc两个子进程结束而已。其中的link方法是确保只要rpc,api有一个服务挂掉就结束另外一个服务。

我们重点分析neutron_rpc的创建过程:

neutron/service.py:

def serve_rpc():
    plugin = manager.NeutronManager.get_plugin() 
    service_plugins = (
        manager.NeutronManager.get_service_plugins().values())

    if cfg.CONF.rpc_workers < 1:
        cfg.CONF.set_override('rpc_workers', 1)

    # If 0 < rpc_workers then start_rpc_listeners would be called in a
    # subprocess and we cannot simply catch the NotImplementedError.  It is
    # simpler to check this up front by testing whether the plugin supports
    # multiple RPC workers.
    if not plugin.rpc_workers_supported():
        LOG.debug("Active plugin doesn't implement start_rpc_listeners")
        if 0 < cfg.CONF.rpc_workers:
            LOG.error(_LE("'rpc_workers = %d' ignored because "
                          "start_rpc_listeners is not implemented."),
                      cfg.CONF.rpc_workers)
        raise NotImplementedError()

    try:
        # passing service plugins only, because core plugin is among them
        rpc = RpcWorker(service_plugins)
        # dispose the whole pool before os.fork, otherwise there will
        # be shared DB connections in child processes which may cause
        # DB errors.
        LOG.debug('using launcher for rpc, workers=%s', cfg.CONF.rpc_workers)
        session.dispose()
        launcher = common_service.ProcessLauncher(cfg.CONF, wait_interval=1.0)
        launcher.launch_service(rpc, workers=cfg.CONF.rpc_workers)
        if (cfg.CONF.rpc_state_report_workers > 0 and
            plugin.rpc_state_report_workers_supported()):
            rpc_state_rep = RpcReportsWorker([plugin])
            LOG.debug('using launcher for state reports rpc, workers=%s',
                      cfg.CONF.rpc_state_report_workers)
            launcher.launch_service(
                rpc_state_rep, workers=cfg.CONF.rpc_state_report_workers)

        return launcher
    except Exception:
        with excutils.save_and_reraise_exception():
            LOG.exception(_LE('Unrecoverable error: please check log for '
                              'details.'))

plugin = manager.NeutronManager.get_plugin()

这个NeutronManager上篇也提到过，它主要是通过配置文件来加载初始化正确的插件，如M2lPlugin,这里调用其类方法get_plugin()获取配置的核心插件保证NeutronManager是个单例类。plugin即为"Ml2Plugin"。

 service_plugins = (
        manager.NeutronManager.get_service_plugins().values())

然后获取所有的service_plugins，这个上篇中也讲到过，最终会获取到以下6个插件实例:

'neutron.plugins.ml2.plugin.Ml2Plugin'

'neutron.services.network_ip_availability.plugin.NetworkIPAvailabilityPlugin'

'neutron.services.auto_allocate.plugin.Plugin'

'neutron.services.timestamp.timestamp_plugin.TimeStampPlugin'

'neutron.services.tag.tag_plugin.TagPlugin'

'neutron.services.l3_router.l3_router_plugin.L3RouterPlugin'

if cfg.CONF.rpc_workers < 1:
        cfg.CONF.set_override('rpc_workers', 1)

然后从配置中获取配置的rpc_worker数量，默认为1。通过上面的分析可知，这个决定了后面ProcessLauncher启动几个子进程来提供服务。

if not plugin.rpc_workers_supported():
    LOG.debug("Active plugin doesn't implement start_rpc_listeners")
    if 0 < cfg.CONF.rpc_workers:
        LOG.error(_LE("'rpc_workers = %d' ignored because "
                      "start_rpc_listeners is not implemented."),
                  cfg.CONF.rpc_workers)
    raise NotImplementedError()

然后判断核心插件（这里是Ml2Plugin）是否实现了start_rpc_listeners方法，如果没有实现则报错。

rpc = RpcWorker(service_plugins)

然后创建了一个RpcWorker,这个和上面讲到的neutron.wsgi:WorkerService的作用一样，也是继承ServiceBase的子类NeutronWorker,并重写start方法，来交于ProcessLauncher运行。因此其start方法就是服务启动的关键代码:

neutron/service.py:

class RpcWorker(worker.NeutronWorker):
    """Wraps a worker to be handled by ProcessLauncher"""
    start_listeners_method = 'start_rpc_listeners'

    def __init__(self, plugins):
        self._plugins = plugins
        self._servers = []

    def start(self):
        super(RpcWorker, self).start()
        for plugin in self._plugins:
            if hasattr(plugin, self.start_listeners_method):
                try:
                    servers = getattr(plugin, self.start_listeners_method)()
                except NotImplementedError:
                    continue
                self._servers.extend(servers)

可以看到，会遍历所有的service_plugins，也就是上面讲的6个插件，查看插件是否实现了"start_rpc_listeners"方法，如果实现了则调用之。这就是RpcWorker的作用。这些插件的start_rpc_listeners方法中就完成了rpc的功能，主要是通过消费特定名称的mq队列消息来提供服务。

launcher = common_service.ProcessLauncher(cfg.CONF, wait_interval=1.0)
launcher.launch_service(rpc, workers=cfg.CONF.rpc_workers)

这样就会通过ProcessLauncher来创建了workers个子进程(默认为1)提供RPC服务，具体的rpc功能实现交给插件的"start_rpc_listeners"方法去实现。

if (cfg.CONF.rpc_state_report_workers > 0 and
    plugin.rpc_state_report_workers_supported()):
    rpc_state_rep = RpcReportsWorker([plugin])
    LOG.debug('using launcher for state reports rpc, workers=%s',
              cfg.CONF.rpc_state_report_workers)
    launcher.launch_service(
        rpc_state_rep, workers=cfg.CONF.rpc_state_report_workers)

然后判断是否配置了rpc_state_report_workers,如果配置了则再启动指定个子进程运行RpcReportWorker,这个Worker也是继承自ServiceBase并重写了start方法。最终的rpc功能交由插件的'start_rpc_state_reports_listener'方法去实现。

plugin_workers = service.start_plugin_workers()
for worker in plugin_workers:
    pool.spawn(worker.wait)

def start_plugin_workers():
    launchers = []
    # NOTE(twilson) get_service_plugins also returns the core plugin
    for plugin in manager.NeutronManager.get_unique_service_plugins():
        # TODO(twilson) Instead of defaulting here, come up with a good way to
        # share a common get_workers default between NeutronPluginBaseV2 and
        # ServicePluginBase
        for plugin_worker in getattr(plugin, 'get_workers', tuple)():
            print("Plugin start_worker",plugin,plugin_worker)
            launcher = common_service.ProcessLauncher(cfg.CONF)
            launcher.launch_service(plugin_worker)
            launchers.append(launcher)
    return launchers

最后是调用所有插件的'get_workers'方法，这个方法用于插件定义自己的ServiceBase来提供自己的个性化服务，如果有自定义的ServiceBase,最终也会交由ProcessLauncher去创建进程启动服务。

这样，整个neutron就启动完成了，可以看到rpc,wsgi都是通过封装继承自ServiceBase并交由ProcessLauncher创建进程去启动，并通过钩子函数方便插件自定义需要的服务。如果默认配置，最终会有3个子进程，分别提供wsgi api,rpc,rpc_state_reports服务。

主进程，通过GreenPool等待所有子进程结束：

eutron 1348 1 0 16:02 ? 00:00:24 /usr/bin/python /usr/bin/neutron-server --config-file=/etc/neutron/neutron.conf --config-file=/etc/neutron/plugins/ml2/ml2_conf.ini --log-file=/var/log/neutron/neutron-server.log

3个子进程分别提供不同的服务:

neutron 3275 1348 0 16:03 ? 00:00:00 /usr/bin/python /usr/bin/neutron-server --config-file=/etc/neutron/neutron.conf --config-file=/etc/neutron/plugins/ml2/ml2_conf.ini --log-file=/var/log/neutron/neutron-server.log

neutron 3276 1348 0 16:03 ? 00:00:23 /usr/bin/python /usr/bin/neutron-server --config-file=/etc/neutron/neutron.conf --config-file=/etc/neutron/plugins/ml2/ml2_conf.ini --log-file=/var/log/neutron/neutron-server.log

neutron 3277 1348 0 16:03 ? 00:00:22 /usr/bin/python /usr/bin/neutron-server --config-file=/etc/neutron/neutron.conf --config-file=/etc/neutron/plugins/ml2/ml2_conf.ini --log-file=/var/log/neutron/neutron-server.log