- API :进入Nova的HTTP接口;
- Scheduler:从可用池中选择最合适的计算节点(物理机)来创建虚拟机实例;
- Conductor:为数据库的访问提供一层安全保障,避免Compute直接访问数据库;
- Compute:管理虚拟机的生命周期;
- nova-novncproxy:用户可以通过多种方式访问虚机的控制台,该种是基于Web浏览器的VNC访问;
- nova-consoleauth:负责对访问虚机控制台提供Token认证;
- nova-cert:提供x509证书支持
二、API服务
def
main():
config.parse_args(sys.argv)
logging.setup(CONF,
"nova"
)
utils.monkey_patch()
objects.register_all()
if
'osapi_compute'
in
CONF.enabled_apis:
# NOTE(mriedem): This is needed for caching the nova-compute service
# version which is looked up when a server create request is made with
# network id of 'auto' or 'none'.
objects.Service.enable_min_version_cache()
log
=
logging.getLogger(__name__)
gmr.TextGuruMeditation.setup_autorun(version)
launcher
=
service.process_launcher()
started
=
0
for
api
in
CONF.enabled_apis:
should_use_ssl
=
api
in
CONF.enabled_ssl_apis
try
:
server
=
service.WSGIService(api, use_ssl
=
should_use_ssl)
launcher.launch_service(server, workers
=
server.workers
or
1
)
started
+
=
1
except
exception.PasteAppNotFound as ex:
log.warning(
_LW(
"%s. ``enabled_apis`` includes bad values. "
"Fix to remove this warning."
), six.text_type(ex))
if
started
=
=
0
:
log.error(_LE(
'No APIs were started. '
'Check the enabled_apis config option.'
))
sys.exit(
1
)
launcher.wait()
|
解析参数
config.parse_args(sys.argv)
具体的实现是:
CONF
=
nova.conf.CONF
def
parse_args(argv, default_config_files
=
None
, configure_db
=
True
, init_rpc
=
True
):
log.register_options(CONF)
if
CONF.glance.debug:
extra_default_log_levels
=
[
'glanceclient=DEBUG'
]
else
:
extra_default_log_levels
=
[
'glanceclient=WARN'
]
log.set_defaults(default_log_levels
=
log.get_default_log_levels()
+
extra_default_log_levels)
rpc.set_defaults(control_exchange
=
'nova'
)
config.set_middleware_defaults()
CONF(argv[
1
:],
project
=
'nova'
,
version
=
version.version_string(),
default_config_files
=
default_config_files)
if
init_rpc:
rpc.init(CONF)
if
configure_db:
sqlalchemy_api.configure(CONF)
这里主要是一些参数的初始化设置,同时把nova-api运行时需要的参数传递给cfg,接下来我们逐行分析:
首先看CONF=nova.conf.CONF,在conf/__init__.py中,大致看一眼是什么:
CONF是在oslo.config中定义的一个类,作为参数传入nova,作为在代码各处注册配置项的实例,和上图相似:
log.register_options(CONF)
CONF实例化后被传入,调用到log库,会注册log相关的一些配置项
if
CONF.glance.debug:
extra_default_log_levels
=
[
'glanceclient=DEBUG'
]
else
:
extra_default_log_levels
=
[
'glanceclient=WARN'
]
log.set_defaults(default_log_levels
=
log.get_default_log_levels()
+
extra_default_log_levels)
这部分通过判断[glance]组下的debug配置线,在log中选择启用glanceclient的报警类别
rpc.set_defaults(control_exchange
=
'nova'
)
最终调用到下面这个:
def
set_transport_defaults(control_exchange):
cfg.set_defaults(_transport_opts, control_exchange
=
control_exchange)
可以看到这个就是设置下cfg文件中的_transport_opts参数,即默认的exchange是叫做nova。
config.set_middleware_defaults()
追入代码能够知道该行是设置oslo.middleware的默认配置项,接下来:
CONF(argv[
1
:],
project
=
'nova'
,
version
=
version.version_string(),
default_config_files
=
default_config_files)
其中argv是parse_args方法传入的第一个参数,即sys.argv,作为启动服务时命令行传入的参数,我们依据systemctl start openstack-nova-api的命令追溯到openstack-nova-api的老巢:
# /usr/lib/systemd/system/openstack-nova-api.service
[Unit]
Description
=
OpenStack Nova API Server
After
=
syslog.target network.target
[Service]
Type
=
notify
NotifyAccess
=
all
TimeoutStartSec
=
0
Restart
=
always
User
=
nova
ExecStart
=
/
usr
/
bin
/
nova
-
api
[Install]
WantedBy
=
multi
-
user.target
很可惜,ExecStart执行的命令后面没有跟--config-file /etc/nova/nova.conf这样预想的内容,所以argv[1:]即对应openstack-nova-api,default_config_files为None,在oslo.config中,实例化CONF时,会判断如果default_config_files为空的时候,会依据目录位置及项目名,找到nova.conf文件,这步将nova的CONF进行实例化。
if
init_rpc:
rpc.init(CONF)
if
configure_db:
sqlalchemy_api.configure(CONF)
这两步将rpc和db的配置项进行了初始化。可见参数的解析也有一定内容。
为nova设置logging
logging.setup(CONF,
"nova"
)
进入oslo.log可以看到,setup方法首先判断Nova是否有log相关的配置文件的传入,如果没有传入则使用nova.conf,然后对nova创建异常hook,当nova-api产生异常时,则会被hook,从而进行log的输出。这里不详细分析excepthook如何创建及工作,对应方法为_create_logging_excepthook(product_name)
monkey patch
utils.monkey_patch()
monkey patch指的是在运行时动态替换,一般是在startup的时候.
用过gevent就会知道,会在最开头的地方gevent.monkey.patch_all();把标准库中的thread/socket等给替换掉.这样我们在后面使用socket的时候可以跟平常一样使用,无需修改任何代码,但是它变成非阻塞的了.
之前做的一个游戏服务器,很多地方用的import json,后来发现ujson比自带json快了N倍,于是问题来了,难道几十个文件要一个个把import json改成import ujson as json吗?
其实只需要在进程startup的地方monkey patch就行了,是影响整个进程空间的。同一进程空间中一个module只会被运行一次,给出一个demo,容易理解:import
json
import
ujson
def
monkey_patch_json():
json.__name__
=
'ujson'
json.dumps
=
ujson.dumps
json.loads
=
ujson.loads
monkey_patch_json()
print
'main.py'
,json.__name__
import
sub
再回到程序中,这里会判断配置文件中是否开启,如果开启了,会读取配置文件中的monkey_patch_modules内容,然后进行替换。默认是不开启的
注册所有的objects
objects.register_all()
这里也可以撇一眼具体内容:
__import__ 和import功能相同,不同的是作为函数,传入的是字符串格式,将nova.objects下的所有内容import进来
nova/objects目录下每一个类都对应数据库中的一个表,是操作数据库的最终接口,conductor、api操作数据库时都是经过objects
获取服务的启动器
launcher
=
service.process_launcher()
追溯到process_launcher()方法的源头,在oslo.service中:
class
ProcessLauncher(
object
):
def
__init__(
self
):
self
.children
=
{}
self
.sigcaught
=
None
self
.running
=
True
rfd,
self
.writepipe
=
os.pipe()
self
.readpipe
=
eventlet.greenio.GreenPipe(rfd,
'r'
)
self
.handle_signal()
可以看到这段代码是为了获取一个ProcessLauncher的对象,这个ProcessLancher应该指的就是目前的这个进程,其后它会启动一些worker,并作为这些worker的父进程。这里的信号量主要就是捕获signal.SIGTERM和signal.SIGINT,如果得到了这两个信号的话就会设置ProcessLauncher的running属性为false。
启动对应的api
started
=
0
for
api
in
CONF.enabled_apis:
should_use_ssl
=
api
in
CONF.enabled_ssl_apis
try
:
server
=
service.WSGIService(api, use_ssl
=
should_use_ssl)
launcher.launch_service(server, workers
=
server.workers
or
1
)
started
+
=
1
这里的CONF.enabled_apis对应配置文件的内容:enabled_apis = ec2, osapi_compute, metadata弄清这些是什么之前我们先看看service.WSGIService是什么:
class
WSGIService(service.Service):
def
__init__(
self
, name, loader
=
None
, use_ssl
=
False
, max_url_len
=
None
):
self
.name
=
name
self
.binary
=
'nova-%s'
%
name
self
.topic
=
None
self
.manager
=
self
._get_manager()
self
.loader
=
loader
or
wsgi.Loader()
self
.app
=
self
.loader.load_app(name)
# inherit all compute_api worker counts from osapi_compute
if
name.startswith(
'openstack_compute_api'
):
wname
=
'osapi_compute'
else
:
wname
=
name
self
.host
=
getattr
(CONF,
'%s_listen'
%
name,
"0.0.0.0"
)
self
.port
=
getattr
(CONF,
'%s_listen_port'
%
name,
0
)
self
.workers
=
(
getattr
(CONF,
'%s_workers'
%
wname,
None
)
or
processutils.get_worker_count())
if
self
.workers
and
self
.workers <
1
:
worker_name
=
'%s_workers'
%
name
msg
=
(_(
"%(worker_name)s value of %(workers)s is invalid, "
"must be greater than 0"
)
%
{
'worker_name'
: worker_name,
'workers'
:
str
(
self
.workers)})
raise
exception.InvalidInput(msg)
self
.use_ssl
=
use_ssl
self
.server
=
wsgi.Server(name,
self
.app,
host
=
self
.host,
port
=
self
.port,
use_ssl
=
self
.use_ssl,
max_url_len
=
max_url_len)
# Pull back actual port used
self
.port
=
self
.server.port
self
.backdoor_port
=
None
该初始化方法通过paste库生成WSGI的app,不过这里的app只是其一个属性,这里的wsgi.Server会监听对应的端口,也就是说这个时候wsgi app已经起来了,实际在后面真正启动,我们看看刚才的enabled_apis具体是啥(运行代码时调试打印):
name: ec2
manager:
None
loader: <nova.wsgi.Loader
object
at
0x2f1ea10
>
app: {(
None
,
'/services/Cloud'
): <nova.api.ec2.FaultWrapper
object
at
0x37a2090
>}
host:
0.0
.
0.0
port:
8773
workers:
1
name: osapi_compute
manager:
None
loader: <nova.wsgi.Loader
object
at
0x2f1edd0
>
app: {(
None
,
'/v3'
): <nova.api.openstack.FaultWrapper
object
at
0x41425d0
>, (
None
,
'/v1.1'
): <nova.api.openstack.FaultWrapper
object
at
0x48cdd50
>, (
None
, '
'): <nova.api.openstack.FaultWrapper object at 0x48d5850>, (None, '
/
v2'): <nova.api.openstack.FaultWrapper
object
at
0x413cb90
>}
host:
0.0
.
0.0
port:
8774
workers:
1
name: metadata
manager: <nova.api.manager.MetadataManager
object
at
0x2f21210
>
loader: <nova.wsgi.Loader
object
at
0x2f1ee50
>
app: {(
None
, ''): <nova.api.ec2.FaultWrapper
object
at
0x48edd50
>}
host:
0.0
.
0.0
port:
8775
workers:
1
name属性即这个WSGIService的名字
manager只有metadata有,在配置文件中进行配置,搜索规则如下:def
_get_manager(
self
):
"""Initialize a Manager object appropriate for this service.
Use the service name to look up a Manager subclass from the
configuration and initialize an instance. If no class name
is configured, just return None.
:returns: a Manager instance, or None.
"""
fl
=
'%s_manager'
%
self
.name
if
fl
not
in
CONF:
return
None
manager_class_name
=
CONF.get(fl,
None
)
if
not
manager_class_name:
return
None
manager_class
=
importutils.import_class(manager_class_name)
return
manager_class()
loader是用于解析paste.ini生成的app,在这里都是用的默认的;
app即paste生成的WSGI的app,请求来了都发往这些app
host和port即监听路径及端口
worker则是具体的工作数,一般默认是一个CPU核对应一个,原因后续分析。继续看代码:self
.app
=
self
.loader.load_app(name)
class
Loader(
object
):
"""Used to load WSGI applications from paste configurations."""
def
__init__(
self
, config_path
=
None
):
"""Initialize the loader, and attempt to find the config.
:param config_path: Full or relative path to the paste config.
:returns: None
"""
self
.config_path
=
None
config_path
=
config_path
or
CONF.wsgi.api_paste_config
if
not
os.path.isabs(config_path):
self
.config_path
=
CONF.find_file(config_path)
elif
os.path.exists(config_path):
self
.config_path
=
config_path
if
not
self
.config_path:
raise
exception.ConfigNotFound(path
=
config_path)
def
load_app(
self
, name):
"""Return the paste URLMap wrapped WSGI application.
:param name: Name of the application to load.
:returns: Paste URLMap object wrapping the requested application.
:raises: `nova.exception.PasteAppNotFound`
"""
try
:
LOG.debug(
"Loading app %(name)s from %(path)s"
,
{
'name'
: name,
'path'
:
self
.config_path})
return
deploy.loadapp(
"config:%s"
%
self
.config_path, name
=
name)
except
LookupError:
LOG.exception(_LE(
"Couldn't lookup app: %s"
), name)
raise
exception.PasteAppNotFound(name
=
name, path
=
self
.config_path)
这个地方比较关键,在__init__中,可以看到paste读取的配置项定义在CONF.wsgi.api_paste_config中,值为api-paste.ini,即load_app方法中会去/etc/nova/api-paste.ini文件中加载对应app,该文件解析不再详述。继续看:
launcher.launch_service(server, workers
=
server.workers
or
1
)
def
launch_service(
self
, service, workers
=
1
):
"""Launch a service with a given number of workers.
:param service: a service to launch, must be an instance of
:class:`oslo_service.service.ServiceBase`
:param workers: a number of processes in which a service
will be running
"""
_check_service_base(service)
wrap
=
ServiceWrapper(service, workers)
LOG.info(_LI(
'Starting %d workers'
), wrap.workers)
while
self
.running
and
len
(wrap.children) < wrap.workers:
self
._start_child(wrap)
_check_service_base检查传入的service类型是否为ServiceBase,warp为普通结构体,主要初始化了self.service、self.workers、self.children、self.forktimes这些内容,主要看self._start_child()方法:
def
_start_child(
self
, wrap):
if
len
(wrap.forktimes) > wrap.workers:
# Limit ourselves to one process a second (over the period of
# number of workers * 1 second). This will allow workers to
# start up quickly but ensure we don't fork off children that
# die instantly too quickly.
if
time.time()
-
wrap.forktimes[
0
] < wrap.workers:
LOG.info(_LI(
'Forking too fast, sleeping'
))
time.sleep(
1
)
wrap.forktimes.pop(
0
)
wrap.forktimes.append(time.time())
pid
=
os.fork()
if
pid
=
=
0
:
self
.launcher
=
self
._child_process(wrap.service)
while
True
:
self
._child_process_handle_signal()
status, signo
=
self
._child_wait_for_exit_or_signal(
self
.launcher)
if
not
_is_sighup_and_daemon(signo):
self
.launcher.wait()
break
self
.launcher.restart()
os._exit(status)
LOG.debug(
'Started child %d'
, pid)
wrap.children.add(pid)
self
.children[pid]
=
wrap
return
pid
可以看到,这里主要就是做一个fork的工作。fork的返回值如果是0的话那么当前的上下文就是子进程。否则就是在父进程中。
首先先看父进程的,父进程做的事情比较简单:启动子进程,然后记录一些信息(更新wrap),然后返回子进程的PID。
然后再看子进程的,子进程的话先通过launcher = self._child_process(wrap.service)启动对应的服务,然后就是while循环等待信号。如果有终止的信号量就结束,否则则重新启动进程。所以我们来看下launcher = self._child_process(wrap.service)做了什么吧,这里的service就是我们上面的WSGIService对象(有WSGI app在里边):def
_child_process(
self
, service):
self
._child_process_handle_signal()
# Reopen the eventlet hub to make sure we don't share an epoll
# fd with parent and/or siblings, which would be bad
eventlet.hubs.use_hub()
# Close write to ensure only parent has it open
os.close(
self
.writepipe)
# Create greenthread to watch for parent to close pipe
eventlet.spawn_n(
self
._pipe_watcher)
# Reseed random number generator
random.seed()
launcher
=
Launcher(
self
.conf, restart_method
=
self
.restart_method)
launcher.launch_service(service)
return
launcher
首先是设置一些信号量,然后定义一个Launcher对象。这个对象和eventlet有关,可以看成就是启动一个线程。其最终会调用WSGIService的start方法(入口在service的add方法中self.tg.add_thread(self.run_service, service, self.done)的self.run_service里面):
def
start(
self
):
"""Start serving this service using loaded configuration.
Also, retrieve updated port number in case '0' was passed in, which
indicates a random port should be used.
:returns: None
"""
ctxt
=
context.get_admin_context()
service_ref
=
objects.Service.get_by_host_and_binary(ctxt,
self
.host,
self
.binary)
if
service_ref:
_update_service_ref(service_ref)
else
:
try
:
service_ref
=
_create_service_ref(
self
, ctxt)
except
(exception.ServiceTopicExists,
exception.ServiceBinaryExists):
# NOTE(danms): If we race to create a record wth a sibling,
# don't fail here.
service_ref
=
objects.Service.get_by_host_and_binary(
ctxt,
self
.host,
self
.binary)
if
self
.manager:
self
.manager.init_host()
self
.manager.pre_start_hook()
if
self
.backdoor_port
is
not
None
:
self
.manager.backdoor_port
=
self
.backdoor_port
self
.server.start()
if
self
.manager:
self
.manager.post_start_hook()
self.server.start()即启动了一个WSGI的服务。
这里和普通的多线程模型下的HTTP server有一个很大的不同:普通的HTTP server在启动的时候只有一个线程在那里监听一个端口,来了一个请求才会fork一个线程去做独立的处理(也就是说如果请求很多的话,线程个数也会很多)。但这里由于使用了eventlet的绿化(本质就是协程,可以看这里),因此对于协程来说,一个协程只能运行在一个CPU核上,并且不存在来个请求就fork这种东西,所以这里会根据CPU的核的个数去建立对应的协程(worker)。当一个请求来了过后呢其就会交给某一个协程去处理。不管请求个数多少,协程的个数是固定的。协程在HTTP server方面的效率比较高。监听并服务
launcher.wait()
代码中,我们知道这里就是看自己的子进程(worker)是否挂了,挂了的话启动起来就行了。当然啦如果是收到了停止的信号量,那么就kill所有的子进程,然后launcher这个父进程也退出循环并结束了。
总结
nova-api的启动过程就是读取配置文件,生成TRANSPORT和NOTIFIER这两个全局对象用于消息操作。同时启动n个wsgi server,每个server对应配置文件中的一个api。另外根据系统的CPU核心数n,每个wsgi server都会有n个worker协程去处理请求。