昨天晚上憋了2个小时,愣是没有写一篇博文,要写的东西太多了,很多东西又太细了,总想把东西写的覆盖点全面点,但是描述又相对来说简单。这说明我要赶快提高自己能力了,加油了。好了,不说废话,进入正题。
本人虾悠悠 QQ:617600535 邮箱:leezhoucloud@gmail.com,欢迎交流。
(一)关于periodic_task那点事
1、periodic_task大家都知道,是一个周期性任务,那么究竟它是在什么时候启动的呢?
答:对前面nova service启动过程比较熟悉的同学一定知道,periodic_task伴随着一个服务(比如compute-serveice)的启动而开始的。源码在/nova/service.py文件中,
#在start函数里面,关于periodic_task的启动代码
if self.periodic_enable:
if self.periodic_fuzzy_delay:
initial_delay = random.randint(0, self.periodic_fuzzy_delay)
else:
initial_delay = None
#通过下面一句,把periodic_task给启动了
self.tg.add_dynamic_timer(self.periodic_tasks,
initial_delay=initial_delay,
periodic_interval_max=
self.periodic_interval_max)
大家看到上面self.tg.add_dynamic_timer(),里面有句self.periodic_task,就是调用了自己文件中如下manager.periodic_task(),开启了manager中带有@periodic_task,即被periodic_task修饰的所有函数。
def periodic_tasks(self, raise_on_error=False):
"""Tasks to be run at a periodic interval."""
ctxt = context.get_admin_context()
return self.manager.periodic_tasks(ctxt, raise_on_error=raise_on_error)
下面是compute-manager里面关于资源刷新的截图。
(二) 主机资源刷新机制
通过(一),我们已经能够很清楚的知道,periodic_task到底是怎么启动的了,接下来,就延续上面的截图,来简单讲一讲compute node主机的刷新机制
1、先贴上上一段截图的代码
#在文件/nova/compute/manager.py下
#主机周期性资源刷新函数
@periodic_task.periodic_task
def update_available_resource(self, context):
"""See driver.get_available_resource()
Periodic process that keeps that the compute host's understanding of
resource availability and usage in sync with the underlying hypervisor.
:param context: security context
"""
new_resource_tracker_dict = {}
nodenames = set(self.driver.get_available_nodes())
for nodename in nodenames:
rt = self._get_resource_tracker(nodename)
rt.update_available_resource(context)
new_resource_tracker_dict[nodename] = rt
# Delete orphan compute node not reported by driver but still in db
compute_nodes_in_db = self._get_compute_nodes_in_db(context)
for cn in compute_nodes_in_db:
if cn.get('hypervisor_hostname') not in nodenames:
LOG.audit(_("Deleting orphan compute node %s") % cn['id'])
self.conductor_api.compute_node_delete(context, cn)
self._resource_tracker_dict = new_resource_tracker_dict
简单分析下上面的代码:
(1)nodenames = set(self.driver.get_available_nodes()) 首先获取可以拿到的compute node
(2)rt = self._get_resource_tracker(nodename),通过这段代码,让node获取一个类似resource_tracker的句柄,可以拿来操纵底层资源
rt实际上是</nova/compute/resource_tracker.py>中类ResourceTracker的一个对象
(3)rt.update_available_resource(context),开始调用</nova/compute/resource_tracker.py>中的update_available_resource()函数
代码如下
@utils.synchronized(COMPUTE_RESOURCE_SEMAPHORE)
def update_available_resource(self, context):
"""Override in-memory calculations of compute node resource usage based
on data audited from the hypervisor layer.
Add in resource claims in progress to account for operations that have
declared a need for resources, but not necessarily retrieved them from
the hypervisor layer yet.
"""
LOG.audit(_("Auditing locally available compute resources"))
resources = self.driver.get_available_resource(self.nodename)
(4)看到关键的资源更新调用 resources = self.driver.get_available_resource(self.nodename) 这行代码,我们知道,nova中driver默认是kvm虚拟化,它又是以Libvirt为基础的,所以,我们找到,/nova/virt/libvirt/driver.py(所有在默认kvm情况下,虚拟机连通就是通过这个driver适配器来实现的)
def get_available_resource(self, nodename):
"""Retrieve resource information.
This method is called when nova-compute launches, and
as part of a periodic task that records the results in the DB.
:param nodename: will be put in PCI device
:returns: dictionary containing resource info
"""
# Temporary: convert supported_instances into a string, while keeping
# the RPC version as JSON. Can be changed when RPC broadcast is removed
stats = self.host_state.get_host_stats(refresh=True)
stats['supported_instances'] = jsonutils.dumps(
stats['supported_instances'])
return stats
我们看到driver又调用了 stats = self.host_state.get_host_stats(refresh=True)来获取主机相关的状态信息。
(6)所以,麻烦我们再来看下这该文件中,HostState的几个相关的函数
a)get_host_stats方法,因为(5)中看到,refreash = True,所以,函数会如同代码中注释所说,首先“run update”
def get_host_stats(self, refresh=False):
"""Return the current state of the host.
If 'refresh' is True, run update the stats first.
"""
if refresh or not self._stats:
self.update_status()
return self._stats
b)我们再来看下update_status函数到底干了什么?
"""Retrieve status info from libvirt."""这才是这个函数的一个真正用途,从Libvirt中把主机资源的信息全部获取过来(比cpu,内存之类的),这些也就是我们能够获得的一些主机信息。看下截图吧,这些就是我们要找的compute到底收集了啥信息。。
(7)通过上一步,把收集到的信息赋值给一个data = {},并且_stats = data,这个_stats保存了主机的各种信息,我们返回到/nova/compute/resource_tracker.py再看一下, 在update_available_resource函数中,resources = self.driver.get_available_resource(self.nodename)这句话,完成了我们从主机中更新资源的操作,到了函数末尾,有两句关键调用
(A)self._report_final_resource_view(resources)
(B)self._sync_compute_node(context, resources)
先看(A)段代码,它把资源的剩余使用量等计算了一下,为下一步,Scheduler调度时候参考主机情况做准备
def _report_final_resource_view(self, resources):
"""Report final calculate of free memory, disk, CPUs, and PCI devices,
including instance calculations and in-progress resource claims. These
values will be exposed via the compute node table to the scheduler.
"""
LOG.audit(_("Free ram (MB): %s") % resources['free_ram_mb'])
LOG.audit(_("Free disk (GB): %s") % resources['free_disk_gb'])
vcpus = resources['vcpus']
if vcpus:
free_vcpus = vcpus - resources['vcpus_used']
LOG.audit(_("Free VCPUS: %s") % free_vcpus)
else:
LOG.audit(_("Free VCPU information unavailable"))
if 'pci_devices' in resources:
LOG.audit(_("Free PCI devices: %s") % resources['pci_devices'])
(B)该句话,显而易见,完成了每个compute节点在数据库中,资源状态的一个更新操作
self._sync_compute_node(context, resources)
好了,终于把上面两个问题讲完了,我大概就是描述了下流程,有点简陋。
本来想着能讲得再细一点,但是发现这样会越讲越多,越讲越乱。
所以,希望小伙伴们自己再琢磨下吧,细节问题还是自己把握下吧。
嘿嘿,上面有什么讲的不对的或者大神需要指点下我的,请马上联系我,我一直在线。哈哈