@periodic_task简述及nova资源刷新机制

最新推荐文章于 2023-07-04 18:04:02 发布

虾悠悠

最新推荐文章于 2023-07-04 18:04:02 发布

阅读量5.4k

点赞数

分类专栏： OpenStack 文章标签： periodic_task nova 主机资源刷新 Python装饰器

本文链接：https://blog.csdn.net/crazystone86/article/details/14221601

版权

OpenStack 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

昨天晚上憋了2个小时，愣是没有写一篇博文，要写的东西太多了，很多东西又太细了，总想把东西写的覆盖点全面点，但是描述又相对来说简单。这说明我要赶快提高自己能力了，加油了。好了，不说废话，进入正题。

本人虾悠悠 QQ：617600535 邮箱：leezhoucloud@gmail.com，欢迎交流。

（一）关于periodic_task那点事

1、periodic_task大家都知道，是一个周期性任务，那么究竟它是在什么时候启动的呢？

答：对前面nova service启动过程比较熟悉的同学一定知道，periodic_task伴随着一个服务（比如compute-serveice）的启动而开始的。源码在/nova/service.py文件中，

        #在start函数里面，关于periodic_task的启动代码
        if self.periodic_enable:
            if self.periodic_fuzzy_delay:
                initial_delay = random.randint(0, self.periodic_fuzzy_delay)
            else:
                initial_delay = None
			
			#通过下面一句，把periodic_task给启动了
            self.tg.add_dynamic_timer(self.periodic_tasks,
                                     initial_delay=initial_delay,
                                     periodic_interval_max=
                                        self.periodic_interval_max)

大家看到上面self.tg.add_dynamic_timer()，里面有句self.periodic_task，就是调用了自己文件中如下manager.periodic_task()，开启了manager中带有@periodic_task，即被periodic_task修饰的所有函数。

def periodic_tasks(self, raise_on_error=False):
        """Tasks to be run at a periodic interval."""
        ctxt = context.get_admin_context()
        return self.manager.periodic_tasks(ctxt, raise_on_error=raise_on_error)

通过上面的几步，周期性工作这就开始运行啦，哈哈。

下面是compute-manager里面关于资源刷新的截图。

(二) 主机资源刷新机制

通过（一），我们已经能够很清楚的知道，periodic_task到底是怎么启动的了，接下来，就延续上面的截图，来简单讲一讲compute node主机的刷新机制

1、先贴上上一段截图的代码

#在文件/nova/compute/manager.py下
	#主机周期性资源刷新函数
    @periodic_task.periodic_task
    def update_available_resource(self, context):
        """See driver.get_available_resource()

        Periodic process that keeps that the compute host's understanding of
        resource availability and usage in sync with the underlying hypervisor.

        :param context: security context
        """
        new_resource_tracker_dict = {}
        nodenames = set(self.driver.get_available_nodes())
        for nodename in nodenames:
            rt = self._get_resource_tracker(nodename)
            rt.update_available_resource(context)
            new_resource_tracker_dict[nodename] = rt

        # Delete orphan compute node not reported by driver but still in db
        compute_nodes_in_db = self._get_compute_nodes_in_db(context)

        for cn in compute_nodes_in_db:
            if cn.get('hypervisor_hostname') not in nodenames:
                LOG.audit(_("Deleting orphan compute node %s") % cn['id'])
                self.conductor_api.compute_node_delete(context, cn)

        self._resource_tracker_dict = new_resource_tracker_dict

简单分析下上面的代码：

（1）nodenames = set(self.driver.get_available_nodes()) 首先获取可以拿到的compute node

（2）rt = self._get_resource_tracker(nodename)，通过这段代码，让node获取一个类似resource_tracker的句柄，可以拿来操纵底层资源

rt实际上是</nova/compute/resource_tracker.py>中类ResourceTracker的一个对象

（3）rt.update_available_resource(context)，开始调用</nova/compute/resource_tracker.py>中的update_available_resource()函数

代码如下

@utils.synchronized(COMPUTE_RESOURCE_SEMAPHORE)
    def update_available_resource(self, context):
        """Override in-memory calculations of compute node resource usage based
        on data audited from the hypervisor layer.

        Add in resource claims in progress to account for operations that have
        declared a need for resources, but not necessarily retrieved them from
        the hypervisor layer yet.
        """
        LOG.audit(_("Auditing locally available compute resources"))
        resources = self.driver.get_available_resource(self.nodename)

（4）看到关键的资源更新调用 resources = self.driver.get_available_resource(self.nodename) 这行代码，我们知道，nova中driver默认是kvm虚拟化，它又是以Libvirt为基础的，所以，我们找到，/nova/virt/libvirt/driver.py(所有在默认kvm情况下，虚拟机连通就是通过这个driver适配器来实现的)

    def get_available_resource(self, nodename):
        """Retrieve resource information.

        This method is called when nova-compute launches, and
        as part of a periodic task that records the results in the DB.

        :param nodename: will be put in PCI device
        :returns: dictionary containing resource info
        """

        # Temporary: convert supported_instances into a string, while keeping
        # the RPC version as JSON. Can be changed when RPC broadcast is removed
        stats = self.host_state.get_host_stats(refresh=True)
        stats['supported_instances'] = jsonutils.dumps(
                stats['supported_instances'])
        return stats

我们看到driver又调用了 stats = self.host_state.get_host_stats(refresh=True)来获取主机相关的状态信息。

（6）所以，麻烦我们再来看下这该文件中，HostState的几个相关的函数

a）get_host_stats方法，因为（5）中看到，refreash = True，所以，函数会如同代码中注释所说，首先“run update”

def get_host_stats(self, refresh=False):
        """Return the current state of the host.

        If 'refresh' is True, run update the stats first.
        """
        if refresh or not self._stats:
            self.update_status()
        return self._stats

b）我们再来看下update_status函数到底干了什么？

"""Retrieve status info from libvirt."""这才是这个函数的一个真正用途，从Libvirt中把主机资源的信息全部获取过来（比cpu，内存之类的），这些也就是我们能够获得的一些主机信息。看下截图吧，这些就是我们要找的compute到底收集了啥信息。。

（7）通过上一步，把收集到的信息赋值给一个data = {}，并且_stats = data，这个_stats保存了主机的各种信息，我们返回到/nova/compute/resource_tracker.py再看一下，在update_available_resource函数中，resources = self.driver.get_available_resource(self.nodename)这句话，完成了我们从主机中更新资源的操作，到了函数末尾，有两句关键调用

（A）self._report_final_resource_view(resources)

（B）self._sync_compute_node(context, resources)

先看（A）段代码，它把资源的剩余使用量等计算了一下，为下一步，Scheduler调度时候参考主机情况做准备

    def _report_final_resource_view(self, resources):
        """Report final calculate of free memory, disk, CPUs, and PCI devices,
        including instance calculations and in-progress resource claims. These
        values will be exposed via the compute node table to the scheduler.
        """
        LOG.audit(_("Free ram (MB): %s") % resources['free_ram_mb'])
        LOG.audit(_("Free disk (GB): %s") % resources['free_disk_gb'])

        vcpus = resources['vcpus']
        if vcpus:
            free_vcpus = vcpus - resources['vcpus_used']
            LOG.audit(_("Free VCPUS: %s") % free_vcpus)
        else:
            LOG.audit(_("Free VCPU information unavailable"))

        if 'pci_devices' in resources:
            LOG.audit(_("Free PCI devices: %s") % resources['pci_devices'])

（B）该句话，显而易见，完成了每个compute节点在数据库中，资源状态的一个更新操作

self._sync_compute_node(context, resources)

好了，终于把上面两个问题讲完了，我大概就是描述了下流程，有点简陋。

本来想着能讲得再细一点，但是发现这样会越讲越多，越讲越乱。

所以，希望小伙伴们自己再琢磨下吧，细节问题还是自己把握下吧。

嘿嘿，上面有什么讲的不对的或者大神需要指点下我的，请马上联系我，我一直在线。哈哈

虾悠悠

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
1
评论
@periodic_task简述及nova资源刷新机制

昨天晚上憋了2个小时，愣是没有写一篇博文，要写的东西太多了，很多东西又太细了，总想把东西写的覆盖点全面点，但是描述又相对来说简单。这就需要我不断提高自己能力了，加油了。
复制链接

扫一扫

专栏目录