Openstack--主机资源刷新机制

最新推荐文章于 2022-03-23 11:42:34 发布

远去的栀子花

最新推荐文章于 2022-03-23 11:42:34 发布

阅读量662

点赞数

分类专栏： Openstack

本文链接：https://blog.csdn.net/u012967763/article/details/116357936

版权

Openstack 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

1、周期任务周期性上报资源


#配置文件中可以查看周期任务的执行时间
@periodic_task.periodic_task(spacing=CONF.update_resources_interval)
    def update_available_resource(self, context):
        """See driver.get_available_resource()

        Periodic process that keeps that the compute host's understanding of
        resource availability and usage in sync with the underlying hypervisor.

        :param context: security context
        """
        #获取数据库中的compute_nodes，查询compute_nodes表
        compute_nodes_in_db = self._get_compute_nodes_in_db(context,
                                                            use_slave=True)
        #驱动层获取可用的nodes
        nodenames = set(self.driver.get_available_nodes())
        #更新每一个节点的资源信息
        for nodename in nodenames:
            self.update_available_resource_for_node(context, nodename)

        rt = self._get_resource_tracker()
        # Delete orphan compute node not reported by driver but still in db
        for cn in compute_nodes_in_db:
            if cn.hypervisor_hostname not in nodenames:
                LOG.info(_LI("Deleting orphan compute node %(id)s "
                             "hypervisor host is %(hh)s, "
                             "nodes are %(nodes)s"),
                             {'id': cn.id, 'hh': cn.hypervisor_hostname,
                              'nodes': nodenames})
                cn.destroy()
                rt.remove_node(cn.hypervisor_hostname)
                # Delete the corresponding resource provider in placement,
                # along with any associated allocations and inventory.
                # TODO(cdent): Move use of reportclient into resource tracker.
                self.scheduler_client.reportclient.delete_resource_provider(
                    context, cn, cascade=True)

2、更新节点资源

    def update_available_resource_for_node(self, context, nodename):
        #生成ResourceTracker对象
        rt = self._get_resource_tracker()
        try:
            #更新资源信息
            rt.update_available_resource(context, nodename)
        except exception.ComputeHostNotFound:
            # NOTE(comstud): We can get to this case if a node was
            # marked 'deleted' in the DB and then re-added with a
            # different auto-increment id. The cached resource
            # tracker tried to update a deleted record and failed.
            # Don't add this resource tracker to the new dict, so
            # that this will resolve itself on the next run.
            LOG.info(_LI("Compute node '%s' not found in "
                         "update_available_resource."), nodename)
            # TODO(jaypipes): Yes, this is inefficient to throw away all of the
            # compute nodes to force a rebuild, but this is only temporary
            # until Ironic baremetal node resource providers are tracked
            # properly in the report client and this is a tiny edge case
            # anyway.
            self._resource_tracker = None
            return
        except Exception:
            LOG.exception(_LE("Error updating resources for node "
                          "%(node)s."), {'node': nodename})

3、更新资源

    def update_available_resource(self, context, nodename):
        """Override in-memory calculations of compute node resource usage based
        on data audited from the hypervisor layer.

        Add in resource claims in progress to account for operations that have
        declared a need for resources, but not necessarily retrieved them from
        the hypervisor layer yet.

        :param nodename: Temporary parameter representing the Ironic resource
                         node. This parameter will be removed once Ironic
                         baremetal resource nodes are handled like any other
                         resource in the system.
        """
        LOG.debug("Auditing locally available compute resources for "
                  "%(host)s (node: %(node)s)",
                 {'node': nodename,
                  'host': self.host})
        #调用驱动获取对应节点上的剩余资源
        resources = self.driver.get_available_resource(nodename)
        # NOTE(jaypipes): The resources['hypervisor_hostname'] field now
        # contains a non-None value, even for non-Ironic nova-compute hosts. It
        # is this value that will be populated in the compute_nodes table.
        resources['host_ip'] = CONF.my_ip

        # We want the 'cpu_info' to be None from the POV of the
        # virt driver, but the DB requires it to be non-null so
        # just force it to empty string
        if "cpu_info" not in resources or resources["cpu_info"] is None:
            resources["cpu_info"] = ''

        self._verify_resources(resources)

        self._report_hypervisor_resource_view(resources)
        # #更新虚拟机管理器资源数据库记录
        self._update_available_resource(context, resources)

4、驱动层获取资源信息

    def get_available_resource(self, nodename):
        """Retrieve resource information.

        This method is called when nova-compute launches, and
        as part of a periodic task that records the results in the DB.

        :param nodename: unused in this driver
        :returns: dictionary containing resource info
        """

        disk_info_dict = self._get_local_gb_info()
        data = {}

        # NOTE(dprince): calling capabilities before getVersion works around
        # an initialization issue with some versions of Libvirt (1.0.5.5).
        # See: https://bugzilla.redhat.com/show_bug.cgi?id=1000116
        # See: https://bugs.launchpad.net/nova/+bug/1215593
        data["supported_instances"] = self._get_instance_capabilities()

        data["vcpus"] = self._get_vcpu_total()
        data["memory_mb"] = self._host.get_memory_mb_total()
        data["local_gb"] = disk_info_dict['total']
        data["vcpus_used"] = self._get_vcpu_used()
        data["memory_mb_used"] = self._host.get_memory_mb_used()
        data["local_gb_used"] = disk_info_dict['used']
        data["hypervisor_type"] = self._host.get_driver_type()
        data["hypervisor_version"] = self._host.get_version()
        data["hypervisor_hostname"] = self._host.get_hostname()
        # TODO(berrange): why do we bother converting the
        # libvirt capabilities XML into a special JSON format ?
        # The data format is different across all the drivers
        # so we could just return the raw capabilities XML
        # which 'compare_cpu' could use directly
        #
        # That said, arch_filter.py now seems to rely on
        # the libvirt drivers format which suggests this
        # data format needs to be standardized across drivers
        data["cpu_info"] = jsonutils.dumps(self._get_cpu_info())

        disk_free_gb = disk_info_dict['free']
        disk_over_committed = self._get_disk_over_committed_size_total()
        available_least = disk_free_gb * units.Gi - disk_over_committed
        data['disk_available_least'] = available_least / units.Gi

        data['pci_passthrough_devices'] = \
            self._get_pci_passthrough_devices()

        numa_topology = self._get_host_numa_topology()
        if numa_topology:
            data['numa_topology'] = numa_topology._to_json()
        else:
            data['numa_topology'] = None

        return data

5、对获取的资源进行验证


    #对比获取的资源种类与应该上报的资源种类
    def _verify_resources(self, resources):
        resource_keys = ["vcpus", "memory_mb", "local_gb", "cpu_info",
                         "vcpus_used", "memory_mb_used", "local_gb_used",
                         "numa_topology"]

        missing_keys = [k for k in resource_keys if k not in resources]
        if missing_keys:
            reason = _("Missing keys: %s") % missing_keys
            raise exception.InvalidInput(reason=reason)

6、调用资源上报

    def _update_available_resource(self, context, resources):

        # initialize the compute node object, creating it
        # if it does not already exist.
        ##从数据库中获取节点资源信息
        self._init_compute_node(context, resources)

        nodename = resources['hypervisor_hostname']

        # if we could not init the compute node the tracker will be
        # disabled and we should quit now
        if self.disabled(nodename):
            return

        # Grab all instances assigned to this node:
        instances = objects.InstanceList.get_by_host_and_node(
            context, self.host, nodename,
            expected_attrs=['system_metadata',
                            'numa_topology',
                            'flavor', 'migration_context'])

        # Now calculate usage based on instance utilization:
        self._update_usage_from_instances(context, instances, nodename)

        # Grab all in-progress migrations:
        migrations = objects.MigrationList.get_in_progress_by_host_and_node(
                context, self.host, nodename)

        self._pair_instances_to_migrations(migrations, instances)
        self._update_usage_from_migrations(context, migrations, nodename)

        # Detect and account for orphaned instances that may exist on the
        # hypervisor, but are not in the DB:
        orphans = self._find_orphaned_instances()
        self._update_usage_from_orphans(orphans, nodename)

        cn = self.compute_nodes[nodename]

        # NOTE(yjiang5): Because pci device tracker status is not cleared in
        # this periodic task, and also because the resource tracker is not
        # notified when instances are deleted, we need remove all usages
        # from deleted instances.
        self.pci_tracker.clean_usage(instances, migrations, orphans)
        dev_pools_obj = self.pci_tracker.stats.to_device_pools_obj()
        cn.pci_device_pools = dev_pools_obj

        self._report_final_resource_view(nodename)

        metrics = self._get_host_metrics(context, nodename)
        # TODO(pmurray): metrics should not be a json string in ComputeNode,
        # but it is. This should be changed in ComputeNode
        cn.metrics = jsonutils.dumps(metrics)

        # update the compute_node
        ###上报虚拟机管理器信息
        self._update(context, cn)
        LOG.debug('Compute_service record updated for %(host)s:%(node)s',
                  {'host': self.host, 'node': nodename})

7、上报资源给scheduler

    def _update(self, context, compute_node):
        """Update partial stats locally and populate them to Scheduler."""
        if not self._resource_change(compute_node):
            return
        # Persist the stats to the Scheduler
        self.scheduler_client.update_resource_stats(compute_node)
        if self.pci_tracker:
            self.pci_tracker.save(context)

参考文章

https://blog.csdn.net/idwtwt/article/details/62227893

https://blog.csdn.net/crazystone86/article/details/14221601

https://blog.csdn.net/gj19890923/article/details/50583435

远去的栀子花

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
Openstack--主机资源刷新机制

1、周期任务周期性上报资源#配置文件中可以查看周期任务的执行时间@periodic_task.periodic_task(spacing=CONF.update_resources_interval) def update_available_resource(self, context): """See driver.get_available_resource() Periodic process that keeps that the compute
复制链接

扫一扫