openstack nova schedule 源码学习

最新推荐文章于 2022-08-23 16:39:50 发布

NKCCDD

最新推荐文章于 2022-08-23 16:39:50 发布

阅读量431

点赞数

文章标签： openstack

本文链接：https://blog.csdn.net/niekunhit/article/details/46776289

版权

最近在学习openstack源码由于对openstack 的调度算法比较感兴趣，因此先从 nova-scheduler 开始学习.

nova-scheduler 是openstack nova的一个组成部分，它负责为guest vm 选择一个合适的compute node

类 SchedulerManager 是scheduler的入口函数（scheduler/manager.py）下面是贴的部分代码 vm启动的过程。

class SchedulerManager(manager.Manager):
    """Chooses a host to run instances on."""
    target = messaging.Target(version='3.0')

    #类的初始化函数
     def __init__(self, scheduler_driver=None, *args, **kwargs):
        if not scheduler_driver:
            scheduler_driver = CONF.scheduler_driver
        self.driver = importutils.import_object(scheduler_driver)
        self.compute_rpcapi = compute_rpcapi.ComputeAPI()
        super(SchedulerManager, self).__init__(service_name='scheduler',
                                               *args, **kwargs)

    #启动instance
    def run_instance(self, context, request_spec, admin_password,
            injected_files, requested_networks, is_first_time,
            filter_properties, legacy_bdm_in_spec):
        """Tries to call schedule_run_instance on the driver.
        Sets instance vm_state to ERROR on exceptions
        """
        #取得instance 的uuid
        instance_uuids = request_spec['instance_uuids']
        with compute_utils.EventReporter(context, 'schedule', *instance_uuids):
            try:
                  #调用Driver的方法schedule_run_instance
                  return self.driver.schedule_run_instance(context,
                        request_spec, admin_password, injected_files,
                        requested_networks, is_first_time, filter_properties,
                        legacy_bdm_in_spec)

            except exception.NoValidHost as ex:
                # don't re-raise
                self._set_vm_state_and_notify('run_instance',
                                              {'vm_state': vm_states.ERROR,
                                              'task_state': None},
                                              context, ex, request_spec)
            except Exception as ex:
                with excutils.save_and_reraise_exception():
                    self._set_vm_state_and_notify('run_instance',
                                                  {'vm_state': vm_states.ERROR,
                                                  'task_state': None},
                                                  context, ex, request_spec)

目前openstack 一共提供三种调度器

1. Chancescheduler 2 FilterScheduler 3. CachingScheduler

为了便于扩展 Nova 提供了一个接口

Scheduler（scheduler/driver） 只要继承这个类并实现里面的接口函数，就可以实现自己的一个调度器。这里使用的设计模式中的策略模式。

目前提供的默认 scheduler 是 FilterScheduler 可以在 nova的配置文件（etc/nova/nova.conf）中进行配置 scheduler_driver.

FilterScheduler 中主要包含两个过程 Filter 和 Weighting。

Filter是对compute hosts根据一定的规则（core,memory,disk等）进行过滤。让后对compute nodes计算权重根据权重排序选择最优的一个启动虚拟机.

下图摘自openstack 官方文档

下面是类FilterScheduler类的函数

  def schedule_run_instance(self, context, request_spec,
                              admin_password, injected_files,
                              requested_networks, is_first_time,
                              filter_properties, legacy_bdm_in_spec):
        """Provisions instances that needs to be scheduled

        Applies filters and weighters on request properties to get a list of
        compute hosts and calls them to spawn instance(s).
        """
        # 取得 payload, 结果为 {'request':request_spec}
        payload = dict(request_spec=request_spec)
        self.notifier.info(context, 'scheduler.run_instance.start', payload)

        instance_uuids = request_spec.get('instance_uuids')
        LOG.info(_("Attempting to build %(num_instances)d instance(s) "
                    "uuids: %(instance_uuids)s"),
                  {'num_instances': len(instance_uuids),
                   'instance_uuids': instance_uuids})
        LOG.debug("Request Spec: %s" % request_spec)

        # check retry policy.  Rather ugly use of instance_uuids[0]...
        # but if we've exceeded max retries... then we really only
        # have a single instance.
        scheduler_utils.populate_retry(filter_properties,
                                       instance_uuids[0])
        #返回主机列表
        weighed_hosts = self._schedule(context, request_spec,
                                       filter_properties)

        # NOTE: Pop instance_uuids as individual creates do not need the
        # set of uuids. Do not pop before here as the upper exception
        # handler fo NoValidHost needs the uuid to set error state
        instance_uuids = request_spec.pop('instance_uuids')

        # NOTE(comstud): Make sure we do not pass this through.  It
        # contains an instance of RpcContext that cannot be serialized.
        filter_properties.pop('context', None)

        for num, instance_uuid in enumerate(instance_uuids):
            request_spec['instance_properties']['launch_index'] = num

            try:
                try:
                    weighed_host = weighed_hosts.pop(0)
                    LOG.info(_("Choosing host %(weighed_host)s "
                                "for instance %(instance_uuid)s"),
                              {'weighed_host': weighed_host,
                               'instance_uuid': instance_uuid})
                except IndexError:
                    raise exception.NoValidHost(reason="")

                self._provision_resource(context, weighed_host,
                                         request_spec,
                                         filter_properties,
                                         requested_networks,
                                         injected_files, admin_password,
                                         is_first_time,
                                         instance_uuid=instance_uuid,
                                         legacy_bdm_in_spec=legacy_bdm_in_spec)
            except Exception as ex:
                # NOTE(vish): we don't reraise the exception here to make sure
                #             that all instances in the request get set to
                #             error properly
                driver.handle_schedule_error(context, ex, instance_uuid,
                                             request_spec)
            # scrub retry host list in case we're scheduling multiple
            # instances:
            retry = filter_properties.get('retry', {})
            retry['hosts'] = []

        self.notifier.info(context, 'scheduler.run_instance.end', payload)

该方法调用此类中的_schedule 返回一个host 列表.
_schedule 里面包含了Filter(host_manager.get_filtered_hosts) 和weighting(host_manager.get_weighed_hosts)的过程

对于get_filtered_hosts 和get_weighed_hosts 将在下一篇中介绍