(1)调度器的启动
调度器由脚本 nova/bin/nova-scheduler启动
其内容如下:
if __name__ == '__main__': utils.default_flagfile() flags.FLAGS(sys.argv) logging.setup() utils.monkey_patch() server = service.Service.create(binary='nova-scheduler') service.serve(server) service.wait()
其中
utils.default_flagfile() 处理相关的配置文件,如果没有提供配置文件,则采用默认位置的配置文件。
def default_flagfile(filename='nova.conf', args=None): if args is None: args = sys.argv for arg in args: if arg.find('flagfile') != -1: return arg[arg.index('flagfile') + len('flagfile') + 1:] else: if not os.path.isabs(filename): # turn relative filename into an absolute path script_dir = os.path.dirname(inspect.stack()[-1][1]) filename = os.path.abspath(os.path.join(script_dir, filename)) if not os.path.exists(filename): filename = "./nova.conf" if not os.path.exists(filename): filename = '/etc/nova/nova.conf' if os.path.exists(filename): flagfile = '--flagfile=%s' % filename args.insert(1, flagfile) return filename
配置文件位置的检查顺序,当前执行脚本所在的目录下 --> 当前目录下 --> /etc/nova/nova.conf --> 命令行参数--flag-file指定的配置文件
flags.FLAGS(sys.argv); 将配置参数加载到flags中,flags保存了所有的配置信息
logging.setup() 日志模块
服务的创建如下:
@classmethod def create(cls, host=None, binary=None, topic=None, manager=None, report_interval=None, periodic_interval=None, periodic_fuzzy_delay=None): """Instantiates class and passes back application object. :param host: defaults to FLAGS.host :param binary: defaults to basename of executable :param topic: defaults to bin_name - 'nova-' part :param manager: defaults to FLAGS.<topic>_manager :param report_interval: defaults to FLAGS.report_interval :param periodic_interval: defaults to FLAGS.periodic_interval :param periodic_fuzzy_delay: defaults to FLAGS.periodic_fuzzy_delay """ if not host: host = FLAGS.host if not binary: binary = os.path.basename(inspect.stack()[-1][1]) if not topic: topic = binary.rpartition('nova-')[2] if not manager: manager = FLAGS.get('%s_manager' % topic, None) if report_interval is None: report_interval = FLAGS.report_interval if periodic_interval is None: periodic_interval = FLAGS.periodic_interval if periodic_fuzzy_delay is None: periodic_fuzzy_delay = FLAGS.periodic_fuzzy_delay service_obj = cls(host, binary, topic, manager, report_interval=report_interval, periodic_interval=periodic_interval, periodic_fuzzy_delay=periodic_fuzzy_delay) return service_obj
其中的report_interval:
seconds between nodes reporting state to datastore
manager为配置文件中,“schedule_manager=...”对应的类名。
在对应的初始化函数中:
def __init__(self, host, binary, topic, manager, report_interval=None, periodic_interval=None, periodic_fuzzy_delay=None, *args, **kwargs): self.host = host self.binary = binary self.topic = topic self.manager_class_name = manager manager_class = utils.import_class(self.manager_class_name) self.manager = manager_class(host=self.host, *args, **kwargs) self.report_interval = report_interval self.periodic_interval = periodic_interval self.periodic_fuzzy_delay = periodic_fuzzy_delay super(Service, self).__init__(*args, **kwargs) self.saved_args, self.saved_kwargs = args, kwargs self.timers = []
一个service包含一个manager,并且监听topic对应的消息队列。此外它定期的将它的状态汇报给数据库。
重点研究了下nova/service.py文件
(a)Launcher类
管理一个service对象,包括启动,停止。service对象的启动采用了eventlet.spawn,以一个新的线程启动
Launcher类内部有一个列表,记录了所有的service对象。
(b)service类
(2)SchedulerManager类
scheduler_driver_opt = cfg.StrOpt('scheduler_driver', default='nova.scheduler.multi.MultiScheduler', help='Default driver to use for the scheduler') FLAGS = flags.FLAGS FLAGS.register_opt(scheduler_driver_opt) class SchedulerManager(manager.Manager): """Chooses a host to run instances on.""" def __init__(self, scheduler_driver=None, *args, **kwargs): if not scheduler_driver: scheduler_driver = FLAGS.scheduler_driver self.driver = utils.import_object(scheduler_driver) super(SchedulerManager, self).__init__(*args, **kwargs)
默认采用nova.scheduler.multi.MultiScheduler
在接收到虚拟机请求时,SchedulerManager.run_instance被调用
def run_instance(self, context, topic, *args, **kwargs): """Tries to call schedule_run_instance on the driver. Sets instance vm_state to ERROR on exceptions """ args = (context,) + args try: return self.driver.schedule_run_instance(*args, **kwargs) except exception.NoValidHost as ex: # don't reraise self._set_vm_state_and_notify('run_instance', {'vm_state': vm_states.ERROR}, context, ex, *args, **kwargs)
转而调用schedule_run_instance方法,如果该过程出现错误,将虚拟机状态设置为ERROR
def schedule_run_instance(self, *args, **kwargs): return self.drivers['compute'].schedule_run_instance(*args, **kwargs)
其中的driver为:
cfg.StrOpt('compute_scheduler_driver', default='nova.scheduler.' 'filter_scheduler.FilterScheduler', help='Driver to use for scheduling compute calls'),
虚拟机的请求是转交给FilterScheduler类来处理的。
FilterScheduler类中的schedule_run_instance方法:
def schedule_run_instance(self, context, request_spec, *args, **kwargs): """This method is called from nova.compute.api to provision an instance. We first create a build plan (a list of WeightedHosts) and then provision. Returns a list of the instances created. """ elevated = context.elevated() num_instances = request_spec.get('num_instances', 1) LOG.debug(_("Attempting to build %(num_instances)d instance(s)") % locals()) payload = dict(request_spec=request_spec) notifier.notify(notifier.publisher_id("scheduler"), 'scheduler.run_instance.start', notifier.INFO, payload) weighted_hosts = self._schedule(context, "compute", request_spec, *args, **kwargs) if not weighted_hosts: raise exception.NoValidHost(reason="") # NOTE(comstud): Make sure we do not pass this through. It # contains an instance of RpcContext that cannot be serialized. kwargs.pop('filter_properties', None) instances = [] for num in xrange(num_instances): if not weighted_hosts: break weighted_host = weighted_hosts.pop(0) request_spec['instance_properties']['launch_index'] = num instance = self._provision_resource(elevated, weighted_host, request_spec, kwargs) if instance: instances.append(instance) notifier.notify(notifier.publisher_id("scheduler"), 'scheduler.run_instance.end', notifier.INFO, payload)
(3)调度算法
(a)得到权重和成本计算函数get_cost_functions
每个topic对应一个权重和成本计算函数,如果之前缓存过,直接返回
if topic in self.cost_function_cache:
return self.cost_function_cache[topic]
成本计算函数是配置文件中least_cost_functions对应的
###### (FloatOpt) How much weight to give the fill-first cost function. A negative value will reverse behavior: e.g. spread-first
# compute_fill_first_cost_fn_weight=-1.0
###### (ListOpt) Which cost functions the LeastCostScheduler should use
# least_cost_functions="nova.scheduler.least_cost.compute_fill_first_cost_fn"
###### (FloatOpt) How much weight to give the noop cost function
# noop_cost_fn_weight=1.0
各个成本计算函数以逗号分开,另外每个成本计算函数也对应一个权重。
get_cost_functions 最后获取所有的成本计算函数以及其对应的权重,并返回。
(b)得到所有可用节点信息HostManager. get_all_host_states
def get_all_host_states(self, context, topic): """Returns a dict of all the hosts the HostManager knows about. Also, each of the consumable resources in HostState are pre-populated and adjusted based on data in the db. For example: {'192.168.1.100': HostState(), ...} Note: this can be very slow with a lot of instances. InstanceType table isn't required since a copy is stored with the instance (in case the InstanceType changed since the instance was created)."""
i. 首先会从数据库中选出所有可用的未过滤的节点组成的一个字典host_state_map。
ii. 所有的计算节点都保存在nova库的coupute_nodes表中。
iii. 从表中取出所有节点必要的信息(如节点的内存容量,虚拟CPU个数,硬盘容量等),放到字典host_state_map中
iv. 从数据库表nova.instances中取出所有实例instance的信息。
v. 根据instance所在的主机,从host_state_map中对应主机减去实例所占用的资源,得到主机的剩余资源
(c)根据定义的过滤方法,过滤掉不合适的主机
可以在配置文件中设置过滤条件
scheduler_available_filters定义可用的过滤器,可以多次定义,每个scheduler_driver=nova.scheduler.distributed_scheduler.FilterScheduler scheduler_available_filters=nova.scheduler.filters.standard_filters scheduler_available_filters=myfilter.MyFilter scheduler_default_filters=RamFilter,ComputeFilter,MyFilter
(d)根据成本计算函数和权重,选择最合适主机