一、概述
虚拟机的启动具有多种方式,这些方式体现在创建虚拟机的源与目标位置的多样性。
这里的“源”即虚拟机的来源,可以有卷(bootable状态为true)、快照、镜像,“目标位
置”即虚拟机创建后存放的位置,可以有卷、hypervisor指定的后端。
此处从卷启动虚拟机即虚拟机的目标位置在卷上,该虚拟机的来源可以为卷
(bootable状态为true)、快照、镜像。
二、nova在boot操作时与glance、cinder的联系
介绍具体boot操作前,先看一副图:
以下是nova boot的完整参数:
其中--block-device参数能够决定启动虚拟机所需的大部分内容,这里详细介绍其中的
source及dest:
dest | source | 说明 | shortcut |
volume | volume | 直接挂载到compute节点 | boot_index=0时, 相当于--boot-volume <volume_id> |
snapshot | 调用cinder依据快照创建新卷, 挂载到compute节点 | boot_index=0时, 相当于--snapshot <snapshot_id> | |
image | 调用cinder依据镜像创建新卷, 挂载到compute节点 | boot_index=0时, 相当于--image <image> | |
local | image | 在Hypervisor上创建ephemeral分区, 将image拷贝到里面并启动虚机 | 相当于普通的Boot from image |
操作举例:
目标 | 创建从volume启动的虚拟机 | |
方式 | 方式1 | 方式2 |
操作 | nova boot --flavor <flavor_id> --block-device source=image, id=<image_id>, dest=volume, size=1, shutdown=preserve, bootindex=0 Boot-from-volume-1 | cinder create --image <image_id> --size 1 Bootable_volume |
nova boot --flavor <flavor_id> --volume <volume_id> --block-device source=volume, id=<volume_id>, dest=volume, size=1, shutdown=preserve, bootindex=0 Boot-from-volum-2 | ||
说明 | 依据image创建volume(Bootable=True) 并挂载到compute节点 | 先依据image创建volume,再通过boot在 刚创建的volume上启动虚拟机 |
以方式1创建:
- 列出镜像
- 列出可用flavor
- 创建虚拟机
- 查看当前卷情况
- 查看当前虚机情况
三、代码层面分析
1.整体流程
2.调用路径
novaclient.boot
→ Nova.api.openstack.compute.create
→ Nova.conductor.computetaskAPI.build_instance
→ Nova.conductor.rpcapi.computetaskAPI.build_instances
→ Nova.conductor.manager.build_instances
→ Nova.compute.rpcapi.computeAPI.build_and_run_instance
→ Nova.compute.manager.build_and_run_instance
→ Nova.compute.manager._do_build_and_run_instance
→ Nova.compute.manager._build_and_run_instance
→ Nova.compute.manager._build_resources
→ Nova.compute.manager._prep_block_device
→ Nova.virt.driver.get_block_device_info
→ Nova.virt.block_device.convert_all_volumes
→ Nova.virt.block_device._convert_block_devices
→ Nova.virt.block_device.attach_block_devices
→ Nova.virt.driver.load_compute_driver.spawn
四、创建块设备资源详细分析
1.总览
为了方便后续阅读代码分析的阅读,这里绘制了一个简要的流程图:
2. volume相关设备创建代码分析
接下来是volume相关设备创建的具体代码分析:
2.1 入口部分:
def _build_resources try: self._default_block_device_names(instance, image_meta, block_device_mapping) LOG.debug('Start building block device mappings for instance.', instance=instance) instance.vm_state = vm_states.BUILDING instance.task_state = task_states.BLOCK_DEVICE_MAPPING instance.save() block_device_info = self._prep_block_device(context, instance, block_device_mapping) resources['block_device_info'] = block_device_info |
该部分代码为虚机创建块设备,首先对虚机的状态进行置位,然后进行保存,
再交由_prep_block_device函数获取块设备信息并进行封装,交由virt.block_device.attach_block_devices函数进行创建。
再进入_prep_block_device函数进行具体分析:
def _prep_block_device try: self._add_missing_dev_names(bdms, instance) block_device_info = driver.get_block_device_info(instance, bdms) mapping = (block_device_info) driver_block_device.attach_block_devices( mapping, context, instance, self.volume_api, self.driver, do_check_attach=do_check_attach, wait_func=self._await_block_device_map_created) self._block_device_info_to_legacy(block_device_info) return block_device_info |
其中driver_block_device.attach_block_devices是最终的入口,它所需要的至关重要
的参数为mapping,该参数由 driver.block_device_info_get_mapping进行封装,而实际的
获取则由 driver.get_block_device_info进行,追入该函数:
def get_block_device_info from nova.virt import block_device as virt_block_device block_device_info = { 'root_device_name': instance.root_device_name, 'ephemerals': virt_block_device.convert_ephemerals( block_device_mapping), 'block_device_mapping': virt_block_device.convert_all_volumes(*block_device_mapping) } swap_list = virt_block_device.convert_swap(block_device_mapping) block_device_info['swap'] = virt_block_device.get_swap(swap_list) return block_device_info |
其中ephemerals与dest=loacl相关,block_device_info与dest=volume相关。首先
分析block_device_mapping的内容:
def convert_all_volumes source_volume = convert_volumes(volume_bdms) source_snapshot = convert_snapshots(volume_bdms) source_image = convert_images(volume_bdms) source_blank = convert_blanks(volume_bdms) return [vol for vol in itertools.chain(source_volume, source_snapshot, source_image, source_blank)] |
从这里可以猜测source_volume/snapshot/image/blank指向了各自对应的boot
时相关操作,在程序中可以获取:
convert_volumes = functools.partial(_convert_block_devices, DriverVolumeBlockDevice) convert_snapshots = functools.partial(_convert_block_devices, DriverSnapshotBlockDevice) convert_images = functools.partial(_convert_block_devices, DriverImageBlockDevice) convert_blanks = functools.partial(_convert_block_devices, DriverBlankBlockDevice) |
通过遍历mapping传入的device,将存在的device对应的相关类进行导入,
即三、代码层面分析中提到的Driver<Volume/Snapshot/Image/Blank>Device,其中包含有执行方法。
回到上述最终入口:
def attach_block_devices def _log_and_attach(bdm): context = attach_args[0] instance = attach_args[1] if bdm.get('volume_id'): LOG.info(_LI('Booting with volume %(volume_id)s at ' '%(mountpoint)s'), {'volume_id': bdm.volume_id, 'mountpoint': bdm['mount_device']}, context=context, instance=instance) elif bdm.get('snapshot_id'): LOG.info(_LI('Booting with volume snapshot %(snapshot_id)s at ' '%(mountpoint)s'), {'snapshot_id': bdm.snapshot_id, 'mountpoint': bdm['mount_device']}, context=context, instance=instance) elif bdm.get('image_id'): LOG.info(_LI('Booting with volume-backed-image %(image_id)s at ' '%(mountpoint)s'), {'image_id': bdm.image_id, 'mountpoint': bdm['mount_device']}, context=context, instance=instance) else: LOG.info(_LI('Booting with blank volume at %(mountpoint)s'), {'mountpoint': bdm['mount_device']}, context=context, instance=instance) bdm.attach(*attach_args, **attach_kwargs) map(_log_and_attach, block_device_mapping) return block_device_mapping |
该函数中最重要的两个操作为map和bdm.attach,其中map方法调用
_log_and_attach方法并传入block_device_mapping参数,该block_device_mapping对应
上面相关类的导入,然后bdm.attach调用的即是上面相关类中的方法。不同device的
attach方法在下面具体分析。
2.2 source=volume、dest=volume
具体分析DriverVolumeBlockDevice类中的attach方法:
def attach(self, context, instance, volume_api, virt_driver, do_check_attach=True, do_driver_attach=False, **kwargs): volume = volume_api.get(context, self.volume_id) if do_check_attach: volume_api.check_attach(context, volume, instance=instance) volume_id = volume['id'] context = context.elevated() connector = virt_driver.get_volume_connector(instance) connection_info = volume_api.initialize_connection(context, volume_id, connector) if 'serial' not in connection_info: connection_info['serial'] = self.volume_id self._preserve_multipath_id(connection_info) if do_driver_attach: encryption = encryptors.get_encryption_metadata( context, volume_api, volume_id, connection_info) try: virt_driver.attach_volume(context, connection_info, instance, self['mount_device'], disk_bus=self['disk_bus'], device_type=self['device_type'], encryption=encryption) except Exception: with excutils.save_and_reraise_exception(): LOG.exception(_LE("Driver failed to attach volume " "%(volume_id)s at %(mountpoint)s"), {'volume_id': volume_id, 'mountpoint': self['mount_device']}, context=context, instance=instance) volume_api.terminate_connection(context, volume_id, connector) self['connection_info'] = connection_info if self.volume_size is None: self.volume_size = volume.get('size') mode = 'rw' if 'data' in connection_info: mode = connection_info['data'].get('access_mode', 'rw') if volume['attach_status'] == "detached": self.save() try: volume_api.attach(context, volume_id, instance.uuid, self['mount_device'], mode=mode) except Exception: with excutils.save_and_reraise_exception(): if do_driver_attach: try: virt_driver.detach_volume(connection_info, instance, self['mount_device'], encryption=encryption) except Exception: LOG.warning(_LW("Driver failed to detach volume " "%(volume_id)s at %(mount_point)s."), {'volume_id': volume_id, 'mount_point': self['mount_device']}, exc_info=True, context=context, instance=instance) volume_api.terminate_connection(context, volume_id, connector) volume_api.detach(context, volume_id) |
传入的 do_driver_attach值为False,不使用virt_driver.attach_volume在此处直接挂
载卷,而是在后面调用volume_api.attach方法通过cinder发出attach请求,再由nova响
应该请求,再调virt去进行挂载。原因是此处为boot from volume,虚机直接在该volume
上启动,而此时虚机并未启动,所以不直接调用virt的驱动进行挂载。
2.3 source=snapshot、dest=volume
def attach(self, context, instance, volume_api, virt_driver, wait_func=None, do_check_attach=True): if not self.volume_id: av_zone = _get_volume_create_az_value(instance) snapshot = volume_api.get_snapshot(context, self.snapshot_id) vol = volume_api.create(context, self.volume_size, '', '', snapshot, availability_zone=av_zone) if wait_func: self._call_wait_func(context, wait_func, volume_api, vol['id']) self.volume_id = vol['id'] # Call the volume attach now super(DriverSnapshotBlockDevice, self).attach( context, instance, volume_api, virt_driver, do_check_attach=do_check_attach) |
首先,DriverSnapshotBlockDevice是DriverVolumeBlockDevice的子类,通过
volume_api.get_snapshot(调用到cinder)获取snapshot后,再调用volume_api.create
依据该snapshot来创建volume,最后调用父类的attach方法,不再赘述。
2.4 source=image、dest=volume
def attach(self, context, instance, volume_api, virt_driver, wait_func=None, do_check_attach=True): if not self.volume_id: av_zone = _get_volume_create_az_value(instance) vol = volume_api.create(context, self.volume_size, '', '', image_id=self.image_id, availability_zone=av_zone) if wait_func: self._call_wait_func(context, wait_func, volume_api, vol['id']) self.volume_id = vol['id'] super(DriverImageBlockDevice, self).attach( context, instance, volume_api, virt_driver, do_check_attach=do_check_attach) |
首先,DriverSnapshotBlockDevice是DriverVolumeBlockDevice的子类,通过
volume_api.create依据image来创建volume,最后调用父类的attach方法,不再赘述。
2.5 source=block、dest=volume
def attach(self, context, instance, volume_api, virt_driver, wait_func=None, do_check_attach=True): if not self.volume_id: vol_name = instance.uuid + '-blank-vol' av_zone = _get_volume_create_az_value(instance) vol = volume_api.create(context, self.volume_size, vol_name, '', availability_zone=av_zone) if wait_func: self._call_wait_func(context, wait_func, volume_api, vol['id']) self.volume_id = vol['id'] super(DriverBlankBlockDevice, self).attach( context, instance, volume_api, virt_driver, do_check_attach=do_check_attach) |
首先,DriverSnapshotBlockDevice是DriverVolumeBlockDevice的子类,通过
volume_api.create依据所需卷的大小来创建volume,最后调用父类的attach方法,不再赘述。
2.6 source=image/block、dest=local
if 'disk.local' in disk_mapping: disk_image = image('disk.local') fn = functools.partial(self._create_ephemeral, fs_label='ephemeral0', os_type=instance.os_type, is_block_dev=disk_image.is_block_dev) fname = "ephemeral_%s_%s" % (ephemeral_gb, file_extension) size = ephemeral_gb * units.Gi disk_image.cache(fetch_func=fn, context=context, filename=fname, size=size, ephemeral_size=ephemeral_gb) for idx, eph in enumerate(driver.block_device_info_get_ephemerals(block_device_info)): disk_image = image(blockinfo.get_eph_disk(idx)) specified_fs = eph.get('guest_format') if specified_fs and not self.is_supported_fs_format(specified_fs): msg = _("%s format is not supported") % specified_fs raise exception.InvalidBDMFormat(details=msg) fn = functools.partial(self._create_ephemeral, fs_label='ephemeral%d' % idx, os_type=instance.os_type is_block_dev=disk_image.is_block_dev) size = eph['size'] * units.Gi fname = "ephemeral_%s_%s" % (eph['size'], file_extension) disk_image.cache(fetch_func=fn, context=context, filename=fname, size=size, ephemeral_size=eph['size'], specified_fs=specified_fs) |
该部分代码在nova/virt/libvirt/driver.py中,判断disk.local是否在disk_mapping中,然后获取disk_imgae,该判断后
进行image在本地的创建;对于block在本地的创建,则通过在block_device_info中获取独立的local存储信息,然后依据
获取的信息进行本地存储的创建。创建的方式都是通过disk_image.cache,,其中disk_image最终返回的是image_type,
可选值为<raw、flat、qcow2、lvm、rbd、ploop、default>,这些可选值在当前文件中分别对应同名类,这些类的父类都
为image,其中image类中实现了cache方法,cache会调用传入的fetch_func,对应fn,追入fn,最终通过命令行,调用
qemu-img create -f <disk_format> <path> <size>来进行创建。
对于swap类型的local存储,也是通过类似的方法进行创建,这里不再详述。
3. local相关设备创建代码分析
3.1 入口部分:
def spawn(self, context, instance, image_meta, injected_files, admin_password, network_info=None, block_device_info=None): disk_info = blockinfo.get_disk_info(CONF.libvirt.virt_type, instance, image_meta, block_device_info) self._create_image(context, instance, disk_info['mapping'], network_info=network_info, block_device_info=block_device_info, files=injected_files, admin_pass=admin_password) |
其中self._create_image函数作为local相关设备创建的入口函数,传入了instance、disk_info、network_info、block_device_info、
injected_files、password这些参数,其中local相关的参数为disk_info和block_device_info,前者传入系统磁盘信息,后者传入块设备
信息,以供_create_image函数创建相应的设备。
3.2 kernel类型创建
if disk_images['kernel_id']: fname = imagecache.get_cache_fname(disk_images['kernel_id']) raw('kernel').cache(fetch_func=libvirt_utils.fetch_raw_image, context=context, filename=fname, image_id=disk_images['kernel_id']) |
其中disk_images中的参数信息从instance中获得,raw.cache方法中的raw会获取配置文件中image_type的值,可以取以下值:
<lvm、ploop、qcow2、rbd、raw、flat>,这些可选值在nova/virt/libvirt/imagebackend.py中分别对应同名类,即raw,cache方法
会调用相应类的cache方法,这些类的父类都为image,其中image类中实现了cache方法,cache会调用传入的fetch_func。
在此处,fetch_func对应libvirt_utils.fetch_raw_image方法,追入该方法,会调用到nova/image/glance.py的download方法,
通过self._get_transfer_module,然后调用到nova/image/download/file.py中FileTransfer类下的download方法,最终调用其中的
lv_utils.copy_image方法,该方法将磁盘映像复制到现有目录,依赖的驱动有rsync和ssh两种方式,通过remote_filesystem_transport
配置项进行配置。
3.3 randisk类型创建
if disk_images['ramdisk_id']: fname = imagecache.get_cache_fname(disk_images['ramdisk_id']) raw('ramdisk').cache(fetch_func=libvirt_utils.fetch_raw_image, context=context, filename=fname, image_id=disk_images['ramdisk_id']) |
fetch_func对应的方法同kernel类型创建,这里不赘述。
3.4 ephemeral类型创建
if 'disk.local' in disk_mapping: disk_image = image('disk.local') fn = functools.partial(self._create_ephemeral, fs_label='ephemeral0', os_type=instance.os_type, is_block_dev=disk_image.is_block_dev) fname = "ephemeral_%s_%s" % (ephemeral_gb, file_extension) size = ephemeral_gb * units.Gi disk_image.cache(fetch_func=fn,context=context, filename=fname, size=size, ephemeral_size=ephemeral_gb) for idx, eph in enumerate(driver.block_device_info_get_ephemerals( block_device_info)): disk_image = image(blockinfo.get_eph_disk(idx)) specified_fs = eph.get('guest_format') if specified_fs and not self.is_supported_fs_format(specified_fs): msg = _("%s format is not supported") % specified_fs raise exception.InvalidBDMFormat(details=msg) fn = functools.partial(self._create_ephemeral, fs_label='ephemeral%d' % idx, os_type=instance.os_type, is_block_dev=disk_image.is_block_dev) size = eph['size'] * units.Gi fname = "ephemeral_%s_%s" % (eph['size'], file_extension) disk_image.cache(fetch_func=fn, context=context, filename=fname, size=size, ephemeral_size=eph['size'], specified_fs=specified_fs) |
这里ephemeral类型分为两种,一种是作为系统盘进行创建,依据的大小是由虚机相关的flavor
决定的,另一种则是空卷的创建。这一部分在上述2.6节中已经详细介绍。
3.5 swap类型创建
if 'disk.swap' in disk_mapping: mapping = disk_mapping['disk.swap'] swap_mb = 0 swap = driver.block_device_info_get_swap(block_device_info) if driver.swap_is_usable(swap): swap_mb = swap['swap_size'] elif (inst_type['swap'] > 0 and not block_device.volume_in_mapping( mapping['dev'], block_device_info)): swap_mb = inst_type['swap'] if swap_mb > 0: size = swap_mb * units.Mi image('disk.swap').cache(fetch_func=self._create_swap, context=context, filename="swap_%s" % swap_mb, size=size, swap_mb=swap_mb) |
依据以上分析,fetch_func对应的self._create_swap方法最终调用了命令行:qemu-img create -f <disk_format> <path> <size>
通过直接调用qemu-img来创建,接着会调用mkswap命令来对刚才创建的部分进行格式化。
3.6 image类型创建
def _create_and_inject_local_root(self, context, instance, booted_from_volume, suffix, disk_images, network_info, admin_pass, files, inject_files, fallback_from_host): # File injection only if needed need_inject = (not configdrive.required_by(instance) and inject_files and CONF.libvirt.inject_partition != -2) # NOTE(ndipanov): Even if disk_mapping was passed in, which # currently happens only on rescue - we still don't want to # create a base image. if not booted_from_volume: root_fname = imagecache.get_cache_fname(disk_images['image_id']) size = instance.flavor.root_gb * units.Gi if size == 0 or suffix == '.rescue': size = None backend = self.image_backend.image(instance, 'disk' + suffix, CONF.libvirt.images_type) if instance.task_state == task_states.RESIZE_FINISH: backend.create_snap(libvirt_utils.RESIZE_SNAPSHOT_NAME) if backend.SUPPORTS_CLONE: def clone_fallback_to_fetch(*args, **kwargs): try: backend.clone(context, disk_images['image_id']) except exception.ImageUnacceptable: libvirt_utils.fetch_image(*args, **kwargs) fetch_func = clone_fallback_to_fetch else: fetch_func = libvirt_utils.fetch_image self._try_fetch_image_cache(backend, fetch_func, context, root_fname, disk_images['image_id'], instance, size, fallback_from_host) if need_inject: self._inject_data(backend, instance, network_info, admin_pass, files) |
这一部分由_create_and_inject_local_root方法实现,首先会判断是否从卷启动,如果不从卷启动,则
依据配置文件的images_type项获取后端镜像的类型,再判断后端是否支持克隆操作,如果支持克隆操作,则
调用后端驱动的clone方法,最终调用到rbd.RBD.clone方法。
如果不支持克隆操作,则执行libvirt_utils.fetch_image方法,该方法首先调用nova.image.api.download将
对应的image下载到对应的目录下,然后解析image信息等操作,判断当前image是否为raw格式,如果不为则
进行转换,然后为目录重命名。