引入nova placement之后对调度的影响(by quqi99)

quqi99

已于 2024-09-04 13:55:14 修改

阅读量9.3k

点赞数 1

分类专栏： schedule OpenStack Non-Networking 文章标签： cell placement

于 2020-09-17 16:11:19 首次发布

本文链接：https://blog.csdn.net/quqi99/article/details/108646142

版权

OpenStack Non-Networking 同时被 2 个专栏收录

72 篇文章 1 订阅

订阅专栏

schedule

1 篇文章 0 订阅

订阅专栏

作者：张华发表于：2020-09-17
版权声明：可以任意转载，转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明

nova cell v2

nova cell v2将nova db分成了3个(nova, nova_api, nova_cell0，虚机信息只存储在所在的cell中，公共数据存储在nova_api库中), nova_api中的3个表(nova_api.host_mappings, nova_api.instance_mappings, nova_api.cell_mappings可以直接从instance找到cell_id进而找到DB与MQ的信息，这样nova-api直接就可以操作该cell之类的DB与MQ从而可以让nova-compute可以水平扩展到更多的物理节点，另一方面，nova-api节点也不再需要nova-cell服务，要有nova-api与nova-scheduler两个服务即可．

mysql> pager less -S
PAGER set to 'less -S'
mysql> show tables;
mysql> select instance_uuid,cell_id from instance_mappings;
+--------------------------------------+---------+
| instance_uuid                        | cell_id |
+--------------------------------------+---------+
| 4039ed4e-d0a1-46ba-99a5-68bc84421b42 |       2 |

mysql> select instance_uuid,cell_id from nova_api.instance_mappings;

mysql> select * from host_mappings;
+---------------------+------------+----+---------+-------------------------------------+
| created_at          | updated_at | id | cell_id | host                                |
+---------------------+------------+----+---------+-------------------------------------+
| 2020-09-16 06:18:26 | NULL       |  1 |       2 | juju-3ba760-ceilometer-15.cloud.sts |

mysql> select transport_url,name,database_connection from cell_mappings;
+----------------------------------------------------------------------------------------------------------+-------+-----------------------------------------------------------------------------+
| transport_url                                                                                            | name  | database_connection                                                         |
+----------------------------------------------------------------------------------------------------------+-------+-----------------------------------------------------------------------------+
| none:///                                                                                                 | cell0 | mysql+pymysql://nova:4Hjrdj5yMTkG6V9nxNpqrfVdhtJ5Tnww@10.5.0.103/nova_cell0 |
| rabbit://nova:wSz5LjscfBqKnhVWKBZnrXdwS5Kz6TByz9jKfm2xKHbCRYPPSbcnqFwPTnCp8VpP@10.5.0.199:5672/openstack | cell1 | mysql+pymysql://nova:4Hjrdj5yMTkG6V9nxNpqrfVdhtJ5Tnww@10.5.0.103/nova       |

所以当遇到这种错误时:

openstack server delete 2ebf1b2d-f679-4265-9c4b-71420dace71a
No server with a name or ID of 2ebf1b2d-f679-4265-9c4b-71420dace71a

注意，下列命令必须运行在nova-cloud-controller节点上

sudo nova-manage cell_v2 list_cells
sudo nova-manage cell_v2 map_instances --cell_uuid <cell-id-from-above>
openstack server delete 2ebf1b2d-f679-4265-9c4b-71420dace71a

另外，

sudo nova-manage cell_v2 discover_hosts --verbose

NOTE: 将nova_api.host_mappings表清空之后在ncc上运行’nova-manage cell_v2 discover_hosts --verbose’时总是报‘Found 0 unmapped computes in cell’，想恢复的话必须先使用’openstack compute service delete’命令删除service，然后再重启nova-compute, 最后再运行’nova-manage cell_v2 discover_hosts --verbose’时就会看到‘Found 1 unmapped computes in cell’

nova placement API

nova placement API在Newton被引入, nova-scheduler调用placement-api用于调度. 主要用于跟踪记录Resource Provider(compute-node, external storage-pool, external ip-allocation-pool etc)的Inventory和Usage.自Pike版本, 必须启用Placement API来辅助nova-scheduler service进行compute node调度，并以此替代之前的RAMFilter、CoreFilter和DiskFilter。概念对象如下:

Resource Class, 资源种类, placement api默认实现了DISK_GB, MEMORY_MB,VCPU三种标准resource classes, 也提供了custom resource classes的接口.
Resource Providers：资源提供者，实际提供资源的对象，例如：compute node、storage pool
Inventory：资源清单，资源提供者所拥有的资源清单，例如：compute node 拥有的vCPU、Disk、RAM 等 inventories
Resource Allocations：资源分配状况，包含了Resource Class、Resource Provider以及Consumer 的映射关系。记录消费者使用了多少该类型的资源数量
Provider Aggregate：资源聚合，类似 HostAggregate 的概念
Traits：资源特征，不同资源提供者可能会具有不同的资源特征。Traits 作为资源提供者特征的描述，它不能够被消费，但在某些Workflow 或者会需要这些信息。例如：标识可用的Disk是一个SSD，可以帮助Scheduler更好的匹配 instance boot请求。

#注意：当删除compute_node表中的记录时，也要同时更新resource_providers表中的uuid字段
# 当然，compute_node表中的记录不需要删除，resource-update线程应该自动更新里面的drity usage (eg: pinned_vcps)
mysql> select * from placement.resource_providers;
+---------------------+---------------------+----+--------------------------------------+-------------------------------------+------------+----------+------------------+--------------------+
| created_at          | updated_at          | id | uuid                                 | name                                | generation | can_host | root_provider_id | parent_provider_id |
+---------------------+---------------------+----+--------------------------------------+-------------------------------------+------------+----------+------------------+--------------------+
| 2020-09-16 06:18:15 | 2020-09-16 10:19:33 |  1 | a7081054-ee03-44b8-ae21-f20e0535cfc1 | juju-3ba760-ceilometer-15.cloud.sts |         19 |     NULL |                1 |               NULL |

# for the field resource_class_id, 0 means VCPU, 1 means MEMORY_MB, 2 means DISK_GB
mysql> select * from placement.inventories;
+---------------------+------------+----+----------------------+-------------------+-------+----------+----------+----------+-----------+------------------+
| created_at          | updated_at | id | resource_provider_id | resource_class_id | total | reserved | min_unit | max_unit | step_size | allocation_ratio |
+---------------------+------------+----+----------------------+-------------------+-------+----------+----------+----------+-----------+------------------+
| 2020-09-16 06:18:15 | NULL       |  1 |                    1 |                 0 |     2 |        0 |        1 |        2 |         1 |               16 |
| 2020-09-16 06:18:15 | NULL       |  2 |                    1 |                 1 |  3944 |      512 |        1 |     3944 |         1 |              1.5 |
| 2020-09-16 06:18:15 | NULL       |  3 |                    1 |                 2 |    38 |        0 |        1 |       38 |         1 |                1 |
mysql> select * from placement.allocations;
+---------------------+------------+----+----------------------+--------------------------------------+-------------------+------+
| created_at          | updated_at | id | resource_provider_id | consumer_id                          | resource_class_id | used |
+---------------------+------------+----+----------------------+--------------------------------------+-------------------+------+
| 2020-09-16 08:45:44 | NULL       | 16 |                    1 | 64cb10fd-246f-4864-b06d-687d59c47c2c |                 2 |    1 |
| 2020-09-16 08:45:44 | NULL       | 17 |                    1 | 64cb10fd-246f-4864-b06d-687d59c47c2c |                 1 |   64 |
| 2020-09-16 08:45:44 | NULL       | 18 |                    1 | 64cb10fd-246f-4864-b06d-687d59c47c2c |                 0 |    1 |

Placement CLI

sudo apt install python3-osc-placement -y

$ openstack resource provider list
+--------------------------------------+-------------------------------------+------------+
| uuid                                 | name                                | generation |
+--------------------------------------+-------------------------------------+------------+
| a7081054-ee03-44b8-ae21-f20e0535cfc1 | juju-3ba760-ceilometer-15.cloud.sts |         19 |
+--------------------------------------+-------------------------------------+------------+

$  openstack resource provider inventory list a7081054-ee03-44b8-ae21-f20e0535cfc1
+----------------+------------------+----------+----------+----------+-----------+-------+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total |
+----------------+------------------+----------+----------+----------+-----------+-------+
| VCPU           |             16.0 |        1 |        2 |        0 |         1 |     2 |
| MEMORY_MB      |              1.5 |        1 |     3944 |      512 |         1 |  3944 |
| DISK_GB        |              1.0 |        1 |       38 |        0 |         1 |    38 |
+----------------+------------------+----------+----------+----------+-----------+-------+

$ openstack resource provider usage show a7081054-ee03-44b8-ae21-f20e0535cfc1
+----------------+-------+
| resource_class | usage |
+----------------+-------+
| VCPU           |     3 |
| MEMORY_MB      |   192 |
| DISK_GB        |     3 |
+----------------+-------+

关于Traits的进一步说明

如果说Inventory and Allocation是来辅助ResourceProvider来管理数量问题的话，那么traits用来辅助特征信息的管理。例如：用户需要为instance关联80G的disk(数量），但是也要求是SSD(特征)，那么就需要标记StorageResourceProvider是不是SSD. 它类似于tag (https://github.com/openstack/os-traits)
所以resource_provider_traits用来关联resource_provider表与trait
mysql> select * from resource_provider_traits where resource_provider_id = 2;
+---------------------+------------+----------+----------------------+
| created_at          | updated_at | trait_id | resource_provider_id |
+---------------------+------------+----------+----------------------+
| 2020-10-26 12:30:09 | NULL       |       59 |                    2 |

mysql> select * from traits;
+---------------------+------------+-----+---------------------------------------+
| created_at          | updated_at | id  | name                                  |
+---------------------+------------+-----+---------------------------------------+
| 2020-10-26 12:27:34 | NULL       |  59 | COMPUTE_DEVICE_TAGGING                |

如何使用呢？
1, The cloud deployer creates an aggregate representing all the compute nodes in row 1, racks 6 through 10:
$AGG_UUID=`openstack aggregate create r1rck0610`
# for all compute nodes in the system that are in racks 6-10 in row 1...
openstack aggregate add host $AGG_UUID $HOSTNAME

2, The cloud deployer creates a ResourceProvider representing the NFS share:
$RP_UUID=`openstack resource-provider create "/mnt/nfs/row1racks0610/" \
    --aggregate-uuid=$AGG_UUID`

3, The cloud deployer updates the resource provider’s capacity of shared disk:
openstack resource-provider set inventory $RP_UUID \
    --resource-class=DISK_GB \
    --total=100000 --reserved=1000 \
    --min-unit=50 --max-unit=10000 --step-size=10 \
    --allocation-ratio=1.0

4, The cloud deployer adds the STORAGE_SSD trait
openstack resource-provider trait add $RP_UUID STORAGE_SSD

5, Scheduling based on traits - https://docs.openstack.org/ironic/queens/install/configure-nova-flavors.html
openstack --os-baremetal-api-version 1.37 baremetal node add trait \
  $NODE_UUID CUSTOM_TRAIT1 HW_CPU_X86_VMX
nova flavor-key my-baremetal-flavor set trait:CUSTOM_TRAIT1=required
nova flavor-key my-baremetal-flavor set trait:HW_CPU_X86_VMX=required

one bug

例如, 如https://bugs.launchpad.net/nova/+bug/1679750描述的场景, 在hostA上创建一虚机,hostA死掉了, 这时删除虚机时无法删除实例(因为nova-compute这时死掉了啊), 这样会导致allocations表中的记录没有被删除. 如果hostA又起来了, nova-compute的init_host->_complete_partial_deletion

pre_start_hook -> update_available_resource -> nova/compute/manager.py#update_available_resource_for_node -> update_available_resource -> _update_available_resource -> _remove_deleted_instances_allocations

当把一个nova-compute删除时，删了service和compute_node表记录后,却没有删除placement resource provider和host mapping records.
nova-compute自己都死了它是没法自己删自己的,所以改由nova-api在启动时在删除instances时也删除allocations表中的记录 - https://review.opendev.org/#/c/580498/

another bug

另一个bug, 客户说正常维护一台机器, 机器肯定先关机了然后起来之后, 说之前host上的一个instance被rebuild到了远程机器了, 要求查清原因. nova-api端找到了下列日志, 显然是触发了local-delete机制:

var/log/nova/nova-api-os-compute.log:2022-06-09 00:06:13.699 68432 WARNING nova.compute.api [req-608d8025-c117-4dc6-9e0f-d2ee1da3f74e bf1b9f5bb95d409ca23a9ce477e94145 d730ab6a0a334ccea1577cd6b725a82d - f1f0b64a74f9408b8ba3506e6f4f6e67 f1f0b64a74f9408b8ba3506e6f4f6e67] [instance: 65472e49-a426-4b0c-8ed4-781c52f68d3d] instance's host llw-nfvi-az1-sv-com-02 is down, deleting from database
var/log/nova/nova-api-os-compute.log:2022-06-09 00:06:17.413 68432 INFO nova.scheduler.client.report [req-608d8025-c117-4dc6-9e0f-d2ee1da3f74e bf1b9f5bb95d409ca23a9ce477e94145 d730ab6a0a334ccea1577cd6b725a82d - f1f0b64a74f9408b8ba3506e6f4f6e67 f1f0b64a74f9408b8ba3506e6f4f6e67] Deleted allocation for instance 65472e49-a426-4b0c-8ed4-781c52f68d3d
var/log/nova/nova-api-os-compute.log:2022-06-09 00:06:17.498 68432 INFO nova.osapi_compute.wsgi.server [req-608d8025-c117-4dc6-9e0f-d2ee1da3f74e bf1b9f5bb95d409ca23a9ce477e94145 d730ab6a0a334ccea1577cd6b725a82d - f1f0b64a74f9408b8ba3506e6f4f6e67 f1f0b64a74f9408b8ba3506e6f4f6e67] 10.252.17.65,127.0.0.1 "DELETE /v2.1/d730ab6a0a334ccea1577cd6b725a82d/servers/65472e49-a426-4b0c-8ed4-781c52f68d3d HTTP/1.1" status: 204 len: 405 time: 3.9009511

根据下列代码分析, 似乎local_delete只有在有人或api在delete instance时才会被触发:

在_delete中, neutron-api有一个local_delete机制(neutron-api跟据心跳检查和service_down_time是判断是否有nova-compute服务DOWN掉了(可能是nova-compute死掉了,也可能是没死掉但功能不work这种就可以根据心跳service_down_time机制检测出来),如果is_local_delete=True且cell不为空将call _local_delete
只有soft_delete, _delete_instance, delete才会调用_delete, 也就是只有在delete instance时才会call _delete. 所以只有有人人为调用了delete instance时且满足is_local_delete=true的条件才会触发local_delete, 也见:https://bugs.launchpad.net/nova/+bug/1679750

即删除一个instance主要有以下两种情况(instance在vm_statesvm_states.SHELVED, vm_states.SHELVED_OFFLOADED时会采用其他方式):
is_local_delete = True 采用local_delete()
is_local_delete = False 采用compute_rpcapi.terminate_instance()

另外, nova-compute端有这种日志,

./nova-compute.log:2022-06-08 23:01:08.651 19391 INFO nova.virt.libvirt.driver [req-d920209e-a181-41c9-8e57-1ada3850b81b 6e4b921233de47499c204ad695d893e5 f4d010bb96704bd7891351697103d4f5 - a473f51a33534c0fbf1febb216be04ba a473f51a33534c0fbf1febb216be04ba] [instance: 65472e49-a426-4b0c-8ed4-781c52f68d3d] Instance shutdown successfully after 23 seconds.
./nova-compute.log:2022-06-08 23:01:08.653 19391 INFO nova.virt.libvirt.driver [-] [instance: 65472e49-a426-4b0c-8ed4-781c52f68d3d] Instance destroyed successfully.
./nova-compute.log:2022-06-08 23:01:23.299 19391 INFO nova.compute.manager [-] [instance: 65472e49-a426-4b0c-8ed4-781c52f68d3d] VM Stopped (Lifecycle Event)
./nova-compute.log:2022-06-09 00:10:12.120 4572 ERROR oslo_messaging.rpc.server [req-681cc4a6-cad6-480b-bcba-392eed949412 bf1b9f5bb95d409ca23a9ce477e94145 d730ab6a0a334ccea1577cd6b725a82d - f1f0b64a74f9408b8ba3506e6f4f6e67 f1f0b64a74f9408b8ba3506e6f4f6e67] Exception during message handling: InstanceNotFound_Remote: Instance 65472e49-a426-4b0c-8ed4-781c52f68d3d could not be found.
./nova-compute.log:InstanceNotFound: Instance 65472e49-a426-4b0c-8ed4-781c52f68d3d could not be found.
./nova-compute.log:2022-06-09 00:10:12.120 4572 ERROR oslo_messaging.rpc.server InstanceNotFound_Remote: Instance 65472e49-a426-4b0c-8ed4-781c52f68d3d could not be found.
./nova-compute.log:2022-06-09 00:10:12.120 4572 ERROR oslo_messaging.rpc.server InstanceNotFound: Instance 65472e49-a426-4b0c-8ed4-781c52f68d3d could not be found.
./nova-compute.log:2022-06-09 00:15:03.899 4572 INFO nova.virt.libvirt.driver [req-7b2fd0cd-a807-425b-9df1-ed4a888d495b - - - - -] [instance: 65472e49-a426-4b0c-8ed4-781c52f68d3d] Deleting instance files /var/lib/nova/instances/65472e49-a426-4b0c-8ed4-781c52f68d3d_del
./nova-compute.log:2022-06-09 00:15:03.902 4572 INFO nova.virt.libvirt.driver [req-7b2fd0cd-a807-425b-9df1-ed4a888d495b - - - - -] [instance: 65472e49-a426-4b0c-8ed4-781c52f68d3d] Deletion of /var/lib/nova/instances/65472e49-a426-4b0c-8ed4-781c52f68d3d_del complete
./nova-compute.log:2022-06-09 00:15:05.010 4572 WARNING nova.virt.libvirt.driver [req-7b2fd0cd-a807-425b-9df1-ed4a888d495b - - - - -] Periodic task is updating the host stat, it is trying to get disk instance-00000116, but disk file was removed by concurrent operations such as resize.: OSError: [Errno 2] No such file or directory: '/var/lib/nova/instances/65472e49-a426-4b0c-8ed4-781c52f68d3d/disk.config'

时间序列分析如下:

2022-06-08 23:01:23, the instance was stopped. 应该是维护开始停host的时候
2022-06-09 00:06:17 , neutron-api deletes the allocations for the instance - https://review.opendev.org/c/openstack/nova/+/580498/1/nova/compute/api.py#2107
2022-06-09 00:10:04, nova-compute was restarted - 2022-06-09 00:10:04.220 4572 INFO nova.service [-] Starting compute node (version 17.0.13) - 应该是维护结束启动host的进修
2022-06-09 00:10:12, so nova-compute reported ‘InstanceNotFound_Remote’ because the following path was triggered

pre_start_hook -> update_available_resource -> nova/compute/manager.py#update_available_resource_for_node -> update_available_resource -> _update_available_resource -> _remove_deleted_instances_allocations

2022-06-09 00:15:03, the nova-compute started to delete instance files /var/lib/nova/instances/65472e49-a426-4b0c-8ed4-781c52f68d3d_del

究竟是什么因为触发的local_delete呢? neutron-api也没有发现定时线程来循环检测host是否down的代码啊, 仅仅只在delete instance时会检查host是否down来触发local_delete哦

debug

NOTE: the table resource_providers, inventories allocations are in the db placement rather than nova_api
select * nova_api.from host_mappings;
select * from nova_api.cell_mappings;
select * from placement.resource_providers where name like '%xxx%';   xx.bos01.xxx (9)
select * from nova.compute_nodes where host like '%bagon%' or hypervisor_hostname like '%xxx%';
select * from placement.inventories where resource_provider_id in (select id from nova_api.resource_providers where name like '%xxx%');
select * from placement.allocations where resource_provider_id in (select id from nova_api.resource_providers where name like '%xxx%') order by consumer_id,resource_provider_id,resource_class_id;
select uuid, host, node, vcpus, memory_mb, vm_state, power_state, task_state, root_gb, ephemeral_gb, cell_name,deleted from nova.instances where uuid in (select consumer_id from nova_api.allocations where resource_provider_id in (select id from nova_api.resource_providers where name like '%xxx%')) order by uuid;

20201230更新 - another bug

"select numa_topology from nova.compute_nodes where hypervisor_hostname=‘cloud3.xxx.com’\G"显示cell0上的pinned_cpus将所有CPU全用完了导致nova-schedule无法继续调度报“Filter NUMATopologyFilter returned 0 hosts"这种错。
下面代码分析显示周期性的update_available_resource本来是可以自动修改数据库记录的。

pre_start_hook -> update_available_resource -> _update_available_resource -> _update_usage_from_instances -> _update_usage_from_instance -> _update_usage -> numa_usage_from_instance_numa

从数据库拿出host_cell.pinned_cpus作为pinned_cpus的初始值，特别要注意：host_cell.pinned_cpus并不是直接从数据库取的，它通过运行下列的self._copy_resources(cn, resources)方法实际上让host_cell.pinned_cpus永远为empty

693 def _init_compute_node(self, context, resources):
...
713 if nodename in self.compute_nodes:
714 cn = self.compute_nodes[nodename]
715 self._copy_resources(cn, resources)
716 self._setup_pci_tracker(context, cn, resources)
717 return False

它根据free变量来决定是往pinned_cpus上加还是减。

./nova/virt/hardware.py#numa_usage_from_instance_numa
def numa_usage_from_instance_numa(host_topology, instance_topology,free=False):
...
for host_cell in host_topology.cells:
new_cell = objects.NUMACell(
id=host_cell.id,
cpuset=shared_cpus,
pcpuset=dedicated_cpus,
memory=host_cell.memory,
cpu_usage=0,
memory_usage=0,
mempages=host_cell.mempages,
pinned_cpus=host_cell.pinned_cpus,
siblings=host_cell.siblings)
...
if free:
if (instance_cell.cpu_thread_policy ==
fields.CPUThreadAllocationPolicy.ISOLATE):
new_cell.unpin_cpus_with_siblings(pinned_cpus)
else:
new_cell.unpin_cpus(pinned_cpus)

free是由”free = sign == -1“决定的（看仔细，右边的是两个等号，左边的是一个等号）

def _update_usage(self, usage, nodename, sign=1):
...
free = sign == -1
cn.numa_topology = hardware.numa_usage_from_instance_numa(
host_numa_topology, instance_numa_topology, free)._to_json()

def _update_usage_from_instance():
is_new_instance = uuid not in self.tracked_instances
is_removed_instance = not is_new_instance and (is_removed or
instance['vm_state'] in vm_states.ALLOW_RESOURCE_REMOVAL)
if is_new_instance:
self.tracked_instances.add(uuid)
sign = 1
if is_removed_instance:
self.tracked_instances.remove(uuid)
sign = -1
...
self._update_usage(self._get_usage_dict(instance, instance),nodename, sign=sign)

所以只要update_available_resource运行那脏记录必须得到修改，那现在没修改说明update_available_resource一直没运行，日志里发现这种错误placement正在使用http而非https打头的endpoint从而导致placement api不可用，这样导致update_available_resource在调用update_placement时出错，从而导致update_available_resource自2020-10-26后再未运行。详见－https://bugs.launchpad.net/charm-nova-compute/+bug/1826382

2020-10-26 15:43:34.459 1393 WARNING keystoneauth.discover [req-5dcdc394-2784-40d2-984c-54fe261f36f0 - - - - -] Failed to contact the endpoint at http://placement-int.xxx.com:8778 for discovery. Fallback to using that endpoint as the base url.
2020-10-26 15:43:34.463 1393 ERROR nova.compute.manager [req-5dcdc394-2784-40d2-984c-54fe261f36f0 - - - - -] Could not retrieve compute node resource provider 8bd4062b-84c7-4aab-ade7-31dc01695878 and therefore unable to error out any instances stuck in BUILDING state. Error: Failed to retrieve allocations for resource provider 8bd4062b-84c7-4aab-ade7-31dc01695878:

关于numa测试环境的搭建可以见－https://blog.csdn.net/quqi99/article/details/51993512, 注意一点，grub里定义isolcpus并不会让nova不使用这些cpu, nova里专门有vcpu_pin_set来做这件事。

其他一个，　https://zhhuabj.blog.csdn.net/article/details/50988089

20211214更新 - FQDN hostname test

注：最后的原因查出来是在删除了stale resource-provider record之后，nova-compute没有重启，这样’openstack compute service list’看到的还是老host的记录，另外，也要记得重启neutron-openviswithch-agent，可使用’openstack network agent list’确认

1, delete compute_nodes record

delete from nova.services where host='juju-6ae090-focal2-9.cloud.sts';
delete from nova.compute_nodes where hypervisor_hostname='juju-6ae090-focal2-9.cloud.sts';

2, The record in two table nova.services and nova.compute_nodes will be recreated automatically after nova-compute restart

juju ssh nova-compute/1 -- sudo systemctl restart nova-compute

3, run 'discover_hosts' command in nova-cloud-controller unit

root@juju-6ae090-focal2-8:/home/ubuntu# nova-manage cell_v2 discover_hosts --verbose
Found 2 cell mappings.
Skipping cell0 since it does not contain hosts.
Getting computes from cell 'cell1': 4473067d-4c91-459f-93ae-79cb1e1203c7
/usr/lib/python3/dist-packages/pymysql/cursors.py:170: Warning: (3719, "'utf8' is currently an alias for the character set UTF8MB3, but will be an alias for UTF8MB4 in a future release. Please consider using UTF8MB4 in order to be unambiguous.")
  result = self._query(query)
Checking host mapping for compute host 'juju-6ae090-focal2-9.cloud.sts': 1a4600d5-f889-429e-bb67-00498f3166ab
Found 0 unmapped computes in cell: 4473067d-4c91-459f-93ae-79cb1e1203c7

4, check nova-compute service is there, and juju-6ae090-focal2-9.cloud.sts is in cell1

$ openstack compute service list |grep nova-compute
|  9 | nova-compute   | juju-6ae090-focal2-9.cloud.sts | nova     | enabled | up    | 2021-12-14T07:47:23.000000 |

# nova-manage cell_v2 list_hosts 
+-----------+--------------------------------------+--------------------------------+
| Cell Name |              Cell UUID               |            Hostname            |
+-----------+--------------------------------------+--------------------------------+
|   cell1   | 4473067d-4c91-459f-93ae-79cb1e1203c7 | juju-6ae090-focal2-9.cloud.sts |
+-----------+--------------------------------------+--------------------------------+

5, create a instance for the test, then I hit ResourceProviderCreationFailed exception from nova-compute.log

2021-12-14 07:50:41.044 24622 ERROR nova.compute.manager [req-b7b0a884-3e07-430d-bd1d-7933cf06befb - - - - -] Error updating resources for node juju-6ae090-focal2-9.cloud.sts.: nova.exception.ResourceProviderCreationFailed: Failed to create resource provider juju-6ae090-focal2-9.cloud.sts
...
2021-12-14 07:50:41.044 24622 ERROR nova.compute.manager nova.exception.ResourceProviderCreationFailed: Failed to create resource provider juju-6ae090-focal2-9.cloud.sts

6, It looks uuid between placement.resource_providers and nova.compute_nodes conflicts

mysql> select id,uuid,name from placement.resource_providers where name='juju-6ae090-focal2-9.cloud.sts';
+----+--------------------------------------+--------------------------------+
| id | uuid                                 | name                           |
+----+--------------------------------------+--------------------------------+
|  2 | cd83ab34-5407-4c88-98bb-60afc100abbf | juju-6ae090-focal2-9.cloud.sts |
+----+--------------------------------------+--------------------------------+
mysql> select id, hypervisor_hostname, host_ip, host, uuid from nova.compute_nodes where host='juju-6ae090-focal2-9.cloud.sts';
+----+--------------------------------+-----------+--------------------------------+--------------------------------------+
| id | hypervisor_hostname            | host_ip   | host                           | uuid                                 |
+----+--------------------------------+-----------+--------------------------------+--------------------------------------+
|  3 | juju-6ae090-focal2-9.cloud.sts | 10.5.4.56 | juju-6ae090-focal2-9.cloud.sts | 1a4600d5-f889-429e-bb67-00498f3166ab |
+----+--------------------------------+-----------+--------------------------------+--------------------------------------+

so modify it.

update placement.resource_providers set uuid='1a4600d5-f889-429e-bb67-00498f3166ab' where name='juju-6ae090-focal2-9.cloud.sts';

7, then it works again.

相关调试命令：
select id, hypervisor_hostname, host_ip, host, uuid from nova.compute_nodes;
select * from nova_api.host_mappings;
openstack host list
openstack compute service list
openstack server show <id>

其他两个测试：

# TEST 4

1, create a dead record by 'openstack compute service delete 9', and change it's host from 'juju-6ae090-focal2-9.cloud.sts' to 'juju-6ae090-focal2-9' by 'update nova.compute_nodes set host='juju-6ae090-focal2-9' where id=3;' as well

mysql> select deleted_at, id, hypervisor_hostname, host_ip, host, uuid  from nova.compute_nodes where host like 'juju-6ae090-focal2-9%';
+---------------------+----+--------------------------------+-----------+--------------------------------+--------------------------------------+
| deleted_at          | id | hypervisor_hostname            | host_ip   | host                           | uuid                                 |
+---------------------+----+--------------------------------+-----------+--------------------------------+--------------------------------------+
| 2021-12-14 09:21:26 |  3 | juju-6ae090-focal2-9.cloud.sts | 10.5.4.56 | juju-6ae090-focal2-9 | 1a4600d5-f889-429e-bb67-00498f3166ab |
+---------------------+----+--------------------------------+-----------+--------------------------------+--------------------------------------+

2, restart nova-compute service to recreate nova.compute_nodes record automatically by 'juju ssh nova-compute/1 -- sudo systemctl restart nova-compute'

mysql> select deleted_at, id, hypervisor_hostname, host_ip, host, uuid  from nova.compute_nodes where host like 'juju-6ae090-focal2-9%';
+---------------------+----+--------------------------------+-----------+--------------------------------+--------------------------------------+
| deleted_at          | id | hypervisor_hostname            | host_ip   | host                           | uuid                                 |
+---------------------+----+--------------------------------+-----------+--------------------------------+--------------------------------------+
| NULL                |  4 | juju-6ae090-focal2-9.cloud.sts | 10.5.4.56 | juju-6ae090-focal2-9.cloud.sts | 5ed16f18-f83a-4109-a8ef-74b8e5d81218 |
| 2021-12-14 09:21:26 |  3 | juju-6ae090-focal2-9.cloud.sts | 10.5.4.56 | juju-6ae090-focal2-9 | 1a4600d5-f889-429e-bb67-00498f3166ab |
+---------------------+----+--------------------------------+-----------+--------------------------------+--------------------------------------+

3, double confirm uuids between placement.resource_providers and nova.compute_nodes are same (all are 5ed16f18-f83a-4109-a8ef-74b8e5d81218).

mysql> select id,uuid,name from placement.resource_providers;
+----+--------------------------------------+--------------------------------+
| id | uuid                                 | name                           |
+----+--------------------------------------+--------------------------------+
| 52 | 5ed16f18-f83a-4109-a8ef-74b8e5d81218 | juju-6ae090-focal2-9.cloud.sts |
+----+--------------------------------------+--------------------------------+

4, double confirm nova-compute service is there

$ openstack compute service list |grep nova-compute
| 10 | nova-compute   | juju-6ae090-focal2-9.cloud.sts | nova     | enabled | up    | 2021-12-14T09:36:36.000000 |

5, it works


# TEST 5

1, Let's change host to 

update nova.compute_nodes set host='juju-6ae090-focal2-9.cloud.sts' where id=3;
update nova.compute_nodes set host='juju-6ae090-focal2-9' where id=4;

mysql> select deleted_at, id, hypervisor_hostname, host_ip, host, uuid  from nova.compute_nodes where host like 'juju-6ae090-focal2-9%';
+---------------------+----+--------------------------------+-----------+--------------------------------+--------------------------------------+
| deleted_at          | id | hypervisor_hostname            | host_ip   | host                           | uuid                                 |
+---------------------+----+--------------------------------+-----------+--------------------------------+--------------------------------------+
| NULL                |  4 | juju-6ae090-focal2-9.cloud.sts | 10.5.4.56 | juju-6ae090-focal2-9           | 5ed16f18-f83a-4109-a8ef-74b8e5d81218 |
| 2021-12-14 09:21:26 |  3 | juju-6ae090-focal2-9.cloud.sts | 10.5.4.56 | juju-6ae090-focal2-9.cloud.sts | 1a4600d5-f889-429e-bb67-00498f3166ab |
+---------------------+----+--------------------------------+-----------+--------------------------------+--------------------------------------+

2, change host to juju-6ae090-focal2-9 in nova_api.host_mappings as well

update nova_api.host_mappings set host='juju-6ae090-focal2-9' where host='juju-6ae090-focal2-9.cloud.sts';
mysql> select * from nova_api.host_mappings;
+---------------------+------------+----+---------+----------------------+
| created_at          | updated_at | id | cell_id | host                 |
+---------------------+------------+----+---------+----------------------+
| 2021-12-14 09:28:02 | NULL       |  2 |       2 | juju-6ae090-focal2-9 |
+---------------------+------------+----+---------+----------------------+

3, but the output of 'openstack compute service list' is still juju-6ae090-focal2-9.cloud.sts

$ openstack compute service list |grep nova-compute
| 10 | nova-compute   | juju-6ae090-focal2-9.cloud.sts | nova     | enabled | up    | 2021-12-14T09:51:26.000000 |

4, so it didn't work

5, change host to juju-6ae090-focal2-9 in both nova.conf and neutron.conf, then restart nova-compute and neutron-openvswitch-agent, now we have two nova-compute services.

# grep -r 'juju-6ae090-focal2-9' /etc/nova/
/etc/nova/nova.conf:host = juju-6ae090-focal2-9
# grep -r 'juju-6ae090-focal2-9' /etc/neutron/
/etc/neutron/neutron.conf:host = juju-6ae090-focal2-9

$ openstack compute service list |grep nova-compute
| 10 | nova-compute   | juju-6ae090-focal2-9.cloud.sts | nova     | enabled | down  | 2021-12-14T10:07:45.000000 |
| 11 | nova-compute   | juju-6ae090-focal2-9           | nova     | enabled | up    | 2021-12-14T10:09:08.000000 |

uuid is 5ed16f18-f83a-4109-a8ef-74b8e5d81218, so id=4 in nova.compute_nodes will be used.

mysql> select * from placement.resource_providers;
+---------------------+---------------------+----+--------------------------------------+--------------------------------+------------+------------------+--------------------+
| created_at          | updated_at          | id | uuid                                 | name                           | generation | root_provider_id | parent_provider_id |
+---------------------+---------------------+----+--------------------------------------+--------------------------------+------------+------------------+--------------------+
| 2021-12-14 09:28:01 | 2021-12-14 10:11:18 | 52 | 5ed16f18-f83a-4109-a8ef-74b8e5d81218 | juju-6ae090-focal2-9.cloud.sts |          8 |               52 |               NULL |
+---------------------+---------------------+----+--------------------------------------+--------------------------------+------------+------------------+--------------------+
mysql> select deleted_at, id, hypervisor_hostname, host_ip, host, uuid  from nova.compute_nodes where host like 'juju-6ae090-focal2-9%';
+---------------------+----+--------------------------------+-----------+--------------------------------+--------------------------------------+
| deleted_at          | id | hypervisor_hostname            | host_ip   | host                           | uuid                                 |
+---------------------+----+--------------------------------+-----------+--------------------------------+--------------------------------------+
| NULL                |  4 | juju-6ae090-focal2-9.cloud.sts | 10.5.4.56 | juju-6ae090-focal2-9           | 5ed16f18-f83a-4109-a8ef-74b8e5d81218 |
| NULL                |  5 | juju-6ae090-focal2-9.cloud.sts | 10.5.4.56 | juju-6ae090-focal2-9.cloud.sts | d9898d66-182c-4494-93a2-95e656fc1001 |
| 2021-12-14 09:21:26 |  3 | juju-6ae090-focal2-9.cloud.sts | 10.5.4.56 | juju-6ae090-focal2-9.cloud.sts | 1a4600d5-f889-429e-bb67-00498f3166ab |
+---------------------+----+--------------------------------+-----------+--------------------------------+--------------------------------------+

6, it works as well.

所以：

nova.conf与neutron.conf中都配置有host, 如：host=juju-6ae090-focal2-9
当创建虚机时，虚机通过instance_mapping找到cell id, 进而通过cell_mappings找到MQ与DB地址(可在ncc unit上运行‘nova-manage cell_v2 list_cells’获取cell信息)。另一方面，通过host_mapping找到host (在ncc unit运行’nova-manage cell_v2 discover_hosts --verbose’却总是报：Found 0 unmapped computes in cell）

mysql> select instance_uuid,cell_id from nova_api.instance_mappings;
+--------------------------------------+---------+
| instance_uuid                        | cell_id |
+--------------------------------------+---------+
| 0be2dce7-c85b-4354-82ea-c08524382565 |       2 |
mysql> select * from nova_api.host_mappings;
+---------------------+------------+----+---------+----------------------+
| created_at          | updated_at | id | cell_id | host                 |
+---------------------+------------+----+---------+----------------------+
| 2021-12-14 11:05:45 | NULL       |  4 |       2 | juju-6ae090-focal2-9 |
+---------------------+------------+----+---------+----------------------+

别忘了service：

$ openstack compute service list |grep nova-compute
| 10 | nova-compute   | juju-6ae090-focal2-9.cloud.sts | nova     | enabled | down  | 2021-12-14T10:07:45.000000 |
| 12 | nova-compute   | juju-6ae090-focal2-9           | nova     | enabled | up    | 2021-12-14T11:10:31.000000 |

mysql> select * from nova.services where host like 'juju-6ae090-focal2-9%';
+---------------------+---------------------+---------------------+----+--------------------------------+--------------+---------+--------------+----------+---------+--------------->
| created_at          | updated_at          | deleted_at          | id | host                           | binary       | topic   | report_count | disabled | deleted | disabled_reaso>
+---------------------+---------------------+---------------------+----+--------------------------------+--------------+---------+--------------+----------+---------+--------------->
| 2021-12-14 11:05:33 | 2021-12-14 11:12:30 | NULL                | 12 | juju-6ae090-focal2-9           | nova-compute | compute |           42 |        0 |       0 | NULL          >
| 2021-12-14 10:08:02 | 2021-12-14 11:04:46 | 2021-12-14 11:04:48 | 11 | juju-6ae090-focal2-9           | nova-compute | compute |          339 |        0 |      11 | NULL          >
| 2021-12-14 09:28:00 | 2021-12-14 10:07:45 | NULL                | 10 | juju-6ae090-focal2-9.cloud.sts | nova-compute | compute |          238 |        0 |       0 | NULL          >
| 2021-12-14 07:44:38 | 2021-12-14 09:21:17 | 2021-12-14 09:21:26 |  9 | juju-6ae090-focal2-9.cloud.sts | nova-compute | compute |          580 |        0 |       9 | NULL          >
+---------------------+---------------------+---------------------+----+--------------------------------+--------------+---------+--------------+----------+---------+--------------->

compute_nodes往往和service是一起的, compute_nodes有uuid它再关联resource_provider

mysql> select deleted_at, id, hypervisor_hostname, host_ip, host, uuid  from nova.compute_nodes where host like 'juju-6ae090-focal2-9%';
+---------------------+----+--------------------------------+-----------+--------------------------------+--------------------------------------+
| deleted_at          | id | hypervisor_hostname            | host_ip   | host                           | uuid                                 |
+---------------------+----+--------------------------------+-----------+--------------------------------+--------------------------------------+
| NULL                |  6 | juju-6ae090-focal2-9.cloud.sts | 10.5.4.56 | juju-6ae090-focal2-9           | 1f39f93f-9ecf-4284-8001-af34c1d2f34c |
| 2021-12-14 11:04:48 |  4 | juju-6ae090-focal2-9.cloud.sts | 10.5.4.56 | juju-6ae090-focal2-9           | 5ed16f18-f83a-4109-a8ef-74b8e5d81218 |
| NULL                |  5 | juju-6ae090-focal2-9.cloud.sts | 10.5.4.56 | juju-6ae090-focal2-9.cloud.sts | d9898d66-182c-4494-93a2-95e656fc1001 |
| 2021-12-14 09:21:26 |  3 | juju-6ae090-focal2-9.cloud.sts | 10.5.4.56 | juju-6ae090-focal2-9.cloud.sts | 1a4600d5-f889-429e-bb67-00498f3166ab |
+---------------------+----+--------------------------------+-----------+--------------------------------+--------------------------------------+

resource_providers表, 这里是通过uuid和nova_computes关联的，然后它的id再和allocation这些表关联，所以实际上它这里的name反而无所谓，name是juju-6ae090-focal2-9.cloud.sts是因为之后的测试使用的是juju-6ae090-focal2-9.cloud.sts，后来没有删除resource_provider所以它还是FQDN
mysql> select * from placement.resource_providers;
±--------------------±--------------------±—±-------------------------------------±-------------------------------±-----------±-----------------±-------------------+
| created_at | updated_at | id | uuid | name | generation | root_provider_id | parent_provider_id |
±--------------------±--------------------±—±-------------------------------------±-------------------------------±-----------±-----------------±-------------------+
| 2021-12-14 11:05:35 | 2021-12-14 11:09:29 | 54 | 1f39f93f-9ecf-4284-8001-af34c1d2f34c | juju-6ae090-focal2-9.cloud.sts | 3 | 54 | NULL |
±--------------------±--------------------±—±-------------------------------------±-------------------------------±-----------±-----------------±-------------------+

20240102 -

VM明明被删除了，但openstack server list时能看到它, 那是nova/nova_cell0里有了dumplicate instance uuid (nova_cell0 will store instances that cannot be scheduled), 可能是网络问题造成的，可以用下列方法修复DB:

1. Double-check the cell mappings first using `select * from nova_api.cell_mappings\G` to confirm the mappings:
cell0 id=2 uuid=00000000-0000-0000-0000-000000000000
cell1 id=5 uuid=03e1129c-2952-4512-874b-e45bc8f280a2
2. Use the instance UUID to check which database contains an alive entry:
# select * from nova.instances where uuid='<-instance_uuid->' and deleted=0\G
# select * from nova_cell0.instances where uuid='<-instance_uuid->' and deleted=0\G
3. Modify the instance mapping:
If nova has an alive entry, let the instance map point to cell1.
If nova_cell0 has an alive entry, the mapping should point to cell0.
Confirm the current mapping with `select cell_id from nova_api.instance_mappings where instance_uuid='<-instance_uuid->'\G` command.
If the cell ID doesn't match, delete it and map instances again:
# delete from nova_api.instance_mappings where instance_uuid='<-instance_uuid->' [PLEASE EXECUTE THIS WITH CAUTION]
# nova-manage cell_v2 map_instances --cell_uuid <-cell_uuid->
4. After confirming that the instance maps to the desired cell, delete the instance with the command:
openstack server delete <-instance_uuid->

20240603

遇到一个问题，说无法从amd机器迁移到intel机器，原来是intel上的cpu比amd的数目要少的原因，所以关键是要设置cpu-allocation-ratio，这个参数cpu-allocation-ratio必须设置在nova-compute节点而不是nova-cloud-controller节点。
例如在一个只有2个pCPU的机器上, 它的max_unit与total应该是2

$ openstack resource provider inventory list df00a126-2390-4dca-b057-c4d3b443c545 |head -n 4
+----------------+------------------+----------+----------+----------+-----------+-------+------+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total | used |
+----------------+------------------+----------+----------+----------+-----------+-------+------+
| VCPU           |              4.0 |        1 |        2 |        0 |         1 |     2 |    3 |

>>> import libvirt
>>> conn = libvirt.open("qemu:///system")
>>> cpu_nums = conn.getCPUMap()[0]
>>> print(cpu_nums)
2

root@juju-99d74e-ovn-11:/home/ubuntu# nc 127.0.0.1 4444
> /usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py(7558)_get_vcpu_available()
-> online_cpus = self._host.get_online_cpus()
(Pdb) bt
  /usr/lib/python3/dist-packages/eventlet/greenthread.py(221)main()
-> result = function(*args, **kwargs)
  /usr/lib/python3/dist-packages/oslo_service/loopingcall.py(150)_run_loop()
-> result = func(*self.args, **self.kw)
  /usr/lib/python3/dist-packages/nova/service.py(307)periodic_tasks()
-> return self.manager.periodic_tasks(ctxt, raise_on_error=raise_on_error)
  /usr/lib/python3/dist-packages/nova/manager.py(104)periodic_tasks()
-> return self.run_periodic_tasks(context, raise_on_error=raise_on_error)
  /usr/lib/python3/dist-packages/oslo_service/periodic_task.py(216)run_periodic_tasks()
-> task(self, context)
  /usr/lib/python3/dist-packages/nova/compute/manager.py(10258)update_available_resource()
-> self._update_available_resource_for_node(context, nodename,
  /usr/lib/python3/dist-packages/nova/compute/manager.py(10167)_update_available_resource_for_node()
-> self.rt.update_available_resource(context, nodename,
  /usr/lib/python3/dist-packages/nova/compute/resource_tracker.py(884)update_available_resource()
-> resources = self.driver.get_available_resource(nodename)
  /usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py(9294)get_available_resource()
-> data["vcpus"] = len(self._get_vcpu_available())
> /usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py(7558)_get_vcpu_available()
-> online_cpus = self._host.get_online_cpus()

(Pdb) l
742  
743             :returns: set of online CPUs, raises libvirtError on error
744             """
745             cpus, cpu_map, online = self.get_connection().getCPUMap()
746  
747  ->         online_cpus = set()
748             for cpu in range(cpus):
749                 if cpu_map[cpu]:
750                     online_cpus.add(cpu)
751  
752             return online_cpus
(Pdb) p cpus
2

这个时候创建虚机会看到下列错:

root@juju-99d74e-ovn-10:/home/ubuntu# grep -r '5a7a8e59-71a4-4624-a368-a024803600bd' /var/log/nova/nova-scheduler.log
2024-06-03 05:03:39.536 2881454 DEBUG nova.scheduler.manager [req-634c42c7-8800-497b-82a1-545b5589cdfe 0389d2c5def94e4ca14366aa7a7a5228 e8c92bf3cd804ae694ef49749daf6eea - 4c83f99642134191b11ad2139afd3497 4c83f99642134191b11ad2139afd3497] Starting to schedule for instances: ['5a7a8e59-71a4-4624-a368-a024803600bd'] select_destinations /usr/lib/python3/dist-packages/nova/scheduler/manager.py:141

root@juju-99d74e-ovn-10:/home/ubuntu# grep -r 'req-634c42c7-8800-497b-82a1-545b5589cdfe' /var/log/nova/nova-scheduler.log
...
2024-06-03 05:03:40.056 2881454 INFO nova.scheduler.manager [req-634c42c7-8800-497b-82a1-545b5589cdfe 0389d2c5def94e4ca14366aa7a7a5228 e8c92bf3cd804ae694ef49749daf6eea - 4c83f99642134191b11ad2139afd3497 4c83f99642134191b11ad2139afd3497] Got no allocation candidates from the Placement API. This could be due to insufficient resources or a temporary occurrence as compute nodes start up.

nova-scheuder会调用osc-placement的/allocation_candidates api, 也可以直接用下列命令快速测试：

 openstack allocation candidate list --resource VCPU=3

要想’openstack allocation candidate list --resource VCPU=3’能work，可以用下列命令设置max_unit和total等于3

openstack resource provider inventory class set --allocation_ratio 4.0 --total 3 --max_unit 3 df00a126-2390-4dca-b057-c4d3b443c545 VCPU

但是上面这个命令的设置将在创建虚机后由这个path ( instance_claim -> _update -> _update_to_placement -> update_provider_tree #_get_vcpu_available())所覆盖，难道是要做如下设置吗？

$ git diff ./nova/virt/libvirt/driver.py
diff --git a/nova/virt/libvirt/driver.py b/nova/virt/libvirt/driver.py
index b1851296ac..9404cb03c5 100644
--- a/nova/virt/libvirt/driver.py
+++ b/nova/virt/libvirt/driver.py
@@ -9314,9 +9314,9 @@ class LibvirtDriver(driver.ComputeDriver):
         # forbids reporting inventory with total=0
         if vcpus:
             result[orc.VCPU] = {
-                'total': vcpus,
+                'total': vcpus * ratios[orc.VCPU],
                 'min_unit': 1,
-                'max_unit': vcpus,
+                'max_unit': vcpus * ratios[orc.VCPU],
                 'step_size': 1,
                 'allocation_ratio': ratios[orc.VCPU],
                 'reserved': CONF.reserved_host_cpus,

可能不需要照上面的修改，because if you have with the same 10 CPU server, a cpu-allocation-ratio of 2, this means that you can use 20 CPU on that server. Example:

20 VMs with 1 CPU
10 VMS with 2 CPU
4 VMS with 5 CPU
2 VM with 10 CPU.
But you cannot allocate 1 VM with > 10 CPU. 即不好把一个VM的CPU数量分配超过某个物理节点的总cpu数量. 所以从这个角度上也可以说它没有bug.

20240904 - vGPU placement

GPU SR-IOV也分PF(real GPU)和VF(slice)

$ lspci | grep NVIDIA
25:00.0 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
...
25:03.6 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)

mdevs (mediated devices)是一个中间层，位于VM与VM之间，

#https://docs.openstack.org/charm-guide/latest/admin/vgpu.html
sudo nvidia-smi
juju deploy ch:nova-compute-nvidia-vgpu --channel=yoga/stable
juju integrate nova-compute-nvidia-vgpu:nova-vgpu nova-compute:nova-vgpu
juju attach nova-compute-nvidia-vgpu nvidia-vgpu-software=./nvidia-vgpu-ubuntu-510_510.47.03_amd64.deb
juju exec -a nova-compute-nvidia-vgpu -- sudo reboot
juju config nova-compute-nvidia-vgpu vgpu-device-mappings="{'nvidia-610': ['0000:25:02.3', '0000:25:00.5']}"
$ juju run nova-compute-nvidia-vgpu/0 list-vgpu-types
  ...
  nvidia-604, 0000:25:02.3, NVIDIA A10-12A, num_heads=1, frl_config=60, framebuffer=12288M, max_resolution=1280x1024, max_instance=2

openstack resource provider list
openstack resource provider inventory list xxx
openstack flavor set <flavor-name> --property resources:VGPU=1
openstack resource provider allocation show <vm-uuid>
  
openstack trait create CUSTOM_VGPU_PLACEMENT
for uuid in $(openstack resource provider list | grep pci_0000 | -e 0000_25_00_4 -e 0000_25_00_5 | awk '{print $2}'); 
   do openstack resource provider trait set --trait CUSTOM_VGPU_PLACEMENT $uuid
done
openstack flavor set 21e3177c-4879-429d-9f81-53199f38ec59 --property trait:CUSTOM_VGPU_PLACEMENT=required