使用OpenStack创建VM,遇到No valid host was found 错误的分析

对于一些刚接触OpenStack的新人,辛苦两天终于把OpenStack部署好了,建立实例时却失败了。这是一件很郁闷的事情。
大家试想一下,如果你准备启动一台物理服务器,如果服务器的CPU,内存,存储发生错误,在开机自检阶段就过不了,即意味着服务器挂掉,无法启动了。
同样,生成VM时常见的报错 " No valid host was found. There are not enough hosts available "也基本是CPU,内存,存储出错导致的。创建VM时错误地使用了external类型的网络,也会产生这个报错。

1. CPU虚拟化参数配置错误

查看nova-compute 日志:

couldn't obtain the vcpu count fromdomain id: 769f95ac-d8da-41be-8e29-f326f03a762f, exception: Requested operationis not valid: cpu affinity is not supported

**分析:**日志出现绑定CPU失败的错误,立刻想到和CPU虚拟化相关。/etc/nova/nova.conf
中的virt_type参数设置得不对
**处理:**修改compute节点的配置文件/etc/nova/nova.conf

 如果compute节点是物理机或开启嵌套虚拟化(CPU硬件加速)的虚拟机: virt_type=kvm
 如果compute节点是未开启嵌套虚拟化的虚拟机:virt_type=qemu

2. 内存不足导致报错

 用一个规格比较高的flavor创建实例:

这里写图片描述
nova-conductor.log 报错:

qemu-kvm:cannot set up guest memory 'pc.ram': Cannot allocate memory\n"]

nova-scheduler.log :
Filter RetryFilter returned 0 hosts

**分析:**日志已经给出原因:无法分配内存,即内存空间不足。特别是RetryFilter没有筛选出可以提供符合flavor中资源数量的host,这时应该去确认host中的资源是否不够用了。
**处理:**增加计算节点内存

3. 存储相关的原因导致报错

  1. 用自制的Ubuntu镜像创建实例时,没有配置合适的volume大小
    这里写图片描述
    nova日志报错:

    Image 1c63d80b-dc44-4785-bf20-4cdb47d7b2c6 is unacceptable: Imagevirtual size is 18GB and doesn't fit in a volume of size 1GB.
    

    **分析和处理:**镜像中的文件系统是18GB,上图红圈中的参数必须大于等于18GB

  2. Block Device Mapping is Invalid 也是很常见错误

    **分析和处理:**通常是由于cinder或ceph等backend配置错误引起的块存储设备报错。这时就需要查看cinder和ceph的服务状态是否正常,这里有几个常用命令:
    ceph -s
    vgs
    cinder service-list
    cinder list
    lsblk
    进一步排查原因的话还需要分析日志。

  3. 权限不足(导致没有可用的存储)
    查看nova-compute 日志:

    InvalidDiskInfo:Disk info file is invalid: qemu-img failed to execute on/var/lib/libvirt/images/centos7.0.qcow2 : Unexpected error while runningcommand.
    2017-01-1911:32:27.736 25979 ERROR nova.compute.manager Command: /usr/bin/python2 -moslo_concurrency.prlimit --as=1073741824 --cpu=2 -- env LC_ALL=C info /var/lib/libvirt/images/centos7.0.qcow2
    2017-01-1911:32:27.736 25979 ERROR nova.compute.manager Exit code: 1
    2017-01-1911:32:27.736 25979 ERROR nova.compute.manager Stdout: u''
    2017-01-1911:32:27.736 25979 ERROR nova.compute.manager Stderr: u"qemu-img: Couldnot open '/var/lib/libvirt/images/centos7.0.qcow2': Could not open '/var/lib/libvirt/images/centos7.0.qcow2':Permission denied\n"
    InvalidDiskInfo:Disk info file is invalid: qemu-img failed to execute on/var/lib/libvirt/images/centos7.0.qcow2 : Unexpected error while runningcommand.
    2017-01-1911:32:27.736 25979 ERROR nova.compute.manager Command: /usr/bin/python2 -moslo_concurrency.prlimit --as=1073741824 --cpu=2 -- env LC_ALL=C info /var/lib/libvirt/images/centos7.0.qcow2
    2017-01-1911:32:27.736 25979 ERROR nova.compute.manager Exit code: 1
    2017-01-1911:32:27.736 25979 ERROR nova.compute.manager Stdout: u''
    2017-01-1911:32:27.736 25979 ERROR nova.compute.manager Stderr: u"qemu-img: Couldnot open '/var/lib/libvirt/images/centos7.0.qcow2': 
    Could not open '/var/lib/libvirt/images/centos7.0.qcow2':Permission denied\n"
    

    分析:/var/lib/libvirt/images/centos7.0.qcow2是我之前做实验用virt-manager在host OS创建的虚拟机,也就是说这个虚拟机镜像被virt-manager所管理,导致openstack没有权限使用启动这个虚拟机(即没有可用的存储)。
    **处理:**在host os 上,卸载virt-manager

4. 创建虚拟机时的错误操作

外部网络(Provider / external / public network)一般不能直接用来创建VM(除非该外部网络同时是shared network)这里写图片描述

如果创建VM时直接使用了外部网络,则报no valid host was found 错误。查看拓扑图会发现这台VM没有链路。
这里写图片描述

controller节点 nova-conductor日志

==> nova-conductor.log <==
2017-10-16 11:02:23.153 2090 ERROR nova.scheduler.utils [req-370255d8-f462-4090-9203-8db574e8f589 0817b805f51e4877a383dd401a318bee b0d993d7027e457189f70bae70f870a9 - - -] [instance: 9b2e0b9e-6d16-4576-b289-075761e3d449] Error from last host: compute (node compute): [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1905, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2058, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance 9b2e0b9e-6d16-4576-b289-075761e3d449 was re-scheduled: Binding failed for port 41b8cdb2-3b1d-4661-bb47-f567c57bda5f, please check neutron logs for more information.\n']


2017-10-16 11:02:23.226 2090 WARNING nova.scheduler.utils [req-370255d8-f462-4090-9203-8db574e8f589 0817b805f51e4877a383dd401a318bee b0d993d7027e457189f70bae70f870a9 - - -] Failed to compute_task_build_instances: No valid host was found. There are not enough hosts available.
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 142, in inner
    return func(*args, **kwargs)

  File "/usr/lib/python2.7/site-packages/nova/scheduler/manager.py", line 84, in select_destinations
    filter_properties)

  File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 90, in select_destinations
    raise exception.NoValidHost(reason=reason)

NoValidHost: No valid host was found. There are not enough hosts available.

2017-10-16 11:02:23.227 2090 WARNING nova.scheduler.utils [req-370255d8-f462-4090-9203-8db574e8f589 0817b805f51e4877a383dd401a318bee b0d993d7027e457189f70bae70f870a9 - - -] [instance: 9b2e0b9e-6d16-4576-b289-075761e3d449] Setting instance to ERROR state.

controller节点 neutron-server日志

2017-10-16 11:02:22.260 2189 ERROR neutron.plugins.ml2.managers [req-cdcb2a89-98f5-45e7-827f-812a01dc4dd6 0ebb9d3a4d52454ca74d1c45a382795a ed964295d9ca4f878fe4c25478aaeca0 - - -] Failed to bind port 41b8cdb2-3b1d-4661-bb47-f567c57bda5f on host compute

compute节点nova-compute日志

2017-10-16 11:02:23.708 1408 ERROR nova.compute.manager [-] Instance failed network setup after 1 attempt(s)
2017-10-16 11:02:23.708 1408 ERROR nova.compute.manager Traceback (most recent call last):
2017-10-16 11:02:23.708 1408 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1564, in _allocate_network_async
2017-10-16 11:02:23.708 1408 ERROR nova.compute.manager     dhcp_options=dhcp_options)
2017-10-16 11:02:23.708 1408 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 744, in allocate_for_instance
2017-10-16 11:02:23.708 1408 ERROR nova.compute.manager     self._delete_ports(neutron, instance, created_port_ids)
2017-10-16 11:02:23.708 1408 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 195, in __exit__

2017-10-16 11:02:24.004 1408 DEBUG nova.compute.utils [req-370255d8-f462-4090-9203-8db574e8f589 0817b805f51e4877a383dd401a318bee b0d993d7027e457189f70bae70f870a9 - - -] [instance: 9b2e0b9e-6d16-4576-b289-075761e3d449] Binding failed for port 41b8cdb2-3b1d-4661-bb47-f567c57bda5f, please check neutron logs for more information. 

compute节点/var/log/neutron下的openvswitch-agent.log日志

2017-10-16 11:02:23.138 1417 INFO neutron.agent.securitygroups_rpc [req-cdcb2a89-98f5-45e7-827f-812a01dc4dd6 0ebb9d3a4d52454ca74d1c45a382795a ed964295d9ca4f878fe4c25478aaeca0 - - -] Security group member updated [u'312e84c2-ff27-460f-802a-10eaabb3bd19']
2017-10-16 11:02:23.652 1417 INFO neutron.agent.securitygroups_rpc [req-2493c53e-7b58-47f5-a2af-5f6c49af0f5e 0ebb9d3a4d52454ca74d1c45a382795a ed964295d9ca4f878fe4c25478aaeca0 - - -] Security group member updated [u'312e84c2-ff27-460f-802a-10eaabb3bd19']
2017-10-16 11:02:24.358 1417 INFO neutron.agent.common.ovs_lib [req-1241fe20-da43-457f-9ec8-365111248723 - - - - -] Port 41b8cdb2-3b1d-4661-bb47-f567c57bda5f not present in bridge br-int
2017-10-16 11:02:24.359 1417 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-1241fe20-da43-457f-9ec8-365111248723 - - - - -] port_unbound(): net_uuid None not in local_vlan_map
2017-10-16 11:02:24.360 1417 INFO neutron.agent.securitygroups_rpc [req-1241fe20-da43-457f-9ec8-365111248723 - - - - -] Remove device filter for [u'41b8cdb2-3b1d-4661-bb47-f567c57bda5f']

**分析:**上述日志中有大量port相关报错。根本原因是上面提到的创建虚拟机时不能直接使用外部网络(即无法给VM分配port)。如果要访问外部网络,必须经过路由器中转。

**处理:**使用内部网络(私有网络/private网络)或者shared类型的外部网络创建虚拟机

网络的思考:##

通常情况下,物理机不会因为自身网络/网卡问题而无法启动,同样虚拟机也不会在启动过程中由于网络问题而出现No valid host was found 错误 。例外的场景是在Openstack中增加第三方SDN控制器后,SDN控制器的某些错误配置会引起创建VM时会报No valid host was found 报错; 以及错误的使用外部网络创建虚拟机时也会报No valid host was found错误。

5. 没有可用的服务

查看日志可以发现报错:
这里写图片描述

**分析:**日志已经给出原因:ServiceNotFound , 即未找到服务供OpenStack使用。
**处理:**有服务没有安装好,或者安装好了没启动。没启动的话就用手动用命令将其启动。

小结

查看日志的技巧——重点看scheduler 日志中的Filter字段来确定哪种资源不足

  • 6
    点赞
  • 45
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值