openstack常见错误总结

最新推荐文章于 2021-12-02 09:30:59 发布

Junfei-Yang

最新推荐文章于 2021-12-02 09:30:59 发布

阅读量1.6w

点赞数

分类专栏： linux 文章标签： Linux

linux 专栏收录该内容

27 篇文章 0 订阅

订阅专栏

以下主要为安装部署过程中遇到的一些问题，因为openstack版本问题，带来的组件差异导致不同的版本安装的方法也完全不一样。经过测试，目前已可成功部署Essex和Grizzly两个版本，其中间还有个版本是Folsom,这个版本没有部署成功，也没有花太多时间去研究，因为Folsom版本中使用的quantum组件还不成熟，对于网络连通性还有多问题，网上也很少有成功的案例，大多数人使用的还是folsom+nova-network模式。

到了Grizzly版本，quantum组件才比较稳定，可以正常使用，自己也花了很多时间研究，现在已可以成功部署多节点环境。以下是部署过程中遇到的一些问题，包括Essex和Grizzly两个版本。国内网上关于这方面的资料很少，很多资料也都是国外网站上看到的。而且很多情况下日志错误信息相同，但导致错误的原因却不尽相同，这时候就需要仔细分析其中的原理，才能准确定位。遇到错误并不可怕，我们可以通过对错误的排查加深对系统的理解，这样也是好事。

关于安装部署，网上有一些自动化的部署工具，如devstack和onestack，一键式部署。如果你是初学者，并不建议你使用这些工具，很明显，这样你学不到任何东西，不会有任何收获。如果没有问题可以暂时恭喜你一下，一旦中间环节出现错误信息，你可能一头雾水，根本不知道是哪里错了，加之后期的维护也是相当困难的。你可能需要花更多的时间去排查故障。因为你根本不了解中间经过了哪些环节，需要做哪些配置！这些工具大多数是为了快速部署开发环境所用，正真生产环境还需要我们一步一步来操作。这样有问题也可快速定位排查错误。

本文仅是针对部署过程中的一些错误信息进行总结梳理，并给予解决办法，这些情况是在我的环境里遇到的，并成功解决的，可能会因为环境的不同而有所差异，仅供参考。

1、检查服务是否正常：

root@control:~# nova-manage service list

Binary           Host                                 Zone             Status     State Updated_At

nova-cert        control                              internal         enabled    :-)   2013-04-26 02:29:44

nova-conductor   control                              internal         enabled    :-)   2013-04-26 02:29:42

nova-consoleauth control                              internal         enabled    :-)   2013-04-26 02:29:44

nova-scheduler   control                              internal         enabled    :-)   2013-04-26 02:29:47

nova-compute     node-01                              nova             enabled    :-)   2013-04-26 02:29:46

nova-compute     node-02                              nova             enabled    :-)   2013-04-26 02:29:46

nova-compute     node-03                              nova             enabled    :-)   2013-04-26 02:29:42

如果看到都是笑脸状态，说明nova的服务属于正常状态，如果出现XXX，请查看该服务的相关日志信息，在/var/log/nova/下查看，通过日志一般可以分析出错误的原因。

2、libvirt错误

python2.7/dist-packages/nova/virt/libvirt/connection.py”, line 338, in _connect
2013-03-0917:05:42 TRACE nova return libvirt.openAuth(uri, auth, 0)
2013-03-09 17:05:42 TRACE nova File “/usr/lib/python2.7/dist-packages/libvirt.py”, line 102, in openAuth
2013-03-09 17:05:42 TRACE nova if ret is None:raise libvirtError(‘virConnectOpenAuth() failed’)
2013-03-09 17:05:42 TRACE nova libvirtError: Failed to connect socket to ‘/var/run/libvirt/libvirt-sock’: No such file or directory
2013-03-09 22:05:41.909+0000: 12466: info : libvirt version: 0.9.8
2013-03-09 22:05:41.909+0000: 12466: error : virNetServerMDNSStart:460 : internal error Failed to create mDNS client: Daemon not running

解决方案：

出现这种错误首先要查看/var/log/libvirt/libvirtd.log日志信息，日志里会显示：libvirt-bin service will not start without dbus installed.

我们再查看ps –ea|grep dbus，确认dbus is running，然后执行apt-get install lxc

3、Failed to add image

Error：
Failed to add image. Got error: The request returned 500 Internal Server Error

解决方案：

环境变量问题，配置环境变量，在/etc/profile文件中新增：

OS_AUTH_KEY=”openstack” 
OS_AUTH_URL=”http://localhost:5000/v2.0/”
OS_PASSWORD=”openstack”
OS_TENANT_NAME=”admin” 
OS_USERNAME=”admin”

然后执行source /etc/profile即可！当然你也可以不在profile里配置环境变量，但是只能临时生效，重启服务器就很麻烦，所以建议你还是写在profile里，这样会省很多麻烦。

4、僵尸实例的产生

僵尸实例一般是非法的关闭nova或者底层虚拟机，又或者在实例错误时删除不了的错误，注意用virsh list检查底层虚拟机是否还在运行，有的话停掉，然后直接进入数据库删除。

Nova instance not found

Local file storage of the image files.

Error:
2013-03-09 17:58:08 TRACE nova raise exception.InstanceNotFound(instance_id=instance_name)
2013-03-09 17:58:08 TRACE nova InstanceNotFound: Instance instance-00000002 could not be found.
2013-03-09 17:58:08 TRACE nova

解决方案：

删除数据库中的僵尸实例或将数据库删除重新创建：

a、删除数据库：

$mysql –u root –p
DROP DATABASE nova;

Recreate the DB:
CREATE DATABASE nova; (strip formatting if you copy and paste any of this)
GRANT ALL PRIVILEGES ON nova.* TO ‘novadbadmin’@'%’ IDENTIFIED BY ‘<password>’;
Quit

Resync DB

b、删除数据库中的实例：

#!/bin/bash

mysql -uroot -pmysql << _ESXU_

use nova;

DELETE a FROM nova.security_group_instance_association 

AS a INNER JOIN nova.instances AS b

ON a.instance_uuid=b.id where b.uuid='$1';

DELETE FROM nova.instance_info_caches WHERE instance_uuid='$1';

DELETE FROM nova.instances WHERE uuid='$1';

_ESXU_

将以上文件写入delete_insrance.sh中，然后执行sh delete_instrance.sh insrance_id;

其中instrance_id可以通过nova list 查看。

5、Keystone NoHandlers

Error
root@openstack-dev-r910:/home/brent/openstack# ./keystone_data.sh
No handlers could be found for logger “keystoneclient.client”
Unable to authorize user
No handlers could be found for logger “keystoneclient.client”
Unable to authorize user
No handlers could be found for logger “keystoneclient.client”
Unable to authorize user

解决方案：

出现这种错误是大多数是由于keystone_data.sh有误，其中

admin_token必须与/etc/keystone/keystone.conf中相同。然后确认keystone.conf中有如下配置：

driver = keystone.catalog.backends.templated.TemplatedCatalog template_file = /etc/keystone/default_catalog.templates

6、清空系统组件，重新安装：

#!/bin/bash
mysql -uroot -popenstack -e “drop database nova;”
mysql -uroot -popenstack -e “drop database glance;”
mysql -uroot -popenstack -e “drop database keystone;”
apt-get purge nova-api nova-cert nova-common nova-compute
nova-compute-kvm nova-doc nova-network nova-objectstore
nova-scheduler nova-vncproxy nova-volume python-nova python-novaclient
apt-get autoremove
rm -rf /var/lib/glance
rm -rf /var/lib/keystone/
rm -rf /var/lib/nova/
rm -rf /var/lib/mysql

可通过执行上面的脚本，卸载已安装的组件并清空数据库。这样可以省去重装系统的麻烦！

7、Access denied for user ‘keystone@localhost(using password:YES’)

# keystone-manage db_sync

File “/usr/lib/python2.7/dist-packages/MySQLdb/connections.py”, line 187, in __init__

super(Connection, self).__init__(*args, **kwargs2)
sqlalchemy.exc.OperationalError: (OperationalError) (1045, “Access denied for user ‘keystone’@'openstack1′ (using password: YES)”) None None

解决方案：

查看keystone.conf配置文件链接数据库是否有误，正确如下：

[sql]

 connection = mysql://keystone:openstack@localhost:3306/keystone

8、nova-compute挂掉与时间同步的关系

很多时候发现nova-compute挂掉，或者不正常了，通过nova-manage查看状态是XXX了。

往往是nova-compute的主机时间和controller的主机时间不一致。 nova-compute是定时地往数据库中services这个表update时间的，这个时间是nova-compute的主机时间。

controller校验nova-compute的存活性是以controller的时间减去nova-compute的update时间，如果大于多少秒（具体数值代码里面有，好像是15秒）就判断nova-compute异常。

这个时候你用nova-manage查看nova-compute状态是XXX，如果创建虚拟机，查看nova-scheduler.log 就是提示找不到有效的host 其他服务节点类同，这是nova心跳机制问题。所以讲nova环境中各节点时间同步很重要。一定要确保时间同步！！

如果在dashboard上看nova-compute状态，可能一会儿变红，一会儿变绿。那就严格同步时间，或者找到代码，把上面的那个15秒改大一点。

9、noVNC不能连接到实例

novnc的问题比较多，网上也有关于这方面的很多配置介绍，其实配置不复杂，只有四个参数，配置正确基本上没什么大问题，但是装的过程中还是遇到了不少的问题。

a、提示“Connection Refuesd”

可能是控制节点在收到vnc请求的时候，无法解析计算节点的主机名，从而无法和计算节点上的实例建立连接。

另外可能是，当前浏览器不支持或者不能访问，将计算节点的ip和主机名的对应关系加入到控制节点的/etc/hosts文件中。

b、提示“failed connect to server”

出现这种错误的情况比较多，有可能是配置文件的错误，我们的环境中遇到这个错误是因为网络源有更新，导致安装版本不一致，使组件无法正常使用，解决方法就是使用本地源。另外需要特别说明的是使用novnc的功能需要浏览器支持Web Socket和HTML5.推荐使用谷歌。

10、cinder错误，无法登录dashboard.

出现如下错误：

   TypeError at /admin/ 
hasattr(): attribute name must be string

Request Method:   GET

Request URL: http://192.168.80.21/horizon/admin/

Django Version:   1.4.5

Exception Type:   TypeError

Exception Value: 

hasattr(): attribute name must be string

Exception Location: /usr/lib/python2.7/dist-packages/cinderclient/client.py in __init__, line 78

Python Executable:   /usr/bin/python

Python Version:   2.7.3

Python Path: 

['/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../..',

 '/usr/lib/python2.7',

 '/usr/lib/python2.7/plat-linux2',

 '/usr/lib/python2.7/lib-tk',

 '/usr/lib/python2.7/lib-old',

 '/usr/lib/python2.7/lib-dynload',

 '/usr/local/lib/python2.7/dist-packages',

 '/usr/lib/python2.7/dist-packages',

 '/usr/share/openstack-dashboard/',

 '/usr/share/openstack-dashboard/openstack_dashboard']

 

Server time: Fri, 29 Mar 2013 12:51:09 +0000

解决方案

查看 apache2 的 error 日志，报如下错误：

ERROR:django.request:Internal Server Error: /horizon/admin/

Traceback (most recent call last):

  File "/usr/lib/python2.7/dist-packages/django/core/handlers/base.py", line 111, in get_response

    response = callback(request, *callback_args, **callback_kwargs)

  File "/usr/lib/python2.7/dist-packages/horizon/decorators.py", line 38, in dec

    return view_func(request, *args, **kwargs)

  File "/usr/lib/python2.7/dist-packages/horizon/decorators.py", line 86, in dec

    return view_func(request, *args, **kwargs)

  File "/usr/lib/python2.7/dist-packages/horizon/decorators.py", line 54, in dec

    return view_func(request, *args, **kwargs)

  File "/usr/lib/python2.7/dist-packages/horizon/decorators.py", line 38, in dec

    return view_func(request, *args, **kwargs)

  File "/usr/lib/python2.7/dist-packages/horizon/decorators.py", line 86, in dec

    return view_func(request, *args, **kwargs)

  File "/usr/lib/python2.7/dist-packages/django/views/generic/base.py", line 48, in view

    return self.dispatch(request, *args, **kwargs)

  File "/usr/lib/python2.7/dist-packages/django/views/generic/base.py", line 69, in dispatch

    return handler(request, *args, **kwargs)

  File "/usr/lib/python2.7/dist-packages/horizon/tables/views.py", line 155, in get

    handled = self.construct_tables()

  File "/usr/lib/python2.7/dist-packages/horizon/tables/views.py", line 146, in construct_tables

    handled = self.handle_table(table)

  File "/usr/lib/python2.7/dist-packages/horizon/tables/views.py", line 118, in handle_table

    data = self._get_data_dict()

  File "/usr/lib/python2.7/dist-packages/horizon/tables/views.py", line 182, in _get_data_dict

    self._data = {self.table_class._meta.name: self.get_data()}

  File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/dashboards/admin/overview/views.py", line 41, in get_data

    data = super(GlobalOverview, self).get_data()

  File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/usage/views.py", line 34, in get_data

    self.usage.get_quotas()

  File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/usage/base.py", line 115, in get_quotas

    _("Unable to retrieve quota information."))

  File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/usage/base.py", line 112, in get_quotas

    self.quotas = quotas.tenant_quota_usages(self.request)

  File "/usr/lib/python2.7/dist-packages/horizon/utils/memoized.py", line 33, in __call__

    value = self.func(*args)

  File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/usage/quotas.py", line 115, in tenant_quota_usages

    disabled_quotas=disabled_quotas):

  File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/usage/quotas.py", line 98, in get_tenant_quota_data

    tenant_id=tenant_id)

  File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/usage/quotas.py", line 80, in _get_quota_data

    quotasets.append(getattr(cinder, method_name)(request, tenant_id))

  File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/api/cinder.py", line 123, in tenant_quota_get

    c_client = cinderclient(request)

  File "/usr/share/openstack-dashboard/openstack_dashboard/wsgi/../../openstack_dashboard/api/cinder.py", line 59, in cinderclient

    http_log_debug=settings.DEBUG)

  File "/usr/lib/python2.7/dist-packages/cinderclient/v1/client.py", line 69, in __init__

    cacert=cacert)

  File "/usr/lib/python2.7/dist-packages/cinderclient/client.py", line 78, in __init__

    if hasattr(requests, logging):

TypeError: hasattr(): attribute name must be string

错误信息中指出了 Cinderclient 的 client.py 中 78 行 hasattr() 方法的属性必须是一个字符串。
修改代码：

# vim /usr/lib/python2.7/dist-packages/cinderclient/client.py

 78 if hasattr(requests, logging): # 改为 ： if hasattr(requests, 'logging'):

 79 requests.logging.getLogger(requests.__name__).addHandler(ch)

重新启动 apache2 ：

/etc/init.d/apache2 restart

这次访问 dashboard 没有报错，尝试创建 volume 也没有问题了。

11、Unable to attach cinder volume to VM

在测试openstack中的volume服务时把lvm挂载到虚拟机实例时失败，这其实不是cinder的错误，是iscsi挂载的问题。

以下是计算节点nova-compute.log 的错误日志：

2012-07-24 14:33:08 TRACE nova.rpc.amqp ProcessExecutionError: Unexpected error while running command.
2012-07-24 14:33:08 TRACE nova.rpc.amqp Command: sudo nova-rootwrap iscsiadm -m node -T iqn.2010-10.org.openstack:volume-00000011 -p 192.168.0.23:3260 –rescan
2012-07-24 14:33:08 TRACE nova.rpc.amqp Exit code: 255
2012-07-24 14:33:08 TRACE nova.rpc.amqp Stdout: ”
2012-07-24 14:33:08 TRACE nova.rpc.amqp Stderr: ‘iscsiadm: No portal found.\n’

以上错误是没有找到iscsi服务端共享出的存储，查找了很多openstack 资料说要添加以下两个参数：

iscsi_ip_prefix=192.168.80 #openstack环境内网段

iscsi_ip_address=192.168.80.22 # volume机器内网IP

可是问题依然无法解决，后来发现只要在nova.conf配置文件中添加参数iscsi_helper=tgtadm 就挂载失败。

根据这个情况进行了测试查看日志才发现：如果使用参数：iscsi_helper=tgtadm 时就必须使用 tgt 服务，反之使用iscsitarget服务再添加参数iscsi_helper=ietadm。

我测试环境的问题是tgt和iscsitarget服务都已安装并运行着（在安装nova-common时会把tgt服务也安装上，这个不小心还真不会发现），在nova.conf配置中添加参数iscsi_helper=tgtadm ，查看端口3260 发现是iscsitarget服务占用，所以导致挂载失败，我们可以根据情况来使用哪个共享存储服务！！将tgt 和iscsi_helper=tgtadm、iscsitarget和iscsi_helper=ietadm保留一个即可。

12、glance index报错：

Authorization Failed: Unable to communicate with identity service: {"error": {"message": "An unexpected error prevented the server from fulfilling your request. Command 'openssl' returned non-zero exit status 3", "code": 500, "title": "Internal Server Error"}}. (HTTP 500)

在 Grizzly 版，我测试 glance index 时候报错：

Authorization Failed: Unable to communicate with identity service: {"error": {"message": "An unexpected error prevented the server from fulfilling your request. Command 'openssl' returned non-zero exit status 3", "code": 500, "title": "Internal Server Error"}}. (HTTP 500)

错误信息指出：glance 没有通过keystone验证，查看了 keystone 日志，报错如下：

2677 2013-03-04 12:40:58    ERROR [keystone.common.cms] Signing error: Error opening signer certificate /etc/keystone/ssl/certs/signing_cert.pem

2678 139803495638688:error:02001002:system library:fopen:No such file or directory:bss_file.c:398:fopen('/etc/keystone/ssl/certs/signing_cert.pem','r')

2679 139803495638688:error:20074002:BIO routines:FILE_CTRL:system lib:bss_file.c:400:

2680 unable to load certificate

2682 2013-03-04 12:40:58    ERROR [root] Command 'openssl' returned non-zero exit status 3

2683 Traceback (most recent call last):

2684   File "/usr/lib/python2.7/dist-packages/keystone/common/wsgi.py", line 231, in __call__

2685     result = method(context, **params)

2686   File "/usr/lib/python2.7/dist-packages/keystone/token/controllers.py", line 118, in authenticate

2687     CONF.signing.keyfile)

2688   File "/usr/lib/python2.7/dist-packages/keystone/common/cms.py", line 140, in cms_sign_token

2689     output = cms_sign_text(text, signing_cert_file_name, signing_key_file_name)

2690   File "/usr/lib/python2.7/dist-packages/keystone/common/cms.py", line 135, in cms_sign_text

2691     raise subprocess.CalledProcessError(retcode, "openssl")

2692 CalledProcessError: Command 'openssl' returned non-zero exit status 3

在Grizzly 版中，keystone 默认验证方式是 PKI , 需要签名证书，之前的版本都是用的 UUID，改 keystone.conf:

token_format = UUID

在试一次就没有错误了。

13、镜像制作

这里主要强调下windows的镜像制作，因为windows的涉及到加载驱动的问题，就比较麻烦。

下载virtio驱动，因为win默认不支持virtio驱动，而通过openstack管理虚拟机是需要virtio驱动的。需要两个virtio驱动，一个是硬盘的，一个是网卡的，即：virtio-win-0.1-30.iso和virtio-win-1.1.16.vfd。这里主要强调两个地方：

1、创建镜像：

 kvm -m 512 -boot d –drive

file=win2003server.img,cache=writeback,if=virtio,boot=on -fda virtio-win-1.1.16.vfd -cdrom windows2003_x64.iso -vnc:10

2、引导系统：

kvm -m 1024 –drive file=win2003server.img,if=virtio,

boot=on -cdrom virtio-win-0.1-30.iso -net nic,model=virtio -net user -boot c -nographic -vnc 8

这里需要注意的地方是if=virtio,boot=on –fda virtio-win-1.1.16.vfd和引导系统时使用的virtio-win-0.1-30.iso 这两个驱动分别是硬盘和网卡驱动。如果不加载这两个驱动安装时会发现找不到硬盘，并且用制作好的镜像生成实例也会发现网卡找不到驱动，所以在这里安装镜像生成后需要重新引导镜像安装更新网卡驱动为virtio。

14、删除僵尸volume

如果cinder服务不正常，我们在创建volume时会产生一些僵尸volume，如果在horizon中无法删除的话，我们需要到服务器上去手动删除，

命令：lvremove /dev/nova-volumes/volume-000002

注意这里一定要写完整的路径，不然无法删除，如果删除提示：

“Can't remove open logical volume“ 可尝试将相关服务stop掉，再尝试删除。删除完还需到数据库cinder的volumes表里清除相关记录。

性能问题

GRE性能问题

问题描述

采用Neutron的GRE模式，默认配置下，VM出网的性能极其低下，BUG列表： https://bugs.launchpad.net/neutron/+bug/1252900

Token过期数据问题

keystone把Token数据存放在数据库token表中，在使用过程中不会删除过期的Token数据，导致Token数据量异常。可参考https://bugs.launchpad.net/ubuntu/+source/keystone/+bug/1032633

2013-12-05 19:13:11.732 2625 WARNING keystone.common.controller [-] RBAC: Invalid token
2013-12-05 19:13:11.732 2625 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from 192.168.1.165

处理方式

http://www.sebastien-han.fr/blog/2012/12/12/cleanup-keystone-tokens/

启动nova-compute报错

\** (process:11739): WARNING **: Error connecting to bus: org.freedesktop.DBus.Error.FileNotFound: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory
process 11739: arguments to dbus_connection_get_data() were incorrect, assertion "connection != NULL" failed in file dbus-connection.c line 5804.

处理方法

重启messagebus服务

[root@compute1 ~]# /etc/init.d/messagebus start
Starting system message bus:                               [  OK  ]

启动ovs-agent出错

2014-01-26 00:58:07.074 29610 TRACE neutron   File "/usr/lib/python2.6/site-packages/neutron/agent/linux/ip_lib.py", line 81, in _execute
2014-01-26 00:58:07.074 29610 TRACE neutron     root_helper=root_helper)
2014-01-26 00:58:07.074 29610 TRACE neutron   File "/usr/lib/python2.6/site-packages/neutron/agent/linux/utils.py", line 62, in execute
2014-01-26 00:58:07.074 29610 TRACE neutron     raise RuntimeError(m)
2014-01-26 00:58:07.074 29610 TRACE neutron RuntimeError:
2014-01-26 00:58:07.074 29610 TRACE neutron Command: ['ip', '-o', 'link', 'show', 'br-int']
2014-01-26 00:58:07.074 29610 TRACE neutron Exit code: 255
2014-01-26 00:58:07.074 29610 TRACE neutron Stdout: ''
2014-01-26 00:58:07.074 29610 TRACE neutron Stderr: 'Device "br-int" does not exist.\n'
2014-01-26 00:58:07.074 29610 TRACE neutron

处理方式

增加br-int

[root@controller1 neutron]# ovs-vsctl add-br br-int
[root@controller1 neutron]# ovs-vsctl show
acb40cab-1fa0-48a0-a48c-56c89e1acfcd
   	Bridge br-int
       	Port br-int
           	Interface br-int
               	type: internal
   	ovs_version: "1.10.2"
[root@controller1 neutron]# ip -o link show br-int
5: br-int: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN \    link/ether 3e:d6:38:4e:28:43 brd ff:ff:ff:ff:ff:ff

qemu-kvm: failed to initialize spice server

spice配置问题，nova-compute节点的listen参数有问题！

qemu-kvm error

2014-03-13 18:01:11.312 12413 WARNING nova.virt.disk.api [req-4cb3d0ef-d70b-4383-a122-c070a62f757f 6965226966304bd5a3ae07587d5ef958 d2390e6dd4ce4b48866be0d3d1417c01] Ignoring error injecting data into image (Error mounting /share/instances/be363098-6749-42ea-84e0-824fdb1c8e59/disk with libguestfs (command failed: LC_ALL=C '/usr/libexec/qemu-kvm' -nographic -help
errno: File exists

处理方式

[root@controller1 ~]# ln -s /usr/bin/qemu-kvm /usr/libexec/qemu-kvm
[root@controller1 ~]# ls -l /usr/libexec/qemu-kvm
lrwxrwxrwx 1 root root 17 Mar 27 17:15 /usr/libexec/qemu-kvm -> /usr/bin/qemu-kvm

windows镜像注入问题

[root@compute2 /data/nova/instances/0071e60b-a0b6-41fa-b484-ede5448d87b9]#guestmount -a disk -i --ro /mnt/
guestmount: no operating system was found on this disk

未安装libguestfs-winsupport，需要安装libguestfs-winsupport。

处理方法

[root@compute2 /root] yum install libguestfs-winsupport

havana 2013.2.1 BUG

UnboundLocalError: local variable ‘instance_dir’ when live migration

修改对应代码，在函数头加入network_name全局变量：nova/virt/libvirt/driver.py

def pre_live_migration(self, context, instance, block_device_info,((Havana 2013.2.1))
#这个函数内增加instance_dir变量！
instance_dir = None

UnboundLocalError: local variable ‘network_name’ in nova/virt/network/neutronv2/api.py,line 964

修改对应代码，在函数头加入network_name全局变量：

def _nw_info_build_network(self, port, networks, subnets):
   	network_name = None(加入这个环境变量)

快照无法显示在dashboard上

修改代码/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py

大概在1307行处
metadata = {'is_public': False,      这里的False改为True

Libvirt error summary

Failed to start domain错误

	
[root@node1 ~]# virsh start vm01
error: Failed to start domain vm01
error: internal error process exited while connecting to monitor: Could not access KVM kernel module: No such file or directory
failed to initialize KVM: No such file or directory
No accelerator found!

处理方法

上面的提示信息就是因为QEMU在初始化阶段因为无法找到kvm内核模块，确保内核支持KVM模块，硬件打开CPU VT技术。

	[root@node1 ~]# modprobe kvm   #载入kvm模块
	重启电脑，进入bios界面，设置advance（cpu）选项里面的virtualization标签为Enabled
	[root@node1 ~]# lsmod |grep kvm    #显示已载入的模块
	kvm_intel              54394  3
	kvm                   317536  1 kvm_intel

虚拟机迁移错误

错误信息

[root@node1 ~]# virsh migrate --live 1 qemu+tcp://node2 --p2p --tunnelled --unsafe 
error: operation failed: Failed to connect to remote libvirt URI qemu+tcp://node2

处理方法

在URI后面加上/system，‘system’相当于root用户的访问权限。

错误信息

[root@node1 ~]# virsh migrate --live 2 qemu+tcp://node2/system --p2p --tunnelled
error: Unsafe migration: Migration may lead to data corruption if disks use cache != none

处理方法

加上–unsafe参数进行迁移。

错误信息

[root@node1 ~]# virsh migrate --live 2 qemu+tcp://192.168.0.121/system --p2p --tunnelled --unsafe 
error: Timed out during operation: cannot acquire state change lock

处理方法

启动虚拟机有时也会遇此错误,需要重启libvirtd进程。

错误信息

[root@node1 ~]# virsh migrate 5 --live qemu+tcp://node2/system
error: Unable to read from monitor: Connection reset by peer

处理方法

OpenStack nova.conf vncserver_listen的配置是否正确。

错误信息

error: internal error Attempt to migrate guest to the same host 00020003-0004-0005-0006-000700080009

处理方法

查看两个节点的system-uuid是否一样，如果一样需要修改libvirt的配置文件。可以通过如下的命令查看：

[root@controller1 ~]# dmidecode -s system-uuid
63897446-817B-0010-B604-089E01B33744

查看 /etc/libvirt/libvirtd.conf 中的host_uuid发现该行被注释，将该注释去掉，并需要对host_uuid的值进行修改！

在两台机器上分别用 cat /proc/sys/kernel/random/uuid的值来替换原来host_uuid的值！

ssh-agent无法自启动

CentOS中ssh-agent无法自动启动，可以通过在/etc/profile.d/ssh-agent.sh，启动脚本的方式启动agent。

[root@test ~]# vim /etc/profile.d/ssh-agent.sh
#!/bin/sh
if [ -f ~/.agent.env ]; then
. ~/.agent.env >/dev/null
if ! kill -0 $SSH_AGENT_PID >/dev/null 2>&1; then
echo “Stale agent file found. Spawning new agent…”
eval `ssh-agent |tee ~/.agent.env`
ssh-add
fi
else
echo “Starting ssh-agent…”
eval `ssh-agent |tee ~/.agent.env`
ssh-add
fi