swift stat
报错 (‘Connection aborted.’, error(111, ‘Connection refused’))
日志显示
Feb 24 14:24:52 datanode2 abrt: detected unhandled Python exception in '/usr/bin/swift-proxy-server'
Feb 24 14:24:52 datanode2 abrtd: Directory 'pyhook-2016-02-24-14:24:52-5774' creation detected
Feb 24 14:24:52 datanode2 abrt-server[5891]: Saved Python crash dump of pid 5774 to /var/spool/abrt/pyhook-2016-02-24-14:24:52-5774
Feb 24 14:24:52 datanode2 abrtd: Package 'openstack-swift-proxy' isn't signed with proper key
Feb 24 14:24:52 datanode2 abrtd: 'post-create' on '/var/spool/abrt/pyhook-2016-02-24-14:24:52-5774' exited with 1
Feb 24 14:24:52 datanode2 abrtd: Deleting problem directory '/var/spool/abrt/pyhook-2016-02-24-14:24:52-5774'
Feb 24 14:24:55 datanode2 kernel: ip_tables: (C) 2000-2006 Netfilter Core Team
Feb 24 14:25:06 datanode2 polkitd[6436]: started daemon version 0.96 using authority implementation `local' version `0.96'
Feb 24 14:25:07 datanode2 rtkit-daemon[6447]: Sucessfully made thread 6445 of process 6445 (/usr/bin/pulseaudio) owned by '42' high priority at nice level -11.
Feb 24 14:25:07 datanode2 pulseaudio[6445]: pid.c: Stale PID file, overwriting.
检查发现进程已死
[root@datanode2 ~]# service openstack-swift-proxy status
openstack-swift-proxy 已死,但 pid 文件仍存
删除 /etc/swift/proxy-server.conf中pipeline中的ceilometer重启openstack-swift-proxy则问题解决。
但是我们要把ceilometer加进去怎么办
swift-init all restart后输出如下
Traceback (most recent call last):
File "/usr/bin/swift-proxy-server", line 23, in <module>
sys.exit(run_wsgi(conf_file, 'proxy-server', default_port=8080, **options))
File "/usr/lib/python2.6/site-packages/swift/common/wsgi.py", line 389, in run_wsgi
loadapp(conf_path, global_conf=global_conf)
File "/usr/lib/python2.6/site-packages/swift/common/wsgi.py", line 323, in loadapp
return ctx.create()
File "/usr/lib/python2.6/site-packages/paste/deploy/loadwsgi.py", line 710, in create
return self.object_type.invoke(self)
File "/usr/lib/python2.6/site-packages/paste/deploy/loadwsgi.py", line 207, in invoke
app = filter(app)
File "/usr/lib/python2.6/site-packages/ceilometer/objectstore/swift_middleware.py", line 194, in ceilometer_filter
return CeilometerMiddleware(app, conf)
File "/usr/lib/python2.6/site-packages/ceilometer/objectstore/swift_middleware.py", line 79, in __init__
service.prepare_service([])
File "/usr/lib/python2.6/site-packages/ceilometer/service.py", line 146, in prepare_service
cfg.CONF(argv[1:], project='ceilometer')
File "/usr/lib/python2.6/site-packages/oslo/config/cfg.py", line 1638, in __call__
raise ConfigFilesNotFoundError(self._namespace.files_not_found)
oslo.config.cfg.ConfigFilesNotFoundError: Failed to read some config files: /usr/share/ceilometer/ceilometer-dist.conf,/etc/ceilometer/ceilometer.conf
解决方法,使用下面命令用于所有报错的目录
chmod -R 777 /usr/share/ceilometer/
Error: couldn’t connect to server 127.0.0.1:27017 at src/mongo/shell/mongo.js
解决方案:
利用–host 来指定与/etc/mongodb.conf中bind_ip对应的地址
ceilometer监测不到image.download
在/etc/glance/glance-api.conf
中添加control_exchange=glance
,重启glance-api服务后问题解决。
ERROR: The server has either erred or is incapable of performing the requested operation. (HTTP 500)
解决方案
在controller节点上/etc/nova/nova.conf中注释掉security_group_api那行
firewall_driver = nova.virt.firewall.NoopFirewallDriver
#security_group_api = neutron
重启控制节点上的所有nova服务
service openstack-nova-api restart
service openstack-nova-cert restart
service openstack-nova-consoleauth restart
service openstack-nova-scheduler restart
service openstack-nova-conductor restart
service openstack-nova-novncproxy restart
instance之间互相不能ping通
解决方案:
±-----------------±-------------------------------------------------+
| Field | Value |
±-----------------±-------------------------------------------------+
| allocation_pools | {“start”: “192.168.2.2”, “end”: “192.168.2.254”} |
| cidr | 192.168.2.0/24 |
| dns_nameservers | |
| enable_dhcp | True |
| gateway_ip | 192.168.2.1 |
| host_routes | |
| id | 552cf391-c30c-440c-ac27-5962fcd0d6a1 |
| ip_version | 4 |
| name | demo-subnet |
| network_id | 22bec80b-f9dd-40f0-9d29-9d8fd36c50e5 |
| tenant_id | e8e3b94c7d0c45578c9fbbf0d341e5a2 |
±-----------------±-------------------------------------------------+
source demo-openrc.sh
nova secgroup-add-rule default icmp -1 -1 0.0.0.0/0
nova secgroup-add-rule default tcp 22 22 0.0.0.0/0
然后在网络节点上执行下面操作
sudo iptables -t nat -A POSTROUTING -o eth2 -j MASQUERADE
其中eth2是我用来连接外网的接口
似乎依然没用,接着我在vm中执行
sudo route add -net 192.168.2.0 gw 192.168.2.1 netmask 255.255.255.0 dev eth0
报错SIOCADDRT: No such process
根据https://support.symantec.com/en_US/article.TECH142841.html的描述,这种错误一般有两种可能:
There are multiple known causes for this error:
- You attempted to set a route specific to an interface which was not up at the time you ran the command.
- You attempted to set a route for a network before setting a host route for the gateway which handles traffic for that network.
通过ifconfig发现eth0是开启的,所以初步判定是第二种情况
执行sudo route add -host 192.168.2.1 dev eth0
然后重新运行sudo route add -net 192.168.2.0 gw 192.168.2.1 netmask 255.255.255.0 dev eth0
依然没有什么卵用
但是当我重启eth0的时候发现了这个
udhcpc(v1.20.1) started
sening discover...
sending discover...
sending discover...
No lease,failing
neutron agent-list
show nothing
解决方案:
controller上看不到,但是在network上可以看到
Feb 29 15:02:26 datanode2 object-expirer: STDOUT: ERROR:root:Error connecting to memcached: 127.0.0.1:11211#012Traceback (most recent call last):#012 File "/usr/lib/python2.6/site-packages/swift/common/memcached.py", line 239, in _get_conns#012 fp, sock = self._client_cache[server].get()#012 File "/usr/lib/python2.6/site-packages/swift/common/memcached.py", line 135, in get#012 fp, sock = self.create()#012 File "/usr/lib/python2.6/site-packages/swift/common/memcached.py", line 128, in create#012 sock.connect((host, int(port)))#012 File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 192, in connect#012 socket_checkerr(fd)#012 File "/usr/lib/python2.6/site-packages/eventlet/greenio.py", line 46, in socket_checkerr#012 raise socket.error(err, errno.errorcode[err])#012error: [Errno 111] ECONNREFUSED (txn: tx998ec7d3f8704fd6bb70b-0056d3ed02)
Feb 29 15:02:26 datanode2 object-expirer: Pass completed in 0s; 0 objects expired
nova list报错
ERROR: An unexpected error prevented the server from fulfilling your request. (HTTP 500)看
查看/var/log/messages发现kernel: nf_conntrack: table full, dropping packet.
然后在/var/log/keystone/keystone.log中发现
2016-04-23 17:37:15.721 5861 TRACE keystone.common.wsgi OperationalError: (OperationalError) (2003, “Can’t connect to MySQL server on ‘controller’ (111)”) None None
使用sysctl net.netfilter.nf_conntrack_count
发现
net.netfilter.nf_conntrack_count = 65536
使用sysctl net.netfilter.nf_conntrack_max
发现
net.netfilter.nf_conntrack_max = 65536
也就是网络连接数已经达到了上限,这是为毛?
从本机登录mysql没有问题,但是当运行下面的命令时报错:
mysql> select database();
No connection. Trying to reconnect...
Connection id: 2
Current database: keystone
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (111)
ERROR:
Can't connect to the server
这里面的111和keystone.log中的111就对应上了
在/etc/my.cnf中将mysql.sock改为mysql2.sock,报错
[root@datanode2 ~]# mysql -uroot -pzhangchen -Dkeystone
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)
再改回mysql.sock,又报错
mysql> show tables;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)
ERROR:
Can't connect to the server