版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 (作者:张华 发表于:2018-08-30)
问题
cinder-volumes在删除volume时很慢, 并且此时create volume的其他操作也无反应, 也无日志.
理论
green thread的概念就是之间不共享数据, 每个green thread有自己的私有的数据对象, 并且非阻塞, 一个green thread的I/O没准备就下其他的green thread跟上. 从而实现在一个进程上高效地跑大量非阻塞的greenthread.
所以greenthread的哲学就是不共享数据对象的, 但greenthread也可以通过tpool.Proxy来共享数据对象, 例如: 使用eventlet.pools.Pool机制还构建httplib2.Http实例池在不同greenthread之间作一定程度共享 (见我五年前的一篇博客 - https://blog.csdn.net/quqi99/article/details/9114577 ). 但是这种共享有一个问题, 当pool里的native thread抛错了但不显示返回的话似乎会造成native thread无法yield从而导致所有green thread也被阻塞. 下面测试程序也可以说明这一点:
import thread
import eventlet
import time
orig = time
from eventlet import tpool
eventlet.monkey_patch()
class MyException(Exception):
pass
class FOO(object):
def foo(self, char, starting_ident):
id=thread.get_ident()
print "native {} exec foo({})".format(char, id)
try:
raise MyException()
finally:
# REPLACE with pass to reproduce failure
# return
pass
def stuff(char):
print "entering green thread"
while True:
print "green exec foo({})".format(char)
f = tpool.Proxy(FOO())
f.foo(char, thread.get_ident())
print "green finished foo({})".format(char)
time.sleep(1)
if __name__ == "__main__":
g = eventlet.greenthread.spawn(stuff, 'A')
g = eventlet.greenthread.spawn(stuff, 'B')
print "done"
while True:
time.sleep(1)
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
rados.Rados实例来自python-rbd, 它本身又会spawn native thread去连接rados.
每个green thread都实例化一个rados.Rados进而启动一个native thread的话. 这些native thread之间是通过python解释器进程来同步数据的, native thread也不是非阻塞的. 所以当一个native thread在运行长任务不yield的话, 其他的green threads都没有机会运行, 所以此时无法对image做任何操作. 所以有这个patch (https://review.openstack.org/#/c/175555/ ) 引入了tpool.Proxy.
但是这个patch (https://review.openstack.org/#/c/197710/)又将它revert了, 理由是spawn出来的non-block native thread都去导入python module会造成死锁(According to Python documentation, code can
lead to a deadlock if the spawned thread directly or indirectly attempts
to import a module. ). 所以在_connect_to_rados这块又回到之前每个green thread都会实例化一个rados.Rados进而启动一个native thread的老方式, 但可以通过配置rados_connect_timeout缓解之前的问题( 一个green thread在超时时间内还连不就yield让其他green thread运行).
但是只是将_connect_to_rados这块改回老方式了, 还有其他使用tppool的方式, 如_get_usage_info就是通过tpool的方式:
def RBDProxy(self):
return tpool.Proxy(self.rbd.RBD())
def _get_usage_info(self):
total_provisioned = 0
with RADOSClient(self) as client:
for t in self.RBDProxy().list(client.ioctx):
with RBDVolumeProxy(self, t, read_only=True) as v:
...
class RBDVolumeProxy(object):
def __init__(self, driver, name, pool=None, snapshot=None,
read_only=False, remote=None, timeout=None):
client, ioctx = driver._connect_to_rados(pool, remote, timeout)
if snapshot is not None:
snapshot = utils.convert_str(snapshot)
try:
self.volume = driver.rbd.Image(ioctx,
utils.convert_str(name),
snapshot=snapshot,
read_only=read_only)
self.volume = tpool.Proxy(self.volume)
except driver.rbd.Error:
LOG.exception("error opening rbd image %s", name)
driver._disconnect_from_rados(client, ioctx)
raise
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
如上代码:
- RBDProxy会通过tpool.Proxy采用非阻塞的native thread方式spawn rbd.RBD.
- RBDVolumeProxy也会通过tpool.Proxy采用非阻塞的native thread方式spawn rbd.RBD.
- _get_usage_info会周期性运行, 此时如果删除一个volume, 那么RBDVolumeProxy可能就会找不着image, 从而报下面的错: Image volume-1f3aa3d5-5639-4a68-be07-14f3214320c6 is not found. _get_usage_info /usr/lib/python2.7/dist-packages/cinder/volume/drivers/rbd.py
- 要知道 RBDVolumeProxy是一个native thread, 里面出了ImageNotFound异常又没有显示返回就会造成我们上面说的greenthread的问题. 这个green thread阻塞, 其他的green thread还能正常yield, 但其他green thread接着也会遇到这个ImageNotFound异常, 这样被阻塞的green thread越来越多 (如何验证了, 可以在上述测试代码里再加一个greenthread里跑一个死循环, 你会发现其他green thread死了不影响这个死循环的green thread)
- 解决方法就是最好设置rbd_exclusive_cinder_pool=True避免调用上面的_get_usage_info
- 我提了一个bug - https://bugs.launchpad.net/cinder/+bug/1789828
具体日志如下:
2018-08-29 06:57:41.604 1586622 DEBUG cinder.volume.drivers.rbd [req-f885ea4a-3e98-4aad-bb3f-102240aebca1 10cbcda6a7854fa79cfc37dc1945cb6d 5d5d0f0ab738467f8ca813dd41432afa - a51502c6e125414fbba0cc95decd86c5 a51502c6e125414fbba0cc95decd86c5] deleting rbd volume volume-2d5951be-25bf-4313-a706-593664f6cd2e delete_volume /usr/lib/python2.7/dist-packages/cinder/volume/drivers/rbd.py:977
2018-08-29 06:57:41.610 1586622 ERROR cinder.volume.drivers.rbd [req-f885ea4a-3e98-4aad-bb3f-102240aebca1 10cbcda6a7854fa79cfc37dc1945cb6d 5d5d0f0ab738467f8ca813dd41432afa - a51502c6e125414fbba0cc95decd86c5 a51502c6e125414fbba0cc95decd86c5] error opening rbd image volume-005a04e7-a113-4ebb-bd77-1d9d3221d8f2: ImageNotFound: [errno 2] error opening image volume-005a04e7-a113-4ebb-bd77-1d9d3221d8f2 at snapshot None
2018-08-29 06:57:41.610 1586622 ERROR cinder.volume.drivers.rbd Traceback (most recent call last):
2018-08-29 06:57:41.610 1586622 ERROR cinder.volume.drivers.rbd File "/usr/lib/python2.7/dist-packages/cinder/volume/drivers/rbd.py", line 147, in __init__
2018-08-29 06:57:41.610 1586622 ERROR cinder.volume.drivers.rbd read_only=read_only)
2018-08-29 06:57:41.610 1586622 ERROR cinder.volume.drivers.rbd File "rbd.pyx", line 1392, in rbd.Image.__init__ (/build/ceph-B2ToPL/ceph-12.2.4/obj-x86_64-linux-gnu/src/pybind/rbd/pyrex/rbd.c:13545)
2018-08-29 06:57:41.610 1586622 ERROR cinder.volume.drivers.rbd ImageNotFound: [errno 2] error opening image volume-005a04e7-a113-4ebb-bd77-1d9d3221d8f2 at snapshot None
2018-08-29 06:57:41.610 1586622 ERROR cinder.volume.drivers.rbd
2018-08-29 06:57:41.612 1586622 DEBUG cinder.volume.drivers.rbd [req-f885ea4a-3e98-4aad-bb3f-102240aebca1 10cbcda6a7854fa79cfc37dc1945cb6d 5d5d0f0ab738467f8ca813dd41432afa - a51502c6e125414fbba0cc95decd86c5 a51502c6e125414fbba0cc95decd86c5] Image volume-005a04e7-a113-4ebb-bd77-1d9d3221d8f2 is not found. _get_usage_info /usr/lib/python2.7/dist-packages/cinder/volume/drivers/rbd.py:409
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
测试
1, 创建100个volume的脚本
#!/bin/bash -eu
. ~/stsstack-bundles/novarcv3_project
openstack project list| grep " admin "| awk '{print $2}'| xargs -l openstack quota set --volumes 200
TOKEN="`openstack token issue| grep ' id '| awk '{print $4}'`"
c_ep="`curl -s -XGET -H "X-Auth-Token: $TOKEN" "$OS_AUTH_URL/auth/catalog"| jq --raw-output '.catalog[] | select(.name | contains("cinderv3")).endpoints[] | select(.interface | contains("admin")).url'`"
echo "Cinder endpoint is $c_ep"
for i in {0..100}; do
(payload="`cat << EOF | python | sed 's/"/\\"/g'
import json
vol = {"volume": { "size": 1, "name": "vol"}}
print json.dumps(vol)
EOF`"
curl -s -X POST -H "X-Auth-Token: $TOKEN" -H "Content-Type: application/json" -d "$payload" ${c_ep}/volumes) &
done
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
2, 删除这100个volume
openstack volume list| egrep -v "^\+-+|ID"| awk '{print $2}'| xargs openstack volume delete
- 1
3, ceph-mon节点上观察volume的个数
watch -n 1 'rbd -p cinder-ceph ls| wc -l'
- 1
4, cinder-volume节点上观察cinder-volume进程下创建的线程数
watch -n 1 'ps -eLf| egrep "[c]inder-volume"| wc -l'
- 1
<link href="https://csdnimg.cn/release/phoenix/mdeditor/markdown_views-095d4a0b23.css" rel="stylesheet">
</div>