记一次 ceph 问题引起的 openstack vm 启动失败

14 篇文章 5 订阅
13 篇文章 1 订阅

环境:私有云使用 kolla 部署的 openstack rocky,存储使用 ansilbe-ceph 部署的 ceph。

1、openstack 中部分 vm 启动失败

2、查看 vm 实例控制台日志,发现出现 了 mount 延时。正好收到 ceph osd 522 down 的告警,怀疑是 osd down 引起的 vm volume 挂载失败。但是 ceph volumes pool 是 3 副本,挂掉一个 osd 不应该影响 openstack volume 挂载失败。

...
[ 1323.307101] INFO: task mount:1808 blocked for more than 120 seconds.
[ 1323.310831] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1323.315932] mount           D ffff95a2db265140     0  1808      1 0x00000004
...

3、查看该 vm 使用的 volume,发现 volumes_attached 值为空。这个 vm 是由开发人员使用 go openstack sdk 自动创建的

[root@ansible002 openstack]# openstack server show nifi-prometheus
...                      |
| volumes_attached                    | id='38a7340c-41e8-469d-8f42-8b3d938b96c6'       ...

4、查看 ceph vms pool,发现 vms pool 多了很多 rbd images,

[root@mtr01 ~]# rbd ls vms
06fcb738-ac26-444e-88fc-002613ea6ae7_disk
0f207373-6ad9-417d-9991-ada1ea0f3890_disk
2629e5c3-9975-4b5d-b16f-a13a3b5e0045_disk
2f500940-a463-49e6-b2de-2e5faa808f4c_disk
3be7be75-c3dd-48a4-a70a-37481500cc79_disk
3dd20d6c-7fcc-4693-8a36-b7bcc752781e_disk
45ccadd9-e45c-4129-9169-4635ed9d3a55_disk
4969f198-9970-4ef2-9e87-4a7f5e607b1a_disk
502e0067-504b-4ddc-aee0-3511d1df1a0a_disk
71c03292-6dc2-45aa-8ab3-e94332456134_disk
73c89b93-0e32-45dd-9cda-ed78d5b75b90_disk
933117bb-7d01-4a42-9048-1b1f3cca7007_disk
95773b3f-0d63-46e4-ae38-fbe343ea1b39_disk
9a123ef5-8ed0-4ac7-ada7-7985945d0536_disk
9d941359-ee46-4ef2-b0fd-66b0a097e90b_disk
a024c9ce-b9e4-46d0-bcb4-c8fcbd24f9d4_disk
a1d49186-e691-4d62-a1f9-544e32b75f50_disk
d0e2a25e-e8e3-47fc-8c71-24610299f910_disk
ee4cb3f5-4aac-4cb2-89ec-8c83e3f8df1f_disk
f6610ca9-9892-4065-af69-2e68858f9133_disk
fa204342-61a8-4a58-82f1-a875626b91fd_disk

5、go openstack sdk 应该是使用了 vms 这个 pool,这个 pool 副本数是 1,正好启动失败的 vm 的 volume 使用某些 pg 分布在了 down 掉的 osd 522 上,导致唯一副本失效。

[root@mtr01 ~]# ceph osd dump | grep pool
pool 1 'images' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 15046 lfor 0/645 flags hashpspool stripe_width 0 expected_num_objects 1 application rbd
pool 2 'volumes' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 4096 pgp_num 4096 last_change 15038 lfor 0/609 flags hashpspool stripe_width 0 expected_num_objects 1 application rbd
pool 3 'backups' replicated size 1 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 10520 flags hashpspool stripe_width 0 expected_num_objects 1 application rbd
pool 4 'vms' replicated size 1 min_size 1 crush_rule 0 object_hash rjenkins pg_num 4096 pgp_num 4096 last_change 15382 lfor 0/725 flags hashpspool stripe_width 0 expected_num_objects 1 application rbd
...

6、重新构建有问题的 pg(慎重,确认能承受起该 pg 数量的丢失风险,vm 中的数据会丢失,只能重新创建)

[root@mtr01 ~]# ceph health detail
HEALTH_ERR Reduced data availability: 7 pgs stale; 28 stuck requests are blocked > 4096 sec. Implicated osds 24,117,123,209,510
PG_AVAILABILITY Reduced data availability: 7 pgs stale
    pg 4.308 is stuck stale for 104354.280352, current state stale+active+clean, last acting [522]
    pg 4.82d is stuck stale for 104354.279537, current state stale+active+clean, last acting [522]
    pg 4.b31 is stuck stale for 104354.279306, current state stale+active+clean, last acting [522]
    pg 4.cab is stuck stale for 104354.279203, current state stale+active+clean, last acting [522]
    pg 4.f8f is stuck stale for 104354.280896, current state stale+active+clean, last acting [522]
    pg 4.ff7 is stuck stale for 104354.280929, current state stale+active+clean, last acting [522]
    pg 10.c8 is stuck stale for 104354.280195, current state stale+active+clean, last acting [522]
REQUEST_STUCK 28 stuck requests are blocked > 4096 sec. Implicated osds 24,117,123,209,510
    17 ops are blocked > 134218 sec
    1 ops are blocked > 67108.9 sec
    2 ops are blocked > 33554.4 sec
    8 ops are blocked > 16777.2 sec
    osd.117 has stuck requests > 33554.4 sec
    osd.510 has stuck requests > 67108.9 sec
    osds 24,123,209 have stuck requests > 134218 sec
[root@mtr01 ~]# ceph pg 10.c8 query
Error ENOENT: i don't have pgid 10.c8
[root@mtr01 ~]# ceph osd force-create-pg 4.308
pg 4.308 now creating, ok

依次 ceph osd force-create-pg 4.82d 等

[root@mtr01 ~]# ceph health detail
HEALTH_OK

7、对 vms pool 设置副本数为 3

[root@mtr01 ~]# ceph osd pool set vms size 3
set pool 4 size to 3
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值