ceph-cursh规则实战及PGS unknown 问题处理

最新推荐文章于 2023-06-10 16:09:25 发布

小黑_深呼吸

最新推荐文章于 2023-06-10 16:09:25 发布

阅读量2.1k

点赞数

分类专栏： ceph 文章标签： ceph

本文链接：https://blog.csdn.net/fq3758/article/details/128528677

版权

ceph 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

问题描述：

[root@ceph-mon01 ~]# ceph -s

cluster:

id: 92d4f66b-94a6-4c40-8941-734f3c44eb4f

health: HEALTH_ERR

1 filesystem is offline

1 filesystem is online with fewer MDS than max_mds

1 pools have many more objects per pg than average

Reduced data availability: 256 pgs inactive

services:

mon: 3 daemons, quorum ceph-mon01,ceph-mon03,ceph-mon02 (age 5d)

mgr: ceph-mon03(active, since 5d), standbys: ceph-mon02, ceph-mon01

mds: cephfs:0

osd: 9 osds: 9 up (since 43h), 9 in (since 43h); 224 remapped pgs

rgw: 1 daemon active (ceph-mon01)

task status:

data:

pools: 9 pools, 480 pgs

objects: 34.60k objects, 8.5 GiB

usage: 128 GiB used, 142 GiB / 270 GiB avail

172995/103797 objects misplaced (166.667%)

256 unknown

224 active+clean+remapped

解决过程

ceph health detail

...

PG_AVAILABILITY Reduced data availability: 1024 pgs inactive

pg 4.3c8 is stuck inactive for 246794.767182, current state unknown, last acting []

pg 4.3ca is stuck inactive for 246794.767182, current state unknown, last acting []

1、检查 osd tree (本处有，datacenter0, default 两个pg副本入口点)

[root@ceph-mon01 ~]# ceph osd tree

ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF

-9 0.26367 datacenter datacenter0

-10 0.26367 room room0

-11 0.08789 rack rack0

-3 0.08789 host ceph-osd01

0 hdd 0.02930 osd.0 up 1.00000 1.00000

1 hdd 0.02930 osd.1 up 1.00000 1.00000

7 hdd 0.02930 osd.7 up 1.00000 1.00000

-12 0.08789 rack rack1

-1 0 root default

2、查看crushmap 信息（查看直到只有一个cruwh rules , id 0 , 副本入口点为：default ）

[root@ceph-mon01 ~]# ceph osd getcrushmap -o test.bin

[root@ceph-mon01 ~]# crushtool -d test.bin -o test.txt

[root@ceph-mon01 ~]# cat test.txt

# begin crush map

tunable choose_local_tries 0

tunable choose_local_fallback_tries 0

…..

# rules

rule replicated_rule {

id 0

type replicated

min_size 1

max_size 10

step take default

step chooseleaf firstn 0 type host

step emit

}

3、查看现有pool

[root@ceph-mon01 ~]# ceph osd pool ls

.rgw.root

default.rgw.control

default.rgw.meta

default.rgw.log

default.rgw.buckets.index

default.rgw.buckets.non-ec

default.rgw.buckets.data

cephfs_data

cephfs_metadata

4、查看现有pool使用的crush_rule规划（本示例查看以，使用的是 crush_rule 0 的规划，）

[root@ceph-mon01 ~]# ceph osd dump |grep crush_rule

pool 1 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode warn last_change 47 flags hashpspool stripe_width 0 application rgw

pool 2 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode warn last_change 49 flags hashpspool stripe_width 0 application rgw

pool 3 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode warn last_change 51 flags hashpspool stripe_width 0 application rgw

5、修改 crushmap 信息

对于比较熟的 crush 配置比较熟悉的老手推荐使用，线上业务集群慎用。

5.1 导出crush map

把 ceph 的二进制格式的 crush map 导出并转换为文本格式

# 把二进制格式的crush map导出到test.bin文件中

ceph osd getcrushmap -o test.bin

# 用 crushtool 工具把 test.bin 里的二进制数据转换成文本形式保存到 test.txt 文档里。

crushtool -d test.bin -o test.txt

5.2 修改test.txt

# 1、将take default改成take datacenter0，修改权重，

rule replicated_rule {

id 0

type replicated

min_size 1

max_size 10

step take default

step chooseleaf firstn 0 type host

step emit

}

rule datacenter_rule { # 规则集的命名，创建pool时可以指定rule集

id 1 # id设置为1

type replicated # 定义pool类型为replicated(还有esurecode模式)

min_size 1 # pool中最小指定的副本数量不能小1

max_size 10 # pool中最大指定的副本数量不能大于10

step take datacenter0 # 定义pg查找副本的入口点

step chooseleaf firstn 0 type host # 深度优先、隔离默认为host，设置为host

step emit # 结束

}

# end crush map

5.3 把重新写的 ceph crush 导入 ceph 集群

# 把 test1 转换成二进制形式

crushtool -c test.txt -o new.bin

# 把 test2 导入集群

ceph osd setcrushmap -i new.bin

5.4 修改现有存储池的crush_rule

重新导入集群后，需要把之前存在过的pool池的crush_rule都修一下，否则集群会出现unknown状态有无法达到activee+clean状态

ceph osd pool set cephfs_data crush_rule datacenter_rule

ceph osd pool set cephfs_metadata crush_rule datacenter_rule

5.5 查看集群状态

ceph osd dump |grep crush_rule # 发现使用的crush_rule的id变为1

pool 8 'cephfs_data' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode warn last_change 180 flags hashpspool stripe_width 0 application cephfs

pool 9 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode warn last_change 183 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs

5.6、查看pool的使用的crush_rule名称

[root@ceph-mon01 ceph-ceph-mon01]# ceph osd pool ls