Ceph常见问题处理（octopus 15.2.13）

我不是程序猿

已于 2022-06-13 14:30:16 修改

阅读量1.5k

点赞数

分类专栏： Ceph 文章标签：云原生 centos linux

于 2022-06-13 14:27:36 首次发布

本文链接：https://blog.csdn.net/qq_25126841/article/details/125259325

版权

Ceph 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

1.查看最后几条日志信息

	ceph log last cephadm

2.当daemon出现error,或是stop状态，可以使用以下命令启动

	ceph orch daemon restart rgw.rgw.ceph3.sfepof

3.报错如下:

overall HEALTH_WARN 1 pools have many more objects per pg than average

ceph health detail #查看集群告警详情
#根据告警信息确定需调整PG数目的pool，然后执行如下，具体num可根据公式Total PGs=(OSDs * 100)/pool size（副本数） Nearest power of 2确定具体的值
              
ceph osd pool set <pool-name> pg_num 64 
ceph osd pool set <pool-name> pgp_num 64
ceph osd pool set <pool-name> pg_num_min 64

4.集群告警

clock skew detected on mon.ceph-node2

1.检查各节点ntp时间同步
2.重启mon
systemctl restart ceph-e1ba1fb4-0b00-11ec-b24d-781dbacebe19@mon.ceph-node2.service

osd显示down out

ceph orch daemon restart osd.<id>

6.集群告警1

HEALTH_WARN 1 clients failing to respond to cache pressure

处理办法：杀死问题客户端

$ ceph tell mds.0 session evict id=558067

7.集群告警2

[WRN] RECENT_CRASH: 2 daemons have recently crashed
mon.ceph-node1 crashed on host ceph-node1 at 2022-05-12T06:46:57.743001Z
mon.ceph-node1 crashed on host ceph-node1 at 2022-05-15T06:46:41.164404Z

处理办法：归档告警信息

$ ceph crash archive-all

7.集群告警2

HEALTH_ERR 3 scrub errors; Possible data damage: 3 pgs inconsistent
[ERR] OSD_SCRUB_ERRORS: 3 scrub errors
[ERR] PG_DAMAGED: Possible data damage: 3 pgs inconsistent
    pg 21.2f is active+clean+inconsistent, acting [7,15,11]
    pg 22.6 is active+clean+scrubbing+deep+inconsistent+repair, acting [15,26,2]
    pg 22.c is active+clean+scrubbing+deep+inconsistent+repair, acting [27,6,11]

处理办法：修改PG数据