openstack ceph故障排查

openstack ceph故障排查

ceph的自动均衡在一定程度上能够提高数据的同步性与存储的均衡性,但是在某些情况下也会产生一些问题,比如ceph集群中大量服务器同时关机,磁盘损坏等问题,ceph的自动均衡机制会给集群造成一定的困扰。


问题现象:
openstack环境中所有云主机使用非常卡顿,甚至无法使用,ceph 所有osd均up,或者某个osd自动down,集群也一直在均衡。
1、查看ceph集群健康状态

ceph health detail

一般会显示出ceph集群中有多少osd,多少pgs,分别都在干什么等等。

ceph -s

如果显示数据是在均衡状态,但是长时间或者一直没有recovery io,这有可能在均衡中出现问题,还是查看health detail,看看是那些osd 陷入stuck。

pg 3.1a5 is stuck unclean for 27164.858187, current state active+remapped+wait_backfill, last acting [4,14]
pg 3.22 is stuck unclean for 14383.232098, current state active+remapped+wait_backfill, last acting [3,23]
pg 5.25 is stuck unclean for 11660.300263, current state active+remapped+wait_backfill, last acting [20,23]

比如进入osd.4中查看具体情况,进入osd.4所在节点,

ceph osd tree

查看osd4的日志

tail -f /var/log/ceph/ceph-osd.4.log

根据日志可以发现不少问题的线索。
1、有时可能会有某某节点显示no reply,则表示在peering过程中该节点没有回应或者没有心跳,可以进入相应节点,同样的查看该osd的日志,从中寻找线索。
2、根据health detail的结果分析pg的状态
寻找出问题的pg组,获取该pg位于的osd和节点名称,进入相应的节点,查询pg的信息

ceph pg x.xx query

一般的会显示出该pg的故障信息以及原因,可以根据此分别进行解决。
如果查询不了pg,表示在该节点上无法查询pg,则可以将该osd remove,将相应pg分流到其他节点上,然后query
删除osd

ceph osd reweight osd.1 0.0

根据均衡情况使用如下命令

ceph osd crush reweight osd.1 0
stop ceph-osd id=1
ceph osd crush rm osd.1
ceph osd crush rm node-2
ceph osd rm 1
ceph auth del osd.1

可以尝试修复pgs

ceph pg repair x.xx

一般的如果最后所有的节点都是active+的话基本等待均衡完成就可以了,该类方法可以针对pg down+peering等情况。
3、pg unfound
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#failures-osd-unfound
Second, you can identify which OSDs have been probed or might contain data:

ceph pg 2.4 query
“recovery_state”: [
{ “name”: “Started\/Primary\/Active”,
“enter_time”: “2012-03-06 15:15:46.713212”,
“might_have_unfound”: [
{ “osd”: 1,
“status”: “osd is down”}]},
In this case, for example, the cluster knows that osd.1 might have data, but it is down. The full range of possible states include:

already probed
querying
OSD is down
not queried (yet)
Sometimes it simply takes some time for the cluster to query possible locations.

It is possible that there are other locations where the object can exist that are not listed. For example, if a ceph-osd is stopped and taken out of the cluster, the cluster fully recovers, and due to some future set of failures ends up with an unfound object, it won’t consider the long-departed ceph-osd as a potential location to consider. (This scenario, however, is unlikely.)

If all possible locations have been queried and objects are still lost, you may have to give up on the lost objects. This, again, is possible given unusual combinations of failures that allow the cluster to learn about writes that were performed before the writes themselves are recovered. To mark the “unfound” objects as “lost”:

ceph pg 2.5 mark_unfound_lost revert|delete

This the final argument specifies how the cluster should deal with lost objects.

  • The “delete” option will forget about them entirely.
  • The “revert” option (not available for erasure coded pools) will either roll back to a previous version of the object or (if it was a new object) forget about it entirely. Use this with caution, as it may confuse applications that expected the object to exist.(慎用)
  • 1
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值