Ceph:ceph修复osd为down的情况

ceph修复osd为down的情况

今天巡检发现ceph集群有一个osds Down了
通过dashboard 查看:
ceph修复osd为down的情况:
在这里插入图片描述
点击查看详情
可以看到是哪个节点Osds Down 了
在这里插入图片描述
通过命令查看Osds状态
①、查看集群状态:

[root@ceph01 ~]# ceph -s
  cluster:
    id:     240a5732-02e5-11eb-8f5a-000c2945a4b1
    health: HEALTH_WARN
            Degraded data redundancy: 3972/11916 objects degraded (33.333%), 64 pgs degraded, 65 pgs undersized
            65 pgs not deep-scrubbed in time
            65 pgs not scrubbed in time

  services:
    mon: 3 daemons, quorum ceph01,ceph02,ceph03 (age 8d)
    mgr: ceph02.zopypt(active, since 10w), standbys: ceph03.ucynxg, ceph01.suwmox
    mds: cephfs:1 {0=cephfs.ceph02.axdsbo=up:active} 4 up:standby
    osd: 3 osds: 2 up (since 5w), 2 in (since 5w)

  data:
    pools:   3 pools, 65 pgs
    objects: 3.97k objects, 1.8 GiB
    usage:   6.0 GiB used, 2.0 TiB / 2.0 TiB avail
    pgs:     3972/11916 objects degraded (33.333%)
             64 active+undersized+degraded
             1  active+undersized

  io:
    client:   596 B/s wr, 0 op/s rd, 0 op/s wr

②、查看Osds树状态

[root@ceph01 ~]# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME             STATUS  REWEIGHT  PRI-AFF
-1         3.00000  root default
-3         1.00000      host sjyt-ceph01
 0    hdd  1.00000          osd.0           down         0  1.00000
-5         1.00000      host sjyt-ceph02
 1    hdd  1.00000          osd.1             up   1.00000  1.00000
-7         1.00000      host sjyt-ceph03
 2    hdd  1.00000          osd.2             up   1.00000  1.00000

解决过程:
另一种处理方式:

参考:ceph修复osd为down的情况

①、重启故障节点osd服务

[root@sjyt-ceph01 ~]# systemctl status ceph-240a5732-02e5-11eb-8f5a-000c2945a4b1@osd.0.service
● ceph-240a5732-02e5-11eb-8f5a-000c2945a4b1@osd.0.service - Ceph osd.0 for 240a5732-02e5-11eb-8f5a-000c2945a4b1
   Loaded: loaded (/etc/systemd/system/ceph-240a5732-02e5-11eb-8f5a-000c2945a4b1@.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Mon 2021-02-01 19:24:37 CST; 1 months 5 days ago
  Process: 320045 ExecStopPost=/bin/bash /var/lib/ceph/240a5732-02e5-11eb-8f5a-000c2945a4b1/osd.0/unit.poststop (code=exited, status=0/SUCCESS)
  Process: 320033 ExecStop=/bin/podman stop ceph-240a5732-02e5-11eb-8f5a-000c2945a4b1-osd.0 (code=exited, status=125)
  Process: 153844 ExecStart=/bin/bash /var/lib/ceph/240a5732-02e5-11eb-8f5a-000c2945a4b1/osd.0/unit.run (code=exited, status=0/SUCCESS)
  Process: 153833 ExecStartPre=/bin/podman rm ceph-240a5732-02e5-11eb-8f5a-000c2945a4b1-osd.0 (code=exited, status=1/FAILURE)
 Main PID: 153844 (code=exited, status=0/SUCCESS)

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
[root@sjyt-ceph01 ~]# systemctl start ceph-240a5732-02e5-11eb-8f5a-000c2945a4b1@osd.0.service
[root@sjyt-ceph01 ~]# systemctl status ceph-240a5732-02e5-11eb-8f5a-000c2945a4b1@osd.0.service
● ceph-240a5732-02e5-11eb-8f5a-000c2945a4b1@osd.0.service - Ceph osd.0 for 240a5732-02e5-11eb-8f5a-000c2945a4b1
   Loaded: loaded (/etc/systemd/system/ceph-240a5732-02e5-11eb-8f5a-000c2945a4b1@.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2021-03-09 10:19:07 CST; 1s ago
  Process: 320045 ExecStopPost=/bin/bash /var/lib/ceph/240a5732-02e5-11eb-8f5a-000c2945a4b1/osd.0/unit.poststop (code=exited, status=0/SUCCESS)
  Process: 320033 ExecStop=/bin/podman stop ceph-240a5732-02e5-11eb-8f5a-000c2945a4b1-osd.0 (code=exited, status=125)
  Process: 2770303 ExecStartPre=/bin/podman rm ceph-240a5732-02e5-11eb-8f5a-000c2945a4b1-osd.0 (code=exited, status=1/FAILURE)
 Main PID: 2770314 (bash)
    Tasks: 13 (limit: 23968)
   Memory: 31.2M
   CGroup: /system.slice/system-ceph\x2d240a5732\x2d02e5\x2d11eb\x2d8f5a\x2d000c2945a4b1.slice/ceph-240a5732-02e5-11eb-8f5a-000c2945a4b1@osd.0.service
           ���─2770314 /bin/bash /var/lib/ceph/240a5732-02e5-11eb-8f5a-000c2945a4b1/osd.0/unit.run
           └─2770413 /bin/podman run --rm --net=host --ipc=host --privileged --group-add=disk --name ceph-240a5732-02e5-11eb-8f5a-000c2945a4b1-osd.0 -e CONTAINER_IMAGE=docker.io/ceph/ceph:v15 -e NODE_NAME=sjyt

②、查看OSD状态

[root@sjyt-ceph01 ~]# ceph osd tree
ID  CLASS  WEIGHT   TYPE NAME             STATUS  REWEIGHT  PRI-AFF
-1         3.00000  root default
-3         1.00000      host sjyt-ceph01
 0    hdd  1.00000          osd.0             up   1.00000  1.00000
-5         1.00000      host sjyt-ceph02
 1    hdd  1.00000          osd.1             up   1.00000  1.00000
-7         1.00000      host sjyt-ceph03
 2    hdd  1.00000          osd.2             up   1.00000  1.00000

③、查看集群状态

[root@sjyt-ceph01 ~]# ceph -s
  cluster:
    id:     240a5732-02e5-11eb-8f5a-000c2945a4b1
    health: HEALTH_WARN
            Degraded data redundancy: 2654/11916 objects degraded (22.273%), 39 pgs degraded, 39 pgs undersized
            64 pgs not deep-scrubbed in time
            64 pgs not scrubbed in time

  services:
    mon: 3 daemons, quorum sjyt-ceph01,sjyt-ceph02,sjyt-ceph03 (age 8d)
    mgr: sjyt-ceph02.zopypt(active, since 10w), standbys: sjyt-ceph03.ucynxg, sjyt-ceph01.suwmox
    mds: cephfs:1 {0=cephfs.sjyt-ceph02.axdsbo=up:active} 4 up:standby
    osd: 3 osds: 3 up (since 8m), 3 in (since 8m); 39 remapped pgs

  data:
    pools:   3 pools, 65 pgs
    objects: 3.97k objects, 1.8 GiB
    usage:   9.4 GiB used, 3.0 TiB / 3.0 TiB avail
    pgs:     1.538% pgs not active
             2654/11916 objects degraded (22.273%)
             38 active+undersized+degraded+remapped+backfill_wait
             25 active+clean
             1  active+undersized+degraded+remapped+backfilling
             1  peering

  io:
    client:   1.5 KiB/s wr, 0 op/s rd, 0 op/s wr
    recovery: 2.7 MiB/s, 1 keys/s, 1 objects/s

Osds 恢复正常后,数据开始恢复到新的Osds节点上。

  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值