Ceph删除添加osd

最新推荐文章于 2021-05-03 08:39:29 发布

weixin_34414650

最新推荐文章于 2021-05-03 08:39:29 发布

阅读量1k

点赞数 1

文章标签： python 运维

原文链接：https://my.oschina.net/xiaozhublog/blog/874108

版权

2019独角兽企业重金招聘Python工程师标准>>>

一、场景

集群当前状态

# ceph -s                                               
    cluster e6ccdfaa-a729-4638-bcde-e539b1e7a28d
     health HEALTH_OK
     monmap e1: 3 mons at {bdc2=172.16.251.2:6789/0,bdc3=172.16.251.3:6789/0,bdc4=172.16.251.4:6789/0}                
            election epoch 82, quorum 0,1,2 bdc2,bdc3,bdc4
     osdmap e3132: 27 osds: 26 up, 26 in                    
     flags sortbitwise                 
      pgmap v13259021: 4096 pgs, 4 pools, 2558 GB data, 638 kobjects
            7631 GB used, 89048 GB / 96680 GB avail
                4096 active+clean
  client io 34720 kB/s wr, 0 op/s rd, 69 op/s wr

可以看到集群的状态为OK，但是可以27个osd有一个状态为down+up

补充知识：osd状态

up：守护进程运行中，能够提供IO服务；
down：守护进程不在运行，无法提供IO服务；
in：包含数据；
out：不包含数据

# ceph osd tree |grep down                              
 0  3.63129         osd.0     down        0          1.00000

意味着osd.0进程不在运行而且也不包含数据了，这里的数据是指ceph集群存储的数据，可以进行验证。

首先是守护进程

# systemctl status ceph-osd@0 
    ● ceph-osd@0.service - Ceph object storage daemon
       Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
       Active: failed (Result: start-limit) since 四 2017-04-06 09:26:04 CST; 1h 2min ago
      Process: 480723 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
      Process: 480669 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS) 
     Main PID: 480723 (code=exited, status=1/FAILURE)  
    4月 06 09:26:04 bdc2 systemd[1]: Unit ceph-osd@0.service entered failed state.  
    4月 06 09:26:04 bdc2 systemd[1]: ceph-osd@0.service failed. 
    4月 06 09:26:04 bdc2 systemd[1]: ceph-osd@0.service holdoff time over, scheduling restart.  
    4月 06 09:26:04 bdc2 systemd[1]: start request repeated too quickly for ceph-osd@0.service  
    4月 06 09:26:04 bdc2 systemd[1]: Failed to start Ceph object storage daemon.
    4月 06 09:26:04 bdc2 systemd[1]: Unit ceph-osd@0.service entered failed state.  
    4月 06 09:26:04 bdc2 systemd[1]: ceph-osd@0.service failed.

查看osd.0的日志

    # tail -f /var/log/ceph/ceph-osd.0.log
    2017-04-06 09:26:04.531004 7f75f33d5800  0 filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)  
    2017-04-06 09:26:04.531520 7f75f33d5800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
    2017-04-06 09:26:04.531528 7f75f33d5800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' co
    nfig option
    2017-04-06 09:26:04.531548 7f75f33d5800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice is supported  
    2017-04-06 09:26:04.532318 7f75f33d5800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)  
    2017-04-06 09:26:04.532384 7f75f33d5800  0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: extsize is disabled by conf   
    2017-04-06 09:26:04.730841 7f75f33d5800 -1 filestore(/var/lib/ceph/osd/ceph-0) Error initializing leveldb : IO error: /var/lib/ceph/osd/ceph-0/current/omap/MANIFEST-004467: In
    put/output error   
       
    2017-04-06 09:26:04.730870 7f75f33d5800 -1 osd.0 0 OSD:init: unable to mount object store  
    2017-04-06 09:26:04.730879 7f75f33d5800 -1  ** ERROR: osd init failed: (1) Operation not permitted

再次是检查数据

    # cd /var/lib/ceph/osd/ceph-0/current 
    # ls -lrt |tail -10 
    drwxr-xr-x  2 ceph ceph   58 4月   2 00:45 4.2e9_TEMP  
    drwxr-xr-x  2 ceph ceph   58 4月   2 00:45 4.355_TEMP  
    drwxr-xr-x  2 ceph ceph   58 4月   2 00:45 4.36c_TEMP  
    drwxr-xr-x  2 ceph ceph   58 4月   2 00:45 4.3ae_TEMP  
    drwxr-xr-x  2 ceph ceph   58 4月   2 00:46 4.3b2_TEMP  
    drwxr-xr-x  2 ceph ceph   58 4月   2 00:46 4.3e8_TEMP  
    drwxr-xr-x  2 ceph ceph   58 4月   2 00:46 4.3ea_TEMP  
    -rw-r--r--. 1 ceph ceph   10 4月   2 08:53 commit_op_seq   
    drwxr-xr-x. 2 ceph ceph   349 4月   5 10:01 omap
    -rw-r--r--. 1 ceph ceph   0 4月   6 09:26 nosnap

任意挑选两个个pg进行查看

 # ceph pg dump|grep 4.3ea   
    dumped all in format plain 
    4.3ea   2   0   0   0   0   8388608 254 254 active+clean2017-04-06 01:55:04.754593  1322'2543132:122[26,2,12]   26[
    26,2,12]26  1322'2542017-04-06 01:55:04.754546  1322'2542017-04-02 00:46:12.611726 
# ceph pg dump|grep 4.3e8   
    dumped all in format plain 
    4.3e8   1   0   0   0   0   4194304 12261226active+clean2017-04-06 01:26:43.827061  1323'1226   3132:127[2,15,5]2 [
    2,15,5] 2   1323'1226   2017-04-06 01:26:43.827005  1323'1226   2017-04-06 01:26:43.827005

可以看到4.3ea很4.3e8的三个副本分别在 [26,2,12]和[2,15,5]这三个osd上了

小结：

意思很明显，ceph集群现在状态正常，已经把osd.0踢出了集群了，osd.0的守护进程无法启动，重启服务会报日志中的错误（解决办法未知），其中存储的数据也不在ceph集群中了，也就是说现在这个osd很鸡肋，所以打算将其彻底踢出集群，擦洗（zap）之后再重新加入集群。

二、移除osd

移出集群（管理节点执行）

# ceph osd out 0 (ceph osd tree中，REWEIGHT值变为0)

停止服务（目标节点执行）

#  systemctl stop ceph-osd@0 (ceph osd tree中，状态变为DOWN)

该osd已经是out且down状态了，跳过1,2两步 3. 移出crush

# ceph osd crush remove osd.0

删除key

# ceph auth del osd.0

移除osd

# ceph osd rm 0

# df -h |grep ceph-0
/dev/sdc1                  3.7T  265G  3.4T    8% /var/lib/ceph/osd/ceph-0                                                                                                     
# umount /var/lib/ceph/osd/ceph-0
umount: /var/lib/ceph/osd/ceph-0：目标忙。                                                                                                                                          
        (有些情况下通过 lsof(8) 或 fuser(1) 可以                                                                                                                                         
         找到有关使用该设备的进程的有用信息)

提示无法卸载，使用fuser命令查看是什么占用了

# fuser -m -v /var/lib/ceph/osd/ceph-0                                                                                                                            
                     用户     进程号 权限   命令                                                                                                                                        
	/var/lib/ceph/osd/ceph-0:                                                                                                                                                      
                     root     kernel mount /var/lib/ceph/osd/ceph-0                                                                                                            
                     root      212444 ..c.. bash

终止掉这个占用的bash进程

# kill -9 212444
或者使用fuser来杀掉
# fuser -m -v -i -k /var/lib/ceph/osd/ceph-0

再进行卸载

# umount /var/lib/ceph/osd/ceph-0

卸载成功

7.擦除磁盘上面df -h的时候已经看到ceph-0对应得是/dev/sdc这块盘（也可以使用ceph-disk list查看）

# ceph-disk zap /dev/sdc  
Caution: invalid backup GPT header, but valid main header; regenerating
backup header from main header.
   
****************************************************************************   
Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk   
verification and recovery are STRONGLY recommended.
****************************************************************************   
GPT data structures destroyed! You may now partition the disk using fdisk or   
other utilities.   
Creating new GPT entries.  
The operation has completed successfully.

查看此时ceph状态

# ceph -s                                                                                                                                                   
    cluster e6ccdfaa-a729-4638-bcde-e539b1e7a28d                                                                                                                             
     health HEALTH_WARN                                                                                                                                                        
            170 pgs backfill_wait                                                                                                                                              
            10 pgs backfilling                                                                                                                                                 
            362 pgs degraded                                                                                                                                                   
            362 pgs recovery_wait                                                                                                                                              
            436 pgs stuck unclean                                                                                                                                              
            recovery 5774/2136302 objects degraded (0.270%)                                                                                                                    
            recovery 342126/2136302 objects misplaced (16.015%)                                                                                                                
     monmap e1: 3 mons at {bdc2=172.16.251.2:6789/0,bdc3=172.16.251.3:6789/0,bdc4=172.16.251.4:6789/0}                                                                         
            election epoch 82, quorum 0,1,2 bdc2,bdc3,bdc4                                                                                                                     
     osdmap e3142: 26 osds: 26 up, 26 in; 180 remapped pgs                                                                                                                     
            flags sortbitwise                                                                                                                                                  
      pgmap v13264634: 4096 pgs, 4 pools, 2558 GB data, 639 kobjects                                                                                                           
            7651 GB used, 89029 GB / 96680 GB avail                                                                                                                            
            5774/2136302 objects degraded (0.270%)                                                                                                                             
            342126/2136302 objects misplaced (16.015%)                                                                                                                         
                3554 active+clean                                                                                                                                              
                 362 active+recovery_wait+degraded                                                                                                                             
                 170 active+remapped+wait_backfill                                                                                                                             
                  10 active+remapped+backfilling
     recovery io 354 MB/s, 89 objects/s
	 client io 1970 kB/s wr, 0 op/s rd, 88 op/s wr

等待集群重新recovery结束恢复到OK状态再添加新的OSD。（至于为什么要等到OK了再添加，我也不知道。）

三、添加osd

上面已经将卸载掉的osd对应磁盘擦除干净了，这里添加就比较方便了，直接使用ceph-deploy工具添加即可

# ceph-deploy --overwrite-conf osd create bdc2:/dev/sdc

命令执行结束，可以看到osd又重新添加了，而且id还是0

# df -h |grep ceph-0
/dev/sdc1                  3.7T   74M  3.7T    1% /var/lib/ceph/osd/ceph-0

初始的很新鲜，没使用多少空间

# ceph-disk list |grep osd                                                                                                                                 
 /dev/sdc1 ceph data, active, cluster ceph, osd.0, journal /dev/sdc2
 /dev/sdd1 ceph data, active, cluster ceph, osd.1, journal /dev/sdd2
 /dev/sde1 ceph data, active, cluster ceph, osd.2, journal /dev/sde2
 /dev/sdf1 ceph data, active, cluster ceph, osd.3, journal /dev/sdf2

查看此时的ceph状态，重新恢复到27个osd，静待集群恢复到ok状态

# ceph -s                                                                                                                                                  
    cluster e6ccdfaa-a729-4638-bcde-e539b1e7a28d                                                                                                                               
     health HEALTH_WARN                                                                                                                                                        
            184 pgs backfill_wait                                                                                                                                              
            6 pgs backfilling                                                                                                                                                  
            374 pgs degraded                                                                                                                                                   
            374 pgs recovery_wait                                                                                                                                              
            83 pgs stuck unclean                                                                                                                                               
            recovery 4605/2114056 objects degraded (0.218%)                                                                                                                    
            recovery 298454/2114056 objects misplaced (14.118%)                                                                                                                
     monmap e1: 3 mons at {bdc2=172.16.251.2:6789/0,bdc3=172.16.251.3:6789/0,bdc4=172.16.251.4:6789/0}                                                                         
            election epoch 82, quorum 0,1,2 bdc2,bdc3,bdc4                                                                                                                     
     osdmap e3501: 27 osds: 27 up, 27 in; 190 remapped pgs                                                                                                                     
            flags sortbitwise                                                                                                                                                  
      pgmap v13275552: 4096 pgs, 4 pools, 2558 GB data, 639 kobjects                                                                                                           
            7647 GB used, 92751 GB / 100398 GB avail                                                                                                                           
            4605/2114056 objects degraded (0.218%)                                                                                                                             
            298454/2114056 objects misplaced (14.118%)                                                                                                                         
                3532 active+clean                                                                                                                                              
                 374 active+recovery_wait+degraded                                                                                                                             
                 184 active+remapped+wait_backfill                                                                                                                             
                   6 active+remapped+backfilling                                                                                                                            
recovery io 264 MB/s, 67 objects/s                                                                                                                              
  client io 1737 kB/s rd, 63113 kB/s wr, 60 op/s rd, 161 op/s wr                                                                             ```          

参考链接 ：[http://www.cnblogs.com/sammyliu/p/5555218.html](http://www.cnblogs.com/sammyliu/p/5555218.html)

转载于:https://my.oschina.net/xiaozhublog/blog/874108