ceph - remove A monitor (MANUAL)

在删除mon节点前, 请务必确保删除后的mon节点可以达到健康状态, 例如从5个节点删除到4个节点, 并且4个节点里面有3个或以上mon是健康的, 这样的状态ceph storage cluster才是健康的.
删除mon节点分两种情况, 1种情况是删除一个健康的mon节点, 另一种情况是删除一个不健康的mon节点.

首选来看看如何删除一个健康的mon节点 : 
连接到该mon节点, 关闭服务, 或(停止ceph_mon服务)
[root@mon5 ~]# ps -ewf|grep mon
root         127       1  0 Dec09 pts/0    00:00:24 ceph-mon -i mon5 --public-addr 172.17.0.10 --mon-data /data01/ceph/mon/ceph-5
root         202      28  0 11:24 pts/0    00:00:00 grep --color=auto mon
[root@mon5 ~]# kill 127
[root@mon5 ~]# ps -ewf|grep mon
root         204      28  0 11:24 pts/0    00:00:00 grep --color=auto mon


关闭ceph-mon进程后
[root@mon5 ~]# ceph -s
    cluster f649b128-963c-4802-ae17-5a76f36c4c76
     health HEALTH_WARN 1 mons down, quorum 0,1,2,3 mon1,mon2,mon3,mon4
     monmap e3: 5 mons at {mon1=172.17.0.2:6789/0,mon2=172.17.0.3:6789/0,mon3=172.17.0.4:6789/0,mon4=172.17.0.9:6789/0,mon5=172.17.0.10:6789/0}, election epoch 16, quorum 0,1,2,3 mon1,mon2,mon3,mon4
     osdmap e29: 4 osds: 4 up, 4 in
      pgmap v2384: 128 pgs, 1 pools, 0 bytes data, 0 objects
            41210 MB used, 1596 GB / 1636 GB avail
                 128 active+clean


使用命令删除节点
[root@mon5 ~]# ceph mon remove mon5
removed mon.mon5 at 172.17.0.10:6789/0, there are now 4 monitors

[root@mon5 ~]# ceph -s
    cluster f649b128-963c-4802-ae17-5a76f36c4c76
     health HEALTH_OK
     monmap e4: 4 mons at {mon1=172.17.0.2:6789/0,mon2=172.17.0.3:6789/0,mon3=172.17.0.4:6789/0,mon4=172.17.0.9:6789/0}, election epoch 20, quorum 0,1,2,3 mon1,mon2,mon3,mon4
     osdmap e29: 4 osds: 4 up, 4 in
      pgmap v2384: 128 pgs, 1 pools, 0 bytes data, 0 objects
            41210 MB used, 1596 GB / 1636 GB avail
                 128 active+clean


建议修改所有mon节点的配置文件, 把删掉的节点从mon 
members  配置和host配置去除 .
(修改ceph.conf只是为了好看, 其实mon节点之间是通过monmap来发现其他mon节点的, 所以配置文件不是必须的)
/etc/ceph/ceph.conf 

mon initial members = mon1, mon2, mon3, mon4
mon host = 172.17.0.2, 172.17.0.3, 172.17.0.4, 172.17.0.9


接下来让ceph storage cluster不健康, 关掉2个节点, 剩余2个节点.
[root@localhost ceph]# ssh 172.17.0.9
root@172.17.0.9's password: 
Last login: Tue Dec  9 17:26:03 2014 from 172.17.42.1

[root@mon4 ~]# ps -ewf|grep mon
root         223       1  0 Dec09 pts/0    00:00:24 ceph-mon -i mon4 --public-addr 172.17.0.9 --mon-data /data01/ceph/mon/ceph-4
root         278     261  0 11:29 pts/1    00:00:00 grep --color=auto mon

[root@mon4 ~]# kill 223

[root@mon4 ~]# ceph -s
    cluster f649b128-963c-4802-ae17-5a76f36c4c76
     health HEALTH_WARN 1 mons down, quorum 0,1,2 mon1,mon2,mon3
     monmap e4: 4 mons at {mon1=172.17.0.2:6789/0,mon2=172.17.0.3:6789/0,mon3=172.17.0.4:6789/0,mon4=172.17.0.9:6789/0}, election epoch 22, quorum 0,1,2 mon1,mon2,mon3
     osdmap e29: 4 osds: 4 up, 4 in
      pgmap v2392: 128 pgs, 1 pools, 0 bytes data, 0 objects
            41219 MB used, 1596 GB / 1636 GB avail
                 128 active+clean

[root@localhost ceph]# ssh 172.17.0.4
root@172.17.0.4's password: 
Last login: Tue Dec  9 17:26:32 2014 from 172.17.42.1

[root@mon3 ~]# service ceph -a stop mon.mon3
=== mon.mon3 === 
Stopping Ceph mon.mon3 on mon3...kill 903...done

[root@mon3 ~]# ceph -s
2014-12-10 11:33:47.004145 7f24641a0700  0 -- :/1001200 >> 172.17.0.4:6789/0 pipe(0x7f2460023190 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f2460023420).fault
^CError connecting to cluster: InterruptedOrTimeoutError

因为集群中有4个mon节点, 已经有2个被关闭了, 所以现在ceph 集群是不健康的状态, 为了达到健康, 我们可以启动这两个节点, 或从集群中把这两个节点删掉. 让集群只有2个mon节点.

接下来要从因为mon节点选举数不够导致集群不健康的情况, 删除mon节点 : 
(因为集群已经不健康了, 所以不能直接操作来删除, 必须通过dump和inject来删除不健康节点, 使ceph集群的mon选举数足够多来恢复集群健康)
首选要确认剩余的健康的mon节点是哪些? 本例是mon1, mon2.
mon3, mon4都关闭了, 并且假设不可恢复, 或者不需要他们了.
首先要关闭所有的要留下的mon, 本例为mon1, mon2
[root@mon1 tmp]# ps -efw|grep mon
root         560       1  0 Dec09 ?        00:01:22 /usr/bin/ceph-mon -i mon1 --pid-file /var/run/ceph/mon.mon1.pid -c /etc/ceph/ceph.conf --cluster ceph
root        1302    1228  0 11:38 pts/1    00:00:00 grep --color=auto mon
[root@mon1 tmp]# kill 560

[root@mon2 ~]# ps -ewf|grep mon
root         220       1  0 Dec09 ?        00:00:28 /usr/bin/ceph-mon -i mon2 --pid-file /var/run/ceph/mon.mon2.pid -c /etc/ceph/ceph.conf --cluster ceph
root         684     665  0 11:39 pts/0    00:00:00 grep --color=auto mon
[root@mon2 ~]# kill 220

然后连接到mon节点, dump monmap, 并从dump出来的monmap中删除不需要的节点(mon3, mon4)
[root@mon2 ~]# ceph-mon -i mon2 --extract-monmap /tmp/monmap
2014-12-10 11:39:39.775643 7ff70b6af880 -1 wrote monmap to /tmp/monmap

[root@mon2 ~]# monmaptool /tmp/monmap --rm mon3
monmaptool: monmap file /tmp/monmap
monmaptool: removing mon3
monmaptool: writing epoch 4 to /tmp/monmap (3 monitors)

[root@mon2 ~]# monmaptool /tmp/monmap --rm mon4
monmaptool: monmap file /tmp/monmap
monmaptool: removing mon4
monmaptool: writing epoch 4 to /tmp/monmap (2 monitors)


使用修改后的monmap导入当前mon节点的monmap数据中.
[root@mon2 ~]# ceph-mon -i mon2 --inject-monmap /tmp/monmap

启动
[root@mon2 ~]# /usr/bin/ceph-mon -i mon2 --pid-file /var/run/ceph/mon.mon2.pid -c /etc/ceph/ceph.conf --cluster ceph

另一个节点也做同样的操作 : 
[root@mon1 ~]# ceph-mon -i mon1 --extract-monmap /tmp/monmap
2014-12-10 11:43:16.959514 7fe2231b5880 -1 wrote monmap to /tmp/monmap

[root@mon1 ~]# monmaptool /tmp/monmap --rm mon3
monmaptool: monmap file /tmp/monmap
monmaptool: removing mon3
monmaptool: writing epoch 4 to /tmp/monmap (3 monitors)

[root@mon1 ~]# monmaptool /tmp/monmap --rm mon4
monmaptool: monmap file /tmp/monmap
monmaptool: removing mon4
monmaptool: writing epoch 4 to /tmp/monmap (2 monitors)

[root@mon1 ~]# ceph-mon -i mon1 --inject-monmap /tmp/monmap

[root@mon1 ~]# /usr/bin/ceph-mon -i mon1 --pid-file /var/run/ceph/mon.mon1.pid -c /etc/ceph/ceph.conf --cluster ceph

现在的ceph storage cluster只有2个mon节点, 而且都up了, 所以集群健康.
[root@mon1 ~]# ceph -s
    cluster f649b128-963c-4802-ae17-5a76f36c4c76
     health HEALTH_OK
     monmap e5: 2 mons at {mon1=172.17.0.2:6789/0,mon2=172.17.0.3:6789/0}, election epoch 26, quorum 0,1 mon1,mon2
     osdmap e29: 4 osds: 4 up, 4 in
      pgmap v2395: 128 pgs, 1 pools, 0 bytes data, 0 objects
            41223 MB used, 1596 GB / 1636 GB avail
                 128 active+clean

如果被删除的节点以后都不需要加进来了, 建议修改一下当前剩余节点mon1, mon2的配置文件, 把他们去除.
(修改ceph.conf只是为了好看, 其实mon节点之间是通过monmap来发现其他mon节点的, 所以配置文件不是必须的)
/etc/ceph/ceph.conf
......
mon initial members = mon1, mon2
mon host = 172.17.0.2, 172.17.0.3


删除mon的操作讲完了, 接下来可以用上一篇讲的方法把mon3, mon4, mon5加入进来.
因为mon3,4,5的信息都在, 所以加进来很简单.
[root@mon1 ~]# ceph mon add mon3 172.17.0.4
port defaulted to 6789; added mon.mon3 at 172.17.0.4:6789/0
[root@mon1 ~]# ceph -s
    cluster f649b128-963c-4802-ae17-5a76f36c4c76
     health HEALTH_WARN 1 mons down, quorum 0,1 mon1,mon2
     monmap e6: 3 mons at {mon1=172.17.0.2:6789/0,mon2=172.17.0.3:6789/0,mon3=172.17.0.4:6789/0}, election epoch 28, quorum 0,1 mon1,mon2
     osdmap e29: 4 osds: 4 up, 4 in
      pgmap v2410: 128 pgs, 1 pools, 0 bytes data, 0 objects
            41234 MB used, 1596 GB / 1636 GB avail
                 128 active+clean
[root@mon1 ~]# ssh 172.17.0.4
root@172.17.0.4's password: 
Last login: Wed Dec 10 11:31:54 2014 from 172.17.42.1
[root@mon3 ~]# service ceph start mon.mon3
=== mon.mon3 === 
Starting Ceph mon.mon3 on mon3...
Starting ceph-create-keys on mon3...
[root@mon3 ~]# ceph -s
    cluster f649b128-963c-4802-ae17-5a76f36c4c76
     health HEALTH_OK
     monmap e6: 3 mons at {mon1=172.17.0.2:6789/0,mon2=172.17.0.3:6789/0,mon3=172.17.0.4:6789/0}, election epoch 30, quorum 0,1,2 mon1,mon2,mon3
     osdmap e29: 4 osds: 4 up, 4 in
      pgmap v2411: 128 pgs, 1 pools, 0 bytes data, 0 objects
            41234 MB used, 1596 GB / 1636 GB avail
                 128 active+clean

但是务必保证加进来后即使未启动集群也是健康的, 所以有3个节点健康后, 可以一次加2个进来.
[root@mon3 ~]# ceph mon add mon4 172.17.0.9
port defaulted to 6789; added mon.mon4 at 172.17.0.9:6789/0
[root@mon3 ~]# ceph mon add mon5 172.17.0.10
port defaulted to 6789; added mon.mon5 at 172.17.0.10:6789/0
[root@mon4 ~]# /usr/bin/ceph-mon -i mon4 --pid-file /var/run/ceph/mon.mon4.pid -c /etc/ceph/ceph.conf --cluster ceph --mon-data /data01/ceph/mon/ceph-4
[root@mon4 ~]# ceph -s
    cluster f649b128-963c-4802-ae17-5a76f36c4c76
     health HEALTH_WARN 1 mons down, quorum 0,1,2,3 mon1,mon2,mon3,mon4
     monmap e8: 5 mons at {mon1=172.17.0.2:6789/0,mon2=172.17.0.3:6789/0,mon3=172.17.0.4:6789/0,mon4=172.17.0.9:6789/0,mon5=172.17.0.10:6789/0}, election epoch 36, quorum 0,1,2,3 mon1,mon2,mon3,mon4
     osdmap e29: 4 osds: 4 up, 4 in
      pgmap v2417: 128 pgs, 1 pools, 0 bytes data, 0 objects
            41241 MB used, 1596 GB / 1636 GB avail
                 128 active+clean

[root@mon5 ~]# /usr/bin/ceph-mon -i mon5 --pid-file /var/run/ceph/mon.mon5.pid -c /etc/ceph/ceph.conf --cluster ceph --mon-data /data01/ceph/mon/ceph-5
[root@mon5 ~]# ceph -s
    cluster f649b128-963c-4802-ae17-5a76f36c4c76
     health HEALTH_OK
     monmap e8: 5 mons at {mon1=172.17.0.2:6789/0,mon2=172.17.0.3:6789/0,mon3=172.17.0.4:6789/0,mon4=172.17.0.9:6789/0,mon5=172.17.0.10:6789/0}, election epoch 38, quorum 0,1,2,3,4 mon1,mon2,mon3,mon4,mon5
     osdmap e29: 4 osds: 4 up, 4 in
      pgmap v2418: 128 pgs, 1 pools, 0 bytes data, 0 objects
            41243 MB used, 1596 GB / 1636 GB avail
                 128 active+clean

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值