Ceph (1) - 安装Ceph集群方法 1：使用ceph-deploy安装Nautilus版Ceph集群

最新推荐文章于 2025-03-22 01:23:39 发布

dawnsky.liu

最新推荐文章于 2025-03-22 01:23:39 发布

阅读量961

点赞数 1

分类专栏： Ceph 文章标签： OpenShift Ceph

本文链接：https://blog.csdn.net/weixin_43902588/article/details/109147778

版权

Ceph 专栏收录该内容

3 篇文章

订阅专栏

《OpenShift 4.x HOL教程汇总》

环境说明

Ceph集群节点说明

本文使用ceph-deploy工具部署社区版Ceph集群，Ceph使用的是nautilus版本。Ceph集群中的mon、mgr和osd部署在以下3个虚机节点上。

ceph-node1（角色：mon/mgr/osd）
ceph-node2（角色：mon/mgr/osd）
ceph-node3（角色：mon/mgr/osd）
我们将在独立的ceph-deploy节点上运行ceph-deploy工具来部署Ceph集群（注意：ceph-deploy最高支持nautilus版社区Ceph安装配置）。另外，还在该节点作为运行NTP服务的主节点。

Ceph集群主机环境说明

本文将使用基于VirtualBox VM虚机模拟物理服务。在配置VM的时候，请为运行Ceph集群的3个虚机，并分别配置2个额外的存储磁盘（如下图的sdb和sdc盘），可以为每个磁盘分配50GB存储空间。
在这里插入图片描述
为每个VM分配2个网卡，其中一个配置成Bridge类型、一个配置成Host-Only类型，分别配置负责Ceph集群外部访问的public地址段（本文使用的是“192.168.1.0”）和集群内部通讯用的private地址段（本文使用的是“192.168.99.0”）。
在这里插入图片描述
在创建完VM后，请为以上所有节点最小化安装CentOS7.x或RHEL 7.x。在安装RHEL过程中按照以下列表为每个节点分配主机名和固定IP地址。

ceph-deploy：192.168.1.200/24、192.168.99.200/24
ceph-node1：192.168.1.201/24、192.168.99.201/24
ceph-node2：192.168.1.202/24、192.168.99.202/24
ceph-node3：192.168.1.203/24、192.168.99.203/24

由于在VirtualBox的VM中2个网卡缺省名为“enp0s3”和“enp0s8”，因此可使用以下命令设置VM的2个IP地址。

nmcli con modify enp0s3 ipv4.addresses 192.168.1.XXX/24
nmcli con modify enp0s8 ipv4.addresses 192.168.99.XXX/24
systemctl restart network

用ceph-deploy部署Ceph集群

说明：本文所有命令操作全部在ceph-deploy节点中使用root用户执行（也可其它用户）。

准备节点环境

设置环境变量

执行以下命令，将常用参数写入环境变量。

$ cat >> ~/.bashrc << EOF
CEPH_VER=nautilus					## 安装Ceph的版本
CEPH_DEPLOY_DIR=~/ceph-deploy		## Ceph集群部署配置文件存放目录
PUBLIC_SUBNET=192.168.1				## 外部访问地址段
PRIVATE_SUBNET=192.168.99			## 内部通讯地址段
CEPH_DEPLOY_NODE=ceph-deploy		## 执行Ceph部署操作的节点
CEPH_NODE1=ceph-node1				## Ceph节点1
CEPH_NODE2=ceph-node2				## Ceph节点2
CEPH_NODE3=ceph-node3				## Ceph节点3
CEPH_NODE_LIST="ceph-node1 ceph-node2 ceph-node3"  	## Ceph 集群所有节点
EOF

$ source ~/.bashrc

设置hosts

本文使简化操作没有使用DNS，而是通过hosts解析4个虚拟主机域名。如果您的环境中有DNS服务，则可省略本步。
执行以下命令，将主机名写入hosts文件中，并同步到其它3个主机中。

$ cat >> /etc/hosts << EOF
$PUBLIC_SUBNET.200   $CEPH_DEPLOY_NODE
$PUBLIC_SUBNET.201   $CEPH_NODE1
$PUBLIC_SUBNET.202   $CEPH_NODE2
$PUBLIC_SUBNET.203   $CEPH_NODE3
EOF

$ for i in $CEPH_NODE_LIST; do scp /etc/hosts root@${i}:/etc/hosts; done

生成证书+免密登录

执行命令生成密钥对，并将公钥复制到所有节点。

$ ssh-keygen -t rsa -b 2048 -P '' -f ~/.ssh/id_rsa
$ for i in $CEPH_DEPLOY_NODE $CEPH_NODE_LIST; do ssh-copy-id root@${i} -f; done

设置节点主机名

执行命令，设置4个节点的主机名。

$ for i in $CEPH_DEPLOY_NODE $CEPH_NODE_LIST; do ssh ${i} hostnamectl set-hostname ${i}; ssh ${i} hostname; done

在执行完后需要退出所有节点，然后重新登录。

关闭防火墙并禁用SELinux

为了省去网络访问限制麻烦，执行以下命令关闭所有节点的防火墙并禁用SELinux（建议生产环境还是开启它们）。

$ for i in $CEPH_DEPLOY_NODE $CEPH_NODE_LIST; do ssh ${i} systemctl stop firewalld; ssh ${i} systemctl disable firewalld; done
$ for i in $CEPH_DEPLOY_NODE $CEPH_NODE_LIST; do ssh ${i} sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config; ssh ${i} setenforce 0; done

配置Yum Repo

执行命令，创建repo文件和yum源。

$ cat > /etc/yum.repos.d/CentOS-Base.repo << EOF
[base]
name=CentOS Base
baseurl=http://mirrors.aliyun.com/centos/7/os/x86_64/
gpgcheck=0

[updates]
name=CentOS Updates
baseurl=http://mirrors.aliyun.com/centos/7/updates/x86_64/
gpgcheck=0

[extras]
name=CentOS Extras
baseurl=http://mirrors.aliyun.com/centos/7/extras/x86_64/
gpgcheck=0
EOF

$ for i in $CEPH_NODE_LIST; do scp /etc/yum.repos.d/CentOS-Base.repo root@${i}:/etc/yum.repos.d/; done 
 
$ cat > /etc/yum.repos.d/epel.repo << EOF
[epel]
name=Extra Packages for Enterprise Linux 7
baseurl=http://mirrors.aliyun.com/epel/7/x86_64
gpgcheck=0
EOF
 
$ for i in $CEPH_NODE_LIST; do scp /etc/yum.repos.d/epel.repo root@${i}:/etc/yum.repos.d/; done 
 
$ cat > /etc/yum.repos.d/ceph.repo << EOF
[Ceph]
name=Ceph packages
baseurl=https://mirrors.aliyun.com/ceph/rpm-$CEPH_VER/el7/x86_64/
gpgcheck=0

[Ceph-noarch]
name=Ceph noarch packages
baseurl=https://mirrors.aliyun.com/ceph/rpm-$CEPH_VER/el7/noarch/
gpgcheck=0
EOF

$ for i in $CEPH_NODE_LIST; do scp /etc/yum.repos.d/ceph.repo root@${i}:/etc/yum.repos.d/; done

安装配置NTP服务

执行命令安装NTP服务，并在4个虚机几点中以ceph-deploy节点为主NTP。

$ for i in $CEPH_DEPLOY_NODE $CEPH_NODE_LIST;  do ssh ${i} yum install ntp -y; done

$ cat > ~/ntp.conf << EOF
restrict default nomodify notrap nopeer noquery
restrict 127.0.0.1
restrict ::1
server $CEPH_DEPLOY_NODE iburst
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
disable monitor
EOF

$ for i in $CEPH_DEPLOY_NODE $CEPH_NODE_LIST; do scp ~/ntp.conf root@${i}:/etc/; ssh ${i} systemctl restart ntpd; ssh ${i} systemctl enable ntpd; ssh ${i} ntpq -pn; done

创建Ceph集群

以下所有命令缺省都在$CEPH_DEPLOY_DIR中进行。

$ mkdir $CEPH_DEPLOY_DIR
$ cd $CEPH_DEPLOY_DIR

安装ceph-deploy程序

$ yum install ceph-deploy -y

向所有Ceph集节点安装ceph相关程序

使用以下任一个命令，向所有Ceph集节点安装ceph相关程序。

$ for i in $CEPH_NODE_LIST; do ceph-deploy install $i; done

或

$ for i in $CEPH_NODE_LIST; do ssh ${i} yum -y install ceph ceph-osd ceph-mds ceph-mon ceph-radosgw; done

创建ceph集群的mon配置

先创建一个只有单个mon服务的Ceph集群，我们先将ceph-node1当做运行第一个mon服务的节点。然后在集群部署目录中查看生成的Ceph集群配置文件。

$ ceph-deploy new --cluster-network $PRIVATE_SUBNET.0/24 --public-network $PUBLIC_SUBNET.0/24 $CEPH_NODE1
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy new --cluster-network 192.168.99.0/24 --public-network 192.168.1.0/24 ceph-node1
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  func                          : <function new at 0x7fb44978ee60>
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fb448efedd0>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  ssh_copykey                   : True
[ceph_deploy.cli][INFO  ]  mon                           : ['ceph-node1']
[ceph_deploy.cli][INFO  ]  public_network                : 192.168.1.0/24
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  cluster_network               : 192.168.99.0/24
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  fsid                          : None
[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
[ceph-mon1][DEBUG ] connected to host: deploy
[ceph-mon1][INFO  ] Running command: ssh -CT -o BatchMode=yes ceph-node1
[ceph-mon1][DEBUG ] connected to host: ceph-node1
[ceph-mon1][DEBUG ] detect platform information from remote host
[ceph-mon1][DEBUG ] detect machine type
[ceph-mon1][DEBUG ] find the location of an executable
[ceph-mon1][INFO  ] Running command: /usr/sbin/ip link show
[ceph-mon1][INFO  ] Running command: /usr/sbin/ip addr show
[ceph-mon1][DEBUG ] IP addresses found: [u'192.168.99.201', u'192.168.1.201']
[ceph_deploy.new][DEBUG ] Resolving host ceph-node1
[ceph_deploy.new][DEBUG ] Monitor ceph-node1 at 192.168.1.201
[ceph_deploy.new][DEBUG ] Monitor initial members are ['ceph-node1']
[ceph_deploy.new][DEBUG ] Monitor addrs are [u'192.168.1.201']
[ceph_deploy.new][DEBUG ] Creating a random mon key...
[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...
[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...
 
$ ls -al $CEPH_CLUSTER_DIR
total 12
-rw-r--r--. 1 root root  266 Oct 21 04:46 ceph.conf
-rw-r--r--. 1 root root 3245 Oct 21 04:46 ceph-deploy-ceph.log
-rw-------. 1 root root   73 Oct 21 04:46 ceph.mon.keyring

将Ceph集群配置和秘钥复制到集群所有节点

根据Ceph集群配置生成Ceph集群内部管理使用的keyring秘钥文件，然后将它们复制到各节点。

$ ceph-deploy mon create-initial
$ ls -al $CEPH_CLUSTER_DIR
total 132
-rw-------. 1 root root    113 Oct 21 09:26 ceph.bootstrap-mds.keyring
-rw-------. 1 root root    113 Oct 21 09:26 ceph.bootstrap-mgr.keyring
-rw-------. 1 root root    113 Oct 21 09:26 ceph.bootstrap-osd.keyring
-rw-------. 1 root root    113 Oct 21 09:26 ceph.bootstrap-rgw.keyring
-rw-------. 1 root root    151 Oct 21 09:26 ceph.client.admin.keyring
-rw-r--r--. 1 root root    267 Oct 21 09:25 ceph.conf
-rw-r--r--. 1 root root 104137 Oct 21 09:40 ceph-deploy-ceph.log
-rw-------. 1 root root     73 Oct 21 09:25 ceph.mon.keyring 
 
$ for i in $CEPH_NODE_LIST; do ceph-deploy admin $i; done

查看Ceph集群状态，确认集群中当前的服务只有“mon: 1 daemons”。

$ ssh $CEPH_NODE1 ceph -s
  cluster:
    id:     390a28c3-9c6f-42fb-aa14-846aa60bfcca
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum ceph-node1 (age 30m)
    mgr: no daemons active
    osd: 0 osds: 0 up, 0 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

创建Ceph集群的mgr服务

执行命令，在$CEPH_NODE1节点创建Ceph集群的mgr服务。

$ ceph-deploy mgr create $CEPH_NODE1
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy mgr create ceph-node1
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  mgr                           : [('ceph-node1', 'ceph-node1')]
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f2946087b48>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : <function mgr at 0x7f29469771b8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts ceph-node1:ceph-node1
[ceph-node1][DEBUG ] connected to host: ceph-node1
[ceph-node1][DEBUG ] detect platform information from remote host
[ceph-node1][DEBUG ] detect machine type
[ceph_deploy.mgr][INFO  ] Distro info: Red Hat Enterprise Linux Server 7.7 Maipo
[ceph_deploy.mgr][DEBUG ] remote host will use systemd
[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-node1
[ceph-node1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-node1][WARNIN] mgr keyring does not exist yet, creating one
[ceph-node1][DEBUG ] create a keyring file
[ceph-node1][DEBUG ] create path recursively if it doesn't exist
[ceph-node1][INFO  ] Running command: ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-node1 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-node1/keyring
[ceph-node1][INFO  ] Running command: systemctl enable ceph-mgr@ceph-node1
[ceph-node1][WARNIN] Created symlink from /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@ceph-node1.service to /usr/lib/systemd/system/ceph-mgr@.service.
[ceph-node1][INFO  ] Running command: systemctl start ceph-mgr@ceph-node1
[ceph-node1][INFO  ] Running command: systemctl enable ceph.target

查看Ceph集群状态，从“mgr: ceph-node1(active, since 2m)”可确认Ceph集群中已经有了一个活跃的mgr服务。

$ ssh $CEPH_NODE1 ceph -s
  cluster:
    id:     390a28c3-9c6f-42fb-aa14-846aa60bfcca
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3
 
  services:
    mon: 1 daemons, quorum ceph-node1 (age 75m)
    mgr: ceph-node1(active, since 2m)
    osd: 0 osds: 0 up, 0 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:

向Ceph集群添加osd服务

首先执行命令查看所有Ceph集群节点的存储情况，它们都应该有“sdb”和“sdc”。

$ for i in $CEPH_NODE_LIST; do ssh $i lsblk; done
NAME          MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda             8:0    0  100G  0 disk
├─sda1          8:1    0    1G  0 part /boot
└─sda2          8:2    0   99G  0 part
  ├─rhel-root 253:0    0   97G  0 lvm  /
  └─rhel-swap 253:1    0    2G  0 lvm  [SWAP]
sdb             8:16   0   50G  0 disk
sdc             8:32   0   50G  0 disk
sr0            11:0    1 1024M  0 rom
。。。

执行命令，将Ceph集群的所有节点的“sdb”和“sdc”磁盘加入到对应节点的osb服务中。

$ for i in $CEPH_NODE_LIST; do ceph-deploy osd create --data /dev/sdb $i; ceph-deploy osd create --data /dev/sdb $i; done

再次执行以下命令，查看Ceph集群中每个节点上的osd服务运行状态，注意“sdb”和“sdc”。

$ for i in $CEPH_NODE_LIST; do ssh $i lsblk; done
NAME                                                                                                  MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                                                                                                     8:0    0  100G  0 disk
├─sda1                                                                                                  8:1    0    1G  0 part /boot
└─sda2                                                                                                  8:2    0   99G  0 part
  ├─rhel-root                                                                                         253:0    0   97G  0 lvm  /
  └─rhel-swap                                                                                         253:1    0    2G  0 lvm  [SWAP]
sdb                                                                                                     8:16   0   50G  0 disk
└─ceph--d16739ba--8373--484e--85df--8e75ae62c8ec-osd--block--50b96cba--c256--4672--8a66--ae7252622c04 253:2    0   50G  0 lvm
sdc                                                                                                     8:32   0   50G  0 disk
└─ceph--44692265--8cef--4c6d--b391--ad3098113c4c-osd--block--7d6a97f9--330d--45dd--94b2--485040c1d27f 253:3    0   50G  0 lvm
sr0                                                                                                    11:0    1 1024M  0 rom
。。。

执行以下命令，查看Ceph集群的osd服务状态。从“osd: 6 osds”可确认Ceph集群已经运行6个osd服务了。

$ ssh $CEPH_NODE1 ceph -s
  cluster:
    id:     390a28c3-9c6f-42fb-aa14-846aa60bfcca
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum ceph-node1 (age 27m)
    mgr: ceph-node1(active, since 26m)
    osd: 6 osds: 6 up (since 22s), 6 in (since 22s)
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   6.0 GiB used, 294 GiB / 300 GiB avail
    pgs: 
    
$ ssh $CEPH_NODE1 ceph osd tree
ID CLASS WEIGHT  TYPE NAME           STATUS REWEIGHT PRI-AFF
-1       0.29279 root default
-3       0.09760     host ceph-node1
 0   hdd 0.04880         osd.0           up  1.00000 1.00000
 1   hdd 0.04880         osd.1           up  1.00000 1.00000
-5       0.09760     host ceph-node2
 2   hdd 0.04880         osd.2           up  1.00000 1.00000
 3   hdd 0.04880         osd.3           up  1.00000 1.00000
-7       0.09760     host ceph-node3
 4   hdd 0.04880         osd.4           up  1.00000 1.00000
 5   hdd 0.04880         osd.5           up  1.00000 1.00000

向Ceph集群添加mon服务

先查看Ceph集群中的mon服务状态，确认此时集群中只有一个mon服务。

$ ssh $CEPH_NODE1 ceph mon stat
e1: 1 mons at {ceph-node1=[v2:192.168.1.201:3300/0,v1:192.168.1.201:6789/0]}, election epoch 5, leader 0 ceph-node1, quorum 0 ceph-node1
 
$ ssh $CEPH_NODE1 ceph mon_status --format json-pretty
{
    "name": "ceph-node1",
    "rank": 0,
    "state": "leader",
    "election_epoch": 5,
    "quorum": [
        0
    ],
。。。
        "mons": [
            {
                "rank": 0,
                "name": "ceph-node1",
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": "v2",
                            "addr": "192.168.1.201:3300",
                            "nonce": 0
                        },
                        {
                            "type": "v1",
                            "addr": "192.168.1.201:6789",
                            "nonce": 0
                        }
                    ]
                },
                "addr": "192.168.1.201:6789/0",
                "public_addr": "192.168.1.201:6789/0"
            }
        ]
    },
。。。

执行命令，将Ceph集群的mon服务添加到“$CEPH_NODE2 $CEPH_NODE3”2个节点上。

$ ssh $CEPH_NODE1 ceph mon add $CEPH_NODE2 $CEPH_NODE3
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy mon add ceph-node2
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : add
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f6a26b222d8>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  mon                           : ['ceph-node2']
[ceph_deploy.cli][INFO  ]  func                          : <function mon at 0x7f6a26d89488>
[ceph_deploy.cli][INFO  ]  address                       : None
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mon][INFO  ] ensuring configuration of new mon host: ceph-node2
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node2
[ceph-node2][DEBUG ] connected to host: ceph-node2
[ceph-node2][DEBUG ] detect platform information from remote host
[ceph-node2][DEBUG ] detect machine type
[ceph-node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host ceph-node2
[ceph_deploy.mon][DEBUG ] using mon address by resolving host: 192.168.1.202
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-node2 ...
[ceph-node2][DEBUG ] connected to host: ceph-node2
。。。

再次查看集群中mon服务的状态，确认此时集群中已经有3个mon服务了，而且从“leader 0 ceph-node1”可以得知“ceph-node1”节点是主节点。

$ ssh $CEPH_NODE1 ceph mon stat
e3: 3 mons at {ceph-node1=[v2:192.168.1.201:3300/0,v1:192.168.1.201:6789/0],ceph-node2=[v2:192.168.1.202:3300/0,v1:192.168.1.202:6789/0],ceph-node3=[v2:192.168.1.203:3300/0,v1:192.168.1.203:6789/0]}, election epoch 14, leader 0 ceph-node1, quorum 0,1,2 ceph-node1,ceph-node2,ceph-node3
 
$ ssh $CEPH_NODE1 ceph mon dump
epoch 3
fsid 390a28c3-9c6f-42fb-aa14-846aa60bfcca
last_changed 2020-10-22 08:26:46.560192
created 2020-10-21 09:26:43.566557
min_mon_release 14 (nautilus)
0: [v2:192.168.1.201:3300/0,v1:192.168.1.201:6789/0] mon.ceph-node1
1: [v2:192.168.1.202:3300/0,v1:192.168.1.202:6789/0] mon.ceph-node2
2: [v2:192.168.1.203:3300/0,v1:192.168.1.203:6789/0] mon.ceph-node3
dumped monmap epoch 3
 
$ ssh $CEPH_NODE1 ceph -s
  cluster:
    id:     390a28c3-9c6f-42fb-aa14-846aa60bfcca
    health: HEALTH_WARN
            clock skew detected on mon.ceph-node2, mon.ceph-node3
 
  services:
    mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3 (age 9m)
    mgr: ceph-node1(active, since 21h)
    osd: 6 osds: 6 up (since 20h), 6 in (since 20h)
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   6.0 GiB used, 294 GiB / 300 GiB avail
    pgs:

向Ceph集群添加mgr服务

执行命令，将Ceph集群的mgr服务添加到“$CEPH_NODE2 $CEPH_NODE3”2个节点上。

$ ceph-deploy mgr create $CEPH_NODE2 $CEPH_NODE3
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy mgr create ceph-node2 ceph-node3
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  mgr                           : [('ceph-node2', 'ceph-node2'), ('ceph-node3', 'ceph-node3')]
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f42dbdf7b48>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : <function mgr at 0x7f42dc6e71b8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts ceph-node2:ceph-node2 ceph-node3:ceph-node3
[ceph-node2][DEBUG ] connected to host: ceph-node2
。。。

再次查看Ceph集群的状态，确认此时Ceph集群中已经有3个mgr服务了，而且ceph-node1上的mgr是活跃的，其他两个节点的mgr服务都处于standbys状态。

$ ssh $CEPH_NODE1 ceph -s
  cluster:
    id:     390a28c3-9c6f-42fb-aa14-846aa60bfcca
    health: HEALTH_WARN
            clock skew detected on mon.ceph-node2, mon.ceph-node3
 
  services:
    mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3 (age 35m)
    mgr: ceph-node1(active, since 21h), standbys: ceph-node2, ceph-node3
    osd: 6 osds: 6 up (since 21h), 6 in (since 21h)
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   6.0 GiB used, 294 GiB / 300 GiB avail
    pgs:

查看Ceph集群的服务运行状态

分别执行以下命令，查看$CEPH_NODE1节点的mon、mgr、osd服务运行进程。

$ ssh $CEPH_NODE1 systemctl list-units | grep ceph-mon
ceph-mon@ceph-node1.service                                                              loaded active running   Ceph cluster monitor daemon
ceph-mon.target                                                                          loaded active active    ceph target allowing to start/stop all ceph-mon@.service instances at once

$ ssh $CEPH_NODE1 systemctl list-units | grep ceph-mgr
ceph-mgr@ceph-node1.service                                                              loaded active running   Ceph cluster manager daemon
ceph-mgr.target                                                                          loaded active active    ceph target allowing to start/stop all ceph-mgr@.service instances at once

$ ssh $CEPH_NODE1 systemctl list-units | grep ceph-osd
var-lib-ceph-osd-ceph\x2d0.mount                                                         loaded active mounted   /var/lib/ceph/osd/ceph-0
var-lib-ceph-osd-ceph\x2d1.mount                                                         loaded active mounted   /var/lib/ceph/osd/ceph-1
ceph-osd@0.service                                                                       loaded active running   Ceph object storage daemon osd.0
ceph-osd@1.service                                                                       loaded active running   Ceph object storage daemon osd.1
ceph-osd.target                                                                          loaded active active    ceph target allowing to start/stop all ceph-osd@.service instances at once

查看$CEPH_NODE1节点的mgr服务运行状态。

ssh $CEPH_NODE1 systemctl status ceph-mgr@$CEPH_NODE1.service
● ceph-mgr@ceph-node1.service - Ceph cluster manager daemon
   Loaded: loaded (/usr/lib/systemd/system/ceph-mgr@.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2020-10-21 11:09:48 EDT; 1 day 4h ago
 Main PID: 989 (ceph-mgr)
   CGroup: /system.slice/system-ceph\x2dmgr.slice/ceph-mgr@ceph-node1.service
           └─989 /usr/bin/ceph-mgr -f --cluster ceph --id ceph-node1 --setuser ceph --setgroup ceph

Oct 21 11:35:00 ceph-node1 systemd[1]: [/usr/lib/systemd/system/ceph-mgr@.service:22] Unknown lvalue 'ProtectControlGroups' in section 'Service'
Oct 21 11:35:00 ceph-node1 systemd[1]: [/usr/lib/systemd/system/ceph-mgr@.service:24] Unknown lvalue 'ProtectKernelModules' in section 'Service'
Oct 21 11:35:00 ceph-node1 systemd[1]: [/usr/lib/systemd/system/ceph-mgr@.service:25] Unknown lvalue 'ProtectKernelTunables' in section 'Service'
Oct 21 11:35:00 ceph-node1 systemd[1]: [/usr/lib/systemd/system/ceph-mgr@.service:14] Unknown lvalue 'LockPersonality' in section 'Service'
Oct 21 11:35:00 ceph-node1 systemd[1]: [/usr/lib/systemd/system/ceph-mgr@.service:18] Unknown lvalue 'MemoryDenyWriteExecute' in section 'Service'
Oct 21 11:35:00 ceph-node1 systemd[1]: [/usr/lib/systemd/system/ceph-mgr@.service:22] Unknown lvalue 'ProtectControlGroups' in section 'Service'
Oct 21 11:35:00 ceph-node1 systemd[1]: [/usr/lib/systemd/system/ceph-mgr@.service:24] Unknown lvalue 'ProtectKernelModules' in section 'Service'
Oct 21 11:35:00 ceph-node1 systemd[1]: [/usr/lib/systemd/system/ceph-mgr@.service:25] Unknown lvalue 'ProtectKernelTunables' in section 'Service'