一、架构
IP | hostname | role | 配置 |
---|---|---|---|
11.0.1.3 | ceph01 | adm,mon.mgr | 1C+2.5G+20G |
11.0.1.4 | ceph02 | mon.mgr | 1C+2.5G+20G |
11.0.1.5 | ceph03 | mon.mgr | 1C+2.5G+20G |
11.0.1.6 | ceph04 | osd | 1C+2.5G+20G+5*20G |
11.0.1.7 | ceph05 | osd | 1C+2.5G+20G+5*20G |
11.0.1.8 | ceph06 | osd | 1C+2.5G+20G+5*20G |
二、初始化环境
修改主机名
hostnamectl set-hostname ceph01
其他节点同样操作。
关闭selinux
setenforce 0
sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
关闭防火墙
systemctl disable firewalld --now
配置时间同步(CentOS 7.9默认已安装)
在所有 Ceph 节点上安装并运行chrony服务,特别是监控节点以免因时钟漂移导致故障
yum install -y chrony
systemctl enable --now chronyd
如果未安装时间同步,运行cephadmin shell将出错
配置hosts主机名解析
cat >> /etc/hosts << EOF
11.0.1.3 ceph01
11.0.1.4 ceph02
11.0.1.5 ceph03
11.0.1.6 ceph04
11.0.1.7 ceph05
11.0.1.8 ceph06
EOF
配置免密登录
在ceph-mon1上新建ssh公钥对,并配置免密登录至集群其它节点
ssh-keygen -t rsa -P ''
for i in `tail -n 6 /etc/hosts | awk '{print $1}'`; do ssh-copy-id $i;done
安装python3
yum install python3 -y
安装部署docker
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install docker-ce docker-ce-cli containerd.io -y
systemctl enable docker && systemctl start docker
配置ceph yum源
注意,根据CentOS的版本不同,选择如下两个配置中的其中一个
CentOS 7
# 在所有节点上配置,或者配置好一份之后分法到其他节点上去
# 注意需要配置EPEL源
vi /etc/yum.repos.d/ceph.repo
[Ceph]
name=Ceph packages for $basearch
baseurl=https://mirrors.aliyun.com/ceph/rpm-octopus/el7/$basearch
enabled=1
gpgcheck=0
type=rpm-md
[Ceph-noarch]
name=Ceph noarch packages
baseurl=http://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch
enabled=1
gpgcheck=0
type=rpm-md
[ceph-source]
name=Ceph source packages
baseurl=http://mirrors.aliyun.com/ceph/rpm-octopus/el7/SRPMS
enabled=1
gpgcheck=0
type=rpm-md
CentOS 8
在所有节点上配置,或者配置好一份之后分法到其他节点上去
注意需要配置EPEL源
# vi /etc/yum.repos.d/ceph.repo
[Ceph]
name=Ceph packages for $basearch
baseurl=https://mirrors.aliyun.com/ceph/rpm-octopus/el8/$basearch
enabled=1
gpgcheck=0
type=rpm-md
[Ceph-noarch]
name=Ceph noarch packages
baseurl=http://mirrors.aliyun.com/ceph/rpm-octopus/el8/noarch
enabled=1
gpgcheck=0
type=rpm-md
[ceph-source]
name=Ceph source packages
baseurl=http://mirrors.aliyun.com/ceph/rpm-octopus/el8/SRPMS
enabled=1
gpgcheck=0
type=rpm-md
三、部署ceph
在ceph01上操作:
1.获取quincy版本cephadm脚本
curl --silent --remote-name --location https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm```
2.授予执行权限
```shell
chmod +x cephadm && cp cephadm /usr/bin/cephadm
cephadm其实就是一个python3脚本,可以直接运行,而不用安装。
3、执行部署命令
mkdir -p /etc/ceph
cephadm bootstrap --mon-ip 11.0.1.3
上述指令会为我们完成以下工作:
- 创建mon
- 创建ssh key并且添加到 /root/.ssh/authorized_keys 文件
- 将集群间通信的最小配置写入/etc/ceph/ceph.conf
- 将client.admin管理secret密钥的副本写入/etc/ceph/ceph.client.admin.keyring。
- 将公用密钥的副本写入/etc/ceph/ceph.pub
执行结果如下:
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman|docker (/usr/bin/docker) is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: 6e2a9c0c-a133-11ed-9914-000c2965979c
Verifying IP 11.0.1.3 port 3300 ...
Verifying IP 11.0.1.3 port 6789 ...
Mon IP 11.0.1.3 is in CIDR network 11.0.1.0/24
Pulling container image quay.io/ceph/ceph:v15...
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
Assimilating anything we can from ceph.conf...
Generating new minimal ceph.conf...
Restarting the monitor...
Setting mon public_network...
Creating mgr...
Verifying port 9283 ...
Wrote keyring to /etc/ceph/ceph.client.admin.keyring
Wrote config to /etc/ceph/ceph.conf
Waiting for mgr to start...
Waiting for mgr...
mgr not available, waiting (1/10)...
mgr not available, waiting (2/10)...
mgr not available, waiting (3/10)...
mgr not available, waiting (4/10)...
mgr is available
Enabling cephadm module...
Waiting for the mgr to restart...
Waiting for Mgr epoch 5...
Mgr epoch 5 is available
Setting orchestrator backend to cephadm...
Generating ssh key...
Wrote public SSH key to to /etc/ceph/ceph.pub
Adding key to root@localhost's authorized_keys...
Adding host ceph01...
Deploying mon service with default placement...
Deploying mgr service with default placement...
Deploying crash service with default placement...
Enabling mgr prometheus module...
Deploying prometheus service with default placement...
Deploying grafana service with default placement...
Deploying node-exporter service with default placement...
Deploying alertmanager service with default placement...
Enabling the dashboard module...
Waiting for the mgr to restart...
Waiting for Mgr epoch 13...
Mgr epoch 13 is available
Generating a dashboard self-signed certificate...
Creating initial admin user...
Fetching dashboard port number...
Ceph Dashboard is now available at:
URL: https://ceph01:8443/
User: admin
Password: e5odjghru3
You can access the Ceph CLI with:
sudo /usr/bin/cephadm shell --fsid 6e2a9c0c-a133-11ed-9914-000c2965979c -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring
Please consider enabling telemetry to help improve Ceph:
ceph telemetry on
For more information see:
https://docs.ceph.com/docs/master/mgr/telemetry/
Bootstrap complete.
注意 这个会生成密码,需提前保存,方便后面访问dashboard。
URL: https://ceph01:8443/
User: admin
Password: e5odjghru3
此时已经运行了以下组件
- ceph-mgr ceph管理程序
- ceph-monitor ceph监视器
- ceph-crash 崩溃数据收集模块
- prometheus prometheus监控组件
- grafana 监控数据展示dashboard
- alertmanager prometheus告警组件
- node_exporter prometheus节点数据收集组件
可以在节点上安装包含所有 ceph 命令的包,包括 、(用于安装 CephFS 文件系统)等
cephadm add-repo --release octopus
cephadm install ceph-common
启用CEPH命令
Cephadm不需要在主机上安装任何Ceph包
Cephadm shell命令在安装了所有Ceph包的容器中启动bash shell。默认情况下,如果在主机上的/etc/ceph中找到配置和keyring文件,则会将它们传递到容器环境中,以便shell完全正常工作。注意,在MON主机上执行时,cephadm shell将从MON容器推断配置,而不是使用默认配置。如果给定–mount,则主机(文件或目录)将显示在容器中的/mnt下面。
进入命令状态
cephadm shell
查看所有组件运行状态
cephadm -- ceph orch ps
查看某个组件运行状态
# 查看mgr组件
[root@ceph01 ~]# ceph orch ps --daemon-type mgr
NAME HOST STATUS REFRESHED AGE VERSION IMAGE NAME IMAGE ID CONTAINER ID
mgr.ceph01.wefeij ceph01 running (22m) 6m ago 22m 15.2.17 quay.io/ceph/ceph:v15 93146564743f 9c9e45c74508
# 查看mon组件
[root@ceph01 ~]# ceph orch ps --daemon-type mon
NAME HOST STATUS REFRESHED AGE VERSION IMAGE NAME IMAGE ID CONTAINER ID
mon.ceph01 ceph01 running (22m) 7m ago 22m 15.2.17 quay.io/ceph/ceph:v15 93146564743f 3166e1452aa1
# 查看mds组件
[root@ceph01 ~]# ceph orch ps --daemon-type mds
No daemons reported
集群状态
[root@ceph01 ~]# ceph status
cluster:
id: 6e2a9c0c-a133-11ed-9914-000c2965979c
health: HEALTH_WARN
OSD count 0 < osd_pool_default_size 3
services:
mon: 1 daemons, quorum ceph01 (age 23m)
mgr: ceph01.wefeij(active, since 22m)
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
[root@ceph01 ceph]# ceph -v
ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
添加主机到ceph集群
在新节点的root用户的authorized_keys文件中安装集群的公共SSH密钥:
cd /etc/ceph/
ssh-copy-id -f -i ceph.pub root@ceph02
ssh-copy-id -f -i ceph.pub root@ceph03
ssh-copy-id -f -i ceph.pub root@ceph04
ssh-copy-id -f -i ceph.pub root@ceph05
ssh-copy-id -f -i ceph.pub root@ceph06
禁用自动部署mon节点
ceph orch apply mon --unmanaged
ceph orch apply mgr --unmanaged
如果不做这一步,cephadm会自动在已添加的host上去部署mon和mgr进程。一个典型的Ceph集群有3到5个monitor daemon。如果超过5个节点,官网建议使用5个monitor守护程序 。
ceph orch host add ceph02
ceph orch host add ceph03
ceph orch host add ceph04
ceph orch host add ceph05
ceph orch host add ceph06
这样我们就有了6个节点。
查看节点
[root@ceph01 ceph]# ceph orch host ls
HOST ADDR LABELS STATUS
ceph01 11.0.1.3 _admin
ceph02 11.0.1.4
ceph03 11.0.1.5
ceph04 11.0.1.6
ceph05 11.0.1.7
ceph06 11.0.1.8
6 hosts in cluster
部署monitor
给需要部署mon进程的节点打上标签
ceph orch host label add ceph01 mon
ceph orch host label add ceph02 mon
ceph orch host label add ceph03 mon
根据标签部署monitor
ceph orch apply mon label:mon
或者,将monitor部署在一组特定的主机上:
ceph orch apply mon *<host1,host2,host3,...>*
注意,需确保在列表中包括第一台(引导)主机 。
添加完成后ceph会自动扩展monitor和manager到另外2个节点,在另外2个节点查看,自动运行了以下容器
[root@ceph02 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
31edb4e5c332 quay.io/ceph/ceph "/usr/bin/ceph-mon -…" 47 minutes ago Up 47 minutes ceph-e40909f8-a099-11ed-be19-000c2965979c-mon-ceph02
289c64d51284 quay.io/prometheus/node-exporter:v1.3.1 "/bin/node_exporter …" 52 minutes ago Up 52 minutes ceph-e40909f8-a099-11ed-be19-000c2965979c-node-exporter-ceph02
78e8a886d6e0 quay.io/ceph/ceph "/usr/bin/ceph-mgr -…" 52 minutes ago Up 52 minutes ceph-e40909f8-a099-11ed-be19-000c2965979c-mgr-ceph02-hlvadx
326b20981e54 quay.io/ceph/ceph "/usr/bin/ceph-crash…" 52 minutes ago Up 52 minutes ceph-e40909f8-a099-11ed-be19-000c2965979c-crash-ceph02
验证
[root@ceph01 ceph]# ceph status
Inferring fsid e40909f8-a099-11ed-be19-000c2965979c
Using recent ceph image quay.io/ceph/ceph@sha256:0560b16bec6e84345f29fb6693cd2430884e6efff16a95d5bdd0bb06d7661c45
cluster:
id: e40909f8-a099-11ed-be19-000c2965979c
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph01,ceph03,ceph02 (age 61m)
mgr: ceph01.lfvymn(active, since 69m), standbys: ceph02.hlvadx, ceph03.qggqsn
osd: 15 osds: 15 up (since 47m), 15 in (since 48m)
data:
pools: 1 pools, 1 pgs
objects: 2 objects, 449 KiB
usage: 725 MiB used, 299 GiB / 300 GiB avail
pgs: 1 active+clean
部署osd
列出节点上的所有可用设备
[root@ceph01 ceph]# ceph orch device ls
Inferring fsid e40909f8-a099-11ed-be19-000c2965979c
Using recent ceph image quay.io/ceph/ceph@sha256:0560b16bec6e84345f29fb6693cd2430884e6efff16a95d5bdd0bb06d7661c45
HOST PATH TYPE DEVICE ID SIZE AVAILABLE REFRESHED REJECT REASONS
ceph04 /dev/sdb hdd 21.4G Yes 2m ago
ceph04 /dev/sdc hdd 21.4G Yes 2m ago
ceph04 /dev/sdd hdd 21.4G Yes 2m ago
ceph04 /dev/sde hdd 21.4G Yes 2m ago
ceph04 /dev/sdf hdd 21.4G Yes 2m ago
ceph05 /dev/sdb hdd 21.4G Yes 12s ago
ceph05 /dev/sdc hdd 21.4G Yes 12s ago
ceph05 /dev/sdd hdd 21.4G Yes 12s ago
ceph05 /dev/sde hdd 21.4G Yes 12s ago
ceph05 /dev/sdf hdd 21.4G Yes 12s ago
ceph06 /dev/sdb hdd 21.4G Yes 2m ago
ceph06 /dev/sdc hdd 21.4G Yes 2m ago
ceph06 /dev/sdd hdd 21.4G Yes 2m ago
ceph06 /dev/sde hdd 21.4G Yes 2m ago
ceph06 /dev/sdf hdd 21.4G Yes 2m ago
如果满足以下所有条件,则认为存储设备可用:
- 设备必须没有分区。
- 设备不得具有任何LVM状态。
- 设备不能被mounted。
- 该设备不得包含文件系统。
- 该设备不得包含Ceph BlueStore OSD。
- 设备必须大于5 GB。
创建osd
方法1:从特定主机上的特定设备创建OSD
cephadm shell -- ceph orch daemon add osd ceph04:/dev/sdb
cephadm shell -- ceph orch daemon add osd ceph04:/dev/sdc
cephadm shell -- ceph orch daemon add osd ceph04:/dev/sdd
cephadm shell -- ceph orch daemon add osd ceph04:/dev/sde
cephadm shell -- ceph orch daemon add osd ceph04:/dev/sdf
cephadm shell -- ceph orch daemon add osd ceph05:/dev/sdb
cephadm shell -- ceph orch daemon add osd ceph05:/dev/sdc
cephadm shell -- ceph orch daemon add osd ceph05:/dev/sdd
cephadm shell -- ceph orch daemon add osd ceph05:/dev/sde
cephadm shell -- ceph orch daemon add osd ceph05:/dev/sdf
cephadm shell -- ceph orch daemon add osd ceph06:/dev/sdb
cephadm shell -- ceph orch daemon add osd ceph06:/dev/sdc
cephadm shell -- ceph orch daemon add osd ceph06:/dev/sdd
cephadm shell -- ceph orch daemon add osd ceph06:/dev/sde
cephadm shell -- ceph orch daemon add osd ceph06:/dev/sdf
方法2:添加任何可用和未使用的存储设备
cephadm shell -- ceph orch apply osd --all-available-devices
验证
[root@ceph01 ceph]# cephadm shell -- ceph orch device ls
Inferring fsid e40909f8-a099-11ed-be19-000c2965979c
Using recent ceph image quay.io/ceph/ceph@sha256:0560b16bec6e84345f29fb6693cd2430884e6efff16a95d5bdd0bb06d7661c45
HOST PATH TYPE DEVICE ID SIZE AVAILABLE REFRESHED REJECT REASONS
ceph04 /dev/sdb hdd 21.4G 6m ago Insufficient space (<10 extents) on vgs, LVM detected, locked
ceph04 /dev/sdc hdd 21.4G 6m ago Insufficient space (<10 extents) on vgs, LVM detected, locked
ceph04 /dev/sdd hdd 21.4G 6m ago Insufficient space (<10 extents) on vgs, LVM detected, locked
ceph04 /dev/sde hdd 21.4G 6m ago Insufficient space (<10 extents) on vgs, LVM detected, locked
ceph04 /dev/sdf hdd 21.4G 6m ago Insufficient space (<10 extents) on vgs, LVM detected, locked
ceph05 /dev/sdb hdd 21.4G 1s ago Insufficient space (<10 extents) on vgs, LVM detected, locked
ceph05 /dev/sdc hdd 21.4G 1s ago Insufficient space (<10 extents) on vgs, LVM detected, locked
ceph05 /dev/sdd hdd 21.4G 1s ago Insufficient space (<10 extents) on vgs, LVM detected, locked
ceph05 /dev/sde hdd 21.4G 1s ago Insufficient space (<10 extents) on vgs, LVM detected, locked
ceph05 /dev/sdf hdd 21.4G 1s ago Insufficient space (<10 extents) on vgs, LVM detected, locked
ceph06 /dev/sdb hdd 21.4G 72s ago Insufficient space (<10 extents) on vgs, LVM detected, locked
ceph06 /dev/sdc hdd 21.4G 72s ago Insufficient space (<10 extents) on vgs, LVM detected, locked
ceph06 /dev/sdd hdd 21.4G 72s ago Insufficient space (<10 extents) on vgs, LVM detected, locked
ceph06 /dev/sde hdd 21.4G 72s ago Insufficient space (<10 extents) on vgs, LVM detected, locked
ceph06 /dev/sdf hdd 21.4G 72s ago Insufficient space (<10 extents) on vgs, LVM detected, locked
[root@ceph01 ceph]# cephadm shell -- ceph status
Inferring fsid e40909f8-a099-11ed-be19-000c2965979c
Using recent ceph image quay.io/ceph/ceph@sha256:0560b16bec6e84345f29fb6693cd2430884e6efff16a95d5bdd0bb06d7661c45
cluster:
id: e40909f8-a099-11ed-be19-000c2965979c
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph01,ceph03,ceph02 (age 15m)
mgr: ceph01.lfvymn(active, since 22m), standbys: ceph02.hlvadx
osd: 15 osds: 15 up (since 92s), 15 in (since 110s)
data:
pools: 1 pools, 1 pgs
objects: 2 objects, 449 KiB
usage: 725 MiB used, 299 GiB / 300 GiB avail
pgs: 1 active+clean
[root@ceph01 ceph]# cephadm shell -- ceph osd tree
Inferring fsid e40909f8-a099-11ed-be19-000c2965979c
Using recent ceph image quay.io/ceph/ceph@sha256:0560b16bec6e84345f29fb6693cd2430884e6efff16a95d5bdd0bb06d7661c45
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.29228 root default
-3 0.09743 host ceph04
0 hdd 0.01949 osd.0 up 1.00000 1.00000
1 hdd 0.01949 osd.1 up 1.00000 1.00000
2 hdd 0.01949 osd.2 up 1.00000 1.00000
3 hdd 0.01949 osd.3 up 1.00000 1.00000
4 hdd 0.01949 osd.4 up 1.00000 1.00000
-5 0.09743 host ceph05
5 hdd 0.01949 osd.5 up 1.00000 1.00000
6 hdd 0.01949 osd.6 up 1.00000 1.00000
7 hdd 0.01949 osd.7 up 1.00000 1.00000
8 hdd 0.01949 osd.8 up 1.00000 1.00000
14 hdd 0.01949 osd.14 up 1.00000 1.00000
-7 0.09743 host ceph06
9 hdd 0.01949 osd.9 up 1.00000 1.00000
10 hdd 0.01949 osd.10 up 1.00000 1.00000
11 hdd 0.01949 osd.11 up 1.00000 1.00000
12 hdd 0.01949 osd.12 up 1.00000 1.00000
13 hdd 0.01949 osd.13 up 1.00000 1.00000
至此,ceph的基本环境已搭建完成。