layout: post
title: 使用Rook安装Ceph
catalog: true
tag: [K8S, Ceph]
1. Rook简介
Rook是一个云原生的存储管理工具,当前支持 Ceph
、Cassandra
、NFS
- 主要解决了云原生场景下Ceph的部署、使用、维护等问题,可见的例如
- rgw、nfs的高可用,都可以通过ing-service-pod来实现
- 部署非常快
- 也带来了一些挑战 – 依赖K8S,Ceph使用容器化管理之后,故障率会提升、维护难度会增大。
2. Rook架构
简言之,rook以云原生方式(Operator)实现了Ceph的生命周期管理和Ceph资源的使用
3. Rook部署
3.1. 前提
- 要有一个K8S集群
- K8S集群的worker节点需要至少有一个空闲的块设备作为Ceph的OSD
3.2. 环境描述
如果使用Ceph O版及之后的版本部署会比较顺利,这里使用N版,部署时出现了一点点问题
组件名称 | 版本 | 备注 |
---|---|---|
操作系统 | CentOS Linux release 7.8.2003 (Core) | |
内核 | 5.13.2-1.el7.elrepo.x86_64 | |
K8S | v1.19.7 | |
Ceph | 14.2.22 | |
Rook | v1.7.0 |
3.3. 部署
3.3.1. 获取rook代码
gh repo clone rook/rook
# 切到v1.7.0tag
git checkout v1.7.0
3.3.2. 修改变量
cd cluster/examples/kubernetes/ceph
3.3.2.1. cluster.yaml
只修改了下面两个有注释的地方
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph # namespace:cluster
spec:
cephVersion:
# 指定Ceph版本
image: quay.io/ceph/ceph:v14.2.22
allowUnsupported: false
dataDirHostPath: /var/lib/rook
skipUpgradeChecks: false
continueUpgradeAfterChecksEvenIfNotHealthy: false
waitTimeoutForHealthyOSDInMinutes: 10
mon:
count: 3
allowMultiplePerNode: false
mgr:
count: 1
modules:
- name: pg_autoscaler
enabled: true
dashboard:
enabled: true
ssl: true
monitoring:
enabled: false
rulesNamespace: rook-ceph
network:
crashCollector:
disable: false
cleanupPolicy:
confirmation: ""
sanitizeDisks:
method: quick
dataSource: zero
iteration: 1
allowUninstallWithVolumes: false
annotations:
labels:
resources:
removeOSDsIfOutAndSafeToRemove: false
storage: # cluster level storage configuration and selection
useAllNodes: true
useAllDevices: true
config:
# 指定Ceph磁盘个数和盘符 可以用正则匹配
osdsPerDevice: "1" # this value can be overridden at the node or device level
devices: "vdb"
disruptionManagement:
managePodBudgets: true
osdMaintenanceTimeout: 30
pgHealthCheckTimeout: 0
manageMachineDisruptionBudgets: false
machineDisruptionBudgetNamespace: openshift-machine-api
healthCheck:
daemonHealth:
mon:
disabled: false
interval: 45s
osd:
disabled: false
interval: 60s
status:
disabled: false
interval: 60s
livenessProbe:
mon:
disabled: false
mgr:
disabled: false
osd:
disabled: false
3.3.2.2. operator.yaml
只修改了csi版本相关信息
kind: ConfigMap
apiVersion: v1
metadata:
name: rook-ceph-operator-config
namespace: rook-ceph # namespace:operator
data:
ROOK_LOG_LEVEL: "INFO"
ROOK_CSI_ENABLE_CEPHFS: "true"
ROOK_CSI_ENABLE_RBD: "true"
ROOK_CSI_ENABLE_GRPC_METRICS: "false"
CSI_ENABLE_CEPHFS_SNAPSHOTTER: "true"
CSI_ENABLE_RBD_SNAPSHOTTER: "true"
CSI_FORCE_CEPHFS_KERNEL_CLIENT: "true"
CSI_RBD_FSGROUPPOLICY: "ReadWriteOnceWithFSType"
CSI_CEPHFS_FSGROUPPOLICY: "None"
# 修改CSI镜像版本,与Ceph版本相匹配
# 版本修改参照 http://elrond.wang/2021/06/19/Kubernetes%E9%9B%86%E6%88%90Ceph/
# 默认使用的镜像被墙,不用代理拉不下来
ROOK_CSI_ALLOW_UNSUPPORTED_VERSION: "true"
ROOK_CSI_CEPH_IMAGE: "quay.io/cephcsi/cephcsi:v3.2-canary"
ROOK_CSI_REGISTRAR_IMAGE: "quay.io/k8scsi/csi-node-driver-registrar:v2.0.1"
ROOK_CSI_RESIZER_IMAGE: "quay.io/k8scsi/csi-resizer:v1.0.1"
ROOK_CSI_PROVISIONER_IMAGE: "quay.io/k8scsi/csi-provisioner:v2.0.4"
ROOK_CSI_SNAPSHOTTER_IMAGE: "quay.io/k8scsi/csi-snapshotter:v4.0.0"
ROOK_CSI_ATTACHER_IMAGE: "quay.io/k8scsi/csi-attacher:v3.0.2"
ROOK_OBC_WATCH_OPERATOR_NAMESPACE: "true"
ROOK_ENABLE_FLEX_DRIVER: "false"
ROOK_ENABLE_DISCOVERY_DAEMON: "false"
ROOK_CEPH_COMMANDS_TIMEOUT_SECONDS: "15"
CSI_ENABLE_VOLUME_REPLICATION: "false"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: rook-ceph-operator
namespace: rook-ceph # namespace:operator
labels:
operator: rook
storage-backend: ceph
spec:
selector:
matchLabels:
app: rook-ceph-operator
replicas: 1
template:
metadata:
labels:
app: rook-ceph-operator
spec:
serviceAccountName: rook-ceph-system
containers:
- name: rook-ceph-operator
image: rook/ceph:v1.7.0
args: ["ceph", "operator"]
volumeMounts:
- mountPath: /var/lib/rook
name: rook-config
- mountPath: /etc/ceph
name: default-config-dir
env:
- name: ROOK_CURRENT_NAMESPACE_ONLY
value: "false"
- name: ROOK_DISCOVER_DEVICES_INTERVAL
value: "60m"
- name: ROOK_HOSTPATH_REQUIRES_PRIVILEGED
value: "false"
- name: ROOK_ENABLE_SELINUX_RELABELING
value: "true"
- name: ROOK_ENABLE_FSGROUP
value: "true"
- name: ROOK_DISABLE_DEVICE_HOTPLUG
value: "false"
- name: DISCOVER_DAEMON_UDEV_BLACKLIST
value: "(?i)dm-[0-9]+,(?i)rbd[0-9]+,(?i)nbd[0-9]+"
- name: ROOK_UNREACHABLE_NODE_TOLERATION_SECONDS
value: "5"
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumes:
- name: rook-config
emptyDir: {}
- name: default-config-dir
emptyDir: {}
3.3.2.3. 其他配置
- 文件系统相关配置
filesystem.yaml
- 对象存储相关配置
object.yaml
- nfs相关配置
nfs.yaml
3.3.3. 开始部署
# 部署operator
kubectl create -f crds.yaml -f common.yaml -f operator.yaml
# 部署Ceph集群
kubectl create -f cluster.yaml
# 部署RGW
kubectl create -f object.yaml
# 部署NFS
kubectl create -f nfs.yaml
# 部署tool-box
kubectl create -f toolbox.yaml
3.3.4. 部署成功的状态
3.3.4.1. po状态
kubectl get po -n rook-ceph
# 输出如下
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-nk6jm 3/3 Running 0 12s
csi-cephfsplugin-provisioner-666c6886db-lktgs 6/6 Running 0 11s
csi-cephfsplugin-provisioner-666c6886db-pw27p 6/6 Running 0 11s
csi-cephfsplugin-xh8kh 3/3 Running 0 12s
csi-cephfsplugin-zl2tm 3/3 Running 0 12s
csi-rbdplugin-5bjzv 3/3 Running 0 14s
csi-rbdplugin-cl2wc 3/3 Running 0 14s
csi-rbdplugin-provisioner-769b8d6f5-9tdzw 6/6 Running 0 13s
csi-rbdplugin-provisioner-769b8d6f5-xfgpx 6/6 Running 0 13s
csi-rbdplugin-wgbfc 3/3 Running 0 14s
rook-ceph-crashcollector-172.16.80.177-6856778cd6-vbdbp 1/1 Running 0 7d15h
rook-ceph-crashcollector-172.16.80.181-84c6474ccf-29jwc 1/1 Running 0 7d15h
rook-ceph-crashcollector-172.16.80.60-5d7c647d48-6rxgx 1/1 Running 0 34m
rook-ceph-mds-myfs-a-568d689779-wr8p7 1/1 Running 0 7d15h
rook-ceph-mds-myfs-b-6f75c5c7f5-vl942 1/1 Running 0 7d15h
rook-ceph-mgr-a-7d6cdd489d-2wh92 1/1 Running 0 7d15h
rook-ceph-mon-a-594d7bcdd4-jrkcm 1/1 Running 0 7d15h
rook-ceph-mon-b-d846c949-gmrkv 1/1 Running 0 7d15h
rook-ceph-mon-c-868856f488-v264b 1/1 Running 0 7d15h
rook-ceph-nfs-my-nfs-a-9fb6cbff6-cvhbc 2/2 Running 0 7d15h
rook-ceph-nfs-my-nfs-b-5dfcdf6454-kktdg 2/2 Running 0 7d15h
rook-ceph-nfs-my-nfs-c-78f748c5f6-kxszz 2/2 Running 0 7d15h
rook-ceph-operator-6b88ff7b4c-nckkl 1/1 Running 0 7d17h
rook-ceph-osd-0-898c48bcc-vvq4g 1/1 Running 0 7d15h
rook-ceph-osd-1-7d65ff8c46-wmclv 1/1 Running 0 7d15h
rook-ceph-osd-2-6856cd76c6-n2rqh 1/1 Running 0 7d15h
rook-ceph-osd-prepare-172.16.80.177-rxqlz 0/1 Completed 0 35m
rook-ceph-osd-prepare-172.16.80.181-mzrnx 0/1 Completed 0 35m
rook-ceph-osd-prepare-172.16.80.60-twls6 0/1 Completed 0 35m
rook-ceph-rgw-my-store-a-6465b6f48f-stwp4 1/1 Running 0 34m
rook-ceph-tools-744c55f865-qpx8s 1/1 Running 0 7d16h
3.3.5. 集群状态
进入toolbox查看
# 进入toolbox
kubectl exec -it -n rook-ceph rook-ceph-tools-744c55f865-qpx8s -- bash
# 执行命令
[root@rook-ceph-tools-744c55f865-qpx8s /]# ceph -s
cluster:
id: e8bc8a28-0c21-48a0-b676-ada1f26bbc41
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 7d)
mgr: a(active, since 6d)
mds: myfs:1 {0=myfs-a=up:active} 1 up:standby-replay
osd: 3 osds: 3 up (since 7d), 3 in (since 7d)
rgw: 1 daemon active (my.store.a)
task status:
data:
pools: 9 pools, 144 pgs
objects: 286 objects, 19 KiB
usage: 3.0 GiB used, 1.5 TiB / 1.5 TiB avail
pgs: 144 active+clean
io:
client: 1.6 KiB/s rd, 170 B/s wr, 2 op/s rd, 0 op/s wr
到这里就部署完毕,可以直接使用了
4. 部署中的问题
4.1. csi镜像无法拉到
国内无法访问 k8s.gcr.io, 所以需要将镜像地址替换为红帽的镜像仓库,参照上面代码中的注释,各个版本对应的镜像版本不一样,按实际情况替换
4.2. 删除rook时卡住
cephcluster
删除一直卡着
- 解决方法:
在另一个窗口执行
kubectl -n rook-ceph patch crd cephclusters.ceph.rook.io --type merge -p '{"metadata":{"finalizers": [null]}}'
- 参照
- https://rook.io/docs/rook/v1.1/ceph-teardown.html
- https://github.com/rook/rook/issues/4410
4.3. 重装rook是mon只初始化了一个,后面的部署都卡住
operater一直打印如下日志
kubectl logs rook-ceph-operator-6b88ff7b4c-nckkl -n rook-ceph -f
2021-01-07 15:44:39.084593 I | op-mon: waiting for mon quorum with [a]
2021-01-07 15:44:39.095582 I | op-mon: mons running: [a]
2021-01-07 15:44:59.929839 I | op-mon: mons running: [a]
2021-01-07 15:45:20.704740 I | op-mon: mons running: [a]
2021-01-07 15:45:41.473138 I | op-mon: mons running: [a]
2021-01-07 15:46:02.242541 I | op-mon: mons running: [a]
2021-01-07 15:46:23.001123 I | op-mon: mons running: [a]
2021-01-07 15:46:43.853226 I | op-mon: mons running: [a]
2021-01-07 15:47:04.651949 I | op-mon: mons running: [a]
2021-01-07 15:47:25.420815 I | op-mon: mons running: [a]
2021-01-07 15:47:46.238038 I | op-mon: mons running: [a]
2021-01-07 15:48:07.077166 I | op-mon: mons running: [a]
2021-01-07 15:48:27.858133 I | op-mon: mons running: [a]
get po
只有一个mon,且也没有osd的pod
- 解决方式
- 删除mon数据目录, 默认为
/var/lib/rook
, 参照 https://rook.io/docs/rook/v1.7/ceph-common-issues.html#monitors-are-the-only-pods-running
- 删除mon数据目录, 默认为
5. 最后
- rook 文档相对详尽
- rook 社区活跃
- rook用来做测试非常方便
- rook的一些高可用实现思路可以参考并用于进程版
- 生产集群用rook的收益与风险不成正比,生产应谨慎考虑之后再做使用