一 环境准备
这里用kubeadm部署4节点kubernetes1.13.2集群,集群节点信息:
节点 | 主机名 | IP | OS | 内核 |
master | node0 | 192.168.2.20 | centos7.5 | 3.10.0-862.el7.x86_64 |
node1 | node1 | 192.168.2.21 | centos7.5 | 3.10.0-862.el7.x86_64 |
node2 | node2 | 192.168.2.22 | centos7.5 | 3.10.0-862.el7.x86_6 |
node3 | node3 | 192.168.2.23 | centos7.5 | 3.10.0-862.el7.x86_6 |
二 部署Rook Operator
1 克隆rook github仓库到本地
git clone https://github.com/rook/rook.git
cd rook/cluster/examples/kubernetes/ceph/
2 执行operator.yaml文件部署rook系统组件
[chen@node0 ceph]$ kubectl create -f operator.yaml
namespace/rook-ceph-system created
customresourcedefinition.apiextensions.k8s.io/cephclusters.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephfilesystems.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephnfses.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstores.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstoreusers.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephblockpools.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/volumes.rook.io created
clusterrole.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
role.rbac.authorization.k8s.io/rook-ceph-system created
clusterrole.rbac.authorization.k8s.io/rook-ceph-global created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
serviceaccount/rook-ceph-system created
rolebinding.rbac.authorization.k8s.io/rook-ceph-system created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-global created
deployment.apps/rook-ceph-operator created
如上所示,它会创建如下资源:
1)namespace:rook-ceph-system,之后的所有rook相关的pod都会创建在该namespace下面
2)CRD:创建五个CRDs,.ceph.rook.io
3)role & clusterrole:用户资源控制
4)serviceaccount:ServiceAccount资源,给Rook创建的Pod使用
5)deployment:rook-ceph-operator,部署rook ceph相关的组件
3 查看创建的pod,部署rook-ceph-operator过程中,会触发以DaemonSet的方式在集群部署Agent和Discoverpods。
operator会在集群内的每个主机创建两个pod:rook-discover,rook-ceph-agent:
[chen@node0 ceph]$ kubectl -n rook-ceph-system get pod
NAME READY STATUS RESTARTS AGE
rook-ceph-agent-jpzjn 1/1 Running 0 9s
rook-ceph-agent-mhr8t 1/1 Running 0 9s
rook-ceph-agent-qzm92 1/1 Running 0 9s
rook-ceph-operator-b996864dd-2n72s 1/1 Running 0 11s
rook-discover-7sp72 1/1 Running 0 9s
rook-discover-mfw78 1/1 Running 0 9s
rook-discover-nfsqf 1/1 Running 0 9s
三 创建Rook Cluster
1 当operator.yaml创建的pod已经是running状态后,就可以部署rook cluster了。
执行cluster.yaml文件:
[chen@node0 ceph]$ kubectl create -f cluster.yaml
namespace/rook-ceph created
serviceaccount/rook-ceph-osd created
serviceaccount/rook-ceph-mgr created
role.rbac.authorization.k8s.io/rook-ceph-osd created
role.rbac.authorization.k8s.io/rook-ceph-mgr created
role.rbac.authorization.k8s.io/rook-ceph-mgr created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
rolebinding.rbac.authorization.k8s.io/rook-ceph-osd created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-system created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
cephcluster.ceph.rook.io/rook-ceph created
如上所示,它会创建如下资源:
1)namespace:rook-ceph,之后的所有Ceph集群相关的pod都会创建在该namespace下
2)serviceaccount:ServiceAccount资源,给Ceph集群的Pod使用
3)role & rolebinding:用户资源控制
4)cluster:rook-ceph,创建的Ceph集群
2 Ceph集群部署后,查看创建pods,其中osd数量取决于你的盘节点数量:
[chen@node0 ceph]$ kubectl -n rook-ceph get pod
NAME READY STATUS RESTARTS AGE
rook-ceph-mgr-a-dfcbfd5d-2vnjb 1/1 Running 0 4m58s
rook-ceph-mon-a-6646f9c786-d4vsn 1/1 Running 0 5m33s
rook-ceph-mon-b-6f6db5d649-dpnnk 1/1 Running 0 5m27s
rook-ceph-mon-c-6bf994999b-bcwpd 1/1 Running 0 5m15s
rook-ceph-osd-0-86b5f9566f-k579f 1/1 Running 0 4m27s
rook-ceph-osd-1-5fd8d4f57-q2lsb 1/1 Running 0 4m17s
rook-ceph-osd-2-666d8d65bb-t2lfr 1/1 Running 0 4m15s
rook-ceph-osd-3-dc4b4c4dd-l5gkr 1/1 Running 0 4m17s
rook-ceph-osd-4-84b749589b-5qgxh 1/1 Running 0 4m15s
rook-ceph-osd-prepare-node1-8zzwr 0/2 Completed 0 4m34s
rook-ceph-osd-prepare-node2-nqjc4 0/2 Completed 0 4m34s
rook-ceph-osd-prepare-node3-gsh7f 0/2 Completed 0 4m34s
可以看出部署的Ceph集群有:
Ceph Monitors:默认启动三个ceph-mon,可以在cluster.yaml里配置
Ceph Mgr:默认启动一个,可以在cluster.yaml里配置
Ceph OSDs:根据cluster.yaml里的配置启动,默认在所有的可用磁盘节点上启动
四 创建ceph dashboard
1 在cluster.yaml文件中默认已经启用了ceph dashboard,查看dashboard的service:
[chen@node0 ceph]$ kubectl get service -n rook-ceph | grep dashboard
rook-ceph-mgr-dashboard ClusterIP 10.98.161.159 <none> 8443/TCP 6m13s
2 rook-ceph-mgr-dashboard监听的端口是8443,创建nodeport类型的service以便集群外部访问:
[chen@node0 ceph]$ kubectl apply -f dashboard-external-https.yaml
service/rook-ceph-mgr-dashboard-external-https created
3 重新查看dashboard的service,可以看到供外部访问的端口为30490 :
[chen@node0 ceph]$ kubectl get service -n rook-ceph | grep dashboard
rook-ceph-mgr-dashboard ClusterIP 10.98.161.159 <none> 8443/TCP 6m13s
rook-ceph-mgr-dashboard-external-https NodePort 10.96.212.169 <none> 8443:30490/TCP 24s
4 获取Dashboard的登陆账号和密码:
[chen@node0 ~]$ MGR_POD=`kubectl get pod -n rook-ceph | grep mgr | awk '{print $1}'`
[chen@node0 ~]$ kubectl -n rook-ceph logs $MGR_POD | grep password
debug 2019-05-05 03:22:23.136 7f1d715f9700 0 log_channel(audit) log [DBG] : from='client.183382 10.244.1.96:0/3054456848' entity='client.admin' cmd=[{"username": "admin", "prefix": "dashboard set-login-credentials", "password": "dXTEL21y2O", "target": ["mgr", ""], "format": "json"}]: dispatch
找到username和password字段,我这里是admin,dXTEL21y2O
5 打开浏览器输入任意一个Node的IP+nodeport端口,这里使用master节点 ip访问:
https://192.168.2.20:30490/#/login
登录后界面如下:
五 部署Ceph Toolbox
1 默认启动的Ceph集群,是开启Ceph认证的,这样你登陆Ceph组件所在的Pod里,是没法去获取集群状态,以及执行CLI命令,这时需要通过toolbox.yaml部署Ceph Toolbox:
[chen@node0 ceph]$ kubectl apply -f toolbox.yaml
deployment.apps/rook-ceph-tools created
2 部署成功后,查看pod:
[chen@node0 ceph]$ kubectl -n rook-ceph get pods -o wide | grep ceph-tools
rook-ceph-tools-76c7d559b6-w7lxd 1/1 Running 0 16s 192.168.2.21 node1 <none> <none>
3 可以登陆该pod后,执行Ceph CLI命令:
[chen@node0 ceph]$ kubectl -n rook-ceph exec -it rook-ceph-tools-76c7d559b6-w7lxd bash
bash: warning: setlocale: LC_CTYPE: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_COLLATE: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_MESSAGES: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_NUMERIC: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_TIME: cannot change locale (en_US.UTF-8): No such file or directory
[root@node1 /]# ceph status
cluster:
id: cf01857e-fb20-4fd8-86d6-5673a374d507
health: HEALTH_WARN
clock skew detected on mon.a, mon.c
services:
mon: 3 daemons, quorum b,a,c
mgr: a(active)
osd: 5 osds: 5 up, 5 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 74 GiB used, 2.0 TiB / 2.1 TiB avail
pgs:
[root@node1 /]# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
2.1 TiB 2.0 TiB 74 GiB 3.44
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
[root@node1 /]# cat /etc/ceph/ceph.conf
[global]
mon_host = 10.106.233.249:6789,10.104.221.124:6789,10.111.134.249:6789
[client.admin]
keyring = /etc/ceph/keyring
[root@node1 /]# cat /etc/ceph/keyring
[client.admin]
key = AQA8ZblcUUIpCxAAeR+hIXC1KY8xhmKt1AfJUw==
[root@node1 /]# cat rbdmap
cat: rbdmap: No such file or directory
[root@node1 /]# cat /etc/ceph/rbdmap
# RbdDevice Parameters
#poolname/imagename id=client,keyring=/etc/ceph/ceph.client.keyring
[root@node1 /]# ceph osd status
+----+-------+-------+-------+--------+---------+--------+---------+-----------+
| id | host | used | avail | wr ops | wr data | rd ops | rd data | state |
+----+-------+-------+-------+--------+---------+--------+---------+-----------+
| 0 | node2 | 27.2G | 71.0G | 0 | 0 | 0 | 0 | exists,up |
| 1 | node1 | 1026M | 930G | 0 | 0 | 0 | 0 | exists,up |
| 2 | node3 | 1026M | 930G | 0 | 0 | 0 | 0 | exists,up |
| 3 | node1 | 22.7G | 75.5G | 0 | 0 | 0 | 0 | exists,up |
| 4 | node3 | 22.1G | 76.1G | 0 | 0 | 0 | 0 | exists,up |
+----+-------+-------+-------+--------+---------+--------+---------+-----------+
[root@node1 /]# rados df
POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR
total_objects 0
total_used 74 GiB
total_avail 2.0 TiB
total_space 2.1 TiB
[root@node1 /]# ceph osd pool create test_pool 64
pool 'test_pool' created
[root@node1 /]# ceph osd pool get test_pool size
size: 1
[root@node1 /]# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
2.1 TiB 2.0 TiB 74 GiB 3.44
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
test_pool 1 0 B 0 1.4 TiB 0