使用rook部署ceph
Ceph is a software solution designed to address storage issues. It has been designed to address storage for objects, block, and file systems, we rarely see this in other software systems.
Ceph是旨在解决存储问题的软件解决方案。 它旨在解决对象,块和文件系统的存储问题,在其他软件系统中很少见到。
Ceph is powerful! It is one of the most reliable and stable software platforms available for storage. I heard about ceph a week ago, my first thought was, “Is it new? how reliable is it?”. Ceph has been around since 2006, it’s first stable release was around 2012 and has been in active development, with the latest one being a few days ago. When I say it is powerful, I refer to its dependability, agility, flexibility, and the array of different storage type it supports.
Ceph强大! 它是可用于存储的最可靠,最稳定的软件平台之一。 一周前,我听说过ceph,我的第一个念头是:“它是新的吗? 它有多可靠?”。 Ceph自2006年以来一直存在,它的第一个稳定版本是2012年左右,并且一直在积极开发中,而最新的版本是几天前。 当我说它强大时,我指的是它的可靠性,敏捷性,灵活性以及它支持的不同存储类型的阵列。
为什么在Openshift上使用ceph? (Why ceph on Openshift?)
Ceph is perfect for today’s cloud technologies. It is safe and secure. It handles the workload. It can be scaled. So why not?
Ceph非常适合当今的云技术。 它是安全的。 它处理工作量。 可以缩放。 那为什么不呢?
什么是rook,它在这里做什么? (What is rook and what does it do here?)
The ceph cluster we talked about above is managed by the rook operator. As they tell in rook.io, “It turns distributed storage systems into self-managing, self-scaling, self-healing storage services”.
我们上面讨论的ceph集群是由rook运算符管理的。 正如他们在rook.io中所说,“它将分布式存储系统转变为自我管理,自我扩展,自我修复的存储服务”。
Here, Rook also deploys osd,mon,mgr,and rgw daemons for the Ceph clusters as Kubernetes pods.
在这里,Rook还为Kubernetes Pod部署了Ceph集群的osd,mon,mgr和rgw守护程序。
现在开始吧... (Now let’s get started …)
Assuming you have set up your OpenShift Client CLI set up, ready and you have logged in.
假设您已经设置好OpenShift Client CLI,并且已经登录。
You can clone the required YAML files from the below-mentioned git repo.
您可以从下面提到的git repo克隆所需的YAML文件。
git clone https://github.com/Streaming-multiple-video-sources-Edge/Ceph-setup-YAML-files.git
cd <location where the files are downloaded>
We start by deploying the common resources
我们首先部署公共资源
oc create -f common.yaml
If you see the projects, you will see rook-ceph newly created. Change the current project to rook-ceph.
如果看到项目,将看到新创建的rook-ceph。 将当前项目更改为rook-ceph 。
oc project rook-ceph
We spoke about rook operator in the previous section, here we deploy the rook operator next. This monitors the storage daemons and ensures the healthiness of the processes.
在上一节中我们谈到了rook运算符,在这里我们接下来部署rook运算符。 这将监视存储后台驻留程序,并确保进程的运行状况。
oc apply -f https://raw.githubusercontent.com/rook/rook/v1.3.6/cluster/examples/kubernetes/ceph/operator-openshift.yamlor oc create -f operator-openshift.yaml
Now you can see the rook pods up and running.
现在,您可以看到Rook Pod已启动并正在运行。
So we can move to deploy the ceph cluster. There are two main things you need to keep in mind before deploying this cluster :
因此,我们可以开始部署ceph集群。 部署此群集之前,需要牢记两个主要事项:
- In this yaml file, we have to set the storage to at least 100 GB if we want to use it to perform large scale tasks, in the case of this example 10 GB is sufficient. So you can mention the size according to your needs. 在此yaml文件中,如果要使用它执行大型任务,则必须将存储设置为至少100 GB,在本示例中,10 GB就足够了。 因此,您可以根据需要提及尺寸。
By default in this yaml the StorageClass name is commented but the className
gp2
refers to the default class for AWS instances, in order to deploy to the cluster’s defaultstorageClass
you can use the yaml file that has it commented.默认情况下,此yaml中的StorageClass名称带有注释,但className
gp2
指AWS实例的默认类,为了部署到集群的默认storageClass
您可以使用带有注释的yaml文件。
oc create -f cluster-on-pvc.yaml
Next we setup the objects,
接下来,我们设置对象
oc create -f object-openshift.yaml
Now we create a new Ceph Object Store user named ceph-demo-user
by running the below command:
现在,通过运行以下命令,创建一个名为ceph-demo-user
的新Ceph对象存储用户:
oc create -f object-user.yaml
Once all the above commands are run, in few minutes the osd,mon,mgr,and rgw
pods will be deployed. Ones all the below mentioned pods are up and running you can confirm that set up is correct till now without any issues.
一旦上述所有命令运行osd,mon,mgr,and rgw
将在几分钟内部署osd,mon,mgr,and rgw
pod。 下面提到的所有Pod都已启动并运行,您可以确认到目前为止设置正确,没有任何问题。
Check if all the below pods are up and running to confirm if your setup is correct till this point.
检查以下所有吊舱是否已启动并且正在运行,以确认您的设置到目前为止是否正确。
Now we need to create routes inside ceph cluster to manage external traffic.
现在,我们需要在ceph集群内部创建路由以管理外部流量。
oc create -f route.yaml
And setup toolbox, to run commands to check the ceph cluster.
并设置工具箱,以运行命令来检查ceph集群。
oc create -f toolbox.yaml
To access the toolbox container run the below command. Once your inside the toolbox pod you can check status of your ceph cluster.
要访问工具箱容器,请运行以下命令。 一旦进入工具箱窗格,您就可以检查ceph集群的状态。
oc -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
Sometimes it might take time to set up the connection, and you will face error shown below, you just need to run the command few times or wait and run for the connection to be established.
有时可能需要花费一些时间来建立连接,并且您将遇到如下所示的错误,您只需要运行几次命令或等待并等待建立连接即可。
工具箱窗格中的有用ceph命令 (Useful ceph command in the toolbox pod)
To check cluster status
检查集群状态
bash-4.2$ ceph status
We get the information regarding the cluster health, services and data.
我们获得有关集群运行状况,服务和数据的信息。
bash-4.2$ ceph status
cluster:
id: XXXX-XXXX–XXXX–XXXX-XXXXXXXXXXXX
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 8m)
mgr: a(active, since 7m)
osd: 3 osds: 3 up (since 6m), 3 in (since 6m)
rgw: 1 daemon active (my.store.a)
data:
pools: 7 pools, 80 pgs
objects: 203 objects, 4.2 KiB
usage: 3.0 GiB used, 297 GiB / 300 GiB avail
pgs: 80 active+clean
Sometimes we would get,
有时候我们会得到
health: HEALTH_WARN
clock skew detected on mon.b, mon.c
This is because the times are out of sync, you can either proceed normally as it's not big issue else you can terminate the pods mon b and mon c pods and just wait for them to create again (it happens implicitly).
这是因为时间不同步,您可以正常进行,因为这不是一个大问题,否则您可以终止mon b和mon c pod的Pod,然后等待它们再次创建(隐式发生)。
You can follow this documentation for more information regarding the ceph commands https://docs.ceph.com/docs/giant/rados/operations/control/.
您可以阅读此文档,以获取有关ceph命令https://docs.ceph.com/docs/giant/rados/operations/control/的更多信息。
测试我们的Ceph集群 (Test our Ceph cluster)
Now that our ceph cluster is up, we will test it by uploading images and downloading them.
现在,我们的ceph集群已启动,我们将通过上传图像并下载它们进行测试。
You first need the S3 command variables, you can get the required variables using the below commands.
首先需要S3命令变量,然后可以使用以下命令获取所需的变量。
# run the below command to get the end point URL
oc get route ceph-route -o jsonpath={.spec.host}
# run the below command to get access key and secret access key
oc get secrets rook-ceph-object-user-my-store-ceph-demo-user -o jsonpath={.data}
Fill in the details you got above in the below blanks in the python code mentioned in the next section.
在下一节提到的python代码的以下空白处,填写您上面获得的详细信息。
s3_endpoint_url = ""
s3_access_key_id = ""
s3_secret_access_key = ""
Now you can just run the demo and test your ceph cluster !
现在您可以运行演示并测试您的ceph集群!
常见问题 (Common issues)
I faced a lot…. A LOT of issues in this process but certain people and blogs out there helped me in the process.
我面临很多……。 在此过程中存在很多问题,但是某些人和博客在此过程中对我有所帮助。
rook — https://github.com/rook/rook/blob/master/Documentation/common-issues.md
菜鸟-https: //github.com/rook/rook/blob/master/Documentation/common-issues.md
ceph — https://github.com/rook/rook/blob/master/Documentation/ceph-common-issues.md
ceph- https://github.com/rook/rook/blob/master/Documentation/ceph-common-issues.md
测试我们的集群! (Testing out our cluster !)
Finally, we run the python code to test out the download and upload feature.
最后,我们运行python代码以测试下载和上传功能。
NOTE: Keep a video file ready to test it out.
注意: 请准备好视频文件以对其进行测试。
In the above code, you need to fill ‘end point URL’, ‘access key id’ and ‘secret access key’ by following the commands mentioned in the last section.
在上面的代码中,您需要按照上一节中提到的命令来填充“端点URL”,“访问密钥ID”和“秘密访问密钥” 。
To say briefly about the code, we start by configuring the Boto S3 client and the S3 resource. Then we get the bucket names and set it up and then upload out a demo video file.
为了简短地讲代码,我们首先配置Boto S3客户端和S3资源。 然后,我们获取存储桶名称并进行设置,然后上传一个演示视频文件。
Here in line 62, we mention the video code file that we are going to upload and in line 72, we mention the download details, we can include location and the name we plane to use here.
在第62行中,我们提到了要上传的视频代码文件,在第72行中,我们提到了下载细节,我们可以在此处包括位置和名称。
python firstCeph.py
Now, you can see the same file you uploaded being downloaded in the location you mention in line 72. You can go to the location mentioned above in the code, else you can mentioned your own location and go there.
现在,您可以在第72行中提到的位置中看到与下载的文件相同的文件。您可以转到代码中上面提到的位置,否则您可以提及自己的位置并转到该位置。
删除集群 (Deleting the Cluster)
Deleting the ceph cluster is a meticulous process, missing one step or not running the below commands in order will lead to stray pods running and finding them out would become a tedious process.
删除ceph集群是一个非常细致的过程,缺少一个步骤或不运行以下命令将导致流浪吊舱运行,而找出它们将变得很繁琐。
you would need helm installed before trying the below code.
在尝试以下代码之前,您需要安装头盔 。
oc delete -f <all_above_mentioned_YAML_files>oc -n rook-ceph patch clusters.ceph.rook.io rook-ceph -p ‘{“metadata”:{“finalizers”: []}}’ –type=mergeoc -n rook-ceph delete cephcluster rook-cephhelm delete –purge rookoc delete namespace rook-ceph
Connect to each machine and delete
连接到每台机器并删除
/var/lib/rook
or the path specified by
或指定的路径
dataDirHostPath
After trying this simple upload, you can try multipart upload and other storage types and implement different applications of ceph.
尝试了这种简单的上传之后,您可以尝试分段上传和其他存储类型,并实现ceph的不同应用程序。
Hope you had fun going through the blog! Your feedback would be appreciated.
希望您在博客中玩得开心! 您的反馈将不胜感激。
Thank you for your time! It is a great time to be part of Open-source development community.
感谢您的时间! 现在是加入开源开发社区的好时机。
使用rook部署ceph