Litmus和Okteto Cloud的混沌工程

Cloud Native applications are, by definition, highly distributed, elastic, resistant to failure and loosely coupled. That’s easy to say, and even diagram. But how do we validate that our applications will perform as expected under different failure conditions?

根据定义,Cloud Native应用程序具有高度分布式,弹性,抗故障能力和松散耦合性。 这很容易说,甚至是图表。 但是,我们如何验证我们的应用程序在不同的故障情况下将按预期运行?

Enter Chaos engineering. Chaos engineering is the discipline of experimenting on a software system in production in order to build confidence in the system’s capability to withstand turbulent and unexpected conditions. Chaos Engineering is a great tool to help us find weaknesses and misconfiguration in our services. It is particularly important for Cloud Native applications, which, due to their distributed and elastic nature, need to be resilient by default.

输入混沌工程。 混沌工程学是在生产中的软件系统上进行实验的学科,目的是建立对系统抵御动荡和意外状况的能力的信心。 混沌工程是一个很好的工具,可以帮助我们发现服务中的弱点和错误配置。 对于Cloud Native应用程序而言,这尤其重要,由于其分布式和弹性特性,默认情况下需要具有弹性。

Litmus is a CNCF sandbox project for practicing Chaos Engineering in Cloud Native environments. Litmus provides a chaos-operator, a large set of chaos experiments in its hub, detailed documentation, quick Demo, and a friendly community. In this blog we’ll show you how you can use Litmus and Okteto together to start Chaos testing your applications in a few seconds.

Litmus是一个CNCF沙箱项目,用于在Cloud Native环境中实践混沌工程。 Litmus提供了一个混乱的操作者,其中心的大量混乱实验,详细的文档,快速的演示以及一个友好的社区。 在此博客中,我们将向您展示如何一起使用Litmus和Okteto在几秒钟内开始Chaos测试应用程序。

石蕊的混沌测试 (Chaos Testing with Litmus)

When chaos testing an application with LitmusChaos, there are four components that you’ll need to keep in mind.

使用LitmusChaos混乱测试应用程序时,需要记住四个组成部分。

混沌算子 (Chaos Operator)

This is the core part of LitmusChaos. The operator is in charge of executing the experiments, and reporting the results once the experiment is finished.

这是LitmusChaos 的核心部分 。 操作员负责执行实验,并在实验完成后报告结果。

You can install it directly from the command line, the official helm chart, or from the Okteto Cloud catalog.

您可以直接从命令行官方头盔图表Okteto Cloud目录中进行安装

混沌实验 (Chaos Experiment)

This is the chaos action that will performed on your application. This goes from Kubernetes specific like deleting a pod, or hogging the network, to application specific actions like randomly deleting an OpenEBS drive.

这是将在您的应用程序上执行的混乱操作。 从特定于Kubernetes的操作(例如删除吊舱或漫游网络)到特定于应用程序的操作(例如随机删除OpenEBS驱动器)。

The LitmusChaos community maintains an online hub of chaos experiments.

LitmusChaos社区维护着一个在线混沌实验中心

混沌引擎 (Chaos Engine)

The Chaos Engine is the link between the chaos experiment and the application under test. This is where you specify any parameters of your experiment such as its duration, enable/disable policies (e.g enable/disable monitoring) as well as information on how to find the targets of the experiment (typically, this is the application under test).

混沌引擎是混沌实验和被测应用程序之间的链接。 您可以在此处指定实验的任何参数,例如持续时间,启用/禁用策略(例如启用/禁用监控)以及有关如何找到实验目标的信息(通常是被测应用程序)。

被测应用 (Application Under Test)

This is the application that will be the “target” of the chaos experiment. Currently, LitmusChaos supports Deployments, StatefulSets and DaemonSets. Under the default configuration, you need to add the litmuschaos.io/chaos: "true" tag to the resource for the Chaos Operator to be able to find them, and to prevent other applications from being affected.

这就是将成为混沌实验“目标”的应用程序。 目前,LitmusChaos支持Deployments,StatefulSet和DaemonSet。 在默认配置下,您需要将litmuschaos.io/chaos: "true"标记添加到资源中,以使Chaos Operator能够找到它们,并防止其他应用程序受到影响。

先决条件 (Prerequistes)

To chaos-test your application you’ll need to install:

要对应用程序进行混乱测试,您需要安装:

  1. The okteto CLI.

    okteto CLI

  2. A free Okteto Cloud account.

    一个免费的Okteto Cloud帐户。

  3. kubectl configured to talk to Okteto Cloud.

    kubectl配置为与Okteto Cloud对话

  4. Your favorite IDE or text editor.

    您最喜欢的IDE或文本编辑器。

部署您的混沌开发环境 (Deploy your Chaos-ready Development Environment)

You can always manually install every component by hand. But instead, I’m taking advantage of Okteto’s pre-configured development environments. Just click on the Develop on Okteto button below and deploy your chaos-ready development environment:

您始终可以手动手动安装每个组件。 但是,相反,我利用了Okteto的预配置开发环境 。 只需单击下面的“ 在Okteto上进行开发”按钮,然后部署可随时使用的混乱开发环境:

This will automatically deploy the following resources on your Okteto Cloud account:

这将在Okteto Cloud帐户上自动部署以下资源:

Image for post

混沌测试应用 (Chaos Test the Application)

Now that we have our development environment, let's chaos test the application. For this example, we are using the traditional Hello World application, deployed with two replicas. Click on the link and call it a few times to verify that it works fine.

现在我们有了开发环境,让我们对应用程序进行混乱的测试。 对于此示例,我们使用部署有两个副本的传统Hello World应用程序。 单击该链接,然后调用几次以验证它是否可以正常工作。

Image for post

With the application running, we are ready to start the chaos experiment. In Litmus-speak, this means creating the ChaosEngine resource. Create a file engine.yaml, open it in your favorite IDE, and paste the content below:

随着应用程序的运行,我们准备开始混乱实验。 用石蕊语来说,这意味着创建ChaosEngine资源。 创建一个文件engine.yaml ,在您喜欢的IDE中打开它,然后粘贴以下内容:

apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: pod-killer-chaos
spec:
annotationCheck: 'true'
engineState: 'active'
appinfo:
applabel: 'app=hello-world'
appkind: 'deployment'
chaosServiceAccount: default
monitoring: false
jobCleanUpPolicy: 'delete'
experiments:
- name: pod-delete
spec:
components:
env:
- name: KILL_COUNT
value: '1'
- name: TOTAL_CHAOS_DURATION
value: '60s'
- name: CHAOS_INTERVAL

The ChaosEngine resource has three main sections:

ChaosEngine资源包括三个主要部分:

  • appinfo: This tells the Litmus operator which application to target. You have to specify a label selector and the type of resource.

    appinfo :这告诉Litmus操作员要定位的应用程序。 您必须指定标签选择器和资源类型。

  • experiments: A list of experiments to run. In this case, we are running the Pod Delete experiment.

    experiments :要运行的实验列表。 在这种情况下,我们正在运行Pod Delete实验。

  • experiments.spec.components: The experiment-specific value overrides. In this case, we are telling the experiment to kill 1 pod over 60 seconds. The available values come from the ChaosExperiment resource.

    experiments.spec.components :实验特定的值覆盖。 在这种情况下,我们告诉实验在60秒内杀死1个吊舱。 可用值来自ChaosExperiment资源。

Start the chaos experiment by creating the ChaosEngine resource with kubectl:

通过使用kubectl创建ChaosEngine资源来启动混沌实验:

$ kubectl apply -f engine.yamlchaosengine.litmuschaos.io/pod-killer-chaos created

见证混乱 (Witness the Chaos)

The experiment will kill one of our application’s pods. If you run the command below once the experiment has started, you’ll see how a random pod is killed and then automatically recreated:

该实验将杀死我们应用程序的一个pod。 如果实验开始后运行下面的命令,您将看到如何杀死随机容器,然后自动重新创建它:

$ kubectl get pod -l=app=hello-worldNAME READY STATUS RESTARTS AGE
hello-world-75947547d4–2fcbc 1/1 Running 0 57m
hello-world-75947547d4-c6wsv 0/1 ContainerCreating 0 10s

While the experiment is running, keep refreshing the browser. Notice how the calls will display different pod names, but they were never interrupted? That’s because our application is resilient to pod destruction 💪🏻!

在实验运行期间,请不断刷新浏览器。 请注意,呼叫将如何显示不同的Pod名称,但它们从未中断过? 那是因为我们的应用程序具有抵御吊舱破坏的能力💪🏻!

When an experiment is created, a ChaosResult resource will created to hold the result of the experiment. The status.verdict key is set to Awaited while the experiment is in progress. Once it finishes, it will change to either Pass or Fail.

创建实验后,将创建ChaosResult资源来保存实验结果。 实验进行期间, status.verdict键设置为“ Awaited ”。 完成后,它将更改为PassFail

$ kubectl describe chaosresult pod-killer-chaos-pod-deleteName: pod-killer-chaos-pod-delete
Namespace: rberrelleza
Labels: name=pod-killer-chaos-pod-delete
Annotations: <none>
API Version: litmuschaos.io/v1alpha1
Kind: ChaosResult
Metadata:
Creation Timestamp: 2020–08–05T21:14:05Z
Generation: 5
Resource Version: 165298631
Self Link: /apis/litmuschaos.io/v1alpha1/namespaces/rberrelleza/chaosresults/pod-killer-chaos-pod-delete
UID: a7f50d28–1f14–4a03–9013–94e72d69eb72
Spec:
Engine: pod-killer-chaos
Experiment: pod-delete
Status:
Experimentstatus:
Fail Step: N/A
Phase: Running
Verdict: Awaited
Events:
Type Reason Age From Message
— — — — — — — — — — — — -
Normal Summary 45m experiment-l0k004–5x2fj pod-delete experiment has been Passed

额外的混乱 (Extra Chaos)

In a future post I’ll show how you can take your chaos testing to the next level and write your own application-specific experiments. Can’t wait? Karthik from MayaData wrote a pretty cool getting started guide. And it happens to use the okteto CLI as part of the dev flow. How cool is that?

在以后的文章中,我将展示如何将混沌测试提高到一个新的水平,并编写自己的针对特定应用的实验。 等不及了 MayaData的Karthik编写了一个非常酷的入门指南 。 而且恰好在开发流程中使用了okteto CLI。 多么酷啊?

Litmus has a monthly community call where the community gets together and talks about their cool use cases and needs. The team was nice enough to invite me to this month’s call to demo the workflow I showed you on this post. It’s a great place to talk and learn from other practitioners.

Litmus 每月举行一次社区电话 ,社区聚集在一起,讨论他们很酷的用例和需求。 团队很高兴邀请我参加本月的电话会议,以演示我在本文中向您展示的工作流程。 这是一个向其他从业者倾诉和学习的好地方。

结论 (Conclusion)

In this post, we showed how you can deploy a replicable development environment that includes an application, the LitmusChaos operator, and your chaos experiment, all in one click. Then, we ran a chaos experiment, validating that our application is resilient to a pod failure.

在这篇文章中,我们展示了如何部署可复制的开发环境,其中包括一个应用程序,LitmusChaos运算符和您的混乱实验,所有这些都只需单击一下即可。 然后,我们进行了一个混乱的实验,验证了我们的应用程序对于Pod故障具有弹性。

This is a great example of how you can use Okteto to accelerate your entire team. One person configures the app with the chaos tools, and everyone else can create their own namespace on demand, deploy a pre-configured, chaos-ready development environment, and start running experiments without having to think twice about installation scripts or infrastructure configuration.

这是如何使用Okteto加速整个团队的一个很好的例子。 一个人用混乱的工具配置应用程序,其他人可以按需创建自己的名称空间,部署预先配置的,可混乱的开发环境,并开始运行实验,而无需三思而后行地考虑安装脚本或基础架构配置。

Let’s keep the conversation going! Join the Okteto and Litmus communities to talk more about Cloud Native development and Chaos Engineering.

让我们继续对话吧! 加入OktetoLitmus社区,更多地讨论Cloud Native开发和Chaos Engineering。

Thanks to Karthik Satchitanand and Prithvi Raj for reading drafts of this.

感谢Karthik Satchitanand和Prithvi Raj阅读此草稿。

翻译自: https://medium.com/okteto/chaos-engineering-with-litmus-and-okteto-cloud-2232ccfa6672

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值