Chaos Mesh介绍

ChaosMesh是一个开源的云原生混沌工程平台,提供多种故障模拟,如网络、存储和CPU等。其核心组件包括混沌看板、混沌控制器管理和混沌守护者,支持Kubernetes环境,具备高安全性和社区活跃度。用户可通过WebUI设计实验并监控实验状态,广泛应用于开发、测试和生产环境的故障排查。
摘要由CSDN通过智能技术生成

This document describes the concepts, use cases, core strengths, and the architecture of Chaos Mesh.

这篇文档介绍ChaosMesh的概念,用例,核心优势和架构

Chaos Mesh Overview

Chaos Mesh is an open source cloud-native Chaos Engineering platform. It offers various types of fault simulation and has an enormous capability to orchestrate fault scenarios. Using Chaos Mesh, you can conveniently simulate various abnormalities that might occur in reality during the development, testing, and production environments and find potential problems in the system. To lower the threshold for a Chaos Engineering project, Chaos Mesh provides you with a perfect visualization operation. You can easily design your Chaos scenarios on the Web UI and monitor the status of Chaos experiments.

ChaosMesh简介

ChaosMesh是一个开源的云原生混沌工程平台。它提供了多种错误模拟和巨大的编排错误场景的能力。使用CM,你能方便的模拟多种在真实的开发,测试和生产环境中不常见的场景从而找到系统的潜在问题。为了防混沌工程易用,CM提供了一个完美的可视化操作。你能轻松的通过图形化界面设计混乱场景并监控混乱实验的状态。

Core strengths

As the industry's leading Chaos testing platform, Chaos Mesh has the following core strengths:

  • Stable core capabilities: Chaos Mesh originated from the core testing platform of TiDB, and inherited a lot of TiDB's existing test experience from its initial release.
  • Fully authenticated: Chaos Mesh is used in numerous companies and organizations, such as Tencent and Meituan; It is also used in the testing systems of many well-known distributed systems, such as Apache APISIX and RabbitMQ.
  • An easy-to-use system: Chaos Mesh makes full use of automation with graphical operations and Kubernetes-based usage.
  • Cloud Native: Chaos Mesh supports Kubernetes environment with its powerful automation ability.
  • Various fault simulation scenarios: Chaos Mesh covers most of the scenarios of basic fault simulation in the distributed testing system.
  • Flexible experiment orchestration capabilities: You can design your own Chaos experiment scenarios on the platform, including multiple mixing experiments and application status checks.
  • High security: Chaos Mesh is designed with multiple layers of security control and provides high security.
  • An active community: Chaos Mesh is an incubating project hosted by CNCF. It has a growing number of contributors and adopters all over the world.
  • Easily scalable: It's easy to add new fault test types and functions to Chaos Mesh.

核心优势

作为工业级领先的混沌工程平台,CM有以下核心优势:

稳定的核心能力:CM从TiDB的核心测试系统中产生,从一开始继承了大量TiDB存在的测试经验

完全授权:CM在大量公司和机构中使用,包括腾讯和美团,它也用于测试很多知名的分布式系统,例如APISIX和RabbitMQ

一个便于使用的系统:CM充分利用基于图形化操作和基于K8s的使用自动化

云原生:CM基于其强大的自动化能力支持K8s环境

多样的错误模拟场景:CM包括大多数分布式系统的基本错误模拟场景

适应能力强的实验编排能力:你可以在平台上设计自己的混沌实验场景,包括多种混合实验和应用状态检查

高度安全:CM实际陈多层安全控制从而提供高度安全性

活跃的社区:CM是一个由CNCF孵化中的项目,它在全世界有着持续增长的贡献者和使用者

易于扩展:为CM添加新的错误测试类型和功能是很容易的

Architecture overview

Chaos Mesh is built on Kubernetes CRD (Custom Resource Definition). To manage different Chaos experiments, Chaos Mesh defines multiple CRD types based on different fault types and implements separate Controllers for different CRD objects. Chaos Mesh primarily contains three components:

  • Chaos Dashboard: The visualization component of Chaos Mesh. Chaos Dashboard offers a set of user-friendly web interfaces through which users can manipulate and observe Chaos experiments. At the same time, Chaos Dashboard also provides an RBAC permission management mechanism.
  • Chaos Controller Manager: The core logical component of Chaos Mesh. Chaos Controller Manager is primarily responsible for the scheduling and management of Chaos experiments. This component contains several CRD Controllers, such as Workflow Controller, Scheduler Controller, and Controllers of various fault types.
  • Chaos Daemon: The main executive component. Chaos Daemon runs in the DaemonSet mode and has the Privileged permission by default (which can be disabled). This component mainly interferes with specific network devices, file systems, kernels by hacking into the target Pod Namespace.

架构介绍

CM基于K8s CRD搭建,为了管理不同的混乱实验,CM基于不同的错误类型定义了多种CRD类型并未不同的CRD对象思想了不同的控制器。CM主要包括三个组件

混沌看板:CM的可视化组件,混沌看板提供了一组用户友好的web界面,用户可以操作和观察混沌实验。同时,混沌看板也提供了一个RBAC许可管理的功能

混沌管理器:CM的核心逻辑组件,混沌管理器主要负责调度和管理混沌实验。这个组件包括几个CRD控制器,例如工作流控制器,调度器和多种类型错误的控制器。

混沌守护者:主要的执行组件。混沌守护者以DaemonSet的模式运行,并且默认有特权(可以被关闭),这个组件通过深入对应的Pod空间来和指定的网络设备,文件系统,内核交互

As shown in the above image, the overall architecture of Chaos Mesh can be divided into three parts from top to bottom:

  • User input and observation: User input reaches the Kubernetes API Server starting with a user operation (User). Users do not directly interact with the Chaos Controller Manager. All user operations are eventually reflected as a Chaos resource change (such as the change of NetworkChaos resource).
  • Monitor resource changes, schedule Workflow, and carry out Chaos experiments: The Chaos Controller Manager only accepts events from the Kubernetes API Server. These events describe the changes of a certain Chaos resource, such as a new Workflow object or the creation of a Chaos object.
  • Injection of a specific node fault: The Chaos Daemon component is primarily responsible for accepting commands from the Chaos Controller Manager component, hacking into the target Pod's Namespace, and performing specific fault injections. For example, setting TC network rules, starting the stress-ng process to preempt CPU or memory resource.

正如上图所示,CM的整个架构从顶至底可以分为三部分

用户输入和观察:用户输入地道K9s的API服务,开始一个用户操作。用户并没有直接和混乱控制器交互,所有的用户操作最终反应为一个混沌资源的改变(比如网络混沌资源的改变)

监控资源改变,调度工作流和执行混沌实验:混沌控制器只从K8s的Api服务接收事件,这些事件描述了了特定的混沌资源,例如新的工作流对象或者一个混沌对象的产生。

注入特定的节点错误。混沌守护者主要负责接收混沌控制器的指令,侵入对应的额Pod空间,产生特定的错误注入。例如设置TC网络规则,开始抢先式的CPU和内存资源压力测试

 

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值