Chaos Mesh is a cloud-native Chaos Engineering platform that orchestrates chaos on Kubernetes environments. At the current stage, it has the following components:
- Chaos Operator: the core component for chaos orchestration. Fully open-sourced.
- Chaos Dashboard: a Web UI for managing, designing, monitoring Chaos Experiments; under development.
Chaos Mesh is a versatile chaos engineering solution that features all-around fault injection methods for complex systems on Kubernetes, covering faults in Pod, network, file system, and even the kernel.
Choas Mesh is one of the better chaos engines for Kubernetes because:
- In a short amount of time there has been heavy community support and it's a CNCF sandbox project.
- It's a native experience to Kubernetes leveraging the Operator Pattern and CRDs permitting IaC with your pipelines.
- If you have followed the best practices by applying plenty of labels and annotations to your Deployments, then there is no need to make modifications to your apps for your chaos experiments.
- There are a wide variety of experiment types, not just Pod killing.
- Installs with a Helm chart and you have complete control over the engine with CRDs.
Don't let the project name mesh misguide you. This project is unrelated to services meshes like Istio and Conduit. Hopefully, in the future, they will leverage the features of a service mesh, but for now, they are unrelated.
In this scenario you will learn how to:
- Install Chaos Mesh onto Kubernetes.
- Install and label applications to make them eligible targets for chaos.
- Design and deliver chaos experiments.
- Observe the chaos engine exercise your experiments against the cluster objects.
Chaos Mesh is an emerging open source project started in Q4 2019. It is filled with many of the experiment features you would expect to write for Chaos testing. The project is under active development as a Sandbox project with CNCF. This Katacoda scenario will be updated as it evolves.
The project has taken the right native architecture path to use the Kubernetes Operator Pattern. By defining a collection of CRDs its controller accepts experiments declarations from you in the form of YAML manifests. These YAML are expected to be infrastructure as code and part of your CI/CD pipeline along with your other testing formulas.
With these steps you have learned:
- ✔ Install Chaos Mesh onto Kubernetes
- ✔ Install and label applications to make them eligible targets for chaos
- ✔ Design and deliver chaos experiments
- ✔ Observe the chaos engine exercise your experiments against the cluster objects
In the last year we've seen Chaos Engineering move from a much talked-about idea to an accepted, mainstream approach to improving and assuring distributed system resilience. As organizations large and small begin to implement Chaos Engineering as an operational process, we're learning how to apply these techniques safely at scale. The approach is definitely not for everyone, and to be effective and safe, it requires organizational support at scale. -- ThoughtWorks Radar
- Chaos Mesh project
- Chaos Mesh documentation
- K8s Chaos Dive, Chaos-Mesh Part 1, Craig Morten
- Principles of Chaos Engineering
- Fallacies of Distributed Computing Explained (PDF)
Your Kubernetes Cluster
For this scenario, Katacoda has just started a fresh Kubernetes cluster for you. Verify that it's ready for your use:
kubectl version --short && \
kubectl get nodes && \
kubectl get componentstatus && \
It should list a 2-node cluster and the control plane components should be reporting Healthy. If it's not healthy, try again in a few moments. If it's still not functioning refresh the browser tab to start a fresh scenario instance before proceeding.
The Helm package manager used for installing applications on Kubernetes is also available:
helm version --short
You can administer your cluster with the
kubectl CLI tool or use the visual Kubernetes dashboard. The Dashboard can be accessed from the tab labeled Kubernetes Dashboard above the command line. When the Dashboard first appears, it will prompt for an access token. At any time you can run this script to access the Dashboard token:
This script will display the token in the terminal. Copy the green text using your browser's copy feature then paste the token into the prompt when the Dashboard is accessed. If the Dashboard is still starting up, then Katacoda will report the access error. Once the dashboard Pod reports the status Running it can be accessed:
kubectl get pods -n kube-system -l app.kubernetes.io/name=kubernetes-dashboard