"Strength emerges from the heart of chaos." — Unknown
Think about a pilot—these professionals must log at least 1,500 hours of flight experience before flying a commercial plane. Even after earning their license, they spend countless hours in simulators, practicing emergency scenarios.
Why do they do this?
- When a pilot encounters a sudden in-flight emergency, they need an instinctive understanding of how to handle it. This intuition comes from repeated practice and facing simulated crises in controlled environments. Practice makes perfect.
- Chaos engineering is also like pilot training for emergencies. Engineers intentionally cause disruptions to see how systems respond, find weaknesses, and improve them. Regular chaos experiments help teams understand how their systems behave under stress, ensuring they can handle real-world challenges and keep running smoothly—just like pilots navigating through turbulence.
I recently started using LitmusChaos, and it has been both thrilling and insightful. In this blog post, I will share a practical walkthrough based on my journey, highlighting the key steps and including common errors and how to tackle them.
<<Table of Contents>>
Overview of Chaos Engineering
Getting Started with LitmusChaos
Accessing the ChaosCenter UI
Creating Your First Chaos Experiment
Common Errors and Solutions
Wrapping UP
->Overview of Chaos Engineering<-
Chaos engineering is all about making your system more resilient by introducing failures into software systems. Now you will be wondering why someone would do such a thing? Because The goal is to find and fix weaknesses before they cause real problems.
What is Litmus Chaos?
LitmusChaos is a tool for running chaos experiments in Kubernetes, It helps us to test how your Kubernetes clusters and applications handle failures. And the good thing is that it is open source, It provides a framework and a set of tools to simulate real-world failures within Kubernetes clusters.
Prerequisites:
Before diving into this blog, you should have the following installed on your system..."
A Kubernetes cluster (Minikube, EKS, etc)
kubectl installed and configured
Helm installed
Step 1: Install LitmusChaos:
"To get started with LitmusChaos, you need to create a Kubernetes cluster, usually with Minikube because it is a single-node cluster and is best for practice and general use."
- Add the LitmusChaos Helm repository:
helm repo add litmuschaos https://litmuschaos.github.io/litmus-helm/
helm repo update
- Install LitmusChaos:
kubectl create ns litmus
helm install litmus litmuschaos/litmus --namespace litmus
The Result will be look like this->>
- Verify the installation:
kubectl get pods -n litmus
The Result will be look like this->>
C**ommon Errors and Solutions:
Error: ' Error: Could not find tiller '
- Solution: This error is specific to Helm 2. Make sure you’re using Helm 3.
Error: ' Unable to connect to the server: dial tcp ....: i/o timeout '
Solution: Make sure your Kubernetes cluster is running and ' kubectl ' is
properly configured.
Step 2: Access the ChaosCenter UI:
You can see frontend service in section of litmus's service which is Cluster-IP, So we have to change it to Nordport to access ChaosCenter UI:
You can see litmus's service by this command -> ' kubectl get svc -n litmus '
- Expose that frontend service using a Nordport:
kubectl expose deployment litmusportal-frontend --type=NodePort --name=litmus-frontend -n litmus
- Get the NodePort assigned:
kubectl get svc litmus-frontend -n litmus
- Access the ChaosCenter UI:
Open your any browser and navigate to 'http://<node-ip>:<nodeport>
'.
Common Errors and Solutions:
Error: 'Error from the server (not Found)
- Solution: Solution: Ensure the service name is correct and that the frontend deployment is running. Use '
kubectl get pods -n litmus
' to check the status of the pods.
Error: 'Unable to connect to the server: connection refused' :
Tis Error I also faced while i was using ChaosCenter UI.
- Solution: Ensure your Kubernetes cluster is accessible and the NodePort is correctly exposed. Use '
kubectl describe svc litmus-frontend -n litmus
' for more details.
Step 3: Create Your First Chaos Experiment
Log in to ChaosCenter:
- Use the default admin credentials (admin/litmus).
Create a new project:
- Navigate to the Projects section and create a new project.
Install Chaos Agents:
- Follow the UI instructions to install Chaos Agents in your desired namespace.
Create an experiment:
Go to the Experiments section and create a new experiment.
Select a predefined experiment, such as
pod-delete
.
Schedule and run the experiment:
Configure the experiment parameters and schedule it.
Run the experiment and observe the results in the ChaosCenter UI.
Wrapping Up
Exploring chaos engineering with LitmusChaos has been an enlightening experience for me. It has not only helped me understand my system's resilience better but also made me realize the importance of being prepared for unexpected failures. By following these steps and being aware of common errors, you can integrate chaos engineering practices into your Kubernetes environment smoothly and improve your system's reliability.
This blog shares my journey of getting started with chaos engineering using LitmusChaos. I hope you find it informative and helpful in your efforts to build more resilient systems.
If you liked this, a star would be a wonderful way to say thanks! ⭐ Your support means a lot.