Skip to content

darshankerkar/k8podguard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kubernetes AutoHeal Operator

This is a custom Kubernetes controller (operator) written in Golang that automatically monitors pod health across the cluster. It detects when a pod falls into a CrashLoopBackOff state and takes corrective action by forcefully restarting the pod.

Why this exists

This project demonstrates cloud-native systems engineering, familiarity with Kubernetes internals, client-go, and reconciliation loops.

MVP Phase 1 Features

  • Authenticates and connects to Kubernetes cluster (in-cluster or via kubeconfig).
  • Polls all namespaces for Pods on a periodic basis.
  • Detects when a Container falls into CrashLoopBackOff.
  • Automatically deletes the problematic Pod so that the backing ReplicaSet/Deployment starts a fresh instance.

Testing Locally

Prerequisites

  • Go 1.20+
  • A running Kubernetes cluster (e.g., kind, minikube, or Docker Desktop Kubernetes)
  • kubectl configured to communicate with your server

1. Build and Run the Operator

Clone this repository and run the operator locally using your local .kube/config:

go run main.go

(You should see: Starting K8s AutoHeal Operator...)

2. Simulate a Failing Pod

You can apply a faulty pod to see the auto-healing in action. In a new terminal, create a broken deployment:

kubectl apply -f manifest/broken-deployment.yaml

Wait a few minutes. You should notice that the pod enters a CrashLoopBackOff loop using kubectl get pods. Within ~15 seconds of the pod hitting CrashLoopBackOff, the operator will display:

[ALERT] Pod broken-pod-xxxx in namespace default is in CrashLoopBackOff. Proceeding to heal...
[ACTION] Deleting Pod broken-pod-xxxx to force restart...
[SUCCESS] Pod broken-pod-xxxx successfully deleted for restart.

3. Deploy in-cluster (Future)

Build the docker image:

docker build -t k8podguard:v1 .

Role-Based Access Control (RBAC) must be set up so the operator has permission to list and delete pods inside the cluster.

About

A Kubernetes operator built with Golang for monitoring pod health, detecting failures, and automatically recovering unhealthy workloads in real time. The project focuses on self-healing infrastructure workflows, automated recovery mechanisms, and production-style Kubernetes operations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors