Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Use Cilium as Network Plugin
title: "Network Plugin: Cilium"
---

| status: | date: | decision-makers: |
Expand Down
Original file line number Diff line number Diff line change
@@ -1,40 +1,40 @@
---
title: "Use Talos OS as the Preferred Operating System for Kubernetes Operations"
title: "OS: Talos Linux"
date: "2025-02-25"
---


| status: | date: | decision-makers: |
| --- | --- | --- |
| proposed | 2025-02-25 | Sofus Albertsen |
| approved | 2025-02-25 | Sofus Albertsen |

## Context and Problem Statement

Choosing the right operating system for your Kubernetes cluster is crucial for stability, security, and operational efficiency. The OS should be optimized for container workloads, minimize overhead, and integrate well with Infrastructure as Code (IaC) practices.

## Considered Options

* Talos OS
* Talos Linux
* Red Hat OpenShift
* SUSE Rancher (RancherOS/RKE)

## Decision Outcome

Chosen option: **Talos OS**, because its minimal footprint, API-driven configuration, and singular focus on Kubernetes make it ideal for automated infrastructure management and reduce operational overhead.
Chosen option: **Talos Linux**, because its minimal footprint, API-driven configuration, and singular focus on Kubernetes make it ideal for automated infrastructure management and reduce operational overhead.

Talos OS's immutable architecture and security-focused design further enhance its suitability for Kubernetes deployments, giving you a minimal attack surface from the OS point of view. As an example, the OS does not have any shell, so no bash scripts can be executed.
Talos's immutable architecture and security-focused design further enhance its suitability for Kubernetes deployments, giving you a minimal attack surface from the OS point of view. As an example, the OS does not have any shell, so no bash scripts can be executed.

OpenShift and Rancher were considered, but their comprehensive feature sets, while beneficial in some scenarios, introduce increased complexity and overhead.

While their dashboards can simplify initial setup, they can also encourage "click-ops" and deviate from IaC best practices. These platforms might be suitable if existing Red Hat or SUSE expertise is a primary driver, but becuase they are fully fledged OS's underneath, they introduce more operational overhead than Talos.

### Consequences

* **Good:** Talos OS's minimal package selection makes it a smaller attack surface.
* **Good:** The API-driven configuration of Talos OS allows for seamless integration with IaC tools like Terraform, enabling fully automated cluster provisioning and management.
* **Good:** The immutable infrastructure of Talos OS simplifies updates and adds recilliency because of it's dual boot bank setup.
* **Good:** Talos's minimal package selection makes it a smaller attack surface.
* **Good:** The API-driven configuration of Talos allows for seamless integration with IaC tools like Terraform, enabling fully automated cluster provisioning and management.
* **Good:** The immutable infrastructure of Talos simplifies updates and adds recilliency because of it's dual boot bank setup.
* **Good:** The "two package" approach simplifies maintenance (day 2 operations) and reduces the likelihood of OS-related issues, as all known package combinations can be tested from the vendor.

* **Bad:** The learning curve for Talos OS might be steeper initially for teams unfamiliar with its API-driven approach.
* **Bad:** The learning curve for Talos might be steeper initially for teams unfamiliar with its API-driven approach.
* **Bad:** The lack of a graphical user interface might be a drawback for some users accustomed to traditional OS management.
* **Bad:** Talos is a relatively newer project compared to OpenShift or Rancher, therefore community support and available resources might be smaller.
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Longhorn_as_storage_solution"
title: "Storage Solution: Longhorn"
date: "2025-03-18"
---

Expand Down
6 changes: 3 additions & 3 deletions docs/hardware_ready/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ title: Getting your hardware ready

| Problem domain | Description | Reason for importance | Tool recommendation |
|:---:|:---:|:---:|:---:|
| Kubernetes Node Operating System | The Operating System running on each of the hosts that will be part of your Kubernetes cluster | Choosing the right OS will be the foundation for building a production-grade Kubernetes cluster | [Talos OS](hardware_ready/ADRs/talos_as_os.md) |
| Storage solution | The underlying storage capabilities which Kubernetes will leverage to provide persistence for stateful workloads | Choosing the right storage solution for your clusters needs is important as there is a lot of balance tradeoffs associated with it, e.g redundancy vs. complexity | [Longhorn](Longhorn_as_storage_solution.md) |
| OS | The Operating System running on each of the hosts that will be part of your Kubernetes cluster | Choosing the right OS will be the foundation for building a production-grade Kubernetes cluster | [Talos Linux](hardware_ready/ADRs/os_talos_linux.md) |
| Storage solution | The underlying storage capabilities which Kubernetes will leverage to provide persistence for stateful workloads | Choosing the right storage solution for your clusters needs is important as there is a lot of balance tradeoffs associated with it, e.g redundancy vs. complexity | [Longhorn](storage_solution_longhorn.md) |
| Container Runtime (CRI) | The software that is responsible for running containers | You need a working container runtime on each node in your cluster, so that the kubelet can launch pods and their containers | |
| Network plugin (CNI) | Plugin used for cluster networking | A CNI plugin is required to implement the Kubernetes network model | [Cilium](Cilium_as_network_plugin.md) |
| Network plugin (CNI) | Plugin used for cluster networking | A CNI plugin is required to implement the Kubernetes network model | [Cilium](network_plugin_cilium.md) |
| Virtualisation | An optional layer between your hardware and your Kubernetes tech stack | In some scenarioes it might be benefitial to abstract the underlying hardeware away, and have everything running in virtual machines | |
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
---
title: "Harbor as Image Registry"
title: "Image Registry: Habour"
date: "2025-03-31"
---

| status: | date: | decision-makers: |
| --- | --- | --- |
| proposed | 2025-03-31 | Kasper Møller |
| approved | 2025-03-31 | Kasper Møller |

## Context and Problem Statement

Expand Down
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
---
title: "Hashicorp Vault as Secret Management"
title: "Secret Management: Hashicorp Vault"
date: "2025-04-14"
---


| status: | date: | decision-makers: |
| --- | --- | --- |
| proposed | 2025-04-14 | Kasper Møller |
| approved | 2025-04-14 | Kasper Møller |

## Context and Problem Statement

Expand Down
4 changes: 2 additions & 2 deletions docs/working_with_k8s/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ title: Working With Kubernetes

| Problem domain | Description | Reason for importance | Tool recommendation |
|:---:|:---:|:---:|:---:|
| Image Registry | A common place to store and fetch images | High availability, secure access control | [Harbor](ADRs/harbor_as_image_registry.md) |
| Secret Management | Securely store and manage sensitive information like passwords and API keys | Prevent unauthorized access and data leaks | [HashiCorp Vault](ADRs/hashicorp_vault_as_secret_management.md) |
| Image Registry | A common place to store and fetch images | High availability, secure access control | [Harbor](ADRs/image_registry_habour.md) |
| Secret Management | Securely store and manage sensitive information like passwords and API keys | Prevent unauthorized access and data leaks | [HashiCorp Vault](ADRs/secret_management_hashicorp_vault.md) |
| Ingress Controller / Gateway API | Manage external access to services in the cluster | Enable routing, load balancing, and secure communication | |
| GitOps / Deployment Pipelines | Automate application deployments using Git as the source of truth | Ensure consistency, traceability, and faster deployments | |
| Monitoring Infrastructure | Observe and analyze the health and performance of the cluster and applications | Proactive issue detection and resolution | |
Expand Down