This is what the cluster looks like:
What it's made of:
- 3 raspberry pi 4 (8Go)
- 1 gigabit ethernet 5 ports switch
- 1 1To lexar ES3 usb SSD
- 1 80mm fan
- 3 25cms cat6 ethernet cables
- a very short usb-c 10gbps cable
- some m3 threaded inserts and screws
- a 3d printed rack
The rack is a remix of this one. I've included the stls that I remixed/designed, aka the vented sleds for the PI4 and the SSD, and the side fan mount.
Here is a top view diagram of the main components:
This is the repo that governs almost all the cluster. The bootstrapping is done using ansible, from 3 ssh-available machines (pi4 in this case).
From here, Flux will create everything that is declared in k8s/
, decrypt what's secret using a private key, and keep the stack in sync.
In k8s/
there are 2 main folders:
-
infra
that represents what's needed for the cluster to function:- a storageclass through a nfs provisionner,
- an IngressController with Traefik (actually 2, one private one public)
- cert-manager for pulling certs for my domain
- cloudflare tunnel for exposing part of my services to the outside world
- tailscale (not deployed using gitops - yet) for accessing my private services from wherever
-
an
apps
folder, that's composed of the actual services running on the cluster:- adguard for DNS/DHCP
- gitea for local git and CI/CD
- paperless-ngx for my important files
- immich for photos backups and sync
- vaultwarden as my passwords manager
- filebrowser for file sharing
- glance as my internet homepage
- kromgo for exposing stats publicly
- octoprint for controlling my 3D printer
- and some other stuff like monitoring, a blog , static sites, etc..
-
there is also an
appchart
folder. It's a Helm chart that ease the deployment of simple services.
I try to adhere to gitops/automation principles. Some things aren't automated but it's mainly toil (one-time-things during setup, critical upgrades, some provisionning..). 95% of the infrastructure should be deployable by following these instructions (assuming data and encryption keys are known).
Requirements and basic stack:
- ansible: infrastructure automation
- flux: cluster state mgmt
- sops + age: encryption
- git: change management
brew install git ansible fluxcd/tap/flux sops age
This assume you have the decryption key age.agekey
, and the env var configured:
SOPS_AGE_KEY_FILE=age.agekey
If you want to encrypt an already created file (eg a k8s Secret spec):
sops encrypt -i <file.yaml>
If you want to edit inline a encrypted file (eg modify a value in a encrypted Secret/Configmap) using $EDITOR:
sops k8s/apps/services/beaver/beaver-config.yaml
It is assumed that a ssh key auth is configured on the nodes (ssh-copy-id ),
with passwordless sudo (<user> ALL=(ALL) NOPASSWD: ALL
in visudo).
cd ansible
ansible-playbook -i inventory.yaml -l lampone cluster-install.yaml
- Get a github token and set an env var:
export GITHUB_TOKEN=xxx
- Enter some commands
# pre create the decryption key
kubectl create ns flux-system
kubectl create secret generic sops-age --namespace=flux-system --from-file=age.agekey
# bootstrap flux
flux bootstrap github \
--owner=k0rventen \
--repository=lampone \
--branch=main \
--path=./k8s/flux
- Things should start to deploy ! :magic:
To update the cluster, set the k3s_version
in the ansible inventory (should be updated by renovate), then:
ansible-playbook -i inventory.yaml cluster-update.yaml
Follow along with k get nodes -w
.
I try to follow a 3-2-1 backup rule. The 'live' data is on the nfs ssd. It's backed up daily onto the same ssd (mainly for rollbacks and potential local re-deployments). For disaster-recovery situations, it's also backed up daily onto a HDD offsite, which can be accessed through my tailnet.
The backup tool is restic . It's installed and configured onto the nfs server using ansible. There is a 'sidecar' unit that sends a report through discord if the backup fails.
- Init the local repo
cd /nfs
restic init nfs-backups
- Init the remote repo
Create a mnt-backup.mount
systemd service on the remote server to mount/umount the backup disk
coco@remote_server:~ $ cat /etc/systemd/system/mnt-backup.mount
[Unit]
Description=Restic Backup External Disk mount
[Mount]
What=/dev/disk/by-label/backup
Where=/mnt/backup
Type=ext4
Options=defaults
[Install]
WantedBy=multi-user.target
Init the repo from the nfs server (this assumes passwordless ssh auth):
restic init -r sftp:<remote_server_ip>:/mnt/backup/nfs-backups
- Create a systemd cred with the repo password (on the nfs server) and set the value of
restic_systemd_creds
in the ansible inventory:
> systemd-ask-password -n | sudo systemd-creds encrypt --name=restic -p - -
🔐 Password: *************************
SetCredentialEncrypted=restic: \
...
-
Create a discord webhook for your channel and set the
discord_webhook
key in the inventory accordingly. -
Deploy the restic config using ansible:
ansible-playbook -i inventory restic-install.yaml
A staging environment can be deployed using vagrant:
brew tap hashicorp/tap
brew install hashicorp/tap/vagrant
sudo apt install virtualbox vagrant --no-install-recommends
Then create the staging env:
# launch
vagrant up
# add the nodes ssh config
vagrant ssh-config >> $HOME/.ssh/config
# deploy the cluster
cd ansible
ansible-playbook -i inventory.yaml -l staging cluster-install.yaml
# get the kubectl config
cd ..
vagrant ssh -c "kubectl config view --raw" staging-master > $HOME/.kube/configs/staging
# test
kubectl get nodes
Then bootstrap the cluster using flux from this section, ideally using a develop branch.