Skip to content

Latest commit

 

History

History
577 lines (422 loc) · 18.3 KB

File metadata and controls

577 lines (422 loc) · 18.3 KB

Lighthouse Trust Anchor — Deployment Guide

Environment: trust-anchor.dep.dev.rciam.grnet.gr (Debian 12 VM) Role: OpenID Federation Trust Anchor for RI / e-Infra AAI federation Stack: Lighthouse + Caddy (TLS termination) via Docker Compose

Upstream references:

Resource URL
GitHub repository https://github.com/go-oidfed/lighthouse
Documentation https://go-oidfed.github.io/lighthouse/
Configuration reference https://go-oidfed.github.io/lighthouse/config/
Endpoints reference https://go-oidfed.github.io/lighthouse/config/endpoints/
Docker Hub image https://hub.docker.com/r/oidfed/lighthouse

Table of Contents

  1. Architecture Overview
  2. Prerequisites
  3. Deployment (Ansible)
  4. Manual Deployment (alternative)
  5. Signing Keys
  6. Public Key Extraction & Distribution
  7. Enrolling Subordinate Entities
  8. Verify the Trust Anchor
  9. Key Rotation
  10. Operations & Maintenance
  11. Hardening Checklist
  12. Troubleshooting

1. Architecture Overview

Internet
    │  :443 (HTTPS)
    ▼
┌─────────────────────────────────────┐
│  Caddy (ta-caddy)                   │  ← TLS termination, Let's Encrypt
│  trust-anchor.dep.dev.rciam.grnet.gr│    
└────────────┬────────────────────────┘
             │ :7672 (HTTP, internal Docker network)
             ▼
┌─────────────────────────────────────┐
│  Lighthouse (ta-lighthouse)         │  ← OpenID Federation Trust Anchor
│  oidfed/lighthouse:0.20.3           │
│  127.0.0.1:7673 (Admin API)         │  ← SSH tunnel only (see ADMIN_API.md)
└─────────────────────────────────────┘

Key design decisions:

  • Lighthouse is not directly exposed to the internet. Only Caddy is.
  • Caddy handles TLS automatically via ACME HTTP-01 challenge (Let's Encrypt).
  • Caddy blocks /enroll (HTTP 403) — the admin enrollment endpoint is only reachable via SSH tunnel to port 7672 on the VM loopback interface.
  • Entity ID = public HTTPS URL. It is baked directly into config.yaml by Ansible — Lighthouse (Go binary) does NOT expand environment variables in config files.
  • Signing keys are generated by Lighthouse on first boot (auto_generate_keys: true). On subsequent runs, Ansible sets it to false so existing keys are preserved.

2. Prerequisites

On the VM

Requirement Version Notes
Debian 12 (bookworm) x86_64 Ansible playbook handles Docker install
SSH access key-based Must have sudo privileges

Docker and all other dependencies are installed by the Ansible playbook. No manual setup on the VM is needed.

On the control node (your machine)

Requirement Install
Ansible ≥ 2.15 pip install ansible
community.docker collection ≥ 3.10 ansible-galaxy collection install -r ansible/requirements.yml
SSH key for the target VM Must have sudo on the VM

Network / Firewall

Port Protocol Direction Purpose
22 TCP Inbound SSH admin access
80 TCP Inbound ACME HTTP-01 challenge (Let's Encrypt)
443 TCP Inbound HTTPS — federation endpoints

Port 80 must be reachable before first deployment. Caddy needs it for the ACME challenge.


3. Deployment (Ansible)

3.1 Configure variables

Edit ansible/group_vars/trust_anchors.yml:

Variable Default Notes
lighthouse_entity_id https://trust-anchor.dep.dev.rciam.grnet.gr Must match DNS + TLS cert CN
deploy_dir /opt/lighthouse-ta Deployment root on the remote host
lighthouse_image oidfed/lighthouse:main Pin to a version tag for production
caddy_image caddy:2-alpine
deploy_user YOUR_SSH_USER Override via -e or edit locally
deploy_group YOUR_SSH_USER Override via -e or edit locally

3.2 Run the playbook

# Install Ansible collection dependency
ansible-galaxy collection install -r ansible/requirements.yml

# Deploy (replace YOUR_SSH_USER with your actual SSH username)
ansible-playbook -i ansible/inventory.ini ansible/deploy.yml \
    -u YOUR_SSH_USER \
    -e deploy_user=YOUR_SSH_USER \
    -e deploy_group=YOUR_SSH_USER \
    --private-key ~/.ssh/your_key

WSL users: Copy your SSH key to WSL native filesystem first:

cp /mnt/c/Users/YOU/.ssh/your_key ~/.ssh/your_key
chmod 600 ~/.ssh/your_key

3.3 What the playbook does (in order)

  1. System packages — installs ca-certificates, curl, openssl, python3-pip
  2. Docker Engine — installs Docker CE + Compose plugin from official Docker APT repo
  3. Directory layout — creates /opt/lighthouse-ta/ tree
  4. Configuration files — renders docker-compose.yml, config.yaml, Caddyfile from Jinja2 templates with the correct entity_id and auto_generate_keys value
  5. Docker Compose — pulls images and starts the stack
  6. Post-checks — verifies TLS, /.well-known/openid-federation, /list, /fetch, /resolve endpoints, then prints the live iss claim

3.4 First deploy vs. re-deploy

Scenario auto_generate_keys Signing keys
First deploy (no keys on host) true Lighthouse generates federation_ES256.pem + federation_ES256f.pem + keys.jwks
Re-deploy (keys already exist) false Existing keys are preserved, no regeneration

The playbook is fully idempotent. Re-running it only changes what actually differs.


4. Manual Deployment (alternative)

If not using Ansible:

# 1. Install Docker (see https://docs.docker.com/engine/install/debian/)
# 2. Clone the repo
git clone <repo-url> /opt/lighthouse-ta
cd /opt/lighthouse-ta

# 3. Edit lighthouse/config.yaml — set entity_id to your actual URL:
#    entity_id: "https://your-domain.example.org"

# 4. Edit caddy/Caddyfile — set the hostname to your domain

# 5. Start the stack
docker compose up -d

# 6. Lighthouse generates signing keys on first boot
# 7. Back up the keys (see §5)
# 8. Set auto_generate_keys: false in config.yaml
# 9. Restart: docker compose restart lighthouse

5. Signing Keys

5.1 How keys work

Lighthouse generates two key files on first boot:

File Purpose
federation_ES256.pem Current signing key
federation_ES256f.pem Future key (pre-staged for rollover)
keys.jwks Public JWKS — auto-rebuilt on every startup from both PEMs

All three files live in <deploy_dir>/lighthouse/data/signing/.

5.2 Back up the keys

Back up both .pem files immediately after first deployment. These are the only files that cannot be recovered if lost.

# From your local machine
scp YOUR_USER@trust-anchor.dep.dev.rciam.grnet.gr:/opt/lighthouse-ta/lighthouse/data/signing/federation_ES256.pem ./
scp YOUR_USER@trust-anchor.dep.dev.rciam.grnet.gr:/opt/lighthouse-ta/lighthouse/data/signing/federation_ES256f.pem ./

Store both in a secrets manager (Bitwarden, 1Password, HashiCorp Vault). Ensure at least two people have access.

5.3 Verify keys are locked

After first deployment, re-run the Ansible playbook. It will detect existing keys and set auto_generate_keys: false automatically. Verify on the VM:

grep auto_generate /opt/lighthouse-ta/lighthouse/config.yaml
# Expected: auto_generate_keys: false

6. Public Key Extraction & Distribution

Federation members need the Trust Anchor's public key (JWKS) to validate trust chains.

6.1 Extract from the live endpoint (recommended)

curl -s https://trust-anchor.dep.dev.rciam.grnet.gr/.well-known/openid-federation | \
    python3 -c "
import sys, json, base64
token = sys.stdin.read().strip()
payload_b64 = token.split('.')[1]
payload_b64 += '=' * (-len(payload_b64) % 4)
payload = json.loads(base64.urlsafe_b64decode(payload_b64))
print(json.dumps(payload['jwks'], indent=2))
" | tee ta-public-jwks.json

6.2 Extract from the keys.jwks file on the host

# On the VM
sudo cat /opt/lighthouse-ta/lighthouse/data/signing/keys.jwks | python3 -m json.tool

6.3 Distribute to federation members

Share ta-public-jwks.json (or the JWKS content) with federation member administrators. They need to configure it in their OpenID Federation client/broker as the Trust Anchor's trusted public key.

The JWKS will contain two keys (current + future). Members should trust both.


7. Enrolling Subordinate Entities

Full reference: ADMIN_API.md

7.1 How enrollment works

Lighthouse 0.20.x provides three ways to manage subordinates:

Method Interface Recommended for Auto-fetches keys?
/enroll GET /enroll?sub=... on port 7672 Enrolling a live entity; no auth beyond the SSH tunnel Yes
/enroll-request GET /enroll-request?sub=... (public) Subordinate self-service; request stays pending until admin approval Yes
Admin API POST /api/v1/admin/subordinates on port 7673 Removal, metadata, lifetimes, key updates No — you must supply jwks

/enroll and /enroll-request fetch and verify the subordinate's Entity Configuration live, extract the JWKS, and write it to the database — no downtime. The Admin API does not fetch keys: POST /subordinates takes an entity_id and, because the default status is active, requires a jwks in the body (otherwise it returns status cannot be active without keys). For enrolling a live entity, prefer /enroll.

7.2 Prerequisites: SSH tunnel to Admin API

The Admin API runs on port 7673, bound to the VM's loopback interface. Access requires an SSH tunnel.

# Open the tunnel (keep this terminal open)
ssh -L 7673:localhost:7673 YOUR_USER@trust-anchor.dep.dev.rciam.grnet.gr

Verify:

curl -s -o /dev/null -w '%{http_code}' http://localhost:7673/api/v1/admin/docs
# Expected: 200

7.3 Enroll a subordinate

Recommended — /enroll (auto-fetches the entity's keys):

# Separate tunnel to the main server port (see §7.6)
ssh -L 7672:localhost:7672 YOUR_USER@trust-anchor.dep.dev.rciam.grnet.gr

curl -i "http://localhost:7672/enroll?sub=https://some-idp.example.org"
# Expected: 201 Created

Admin API alternative (you must supply the jwks yourself):

curl -s -u "admin:YOUR_PASSWORD" \
  -X POST http://localhost:7673/api/v1/admin/subordinates \
  -H 'Content-Type: application/json' \
  -d '{"entity_id":"https://some-idp.example.org","jwks":{"keys":[ ... ]}}'
# Expected: 201 Created

Error responses:

Status Meaning
400 Bad Request Invalid request — e.g. status cannot be active without keys (no jwks supplied while status is/defaults to active); or, for /enroll, the entity's /.well-known/openid-federation is unreachable or not a valid JWT
409 Conflict Already enrolled — POST is not idempotent; update via the jwks/status endpoints instead

7.4 Verify enrollment

# List all enrolled subordinates
curl -s https://trust-anchor.dep.dev.rciam.grnet.gr/list

# Fetch the signed statement for a specific subordinate
curl -s "https://trust-anchor.dep.dev.rciam.grnet.gr/fetch?sub=https://some-idp.example.org"

# Resolve the full trust chain (end-to-end test)
curl -s "https://trust-anchor.dep.dev.rciam.grnet.gr/resolve?sub=https://some-idp.example.org&trust_anchor=https://trust-anchor.dep.dev.rciam.grnet.gr"

The /resolve call is the definitive end-to-end test — 200 OK with a JWT means the full chain from the subordinate to the TA validates correctly.

7.5 Remove a subordinate (Admin API)

# URL-encode the entity identifier in the path
curl -s -u "admin:YOUR_PASSWORD" -X DELETE \
  "http://localhost:7673/api/v1/admin/subordinates/https%3A%2F%2Fsome-idp.example.org"
# Expected: 204 No Content

7.6 The /enroll endpoint

The /enroll GET endpoint fetches the subordinate's Entity Configuration, extracts its JWKS, and enrolls it in one call — the simplest way to enroll a live entity. It is blocked by Caddy on port 443 and requires an SSH tunnel to port 7672 (the main federation server port, separate from the Admin API port 7673).

# Tunnel to port 7672
ssh -L 7672:localhost:7672 YOUR_USER@trust-anchor.dep.dev.rciam.grnet.gr

# Enroll (in a second terminal)
curl -i "http://localhost:7672/enroll?sub=https://some-idp.example.org"

This endpoint can only enroll — it cannot remove, list, or manage metadata. Use the Admin API for all other operations.


8. Verify the Trust Anchor

8.1 Quick health check

BASE="https://trust-anchor.dep.dev.rciam.grnet.gr"

echo "=== Entity Configuration ==="
curl -sI "${BASE}/.well-known/openid-federation" | head -3

echo "=== List ==="
curl -s "${BASE}/list"

echo "=== Fetch ==="
curl -sI "${BASE}/fetch" | head -3

echo "=== Resolve ==="
curl -sI "${BASE}/resolve" | head -3

8.2 Decode the Entity Configuration JWT

curl -s https://trust-anchor.dep.dev.rciam.grnet.gr/.well-known/openid-federation | \
    python3 -c "
import sys, json, base64
token = sys.stdin.read().strip()
payload_b64 = token.split('.')[1]
payload_b64 += '=' * (-len(payload_b64) % 4)
payload = json.loads(base64.urlsafe_b64decode(payload_b64))
print(json.dumps(payload, indent=2))
"

Expected fields: iss, sub (both = entity_id), jwks, metadata.federation_entity.* with all endpoint URLs.

8.3 TLS certificate check

echo | openssl s_client -connect trust-anchor.dep.dev.rciam.grnet.gr:443 \
    -servername trust-anchor.dep.dev.rciam.grnet.gr 2>/dev/null | \
    openssl x509 -noout -issuer -subject -dates

9. Key Rotation

Key rotation is a coordinated process — the TA's public key is hard-coded by federation members.

  1. Generate a new key (do not delete the old one):

    openssl genpkey -algorithm EC -pkeyopt ec_paramgen_curve:P-256 \
        -out /opt/lighthouse-ta/lighthouse/data/signing/federation_ES256_new.pem
  2. Publish the new public key out-of-band to all federation members with a transition period (e.g., 30 days).

  3. Enable automatic_key_rollover in config.yaml if supported:

    signing:
      automatic_key_rollover:
        enabled: true
        interval: "30d"
  4. Restart Lighthouse and verify both keys appear in the Entity Configuration's jwks.

  5. After all members have updated, remove the old key.


10. Operations & Maintenance

Logs

sudo docker compose -f /opt/lighthouse-ta/docker-compose.yml logs -f lighthouse
sudo docker compose -f /opt/lighthouse-ta/docker-compose.yml logs -f caddy

Restart a service

cd /opt/lighthouse-ta
sudo docker compose restart lighthouse
sudo docker compose restart caddy

Update Lighthouse image

cd /opt/lighthouse-ta
sudo docker compose pull lighthouse
sudo docker compose up -d lighthouse

oidfed/lighthouse:main is a rolling tag. Pin to a version tag for production stability.

Backup

Path Contents Frequency
lighthouse/data/signing/*.pem Signing private keys Once + after any rotation
postgres_data Docker volume Enrolled entities, metadata, signing key history Daily
caddy/data/ TLS cert + ACME account Weekly (auto-renews anyway)

Auto-start on reboot

Docker with restart: unless-stopped handles container restarts. Ensure Docker itself starts:

sudo systemctl enable docker
sudo systemctl enable containerd

11. Hardening Checklist

  • SSH: disable password authentication
  • Firewall: only ports 22, 80, 443 open
  • Docker daemon: not exposed over TCP
  • Signing key: chmod 600 on PEM files
  • Signing key: backed up to secrets vault
  • auto_generate_keys: false after first boot
  • HSTS header enabled in Caddyfile (default: yes)
  • Log rotation configured (Caddy: 50 MB x 10 files)
  • Regular docker compose pull for security patches
  • /enroll blocked by Caddy on port 443 (default: yes — respond /enroll 403)
  • Port 7672 bound to loopback only (127.0.0.1:7672:7672) — not publicly reachable

12. Troubleshooting

/enroll returns 403

Caddy blocks /enroll on port 443 by design. Use the SSH tunnel:

ssh -L 7672:localhost:7672 YOUR_USER@trust-anchor.dep.dev.rciam.grnet.gr
# then in another terminal:
curl -i "http://localhost:7672/enroll?sub=https://some-idp.example.org"

Caddy fails to obtain TLS certificate

Port 80 is not reachable, or DNS does not point to this VM.

dig +short trust-anchor.dep.dev.rciam.grnet.gr
sudo ss -tlnp | grep ':80'

Entity Configuration returns wrong entity_id

The entity_id in config.yaml is incorrect. Fix the value and restart:

sudo nano /opt/lighthouse-ta/lighthouse/config.yaml
sudo docker compose -f /opt/lighthouse-ta/docker-compose.yml restart lighthouse

Important: Lighthouse reads config.yaml once at startup. Editing the file has no effect until you restart the container.

Config changes not reflected

sudo docker compose -f /opt/lighthouse-ta/docker-compose.yml restart lighthouse
sudo docker compose -f /opt/lighthouse-ta/docker-compose.yml logs lighthouse | tail -20

Ansible post-checks fail

The post-checks run on the remote host. If they fail, SSH into the VM and test manually:

curl -s https://trust-anchor.dep.dev.rciam.grnet.gr/.well-known/openid-federation | head -c 100

Check all endpoints at once

BASE="https://trust-anchor.dep.dev.rciam.grnet.gr"
echo "=== Entity Configuration ===" && curl -sI "${BASE}/.well-known/openid-federation" | head -3
echo "=== List ===" && curl -s "${BASE}/list"
echo "=== Fetch ===" && curl -sI "${BASE}/fetch" | head -3
echo "=== Resolve ===" && curl -sI "${BASE}/resolve" | head -3