Skip to content

Prepare ssl-manager-wrapped Docker images for goaggregator, nostr-relay, and faucet (local-infra HAProxy deployment) #321

@vrogojin

Description

@vrogojin

Motivation

The 2026-05-28 testnet aggregator outage (sustained 503 from goggregator-test.unicity.network) blocked every e2e and soak run that depends on the aggregator pointer path. The companion sphere.telco PRs (#320, #321, #322, #323) made the wallet UI fully configurable — six endpoints can now be overridden from a gear-icon Settings modal or sphereDev.set* console helpers — but pointing the wallet at a local stack only helps if there IS a local stack to point at.

tests/e2e/local-infra/ today boots only a Nostr relay (docker-compose.yml) and a faucet helper (faucet.ts). The full aggregator-go stack lives at tests/e2e/local-infra/.aggregator-go/ but is gitignored (developer-local clone with its own three-container compose). None of these are reachable over real domain names with proper TLS — they only work from inside Docker on the dev's laptop.

We need three production-shaped, ssl-manager-wrapped Docker images so the local stack can be deployed under real domains via the existing HAProxy + Let's Encrypt automation, both for e2e/soak in CI and as a fallback when testnet aggregator is down.

Existing pattern (DO NOT change in this issue — replicate it)

Every TLS'd service on the host today follows the same three-piece structure:

  1. DockerfileFROM ghcr.io/unicitynetwork/ssl-manager:latest + adds the service binary on top. The base image bundles certbot, curl, jq, tini, python3, the registration scripts, etc.

  2. entrypoint.sh — on container start runs (in order):

    • ssl-setup.sh (in the base) → acquires/renews Let's Encrypt cert for $SSL_DOMAIN.
    • haproxy-register.sh register (in the base) → POSTs {domain, container, http_port, https_port, extra_ports} to http://${HAPROXY_HOST}:8404/v1/backends.
    • exec the actual service binary.
  3. run-<service>.sh — sources ssl-manager/run-lib.sh, sets CONTAINER_NAME / IMAGE_NAME / APP_NET / HEALTH_PORT and hooks (app_parse_args, app_env_args, app_health_check, etc.), then calls ssl_manager_run "$@". Library handles arg parsing (--domain, --ssl-email, --haproxy-host, --no-ssl, --no-haproxy), network setup, container creation, port polling, color-coded health checks.

Best end-to-end references on the host:

  • /home/vrogojin/ipfs-storage/ — complete Dockerfile + entrypoint + run-ipfs.sh + sidecar
  • /home/vrogojin/ssl-manager/examples/run-fulcrum.sh — WSS + EXTRA_PORTS (relevant for the Nostr relay)
  • /home/vrogojin/ssl-manager/INTEGRATION.md — full step-by-step guide

HAProxy is dynamic-registration based. frontend https-in :443 mode tcp does SNI passthrough — HAProxy does NOT terminate TLS; each container holds its own Let's Encrypt cert (acquired via the certbot inside ssl-manager). New services don't touch haproxy.cfg; they POST to /v1/backends via haproxy-register.sh and HAProxy adds them dynamically. EXTRA_PORTS (JSON array env var) is the escape hatch for ports other than 80/443.

Per-service spec

1. Faucet (simplest — do this first)

Property Value
Upstream image ghcr.io/unicitynetwork/agentic-hosting/faucet:local (see tests/e2e/local-infra/faucet.ts:41)
Port shape Single HTTP REST endpoint
TLS In-container certbot → HTTPS on :443
Domain faucet-dev.unicity.network (TBD — see open decisions)
Required env FAUCET_MNEMONIC, FAUCET_NOSTR_RELAYS, FAUCET_API_KEY (see faucet.ts for the full list)

Deliverables:

  • Dockerfile extending ssl-manager + the upstream faucet binary.
  • entrypoint.sh doing ssl-setup → haproxy-register → exec faucet.
  • run-faucet.sh modeled on run-ipfs.sh.
  • README documenting required env vars + boot procedure.

Why faucet first: single process, single port, no internal multi-container dependencies. Best to confirm the pattern works end-to-end on this before tackling the relay or the aggregator.

2. Nostr relay

Property Value
Upstream image ghcr.io/unicitynetwork/unicity-tokens-relay:sha-1e1b544 (the same SHA pinned in tests/e2e/local-infra/docker-compose.yml)
Port shape WebSocket on TCP :8080
TLS In-container certbot → WSS on :443
Domain relay-dev.unicity.network (TBD)
Storage SQLite volume — keep persistent so we can docker exec sqlite3 to inspect events

Deliverables:

  • Dockerfile extending ssl-manager + nostr-rs-relay binary (or copy from the pinned image).
  • entrypoint.sh doing ssl-setup → haproxy-register → exec the relay.
  • run-relay.sh.
  • Decide: terminate TLS in-container (matches existing pattern) OR pass through via EXTRA_PORTS and terminate at the relay. Recommendation: in-container, matches IPFS / Fulcrum.

3. Aggregator (most complex — three-container stack)

Property Value
Upstream Vendored .aggregator-go/docker-compose.yml (currently developer-local, gitignored)
Stack bft-root + bft-aggregator-genesis-gen + mongo + the aggregator Go binary itself. Optional ui on :3000.
Internal port Aggregator JSON-RPC :11003
TLS A tiny ssl-manager-wrapped reverse proxy (nginx or HAProxy in mode http) fronts the aggregator on :443. The three backend containers run on a private Docker network with no TLS.
Domain aggregator-dev.unicity.network (TBD)
Mongo storage Persistent volume — required for chain continuity. Wipe between soak runs unless the test depends on cumulative state.
Genesis bootstrap First run mints fresh BFT root + trust-base + partition genesis. Need to extract /genesis/trust-base.json from bft-aggregator-genesis-gen so wallets can load it via oracle.trustBasePath (Node) or sphereDev.setSkipTrustBase(true) (browser, expedient).

Deliverables:

  • Decision on whether to commit .aggregator-go/ as a git-submodule under tests/e2e/local-infra/aggregator-go/ or to vendor a minimal aggregator-compose.yml that pins the same image SHAs.
  • A wrapper image OR a run-aggregator.sh that orchestrates the existing compose + a small ssl-manager-wrapped proxy container.
  • Export of trust-base.json to a path consumers can read (e.g., a named volume or an HTTP endpoint at https://aggregator-dev.unicity.network/.well-known/trust-base.json).
  • README documenting genesis-state management — when to wipe, how to reset, what wallets to use against a fresh stack.

Open decisions (please weigh in before implementation starts)

  1. Domain names. Suggested faucet-dev.unicity.network / relay-dev.unicity.network / aggregator-dev.unicity.network. Confirm + ensure DNS A-records exist (or arrange wildcards).

  2. Where these images live. Three options:

    • In sphere-sdk: under tests/e2e/local-infra/<service>-image/ — directly adjacent to the helpers that use them. Easiest to keep in sync with the SDK's contract; CI can build them on-demand.
    • In a new unicity-infra repo: keeps service infra separate from SDK source. Cleaner separation of concerns; harder to keep version-locked with SDK.
    • In each service's own repo (aggregator-go, faucet, nostr relay): the most "correct" home but requires PRs across multiple repos. Recommended path if these services are also used by other Unicity consumers.
  3. Image registry. Push built images to ghcr.io/unicity-sphere/ ? ghcr.io/unicitynetwork/ ? Inherit from upstream's namespace?

  4. Faucet identity. Each faucet instance needs a Nostr identity (mnemonic). Generate fresh per environment? Bake into image (BAD)? Inject via secret at runtime?

  5. Aggregator trust-base distribution. The wallets connecting to the local aggregator need its trust-base.json. Three delivery options:

    • HTTP endpoint at https://aggregator-dev.unicity.network/.well-known/trust-base.json (cleanest for browser wallets).
    • Static file checked into tests/e2e/local-infra/aggregator-trust-base.json and consumed by global-setup.ts. Brittle — needs regeneration on every aggregator wipe.
    • sphereDev.setSkipTrustBase(true) and ignore the cryptographic check entirely. Expedient for browser dev but inappropriate for soak.
  6. Soak-run lifecycle. Should the soak harness wipe the aggregator MongoDB + relay SQLite between runs, or test cumulative state? If wiping is the default, the genesis bootstrap needs to be deterministic and re-runnable. If keeping, the harness needs a versioning story.

Acceptance criteria

  • docker pull works for all three images from a public-readable registry (or a private one the CI has credentials for).
  • Running each run-<service>.sh --domain <domain> --ssl-email <email> from a clean host:
    • acquires a Let's Encrypt cert via certbot
    • registers itself with the production HAProxy via the /v1/backends API
    • passes the script's built-in health checks (green ticks across the board)
    • is reachable from the public internet on https://<domain> within the script's polling window
  • sphere-telco Settings modal flow works end-to-end: open gear → fill in aggregator-dev.unicity.network / relay-dev.unicity.network / faucet-dev.unicity.network → Save → wallet reinit → all four banner pills green within one probe cycle.
  • The aggregator's trust-base is reachable / loadable by the wallet WITHOUT requiring skipTrustBase (browser may temporarily depend on that flag; soak-test Node clients must load the real trust-base file).
  • tests/e2e/local-infra/global-setup.ts gains an E2E_LOCAL_AGGREGATOR=1 mode that brings up all three services and exports the relevant URLs / pubkeys for the test harness — companion to the existing E2E_LOCAL_INFRA=1 flag.
  • INTEGRATION.md-style docs explain how to deploy each service from scratch on a new host (assumes ssl-manager + HAProxy stack already running).

Related context

  • Companion app PRs that consume these endpoints once they exist:
    • sphere#320 — wire dev oracle override (oracle.url, skipVerification) into SphereProvider
    • sphere#321 — extend to all five remaining endpoints (Nostr, IPFS, Faucet, Market, plus the existing aggregator)
    • sphere#322 — Settings modal GUI surface for the same overrides
    • sphere#323 — BaseModal z-index fix so the modal's footer isn't occluded by the bottom nav
  • sphere-sdk PR fix(profile)(#319): auto-clear BLOCKED on successful pointer poll for transient-connectivity reasons #320 — BLOCKED auto-clear on transient connectivity (the 2026-05-28 outage motivator)
  • tests/e2e/local-infra/global-setup.ts — existing local-infra entrypoint to extend
  • /home/vrogojin/ssl-manager/INTEGRATION.md (and examples/run-fulcrum.sh) — reference for the wrap pattern
  • /home/vrogojin/ipfs-storage/ — closest complete working example

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions