Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,66 @@

All notable changes to `@unicitylabs/infra-probe` are documented in this file. The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and the project adheres to [Semantic Versioning](https://semver.org/).

## [0.5.0] — 2026-05-24

### Changed (breaking)

- **Faucet probe rewritten to do an actual end-to-end mint-and-verify.** Previous versions sent a deliberately-invalid nametag and accepted the faucet's `Nametag not found` rejection as proof-of-life — that correctly reports the HTTP request/parse/resolve pipeline, but cannot catch a faucet bug where the HTTP layer accepts a real mint request, returns `success:true`, and yet no token is ever delivered. The new probe drives the full happy path:
1. Spin up an ephemeral Sphere wallet (in-memory storage adapter — still no disk writes), mint a single-use nametag on the L3 aggregator, publish the kind:30078 binding on the Nostr relay.
2. POST `/api/v1/faucet/request` for 1 raw unit (1e-6) of USDU. Capture `data.amountInSmallestUnits` and `data.requestId` from the response.
3. Subscribe to `transfer:incoming` on the wallet and wait up to 10 s for a kind:31113 token-transfer event addressed to our pubkey. The SDK handles NIP-04 decryption + Token deserialization.
4. Assert the delivered token's `coinId == USDU` and `amount == amountInSmallestUnits` from the HTTP response — independent proof the mint actually landed.
- **Check names changed**: `request` + `health` → `wallet-setup` + `request` + `receipt`. All three are `critical:true` so a failure on any drives the verdict to `unreachable`. Existing JSON consumers that filtered on the previous check names will see different keys and need to update.
- **Faucet probe now depends on `@unicitylabs/sphere-sdk` (≈ 74 MB install).** This violates the project's "Minimal dependencies" hard rule, accepted as a scoped exception because re-implementing the wallet+token+nametag stack from scratch would be ~300 LoC of mirrored SDK code touching half a dozen wire formats. The other five probes still operate on raw formats with only `ws` + `@noble/curves`. Rationale and trade-offs recorded in [CLAUDE.md](./CLAUDE.md#the-faucet-exception).
- **End-to-end faucet probe wall-clock is now ~8–12 s** (up from <500 ms). Pre-flight gates with tight `--timeout` may need to relax; the orchestration layer auto-bumps the faucet's per-probe ceiling to at least 30 s for this reason.

### Added

- **Per-run cost (intentional):** the faucet probe now leaves one kind:30078 event on the Nostr relay + one nametag NFT on the L3 aggregator + consumes 1 USDU raw unit (1e-6 USDU) from faucet quota per run. Documented in CLAUDE.md "Stateless on the relay/gateway side, with one exception". Pace probe runs accordingly.

### Why

A live testnet run on 2026-05-23 showed the previous faucet probe reporting `✓ request "Nametag not found"` — visually confusing to operators (the success tick next to an error-shaped message reads as a contradiction) AND non-authoritative (it didn't prove the faucet's send path actually delivers tokens). The new probe makes both clear: a healthy verdict means a real mint, signed by the faucet's pubkey, landed in a probe-controlled wallet within the receipt budget.

## [0.4.2] — 2026-05-20

### Added

- **`local` network** in `NETWORKS` — endpoints for the hermetic local docker-compose stack at `tests/e2e/local-infra/` in sphere-sdk: aggregator `http://127.0.0.1:3001`, Nostr relay `ws://127.0.0.1:7777`, IPFS gateway `http://127.0.0.1:8082`. Fulcrum and Market still point at testnet (no local counterpart yet). Faucet is `null` because the local faucet is DM-driven (no HTTP /request endpoint), so the existing HTTP probe doesn't apply.
- Use: `unicity-infra-probe --network local --only aggregator,ipfs,nostr` to verify a local stack is healthy.

### Fixed

- **Aggregator `health` check false-negative against the standalone `BFT_ENABLED=false` mode.** The parser only accepted `{"status":"healthy", "database":"ok"}` (BFT mode) and rejected the standalone shape `{"status":"ok", "role":"standalone", "details":{"database":"connected"}}` as `degraded` even when the aggregator was fully functional. Now the parser accepts both shapes; the verdict drops to `degraded` only when the body genuinely reports unhealthy state.
- **Why two shapes:** the production aggregator runs behind BFT consensus and reports the per-shard / per-database matrix. The standalone aggregator (used by sphere-sdk's e2e local stack via `BFT_ENABLED=false`) reports a flat `{status:'ok', role:'standalone'}` because there are no shards or peer aggregators to enumerate. Both are correct for their mode.

## [0.4.1] — 2026-05-20

### Fixed

- **Aggregator false-negative — broken write path reported as `degraded` (exit 1) instead of `unreachable` (exit 2).** A real testnet outage (HTTP 401 Unauthorized on `submit_commitment`) was classified as merely "degraded" because the verdict counted ALL failed checks uniformly: one fail → degraded, two fails → unreachable. The canonical functional check was no more important than a diagnostic JSON-RPC plane probe. Downstream e2e pre-flight gates that only fired on exit code 2 silently let test suites run against an aggregator that couldn't accept commitments, producing 35 phantom `Submit failed: [object Object]` failures on the SDK side (sphere-sdk issue #191 is the companion fix that makes those failures legible).
- **Fix:** `submit_commitment` and `get_inclusion_proof` checks now carry a `critical: true` field. The aggregator probe's verdict computation has been extracted into a pure `computeAggregatorStatus(checks)` function and exported for testability. The rule: any critical-check fail → `unreachable`, regardless of how many liveness checks pass. The legacy "≥ 2 fails → unreachable" rule is preserved for non-critical checks.
- **Test coverage:** 8 new network-free unit tests pin the rule in `tests/smoke.test.mjs` — healthy path, single-fail degraded, multi-fail unreachable, each critical check independently, explicit `critical: false`, and the warn-vs-critical-fail interaction.

### Notes

- The `critical: true` marker on a check is the design language going forward for "this check IS the gate". Other probes (nostr publish, ipfs add+fetch roundtrip, fulcrum tx-index) may adopt the same pattern in follow-up work where their functional-check failures should similarly drive `unreachable`. This release scopes the change tight to the reported regression.
- The aggregator probe's `get_inclusion_proof` check is currently too lenient — it accepts any `result` object as success, including responses that don't actually prove the just-submitted commitment landed. That's a separate looseness worth tightening (see the live-probe output: a check that returns success while submit_commitment fails with HTTP 401 is unreliable). Deferred to a follow-up commit since the user-visible classification is already fixed.

## [0.4.0] — 2026-05-04

### Added

- **`faucet` probe** for the Unicity test faucet (`https://faucet.unicity.network`). The faucet issues real test tokens to e2e wallets identified by registered nametag; many SDK e2e suites (`uxf-send-receive`, `pointer-roundtrip`, `migrate-to-profile-conservation`, `profile-export-roundtrip`) silently time out at the wallet-funding step when the faucet is down. The probe converts a 240 s "Faucet top-up timed out" into a 1-2 s clean SKIP with a precise "faucet unreachable" message.
- **`request` check** — POST `/api/v1/faucet/request` with a deliberately-invalid probe nametag. Healthy faucet responds with HTTP 4xx + structured `{success: false, error: "Nametag not found: …"}`. Exercises the full HTTP/parse/nametag-resolve/response-shaping pipeline WITHOUT consuming actual faucet quota.
- **`health` check** — best-effort GET `/health`. Absence is not a failure; presence cross-checks backend pressure.
- **`NETWORKS[*].faucet`** field per network. `null` on mainnet (no faucet by design), `https://faucet.unicity.network` on testnet and dev. The probe layer treats `null` as a clean skip with verdict `healthy` — cleaner than emitting a misleading "unreachable" verdict against a default URL that doesn't apply.
- **`SERVICES`** updated to 6 entries: `['nostr', 'aggregator', 'ipfs', 'fulcrum', 'market', 'faucet']`. Stable ordering preserved; `faucet` appended at the end so existing `--only` invocations are unaffected.

### Notes

- The probe nametag (`infra-probe-do-not-mint-zk7q3xa9p2v`) is deliberately synthetic and unlikely to collide with a real user's nametag. If the faucet ever returns `success: true` for the probe call (i.e., MINTS tokens to the probe nametag), the verdict downgrades to `degraded` with a clear "validation may be broken" message — that's a real signal worth surfacing.

## [0.3.0] — 2026-05-03

First publishable release. Three rounds of probe-correctness work since the initial cut, plus documentation and packaging hardening.
Expand Down Expand Up @@ -42,6 +102,9 @@ Functional probes added (write+read+verify across all 5 services). Not published

Initial release. Liveness-only probes for nostr, aggregator, ipfs, fulcrum. Pretty + JSON output. Documented exit codes. Not published to npm.

[0.4.2]: https://github.com/unicitynetwork/infra-probe/releases/tag/v0.4.2
[0.4.1]: https://github.com/unicitynetwork/infra-probe/releases/tag/v0.4.1
[0.4.0]: https://github.com/unicitynetwork/infra-probe/releases/tag/v0.4.0
[0.3.0]: https://github.com/unicitynetwork/infra-probe/releases/tag/v0.3.0
[0.2.0]: https://github.com/unicitynetwork/infra-probe/releases/tag/v0.2.0
[0.1.0]: https://github.com/unicitynetwork/infra-probe/releases/tag/v0.1.0
61 changes: 54 additions & 7 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# CLAUDE.md — Project context for AI coding agents
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

This file gives an AI agent everything it needs to work on `@unicitylabs/infra-probe` without re-deriving the design from scratch. Read this before making any non-trivial changes.

Expand All @@ -11,9 +13,26 @@ This file gives an AI agent everything it needs to work on `@unicitylabs/infra-p
- **IPFS gateway** (Kubo HTTP API + `/ipfs/*` path)
- **L1 Fulcrum** (Electrum-protocol over WSS, the Unicity ALPHA blockchain front)
- **Market / Intent database** (semantic-search REST API)
- **Faucet** (HTTP `/request` endpoint — testnet/dev only)

It is the canonical pre-flight gate for any e2e test suite that hits the live testnet/mainnet, and a hand-tool smoke test when something feels off.

## Common commands

```sh
npm test # full smoke suite (network-free)
node --test tests/smoke.test.mjs # same, explicit
node --test --test-name-pattern='computeAggregatorStatus' tests/smoke.test.mjs # single test
npm start # probe testnet (default)
npm run probe:testnet # explicit testnet
npm run probe:mainnet # explicit mainnet
npm run probe:json # JSON output
node ./bin/unicity-infra-probe.mjs --network local --only nostr,aggregator,ipfs # local docker-compose stack
node ./bin/unicity-infra-probe.mjs --help # full CLI surface
```

Networks: `mainnet`, `testnet` (default), `dev`, `local`. The `local` profile points at the docker-compose stack in sphere-sdk's e2e harness (`E2E_FULL_LOCAL_STACK=1`); only `nostr`, `aggregator`, and `ipfs` have local counterparts — the rest still point at testnet so a single command can surface "your local stack is fine; the public service it depends on isn't."

## What it explicitly is *not*

- Not a continuous-monitoring system (use Grafana/Prometheus for that).
Expand All @@ -24,10 +43,23 @@ It is the canonical pre-flight gate for any e2e test suite that hits the live te
## Hard rules

- **No build step.** Source files are runnable Node.js ESM. No TypeScript, no transpiler, no bundler. If you find yourself wanting one, stop and reconsider — it'd compromise the "clone-and-run-anywhere" property.
- **Minimal dependencies.** Only `ws` and `@noble/curves` are allowed. Adding a third dependency requires a strong justification in the commit message.
- **No SDK coupling.** The probe must NOT import from `@unicitylabs/sphere-sdk` or `@unicitylabs/state-transition-sdk`. The wire formats those SDKs use are reverse-engineered into this repo's source so the probe stays independent of SDK release cycles. If a wire format changes upstream, mirror it here in plain code with a comment pointing back to the SDK source.
- **Network-only — no local state.** The probe never writes to disk, never reads config files. Every input is CLI args + env vars. Every output is stdout/stderr.
- **Stateless on the relay/gateway side too.** Probes use ephemeral keypairs, `?pin=false` for IPFS adds, etc., so a successful probe leaves no persisted artifact on the upstream service.
- **Minimal dependencies, with one exception.** The default-allowed deps are `ws` and `@noble/curves`. The faucet probe additionally depends on `@unicitylabs/sphere-sdk` (and its ~40 transitive deps) because the only way to verify a real mint round-trips end-to-end is to drive the full wallet path — see "The faucet exception" below. Adding any other third dependency requires a strong justification in the commit message.
- **No SDK coupling, with one exception.** Probes must NOT import from `@unicitylabs/sphere-sdk` or `@unicitylabs/state-transition-sdk`. The wire formats those SDKs use are reverse-engineered into this repo's source so the probe stays independent of SDK release cycles. **Exception:** `src/probes/faucet.mjs` uses sphere-sdk directly — see "The faucet exception" below.
- **Network-only — no local state.** The probe never writes to disk, never reads config files. Every input is CLI args + env vars. Every output is stdout/stderr. The faucet probe upholds this even while using sphere-sdk (it supplies an `InMemoryStorage` adapter to satisfy the SDK's `StorageProvider` interface without touching the filesystem; see the class at the bottom of `src/probes/faucet.mjs`).
- **Stateless on the relay/gateway side too, with one exception.** Probes use ephemeral keypairs, `?pin=false` for IPFS adds, etc., so a successful probe leaves no persisted artifact on the upstream service. **Exception:** the faucet probe leaves a kind:30078 nametag-binding event on the Nostr relay and a nametag NFT on the L3 aggregator per run. Both are accepted costs of end-to-end verification.

### The faucet exception

The faucet has no probe-only mode and no direct-pubkey path — it requires a `unicityId` (nametag) that resolves to a registered identity. Verifying that a real mint actually lands therefore requires the probe to *be* a (one-shot) Unicity wallet: generate a mnemonic, mint a nametag on the L3 aggregator, publish the kind:30078 binding event on the relay, issue the faucet request against that nametag, and wait for the kind:31113 transfer event to arrive.

Re-implementing that from scratch is ~300 LoC of mirrored SDK code touching half a dozen wire formats (RequestId, Authenticator, NIP-04 ECDH+AES, Token deserialization, TokenCoinData parsing, ...). Pulling in `@unicitylabs/sphere-sdk` is the pragmatic alternative even though it's a ~74 MB install with ~40 transitive deps. The faucet probe is the ONE place this trade-off is taken. All other probes (`nostr`, `aggregator`, `ipfs`, `fulcrum`, `market`) MUST keep operating on raw wire formats with only `ws` + `@noble/curves`.

Cost per faucet probe run, for downstream operators to factor into how often they run it:
- One nametag NFT minted on the L3 aggregator (unreclaimed)
- One USDU raw unit (1e-6 USDU) consumed from faucet quota
- One kind:30078 event left on the Nostr relay

If the faucet ever grows a probe-only mode (no nametag resolve required, or a no-quota coin), revisit this exception and pull sphere-sdk back out.

## Folder layout (canonical — don't reorganise without strong reason)

Expand All @@ -52,7 +84,8 @@ unicity-infra-probe/
│ ├── aggregator.mjs
│ ├── ipfs.mjs
│ ├── fulcrum.mjs
│ └── market.mjs
│ ├── market.mjs
│ └── faucet.mjs
└── tests/
└── smoke.test.mjs # Node --test runner; CI-friendly, no network
```
Expand All @@ -65,7 +98,7 @@ Every probe function returns this shape (extra service-specific fields are allow

```js
{
service: string, // 'nostr' | 'aggregator' | 'ipfs' | 'fulcrum' | 'market' | ...
service: string, // 'nostr' | 'aggregator' | 'ipfs' | 'fulcrum' | 'market' | 'faucet' | ...
endpoint: string, // human-readable URL
status: 'healthy' | 'degraded' | 'unreachable' | 'error',
latencyMs: number, // overall probe wall-clock
Expand All @@ -75,6 +108,7 @@ Every probe function returns this shape (extra service-specific fields are allow
status: 'pass' | 'fail' | 'warn',
latencyMs?: number,
message?: string,
critical?: boolean, // see "Critical checks" below
... // service-specific fields fine (e.g. eventCount)
}
],
Expand All @@ -84,6 +118,10 @@ Every probe function returns this shape (extra service-specific fields are allow
}
```

### Critical checks (verdict-driving)

Some checks are flagged `critical: true` because their failure means the service is effectively unusable for its primary purpose, even if other checks pass. The canonical example is the aggregator: an HTTP 401 on `submit_commitment` while `health` returns 200 is **unreachable**, not degraded — wallets cannot transact. The legacy "count fails" rule is preserved as a fallback (≥2 fails → unreachable), but any single critical fail short-circuits to `unreachable`. See `computeAggregatorStatus` in `src/probes/aggregator.mjs` and the pinned tests in `tests/smoke.test.mjs` (`computeAggregatorStatus: *`). The motivating incident was sphere-sdk #191, where a degraded verdict let an e2e gate pass against a broken write path.

The status enum and check fields are **shape-stable public API**. Don't rename or repurpose them without bumping the major version.

## Verdict logic
Expand Down Expand Up @@ -124,6 +162,15 @@ When in doubt, **probe with raw cURL or a minimal raw-WebSocket script first** t

Latency thresholds are inherently per-service. A 3 s GET is degraded; a 3 s semantic-search call is healthy. Don't apply uniform thresholds across services. Each probe owns its own threshold logic; document the reasoning inline (see `src/probes/market.mjs` 10 s threshold for `search`).

## Network-specific opt-outs (`null` endpoints)

A network may legitimately have no counterpart for a given service. Examples in `src/networks.mjs`:

- `mainnet.faucet = null` — mainnet has no faucet by design.
- `local.faucet = null` — the local stack uses a DM-driven faucet (NIP-04 over the local relay), not HTTP.

The orchestration layer (`runProbes` in `src/index.mjs`) interprets `null` as "skip this probe cleanly" — it emits a synthetic `healthy` result with a `config` check explaining the skip, rather than running the HTTP probe against a wrong URL and reporting a misleading `unreachable`. The smoke test `every NETWORKS entry has the required endpoint set` enforces that `faucet` is **declared** in every network (so the field is never silently missing), while accepting `null` as a valid value. When adding a new optional service, follow this pattern — don't default to a URL that doesn't apply.

## Commit messages

Conventional Commits, scope = service or area:
Expand Down
Loading
Loading