Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,33 @@

All notable changes to `@unicitylabs/infra-probe` are documented in this file. The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and the project adheres to [Semantic Versioning](https://semver.org/).

## [0.4.1] — 2026-05-20

### Fixed

- **Aggregator false-negative — broken write path reported as `degraded` (exit 1) instead of `unreachable` (exit 2).** A real testnet outage (HTTP 401 Unauthorized on `submit_commitment`) was classified as merely "degraded" because the verdict counted ALL failed checks uniformly: one fail → degraded, two fails → unreachable. The canonical functional check was no more important than a diagnostic JSON-RPC plane probe. Downstream e2e pre-flight gates that only fired on exit code 2 silently let test suites run against an aggregator that couldn't accept commitments, producing 35 phantom `Submit failed: [object Object]` failures on the SDK side (sphere-sdk issue #191 is the companion fix that makes those failures legible).
- **Fix:** `submit_commitment` and `get_inclusion_proof` checks now carry a `critical: true` field. The aggregator probe's verdict computation has been extracted into a pure `computeAggregatorStatus(checks)` function and exported for testability. The rule: any critical-check fail → `unreachable`, regardless of how many liveness checks pass. The legacy "≥ 2 fails → unreachable" rule is preserved for non-critical checks.
- **Test coverage:** 8 new network-free unit tests pin the rule in `tests/smoke.test.mjs` — healthy path, single-fail degraded, multi-fail unreachable, each critical check independently, explicit `critical: false`, and the warn-vs-critical-fail interaction.

### Notes

- The `critical: true` marker on a check is the design language going forward for "this check IS the gate". Other probes (nostr publish, ipfs add+fetch roundtrip, fulcrum tx-index) may adopt the same pattern in follow-up work where their functional-check failures should similarly drive `unreachable`. This release scopes the change tight to the reported regression.
- The aggregator probe's `get_inclusion_proof` check is currently too lenient — it accepts any `result` object as success, including responses that don't actually prove the just-submitted commitment landed. That's a separate looseness worth tightening (see the live-probe output: a check that returns success while submit_commitment fails with HTTP 401 is unreliable). Deferred to a follow-up commit since the user-visible classification is already fixed.

## [0.4.0] — 2026-05-04

### Added

- **`faucet` probe** for the Unicity test faucet (`https://faucet.unicity.network`). The faucet issues real test tokens to e2e wallets identified by registered nametag; many SDK e2e suites (`uxf-send-receive`, `pointer-roundtrip`, `migrate-to-profile-conservation`, `profile-export-roundtrip`) silently time out at the wallet-funding step when the faucet is down. The probe converts a 240 s "Faucet top-up timed out" into a 1-2 s clean SKIP with a precise "faucet unreachable" message.
- **`request` check** — POST `/api/v1/faucet/request` with a deliberately-invalid probe nametag. Healthy faucet responds with HTTP 4xx + structured `{success: false, error: "Nametag not found: …"}`. Exercises the full HTTP/parse/nametag-resolve/response-shaping pipeline WITHOUT consuming actual faucet quota.
- **`health` check** — best-effort GET `/health`. Absence is not a failure; presence cross-checks backend pressure.
- **`NETWORKS[*].faucet`** field per network. `null` on mainnet (no faucet by design), `https://faucet.unicity.network` on testnet and dev. The probe layer treats `null` as a clean skip with verdict `healthy` — cleaner than emitting a misleading "unreachable" verdict against a default URL that doesn't apply.
- **`SERVICES`** updated to 6 entries: `['nostr', 'aggregator', 'ipfs', 'fulcrum', 'market', 'faucet']`. Stable ordering preserved; `faucet` appended at the end so existing `--only` invocations are unaffected.

### Notes

- The probe nametag (`infra-probe-do-not-mint-zk7q3xa9p2v`) is deliberately synthetic and unlikely to collide with a real user's nametag. If the faucet ever returns `success: true` for the probe call (i.e., MINTS tokens to the probe nametag), the verdict downgrades to `degraded` with a clear "validation may be broken" message — that's a real signal worth surfacing.

## [0.3.0] — 2026-05-03

First publishable release. Three rounds of probe-correctness work since the initial cut, plus documentation and packaging hardening.
Expand Down Expand Up @@ -42,6 +69,8 @@ Functional probes added (write+read+verify across all 5 services). Not published

Initial release. Liveness-only probes for nostr, aggregator, ipfs, fulcrum. Pretty + JSON output. Documented exit codes. Not published to npm.

[0.4.1]: https://github.com/unicitynetwork/infra-probe/releases/tag/v0.4.1
[0.4.0]: https://github.com/unicitynetwork/infra-probe/releases/tag/v0.4.0
[0.3.0]: https://github.com/unicitynetwork/infra-probe/releases/tag/v0.3.0
[0.2.0]: https://github.com/unicitynetwork/infra-probe/releases/tag/v0.2.0
[0.1.0]: https://github.com/unicitynetwork/infra-probe/releases/tag/v0.1.0
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Tiny availability + performance probe for [Unicity Network](https://unicity.network) infrastructure.

Designed as a **pre-flight gate** for end-to-end test suites and a **5-second smoke test** when something feels off. Runs five parallel probes — Nostr relay, L3 Aggregator, IPFS gateway, L1 Fulcrum, and the Market intent database — exercises both the **liveness** of each endpoint and the **functional write+read+verify path** real wallet flows depend on, then reports per-check latency in either a colored human-readable format or single-line JSON.
Designed as a **pre-flight gate** for end-to-end test suites and a **5-second smoke test** when something feels off. Runs six parallel probes — Nostr relay, L3 Aggregator, IPFS gateway, L1 Fulcrum, the Market intent database, and the test Faucet — exercises both the **liveness** of each endpoint and the **functional write+read+verify path** real wallet flows depend on, then reports per-check latency in either a colored human-readable format or single-line JSON.

```
✅ aggregator https://goggregator-test.unicity.network
Expand Down Expand Up @@ -47,7 +47,12 @@ Designed as a **pre-flight gate** for end-to-end test suites and a **5-second sm
✓ feed-recent 1016ms 10 listing(s) returned (1016ms)
Status: HEALTHY (2/2 checks passed)

Summary: 5 HEALTHY, 0 DEGRADED, 0 UNREACHABLE (of 5)
✅ faucet https://faucet.unicity.network
✓ request 168ms cleanly rejected probe-nametag (168ms; "Nametag not found: …")
✓ health 13ms HTTP 200 (13ms)
Status: HEALTHY (2/2 checks passed)

Summary: 6 HEALTHY, 0 DEGRADED, 0 UNREACHABLE (of 6)
```

## Install
Expand Down
5 changes: 3 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "@unicitylabs/infra-probe",
"version": "0.3.0",
"description": "Availability + performance probe for Unicity Network infrastructure (Nostr relay, Aggregator, IPFS gateway, L1 Fulcrum). Pre-flight check for e2e tests; CI-friendly JSON output.",
"version": "0.4.1",
"description": "Availability + performance probe for Unicity Network infrastructure (Nostr relay, Aggregator, IPFS gateway, L1 Fulcrum, Market, Faucet). Pre-flight check for e2e tests; CI-friendly JSON output.",
"type": "module",
"license": "MIT",
"author": "Unicity Network <https://unicity.network>",
Expand All @@ -20,6 +20,7 @@
"nostr",
"ipfs",
"fulcrum",
"faucet",
"monitoring",
"preflight"
],
Expand Down
21 changes: 20 additions & 1 deletion src/index.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,9 @@ import { probeAggregator } from './probes/aggregator.mjs';
import { probeIpfsGateway } from './probes/ipfs.mjs';
import { probeFulcrum } from './probes/fulcrum.mjs';
import { probeMarket } from './probes/market.mjs';
import { probeFaucet } from './probes/faucet.mjs';

export const SERVICES = ['nostr', 'aggregator', 'ipfs', 'fulcrum', 'market'];
export const SERVICES = ['nostr', 'aggregator', 'ipfs', 'fulcrum', 'market', 'faucet'];

/**
* @param {object} options
Expand All @@ -41,6 +42,24 @@ export async function runProbes({ network = 'testnet', only, timeoutMs = 20_000,
ipfs: () => probeIpfsGateway(cfg.ipfsGateways[0], { timeoutMs }),
fulcrum: () => probeFulcrum(cfg.fulcrum, { timeoutMs }),
market: () => probeMarket(cfg.marketApi, { timeoutMs }),
// Mainnet has no faucet — `cfg.faucet` is null there. Return a
// skipped-cleanly verdict rather than running probeFaucet against
// a default URL that doesn't apply to the network being probed.
faucet: () => cfg.faucet
? probeFaucet(cfg.faucet, { timeoutMs })
: Promise.resolve({
service: 'faucet',
endpoint: '(none for ' + network + ')',
status: 'healthy',
latencyMs: 0,
checks: [{
name: 'config',
status: 'pass',
latencyMs: 0,
message: `network ${network} has no faucet by design — skipped`,
}],
timestamp: new Date().toISOString(),
}),
};

const tasks = requested.map((name) => {
Expand Down
6 changes: 6 additions & 0 deletions src/networks.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@ export const NETWORKS = {
],
fulcrum: 'wss://fulcrum.unicity.network:50004',
marketApi: 'https://market-api.unicity.network',
// Mainnet has no faucet — `null` makes the probe skip cleanly when
// run with `--only faucet` against mainnet, rather than emitting a
// misleading "unreachable" verdict.
faucet: null,
},
testnet: {
label: 'Testnet',
Expand All @@ -30,6 +34,7 @@ export const NETWORKS = {
],
fulcrum: 'wss://fulcrum.unicity.network:50004',
marketApi: 'https://market-api.unicity.network',
faucet: 'https://faucet.unicity.network',
},
dev: {
label: 'Development',
Expand All @@ -42,6 +47,7 @@ export const NETWORKS = {
],
fulcrum: 'wss://fulcrum.unicity.network:50004',
marketApi: 'https://market-api.unicity.network',
faucet: 'https://faucet.unicity.network',
},
};

Expand Down
55 changes: 52 additions & 3 deletions src/probes/aggregator.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ export async function probeAggregator(url, { timeoutMs = 10_000, apiKey = DEFAUL
checks.push({
name: 'submit_commitment',
status: 'fail',
critical: true,
latencyMs,
message: json?.error
? `aggregator rejected: ${typeof json.error === 'string' ? json.error : json.error?.message ?? JSON.stringify(json.error)}`
Expand All @@ -129,12 +130,13 @@ export async function probeAggregator(url, { timeoutMs = 10_000, apiKey = DEFAUL
checks.push({
name: 'submit_commitment',
status: latencyMs > 3_000 ? 'warn' : 'pass',
critical: true,
latencyMs,
message: `accepted (status=${status}, ${latencyMs}ms)`,
});
}
} catch (err) {
checks.push({ name: 'submit_commitment', status: 'fail', latencyMs: Date.now() - submitStart, message: errMsg(err) });
checks.push({ name: 'submit_commitment', status: 'fail', critical: true, latencyMs: Date.now() - submitStart, message: errMsg(err) });
}

// ---- 4. get_inclusion_proof for the just-submitted commitment ----
Expand Down Expand Up @@ -169,23 +171,70 @@ export async function probeAggregator(url, { timeoutMs = 10_000, apiKey = DEFAUL
checks.push({
name: 'get_inclusion_proof',
status: latencyMs > 3_000 ? 'warn' : 'pass',
critical: true,
latencyMs,
message: `proof returned in ${latencyMs}ms`,
});
} else {
checks.push({
name: 'get_inclusion_proof',
status: 'fail',
critical: true,
latencyMs,
message: `no proof within ${latencyMs}ms (last: ${lastErr ?? 'unknown'})`,
});
}
}

return finalize(computeAggregatorStatus(checks));
}

/**
* Compute the aggregate verdict for the aggregator probe from its checks.
*
* Exported for unit testing the rule in isolation (network-free).
*
* **Rule** (encodes the CLAUDE.md "False-negative discipline" principle —
* functional checks are authoritative for the verdict):
*
* 1. If ANY check marked `critical: true` is `fail` → `unreachable`.
* A wallet cannot transact through this aggregator regardless of how
* many liveness checks pass. `submit_commitment` and
* `get_inclusion_proof` are the canonical critical checks: they
* exercise the read/write code path real wallets depend on. Returning
* `degraded` here would let the CLI exit code 1 (degraded) instead of
* 2 (unreachable), so any e2e pre-flight script that only gates on
* `unreachable` would miss the outage and let tests run anyway — a
* false-negative that produces noise downstream (see the issue #191
* companion report on the sphere-sdk side: 35/35 nametag-mint e2e
* tests fail against a "degraded" testnet that's actually broken at
* the write path).
*
* 2. Else if ≥ 2 non-critical checks fail → `unreachable`. Multiple
* simultaneous liveness failures suggest the service is broadly
* broken even if no single failure is critical.
*
* 3. Else if exactly 1 non-critical check fails OR any check is `warn` →
* `degraded`.
*
* 4. Else → `healthy`.
*
* The previous logic counted ALL fails uniformly (single fail = degraded,
* two fails = unreachable). That treated `submit_commitment` — the very
* write path the aggregator exists to serve — as no more important than a
* diagnostic JSON-RPC plane probe. Issue: a real 401-on-submit outage was
* reported as `degraded` (exit 1) instead of `unreachable` (exit 2), so
* the e2e preflight gate didn't fire and let the test suite run anyway,
* producing 35 phantom failures with opaque error messages.
*/
export function computeAggregatorStatus(checks) {
const criticalFail = checks.some((c) => c.critical === true && c.status === 'fail');
if (criticalFail) return 'unreachable';
const failed = checks.filter((c) => c.status === 'fail').length;
const slow = checks.filter((c) => c.status === 'warn').length;
const status = failed > 0 ? (failed >= 2 ? 'unreachable' : 'degraded') : slow > 0 ? 'degraded' : 'healthy';
return finalize(status);
if (failed >= 2) return 'unreachable';
if (failed > 0 || slow > 0) return 'degraded';
return 'healthy';
}

// ---------------------------------------------------------------------------
Expand Down
Loading
Loading