Skip to content

Recipient silently drops uxf-cid bundles: investigate Nostr-delivery + CID-fetch path #396

@vrogojin

Description

@vrogojin

Symptom

When a UXF send genuinely promotes to CID-over-Nostr (uxf-cid wire payload), the recipient may silently drop the bundle: the SDK's payments.receive() returns no new transfers, no transfer:incoming event fires, no transfer:failed event fires, and the bundle never reaches the wallet. The sender, meanwhile, sees a clean Status: submitted and its wallet state evolves as if delivery succeeded.

How it was surfaced

Reproduced during the #394 STRICT-mode soak (before #394b raised RELAY_SAFE_CAP_BYTES from 96 KiB to 512 KiB) — manual-test-roundtrip-391.sh 4-hop A→B→A→B→A scenario:

  • HOP 4 bundle (~121 KB) exceeded the then-current 96 KiB cap → auto-promoted to CID.
  • Bob's CLI: ✓ Transfer successful! Status: submitted (no error).
  • Bob's wallet state correct (UCT: 0.5 = expected change after sending 98.5).
  • Alice's CLI: Checking for incoming transfers... No new transfers found.
  • Alice's balance unchanged on the UCT she should have received.

Bypassed by PR #395 — the 512 KiB cap keeps realistic multi-hop chains inline, so this code path is now exercised only by genuinely huge bundles that don't occur in normal use. But the bug remains for anyone whose chain genuinely produces a > 512 KiB bundle, and for any caller that explicitly passes delivery: { kind: 'force-cid' }.

Companion log evidence

The soak log shows the at-least-once durability layer flagging many Nostr events around the same time:

[Nostr] [AT-LEAST-ONCE] TOKEN_TRANSFER 965fc075beac exhausted 3 durability replay attempts —
  advancing cursor; operator should investigate local OrbitDB/IPFS-pin/publish failures.

10+ such warnings over the soak window. Whether HOP 4's uxf-cid event was specifically among them, we don't know — the warning doesn't tell us which event was a TOKEN_TRANSFER for which transfer attempt.

Two candidate sub-bugs (need diagnostic logging to disambiguate)

Sub-bug A — Nostr event delivery gap

Alice's transport's since-bounded subscription either:

  • doesn't pick up the uxf-cid event at all, OR
  • the relay drops it before alice's short-lived CLI process subscribes, OR
  • the cursor since is advanced (per the at-least-once warning above) past the event before alice's process even runs.

Sub-bug B — CID-fetch silent-drop on receive

Alice's transport sees the uxf-cid event, hands off to the recipient pipeline (IngestWorkerPoolcid-fetcher), the fetch from cidFetchGateways fails (gateway timeout, pin not yet propagated, IPFS DAG block missing, etc.), and the failure is swallowed without surfacing as a transfer:failed event or any operator-visible signal.

The two are not mutually exclusive — both could contribute.

Reproduction plan

  1. Force a > RELAY_SAFE_CAP_BYTES bundle. Easiest: pass delivery: { kind: 'force-cid' } on a 3-token chain so even a small bundle routes through CID. Alternative: extend the round-trip soak with more hops until the chain genuinely overflows 512 KiB.
  2. Run with SDK debug logging enabled at:
    • NostrTransportProvider.subscribe — log every event received, the since cursor, and which subscription matched.
    • IngestWorkerPool — log every bundle handoff + outcome.
    • cid-fetcher (modules/payments/transfer/cid-fetcher.ts) — log every gateway attempt + outcome.
  3. Localize which sub-bug fires (A, B, or both).

Acceptance

  • An observable failure event (probably transfer:cid-fetch-failed or transfer:bundle-dropped) surfaces silent drops to operators.
  • The receive path either:
    • Successfully fetches and finalizes the bundle, OR
    • Surfaces a clear error to the application AND retries via the at-least-once recovery layer.
  • An e2e gate (extension of manual-test-roundtrip-391.sh or a new soak) that forces CID delivery and verifies end-to-end receipt — fails CI if the receive path silently swallows a CID bundle.

Why this is a follow-up rather than a #394 blocker

Practical multi-hop chains (3-token, 4-hop) produce bundles around 100-150 KiB — well inside the new 512 KiB envelope. Real users using realistic flows never trigger CID delivery today. The bug remains a latent gap that becomes relevant only when bundles legitimately exceed 512 KiB OR when callers explicitly opt into force-cid.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions