Skip to content

perf(profile/ipfs): enable CAR-batching at Unicity Kubo gateway + switch SDK to /dag/import + /dag/export #370

@vrogojin

Description

@vrogojin

Goal

Replace the SDK's current per-block IPFS round-trip pattern (/api/v0/dag/put per block, with DEFAULT_PIN_CONCURRENCY=10 opening many parallel HTTP connections) with single-round-trip CAR-level push and pull using Kubo's /api/v0/dag/import (push) and /api/v0/dag/export (pull) endpoints. Eliminates the throttling-under-burst behaviour that caused several spurious soak failures during #364 validation.

Background

The SDK currently uses /api/v0/dag/put per block because the Unicity IPFS gateway (unicity-ipfs1.dyndns.org) does NOT expose /api/v0/dag/import — confirmed by profile/ipfs-client.ts:644 (existing code comment) and verified during #364 investigation (curl -X POST https://unicity-ipfs1.dyndns.org/api/v0/dag/import returns HTTP 404).

This forces the per-block dance: for a 250-block CAR, 10 concurrent workers × 25 sequential rounds × ~80-150 ms RTT = ~250 HTTP requests against the gateway. The kubo container's rate limiter (MAX_PINS_PER_SECOND=100) brushes the upper bound; under burst (multiple wallets flushing simultaneously during §D.1 pre-clear snapshots) it tips over and the SDK sees IPFS dag/put failed on all gateways: fetch failed.

Two coordinated changes — both required for the win

Part 1 — Operator side (Unicity IPFS Kubo gateway)

Expose /api/v0/dag/import and /api/v0/dag/export on unicity-ipfs1.dyndns.org (and any sibling gateways the SDK will fan-out to).

  • haproxy/nginx config: whitelist these two endpoints alongside the existing /api/v0/dag/put, /api/v0/block/get, etc. Kubo's API surface is intentionally restricted at the operator layer; this is purely an ACL change.
  • Rate-limit revisit: with one HTTP request per CAR (instead of N), the existing MAX_PINS_PER_SECOND=100 is no longer the right throttle. Move to a per-IP request/sec limit (e.g. 5-10 req/sec) and a per-IP byte/sec cap if needed.
  • Acceptance:
    • curl -X POST -F 'file=@small.car' 'https://unicity-ipfs1.dyndns.org/api/v0/dag/import?pin=true' returns HTTP 200 with {\"Root\":{\"Cid\":{\"/\":\"bafy...\"}}} lines for each root.
    • curl -X POST 'https://unicity-ipfs1.dyndns.org/api/v0/dag/export?arg=<cid>' --output got.car returns a valid CAR file containing the CID and all descendants.

Part 2 — SDK side (sphere-sdk)

Rewrite profile/ipfs-client.ts to prefer CAR-batched endpoints when available, falling back to the per-block path for legacy gateways:

Push: replace pinCarBlocksToIpfs per-block loop with single CAR dag/import

// New: single round-trip
async function pinCarToIpfs(gateway, carBytes, expectedRootCid, timeoutMs) {
  const form = new FormData();
  form.append('file', new Blob([carBytes]), 'bundle.car');
  const url = `${gateway}/api/v0/dag/import?pin=true`;
  const response = await fetch(url, { method: 'POST', body: form, signal: AbortSignal.timeout(timeoutMs) });
  // Kubo returns NDJSON; each line is a {Root, Stats, …} envelope.
  // Verify expectedRootCid appears in the Root list.
}

Pull: replace BFS over /api/v0/block/get with single /dag/export

profile/ipfs-client.ts:fetchCarFromIpfs and http-block-broker.ts currently walk the dag-cbor link graph one block at a time. After Part 1, the gateway returns the whole subtree in one CAR via /dag/export?arg=<root>. We then feed it through the existing @ipld/car CarReader directly.

Capability discovery

The SDK shouldn't crash on legacy gateways without the new endpoints. Probe once per gateway URL at startup (cached in-process):

const supportsImport = await probeEndpoint(gateway, '/api/v0/dag/import');
const supportsExport = await probeEndpoint(gateway, '/api/v0/dag/export');

If supportsImport === true, use pinCarToIpfs. Else fall back to pinCarBlocksToIpfs. Same for export.

Acceptance criteria

  • Pin path: 250-block CAR pins via a SINGLE HTTP request. Wall-time per pin drops from ~25 s (worst-case serial) / ~2.5 s (10-way concurrent) to ~0.3-0.8 s.
  • Fetch path: 250-block tree fetch via a SINGLE HTTP request. Eliminates the per-block-RTT recovery cost called out in perf(profile): post-mortem of #360 — static-analysis fixes regressed; instrumentation added for a real profile-driven investigation #363 §D.4 attribution.
  • MAX_PINS_PER_SECOND rate-limit collisions no longer surface as IPFS dag/put failed on all gateways.
  • New unit tests:
    • pinCarToIpfs happy-path against a mock 200 NDJSON response.
    • Fallback to pinCarBlocksToIpfs on 404 / 405 from /dag/import.
    • fetchCarFromIpfs returns identical bytes when comparing /dag/export path vs the legacy BFS path (golden-file equivalence).
  • Integration: manual-test-full-recovery.sh soak with SPHERE_PERF=1 shows:
    • ipfs.fetchCar.totalMs drops by 5-10× vs current baseline
    • New counters ipfs.dagImport.totalMs and ipfs.dagExport.totalMs fire
    • Old ipfs.fetchBlock.gatewayHit count drops to zero when gateway advertises /dag/export

Why this is a single coupled issue (operator + SDK)

The SDK can't switch over until the gateway exposes the endpoints (the existing comment at profile/ipfs-client.ts:644 explicitly documents why we landed on per-block in the first place). The gateway exposing them is no value without the SDK switching over. Coordinated rollout: gateway change first, SDK probe falls back gracefully if a single gateway hasn't been updated yet, then everyone is on the fast path.

Ordering / blocks

Refs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions