Skip to content

test(producer): add chunk-boundary fixtures per first-party adapter#852

Merged
jrusso1020 merged 3 commits into
mainfrom
05-14-test_producer_add_chunk-boundary_fixtures_per_first-party_adapter
May 15, 2026
Merged

test(producer): add chunk-boundary fixtures per first-party adapter#852
jrusso1020 merged 3 commits into
mainfrom
05-14-test_producer_add_chunk-boundary_fixtures_per_first-party_adapter

Conversation

@jrusso1020
Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 commented May 14, 2026

Description

Phase 4 of the distributed rendering plan: test fixtures (see DISTRIBUTED-RENDERING-PLAN.md §11 Phase 4 + §10 test strategy). This is PR 4.7 of the Phase 4 remainder — the final fixture PR.

This PR adds six per-adapter chunk-boundary fixtures under tests/distributed/{gsap,anime,three,lottie,css,waapi}-boundary/ plus a single bun:test driver (chunkBoundary.test.ts) that exercises each fixture's seek-determinism contract. Each fixture is a 60-frame composition (2s @ 30fps, 320×180) that drives the named adapter through the HyperFrames runtime's seek hook. The test renders each at chunkSize=60 (N=1 chunk, no seams) and chunkSize=15 (N=4 chunks, three seams at frames 15/30/45), then asserts every PNG frame is byte-identical across the two runs.

Why png-sequence: mp4 bitstreams encode keyframe placement directly. At chunkSize=60 libx264 emits 1 IDR; at chunkSize=15 it emits 4 IDRs at frames 0/15/30/45. Those are legitimately different bytes even when the captured pixels are identical. png-sequence's assemble path merges chunk frame directories with no re-encode, so per-frame byte equality is exactly pixel equality — the strongest contract a distributed render can satisfy.

Fixture design (each is ~60 lines of HTML):

  • gsap-boundary — single GSAP tl.to(...) driving translateX + rotation linearly across 2s.
  • anime-boundary — anime.js v4 timeline registered via window.__hfAnime.
  • three-boundary — minimal Three.js scene; cube rotation derived from window.__hfThreeTime.
  • lottie-boundary — inline Lottie JSON (rectangle layer animating position+rotation) loaded via lottie-web and registered via window.__hfLottie.
  • css-boundary — pure @keyframes animation; the HyperFrames CSS adapter seeks via animation-delay.
  • waapi-boundaryelement.animate() with linear keyframes; runtime sets currentTime per frame.

The fixtures intentionally omit meta.json so the regression-harness discovery skips them with a clear missing meta.json log (they're driven exclusively by chunkBoundary.test.ts). The test passes rejectOnSystemFonts: false because some adapter bundles (notably anime.js's IIFE) embed CSS-shaped strings inside their JS source — font-family: ui-monospace, monospace for internal devtools styling — which validateNoSystemFonts's document-wide regex would otherwise false-positive on every adapter fixture that loads such a bundle. The fixtures display no text, so the relaxed font validation doesn't affect the contract under test.

The 7th test case is a layout sanity check that asserts every expected *-boundary fixture directory exists.

Testing

  • bun test packages/producer/src/services/distributed/chunkBoundary.test.ts — 7 tests pass on host (6 adapters × byte-identical N=1 vs N=4 + the layout check, 41.6s)
  • bun test packages/producer/src/services/distributed/ — all 49 distributed unit tests pass (43.6s)
  • bun run --cwd packages/producer docker:test:distributed font-variant-numeric many-cuts gsap-letters-render-compat style-1-prod sub-composition-video mp4-h264-sdr png-sequence mov-prores mp4-h265-sdr -- --sequential — full smoke set + all four prior stacked fixtures pass (9/9)
  • bunx oxlint + bunx oxfmt --check clean
  • bunx tsc --noEmit (producer package) clean

After this stack lands

The Phase 4 fixture set is complete: PR 4.6 (#844) pins cross-worker idempotency; 4.2/4.3/4.4/4.5 (#845/#851/#847/#848) prove each format produces correct chunked output at the fixture's minPsnr; 4.3-pre (#850) added the codec knob H.265 needed; and this PR proves each first-party adapter's seek-determinism survives chunk seams. Phase 5 (CLI surface for hyperframes plan/chunk/assemble) and Phase 6 (AWS Lambda turnkey) are unblocked.

🤖 Generated with Claude Code

@jrusso1020 jrusso1020 force-pushed the 05-14-test_producer_add_mp4_h.265_sdr_distributed_fixture branch from dadb453 to 32c5f88 Compare May 14, 2026 23:31
@jrusso1020 jrusso1020 force-pushed the 05-14-test_producer_add_chunk-boundary_fixtures_per_first-party_adapter branch from feb1b60 to 3a0388f Compare May 14, 2026 23:31
miguel-heygen
miguel-heygen previously approved these changes May 14, 2026
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed as part of the fixture stack. LGTM.

Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the largest fixture PR in the Phase 4 stack — chunkBoundary.test.ts end-to-end plus all six adapter fixtures, cross-referenced against the runtime adapters in packages/core/src/runtime/adapters/ and plan.ts. Prior coverage: @miguel-heygen approved on the re-review. Findings below are gaps.

Strengths

  • png-sequence framing of the boundary contract (chunkBoundary.test.ts:6-12): textbook "say WHY, not WHAT" — calling out that mp4's IDR placement is a legitimately-different-bytes confound and that png-sequence's no-re-encode assemble path round-trips pixel equality is exactly the level of explanation a future reader needs. This is the rationale I'd want to inherit if I were debugging a flake in three months.
  • Cross-adapter symmetry: every fixture drives a 40×40 box from translateX 0 → 280 + rotate 0 → 360 linearly across 2s. Same input geometry, same adapter-specific seek path, same boundary frames (15/30/45). Symmetric coverage was the explicit asymmetry concern from the review brief — this PR doesn't have it.
  • Adapter-hook fidelity: spot-checked __hfAnime / __hfLottie / __hfThreeTime + hf-seek against packages/core/src/runtime/adapters/{animejs,lottie,three}.ts and document.getAnimations() against waapi.ts / css.ts — each fixture's contract matches the real seek path the runtime invokes in chunk workers.

Important

  • important — Asymmetric soft-skip leaves N=4 unguarded (chunkBoundary.test.ts:126-143). Only planAndAssemble(workOne) is wrapped in the HOST_CHROME_FAILURE_PATTERNS try/catch; the workFour call right after is not. Cold-Chrome / SwiftShader flakes are exactly the kind of thing that can succeed on the first render and trip on the second — when that happens on a developer host, the existing convention says "soft skip, Docker covers it," but this test will hard-fail instead. Either wrap both renders in the same skip pattern, or document why N=4 is deliberately treated as hard (and the convention says it shouldn't be).
  • important — Vacuously-passing length assertion (chunkBoundary.test.ts:156-160). expect(framesOne.length).toBe(framesFour.length) will pass if both runs silently produce 0 frames (or 30, or any matching subset). The PR description states the contract is "60-frame composition (2s @ 30fps)" — pin the absolute count explicitly: expect(framesOne.length).toBe(60). Without that, a regression that truncates both renders identically (e.g. a probe that misreads duration as 0s) shows as a green test.
  • important — External CDN dependencies on test fixtures. Every fixture loads gsap, anime, lottie, or three from cdnjs / jsdelivr at test time. CI hermeticity, supply-chain risk, and CDN-outage flake all become this test's failure surface. Worth comparing against how packages/producer/tests/animejs-adapter/src/index.html (the existing in-repo adapter fixture) handles its bundles — if that one also CDN-loads, this is a pre-existing systemic gap and a follow-up ticket; if it ships local copies, this PR diverges from the established pattern and should match it.

Nits

  • nit — CDN version drift within the PR. gsap-boundary/src/index.html pins gsap@3.12.2 (cdnjs), anime-boundary/src/index.html pins gsap@3.14.2 (jsdelivr). Both gsap timelines are empty duration-drivers so it doesn't affect the contract, but pinning one version + one CDN across all six fixtures saves a future "why this discrepancy" hunt.
  • nit — Comment on the conditional spread understates the safety (regression-harness-distributed.ts simplify commit). The comment says "codec: undefined is a no-op" but plan.ts:490-495 actually throws if config.codec !== undefined for any non-mp4 format. The && input.codec !== undefined half of the spread guard is doing real work — worth reflecting in the comment so the next refactorer doesn't drop it.
  • nit — PR description test count vs. host soft-skip. Description claims "7 tests pass on host" but the host soft-skip path means an adapter that tripped SwiftShader would skip silently and still report green. Worth confirming the Docker harness run covered all six adapters end-to-end (no skips), since the host run alone isn't sufficient evidence of the contract holding.

Notes

  • Reviewed under Rule 5 with mergeable_state: UNSTABLE: pulled gh pr checks — only fail-class indicators are Player perf / preview-regression Detect changes jobs that were CANCELLED (superseded by later runs that passed). The long-tail regression-shards are still IN_PROGRESS at review time — not gating my verdict on code merits, but worth a glance before merge.
  • Did not engage Rames or Magi findings (none posted at review time).

Verdict: COMMENT
Reasoning: Two important findings (asymmetric soft-skip + vacuous length assertion) are tightening, not blocking — the test still catches real boundary regressions, just leaves a couple of false-green windows. Comfortable shipping with the asks addressed in this PR or as fast-follow.

Review by Vai

jrusso1020 added a commit that referenced this pull request May 14, 2026
…ndexing

Address @vanceingalls and @miguel-heygen review findings on #852:

1. Asymmetric soft-skip — only the N=1 plan+render+assemble call was
   wrapped in the host-Chrome-failure catch; an SwiftShader / cold-Chrome
   flake on the N=4 call would hard-fail instead of soft-skip. Factor a
   local runRender() helper and wrap both calls.

2. Vacuously-passing length assertion — 'expect(framesOne.length).toBe(
   framesFour.length)' passes when both runs produce 0 frames. Pin the
   absolute count (EXPECTED_FRAME_COUNT = 60) so a regression that
   identically truncates both renders shows red.

3. CDN version drift — anime-boundary loaded gsap@3.14.2 from jsdelivr
   while every other boundary fixture loaded 3.12.2 from cdnjs. Unify on
   cdnjs@3.12.2 so the next reader doesn't have to wonder why one fixture
   diverges. (gsap is an empty duration-driver in all six fixtures so
   the version was never load-bearing — but the divergence reads as
   intentional and isn't.)

4. VIDEO_EXT type narrowing — the lookup is Record<"mp4"|"mov"|"webm">
   but outputFormat includes "png-sequence". The isPngSequence ternary
   short-circuits before png-sequence can reach the indexing site, but TS
   can't narrow through that. Add an explicit cast at the indexing site
   (not the lookup definition — over-widening to include "png-sequence":
   undefined would defeat the existence guarantee).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@jrusso1020 jrusso1020 force-pushed the 05-14-test_producer_add_mp4_h.265_sdr_distributed_fixture branch from 32c5f88 to ee4fc8c Compare May 14, 2026 23:52
@jrusso1020
Copy link
Copy Markdown
Collaborator Author

Thanks @vanceingalls, @miguel-heygen — all important findings addressed in commit 2610b303:

  1. Asymmetric soft-skip — both planAndAssemble(workOne) and planAndAssemble(workFour) are now wrapped in the same HOST_CHROME_FAILURE_PATTERNS catch via a local runRender() helper. A cold-Chrome / SwiftShader flake on either render skips cleanly instead of hard-failing on N=4.

  2. Vacuously-passing length assertion — added EXPECTED_FRAME_COUNT = 60 and expect(framesOne.length).toBe(EXPECTED_FRAME_COUNT) + same for framesFour. A regression that identically truncates both renders (e.g. probe stage reading duration as 0s) now shows red.

  3. CDN version driftanime-boundary loaded gsap@3.14.2 from jsdelivr while every other boundary fixture loaded 3.12.2 from cdnjs. Unified on cdnjs/3.12.2.

  4. @miguel-heygen VIDEO_EXT narrowing — added an explicit cast at the indexing site (VIDEO_EXT[outputFormat as "mp4" | "mov" | "webm"]) rather than widening the lookup to include "png-sequence". The isPngSequence ternary above the indexing site is the runtime guard.

External-CDN dependency on test fixtures is a real systemic concern (matches the pre-existing pattern in tests/animejs-adapter/) — leaving for a follow-up that converts all adapter fixtures to vendored bundles.

jrusso1020 added a commit that referenced this pull request May 15, 2026
…ndexing

Address @vanceingalls and @miguel-heygen review findings on #852:

1. Asymmetric soft-skip — only the N=1 plan+render+assemble call was
   wrapped in the host-Chrome-failure catch; an SwiftShader / cold-Chrome
   flake on the N=4 call would hard-fail instead of soft-skip. Factor a
   local runRender() helper and wrap both calls.

2. Vacuously-passing length assertion — 'expect(framesOne.length).toBe(
   framesFour.length)' passes when both runs produce 0 frames. Pin the
   absolute count (EXPECTED_FRAME_COUNT = 60) so a regression that
   identically truncates both renders shows red.

3. CDN version drift — anime-boundary loaded gsap@3.14.2 from jsdelivr
   while every other boundary fixture loaded 3.12.2 from cdnjs. Unify on
   cdnjs@3.12.2 so the next reader doesn't have to wonder why one fixture
   diverges. (gsap is an empty duration-driver in all six fixtures so
   the version was never load-bearing — but the divergence reads as
   intentional and isn't.)

4. VIDEO_EXT type narrowing — the lookup is Record<"mp4"|"mov"|"webm">
   but outputFormat includes "png-sequence". The isPngSequence ternary
   short-circuits before png-sequence can reach the indexing site, but TS
   can't narrow through that. Add an explicit cast at the indexing site
   (not the lookup definition — over-widening to include "png-sequence":
   undefined would defeat the existence guarantee).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@jrusso1020 jrusso1020 force-pushed the 05-14-test_producer_add_chunk-boundary_fixtures_per_first-party_adapter branch from 2610b30 to d519030 Compare May 15, 2026 00:05
@jrusso1020 jrusso1020 force-pushed the 05-14-test_producer_add_mp4_h.265_sdr_distributed_fixture branch 2 times, most recently from fb7a0a4 to 6dbdac8 Compare May 15, 2026 02:22
jrusso1020 added a commit that referenced this pull request May 15, 2026
…ndexing

Address @vanceingalls and @miguel-heygen review findings on #852:

1. Asymmetric soft-skip — only the N=1 plan+render+assemble call was
   wrapped in the host-Chrome-failure catch; an SwiftShader / cold-Chrome
   flake on the N=4 call would hard-fail instead of soft-skip. Factor a
   local runRender() helper and wrap both calls.

2. Vacuously-passing length assertion — 'expect(framesOne.length).toBe(
   framesFour.length)' passes when both runs produce 0 frames. Pin the
   absolute count (EXPECTED_FRAME_COUNT = 60) so a regression that
   identically truncates both renders shows red.

3. CDN version drift — anime-boundary loaded gsap@3.14.2 from jsdelivr
   while every other boundary fixture loaded 3.12.2 from cdnjs. Unify on
   cdnjs@3.12.2 so the next reader doesn't have to wonder why one fixture
   diverges. (gsap is an empty duration-driver in all six fixtures so
   the version was never load-bearing — but the divergence reads as
   intentional and isn't.)

4. VIDEO_EXT type narrowing — the lookup is Record<"mp4"|"mov"|"webm">
   but outputFormat includes "png-sequence". The isPngSequence ternary
   short-circuits before png-sequence can reach the indexing site, but TS
   can't narrow through that. Add an explicit cast at the indexing site
   (not the lookup definition — over-widening to include "png-sequence":
   undefined would defeat the existence guarantee).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@jrusso1020 jrusso1020 force-pushed the 05-14-test_producer_add_chunk-boundary_fixtures_per_first-party_adapter branch from d519030 to 19f0f88 Compare May 15, 2026 02:22
@jrusso1020 jrusso1020 changed the base branch from 05-14-test_producer_add_mp4_h.265_sdr_distributed_fixture to graphite-base/852 May 15, 2026 02:46
Address findings from a three-agent code-review pass over the Phase 4 stack:

- regression-harness: hoist `readdirSync` out of the per-checkpoint
  failure-extraction loop (was running 20 redundant syscalls on every
  failing png-sequence test). Drop redundant `existsSync` guards before
  `mkdirSync(recursive: true)` and `rmSync(force: true)`. Replace the
  three-deep ternary that built the output filename suffix with a
  single `Record<format, ext>` lookup.
- regression-harness-distributed: flatten the `format === "mp4" ? {...} : {...}`
  branching in the `plan()` call into a single config object with a
  conditional spread. `plan()` already accepts `codec: undefined` for
  non-mp4 formats, so the duplicate object was unnecessary.
- chunkBoundary.test: rename the stale "byte-identical mp4" test title
  to "byte-identical frames" (the test now uses png-sequence). Trim the
  10-line comment justifying `rejectOnSystemFonts: false` to the
  essential WHY.
- renderChunk / plan.test / regression-harness: drop trailing-edge
  comment phrases that pinned the prose to the PR's calendar context
  ("today", "v1.5", "pre-codec-knob output", section-numbered cross-
  references to the planning doc).

No behavior change. All 49 distributed unit tests pass. Smoke + four
distributed format fixtures pass in --mode=distributed-simulated.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
jrusso1020 added a commit that referenced this pull request May 15, 2026
…ndexing

Address @vanceingalls and @miguel-heygen review findings on #852:

1. Asymmetric soft-skip — only the N=1 plan+render+assemble call was
   wrapped in the host-Chrome-failure catch; an SwiftShader / cold-Chrome
   flake on the N=4 call would hard-fail instead of soft-skip. Factor a
   local runRender() helper and wrap both calls.

2. Vacuously-passing length assertion — 'expect(framesOne.length).toBe(
   framesFour.length)' passes when both runs produce 0 frames. Pin the
   absolute count (EXPECTED_FRAME_COUNT = 60) so a regression that
   identically truncates both renders shows red.

3. CDN version drift — anime-boundary loaded gsap@3.14.2 from jsdelivr
   while every other boundary fixture loaded 3.12.2 from cdnjs. Unify on
   cdnjs@3.12.2 so the next reader doesn't have to wonder why one fixture
   diverges. (gsap is an empty duration-driver in all six fixtures so
   the version was never load-bearing — but the divergence reads as
   intentional and isn't.)

4. VIDEO_EXT type narrowing — the lookup is Record<"mp4"|"mov"|"webm">
   but outputFormat includes "png-sequence". The isPngSequence ternary
   short-circuits before png-sequence can reach the indexing site, but TS
   can't narrow through that. Add an explicit cast at the indexing site
   (not the lookup definition — over-widening to include "png-sequence":
   undefined would defeat the existence guarantee).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@jrusso1020 jrusso1020 force-pushed the 05-14-test_producer_add_chunk-boundary_fixtures_per_first-party_adapter branch from 19f0f88 to b4bc224 Compare May 15, 2026 02:46
@graphite-app graphite-app Bot changed the base branch from graphite-base/852 to main May 15, 2026 02:47
@graphite-app graphite-app Bot dismissed miguel-heygen’s stale review May 15, 2026 02:47

The base branch was changed.

…ndexing

Address @vanceingalls and @miguel-heygen review findings on #852:

1. Asymmetric soft-skip — only the N=1 plan+render+assemble call was
   wrapped in the host-Chrome-failure catch; an SwiftShader / cold-Chrome
   flake on the N=4 call would hard-fail instead of soft-skip. Factor a
   local runRender() helper and wrap both calls.

2. Vacuously-passing length assertion — 'expect(framesOne.length).toBe(
   framesFour.length)' passes when both runs produce 0 frames. Pin the
   absolute count (EXPECTED_FRAME_COUNT = 60) so a regression that
   identically truncates both renders shows red.

3. CDN version drift — anime-boundary loaded gsap@3.14.2 from jsdelivr
   while every other boundary fixture loaded 3.12.2 from cdnjs. Unify on
   cdnjs@3.12.2 so the next reader doesn't have to wonder why one fixture
   diverges. (gsap is an empty duration-driver in all six fixtures so
   the version was never load-bearing — but the divergence reads as
   intentional and isn't.)

4. VIDEO_EXT type narrowing — the lookup is Record<"mp4"|"mov"|"webm">
   but outputFormat includes "png-sequence". The isPngSequence ternary
   short-circuits before png-sequence can reach the indexing site, but TS
   can't narrow through that. Add an explicit cast at the indexing site
   (not the lookup definition — over-widening to include "png-sequence":
   undefined would defeat the existence guarantee).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@jrusso1020 jrusso1020 force-pushed the 05-14-test_producer_add_chunk-boundary_fixtures_per_first-party_adapter branch from b4bc224 to 6b3ad09 Compare May 15, 2026 02:47
Comment thread packages/producer/tests/distributed/anime-boundary/src/index.html Dismissed
Comment thread packages/producer/tests/distributed/css-boundary/src/index.html Dismissed
Comment thread packages/producer/tests/distributed/gsap-boundary/src/index.html Dismissed
Comment thread packages/producer/tests/distributed/lottie-boundary/src/index.html Dismissed
Comment thread packages/producer/tests/distributed/lottie-boundary/src/index.html Dismissed
Comment thread packages/producer/tests/distributed/three-boundary/src/index.html Dismissed
Comment thread packages/producer/tests/distributed/waapi-boundary/src/index.html Dismissed
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-stamping after rebase. LGTM.

@jrusso1020 jrusso1020 merged commit cc6bc2e into main May 15, 2026
39 checks passed
Copy link
Copy Markdown
Collaborator Author

Merge activity

@jrusso1020 jrusso1020 deleted the 05-14-test_producer_add_chunk-boundary_fixtures_per_first-party_adapter branch May 15, 2026 03:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants