test(producer): add chunk-boundary fixtures per first-party adapter#852
Conversation
dadb453 to
32c5f88
Compare
feb1b60 to
3a0388f
Compare
miguel-heygen
left a comment
There was a problem hiding this comment.
Re-reviewed as part of the fixture stack. LGTM.
vanceingalls
left a comment
There was a problem hiding this comment.
Reviewed the largest fixture PR in the Phase 4 stack — chunkBoundary.test.ts end-to-end plus all six adapter fixtures, cross-referenced against the runtime adapters in packages/core/src/runtime/adapters/ and plan.ts. Prior coverage: @miguel-heygen approved on the re-review. Findings below are gaps.
Strengths
- png-sequence framing of the boundary contract (
chunkBoundary.test.ts:6-12): textbook "say WHY, not WHAT" — calling out that mp4's IDR placement is a legitimately-different-bytes confound and that png-sequence's no-re-encode assemble path round-trips pixel equality is exactly the level of explanation a future reader needs. This is the rationale I'd want to inherit if I were debugging a flake in three months. - Cross-adapter symmetry: every fixture drives a 40×40 box from
translateX 0 → 280+rotate 0 → 360linearly across 2s. Same input geometry, same adapter-specific seek path, same boundary frames (15/30/45). Symmetric coverage was the explicit asymmetry concern from the review brief — this PR doesn't have it. - Adapter-hook fidelity: spot-checked
__hfAnime/__hfLottie/__hfThreeTime+hf-seekagainstpackages/core/src/runtime/adapters/{animejs,lottie,three}.tsanddocument.getAnimations()againstwaapi.ts/css.ts— each fixture's contract matches the real seek path the runtime invokes in chunk workers.
Important
important— Asymmetric soft-skip leaves N=4 unguarded (chunkBoundary.test.ts:126-143). OnlyplanAndAssemble(workOne)is wrapped in the HOST_CHROME_FAILURE_PATTERNS try/catch; theworkFourcall right after is not. Cold-Chrome / SwiftShader flakes are exactly the kind of thing that can succeed on the first render and trip on the second — when that happens on a developer host, the existing convention says "soft skip, Docker covers it," but this test will hard-fail instead. Either wrap both renders in the same skip pattern, or document why N=4 is deliberately treated as hard (and the convention says it shouldn't be).important— Vacuously-passing length assertion (chunkBoundary.test.ts:156-160).expect(framesOne.length).toBe(framesFour.length)will pass if both runs silently produce 0 frames (or 30, or any matching subset). The PR description states the contract is "60-frame composition (2s @ 30fps)" — pin the absolute count explicitly:expect(framesOne.length).toBe(60). Without that, a regression that truncates both renders identically (e.g. a probe that misreads duration as 0s) shows as a green test.important— External CDN dependencies on test fixtures. Every fixture loadsgsap,anime,lottie, orthreefrom cdnjs / jsdelivr at test time. CI hermeticity, supply-chain risk, and CDN-outage flake all become this test's failure surface. Worth comparing against howpackages/producer/tests/animejs-adapter/src/index.html(the existing in-repo adapter fixture) handles its bundles — if that one also CDN-loads, this is a pre-existing systemic gap and a follow-up ticket; if it ships local copies, this PR diverges from the established pattern and should match it.
Nits
nit— CDN version drift within the PR.gsap-boundary/src/index.htmlpinsgsap@3.12.2(cdnjs),anime-boundary/src/index.htmlpinsgsap@3.14.2(jsdelivr). Both gsap timelines are empty duration-drivers so it doesn't affect the contract, but pinning one version + one CDN across all six fixtures saves a future "why this discrepancy" hunt.nit— Comment on the conditional spread understates the safety (regression-harness-distributed.tssimplify commit). The comment says "codec: undefinedis a no-op" butplan.ts:490-495actually throws ifconfig.codec !== undefinedfor any non-mp4 format. The&& input.codec !== undefinedhalf of the spread guard is doing real work — worth reflecting in the comment so the next refactorer doesn't drop it.nit— PR description test count vs. host soft-skip. Description claims "7 tests pass on host" but the host soft-skip path means an adapter that tripped SwiftShader would skip silently and still report green. Worth confirming the Docker harness run covered all six adapters end-to-end (no skips), since the host run alone isn't sufficient evidence of the contract holding.
Notes
- Reviewed under Rule 5 with
mergeable_state: UNSTABLE: pulledgh pr checks— only fail-class indicators arePlayer perf/preview-regressionDetect changesjobs that wereCANCELLED(superseded by later runs that passed). The long-tailregression-shardsare stillIN_PROGRESSat review time — not gating my verdict on code merits, but worth a glance before merge. - Did not engage Rames or Magi findings (none posted at review time).
Verdict: COMMENT
Reasoning: Two important findings (asymmetric soft-skip + vacuous length assertion) are tightening, not blocking — the test still catches real boundary regressions, just leaves a couple of false-green windows. Comfortable shipping with the asks addressed in this PR or as fast-follow.
Review by Vai
…ndexing Address @vanceingalls and @miguel-heygen review findings on #852: 1. Asymmetric soft-skip — only the N=1 plan+render+assemble call was wrapped in the host-Chrome-failure catch; an SwiftShader / cold-Chrome flake on the N=4 call would hard-fail instead of soft-skip. Factor a local runRender() helper and wrap both calls. 2. Vacuously-passing length assertion — 'expect(framesOne.length).toBe( framesFour.length)' passes when both runs produce 0 frames. Pin the absolute count (EXPECTED_FRAME_COUNT = 60) so a regression that identically truncates both renders shows red. 3. CDN version drift — anime-boundary loaded gsap@3.14.2 from jsdelivr while every other boundary fixture loaded 3.12.2 from cdnjs. Unify on cdnjs@3.12.2 so the next reader doesn't have to wonder why one fixture diverges. (gsap is an empty duration-driver in all six fixtures so the version was never load-bearing — but the divergence reads as intentional and isn't.) 4. VIDEO_EXT type narrowing — the lookup is Record<"mp4"|"mov"|"webm"> but outputFormat includes "png-sequence". The isPngSequence ternary short-circuits before png-sequence can reach the indexing site, but TS can't narrow through that. Add an explicit cast at the indexing site (not the lookup definition — over-widening to include "png-sequence": undefined would defeat the existence guarantee). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
32c5f88 to
ee4fc8c
Compare
|
Thanks @vanceingalls, @miguel-heygen — all important findings addressed in commit
External-CDN dependency on test fixtures is a real systemic concern (matches the pre-existing pattern in |
…ndexing Address @vanceingalls and @miguel-heygen review findings on #852: 1. Asymmetric soft-skip — only the N=1 plan+render+assemble call was wrapped in the host-Chrome-failure catch; an SwiftShader / cold-Chrome flake on the N=4 call would hard-fail instead of soft-skip. Factor a local runRender() helper and wrap both calls. 2. Vacuously-passing length assertion — 'expect(framesOne.length).toBe( framesFour.length)' passes when both runs produce 0 frames. Pin the absolute count (EXPECTED_FRAME_COUNT = 60) so a regression that identically truncates both renders shows red. 3. CDN version drift — anime-boundary loaded gsap@3.14.2 from jsdelivr while every other boundary fixture loaded 3.12.2 from cdnjs. Unify on cdnjs@3.12.2 so the next reader doesn't have to wonder why one fixture diverges. (gsap is an empty duration-driver in all six fixtures so the version was never load-bearing — but the divergence reads as intentional and isn't.) 4. VIDEO_EXT type narrowing — the lookup is Record<"mp4"|"mov"|"webm"> but outputFormat includes "png-sequence". The isPngSequence ternary short-circuits before png-sequence can reach the indexing site, but TS can't narrow through that. Add an explicit cast at the indexing site (not the lookup definition — over-widening to include "png-sequence": undefined would defeat the existence guarantee). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
2610b30 to
d519030
Compare
fb7a0a4 to
6dbdac8
Compare
…ndexing Address @vanceingalls and @miguel-heygen review findings on #852: 1. Asymmetric soft-skip — only the N=1 plan+render+assemble call was wrapped in the host-Chrome-failure catch; an SwiftShader / cold-Chrome flake on the N=4 call would hard-fail instead of soft-skip. Factor a local runRender() helper and wrap both calls. 2. Vacuously-passing length assertion — 'expect(framesOne.length).toBe( framesFour.length)' passes when both runs produce 0 frames. Pin the absolute count (EXPECTED_FRAME_COUNT = 60) so a regression that identically truncates both renders shows red. 3. CDN version drift — anime-boundary loaded gsap@3.14.2 from jsdelivr while every other boundary fixture loaded 3.12.2 from cdnjs. Unify on cdnjs@3.12.2 so the next reader doesn't have to wonder why one fixture diverges. (gsap is an empty duration-driver in all six fixtures so the version was never load-bearing — but the divergence reads as intentional and isn't.) 4. VIDEO_EXT type narrowing — the lookup is Record<"mp4"|"mov"|"webm"> but outputFormat includes "png-sequence". The isPngSequence ternary short-circuits before png-sequence can reach the indexing site, but TS can't narrow through that. Add an explicit cast at the indexing site (not the lookup definition — over-widening to include "png-sequence": undefined would defeat the existence guarantee). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
d519030 to
19f0f88
Compare
Address findings from a three-agent code-review pass over the Phase 4 stack:
- regression-harness: hoist `readdirSync` out of the per-checkpoint
failure-extraction loop (was running 20 redundant syscalls on every
failing png-sequence test). Drop redundant `existsSync` guards before
`mkdirSync(recursive: true)` and `rmSync(force: true)`. Replace the
three-deep ternary that built the output filename suffix with a
single `Record<format, ext>` lookup.
- regression-harness-distributed: flatten the `format === "mp4" ? {...} : {...}`
branching in the `plan()` call into a single config object with a
conditional spread. `plan()` already accepts `codec: undefined` for
non-mp4 formats, so the duplicate object was unnecessary.
- chunkBoundary.test: rename the stale "byte-identical mp4" test title
to "byte-identical frames" (the test now uses png-sequence). Trim the
10-line comment justifying `rejectOnSystemFonts: false` to the
essential WHY.
- renderChunk / plan.test / regression-harness: drop trailing-edge
comment phrases that pinned the prose to the PR's calendar context
("today", "v1.5", "pre-codec-knob output", section-numbered cross-
references to the planning doc).
No behavior change. All 49 distributed unit tests pass. Smoke + four
distributed format fixtures pass in --mode=distributed-simulated.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
…ndexing Address @vanceingalls and @miguel-heygen review findings on #852: 1. Asymmetric soft-skip — only the N=1 plan+render+assemble call was wrapped in the host-Chrome-failure catch; an SwiftShader / cold-Chrome flake on the N=4 call would hard-fail instead of soft-skip. Factor a local runRender() helper and wrap both calls. 2. Vacuously-passing length assertion — 'expect(framesOne.length).toBe( framesFour.length)' passes when both runs produce 0 frames. Pin the absolute count (EXPECTED_FRAME_COUNT = 60) so a regression that identically truncates both renders shows red. 3. CDN version drift — anime-boundary loaded gsap@3.14.2 from jsdelivr while every other boundary fixture loaded 3.12.2 from cdnjs. Unify on cdnjs@3.12.2 so the next reader doesn't have to wonder why one fixture diverges. (gsap is an empty duration-driver in all six fixtures so the version was never load-bearing — but the divergence reads as intentional and isn't.) 4. VIDEO_EXT type narrowing — the lookup is Record<"mp4"|"mov"|"webm"> but outputFormat includes "png-sequence". The isPngSequence ternary short-circuits before png-sequence can reach the indexing site, but TS can't narrow through that. Add an explicit cast at the indexing site (not the lookup definition — over-widening to include "png-sequence": undefined would defeat the existence guarantee). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
19f0f88 to
b4bc224
Compare
The base branch was changed.
…ndexing Address @vanceingalls and @miguel-heygen review findings on #852: 1. Asymmetric soft-skip — only the N=1 plan+render+assemble call was wrapped in the host-Chrome-failure catch; an SwiftShader / cold-Chrome flake on the N=4 call would hard-fail instead of soft-skip. Factor a local runRender() helper and wrap both calls. 2. Vacuously-passing length assertion — 'expect(framesOne.length).toBe( framesFour.length)' passes when both runs produce 0 frames. Pin the absolute count (EXPECTED_FRAME_COUNT = 60) so a regression that identically truncates both renders shows red. 3. CDN version drift — anime-boundary loaded gsap@3.14.2 from jsdelivr while every other boundary fixture loaded 3.12.2 from cdnjs. Unify on cdnjs@3.12.2 so the next reader doesn't have to wonder why one fixture diverges. (gsap is an empty duration-driver in all six fixtures so the version was never load-bearing — but the divergence reads as intentional and isn't.) 4. VIDEO_EXT type narrowing — the lookup is Record<"mp4"|"mov"|"webm"> but outputFormat includes "png-sequence". The isPngSequence ternary short-circuits before png-sequence can reach the indexing site, but TS can't narrow through that. Add an explicit cast at the indexing site (not the lookup definition — over-widening to include "png-sequence": undefined would defeat the existence guarantee). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
b4bc224 to
6b3ad09
Compare
miguel-heygen
left a comment
There was a problem hiding this comment.
Re-stamping after rebase. LGTM.
Merge activity
|

Description
Phase 4 of the distributed rendering plan: test fixtures (see DISTRIBUTED-RENDERING-PLAN.md §11 Phase 4 + §10 test strategy). This is PR 4.7 of the Phase 4 remainder — the final fixture PR.
This PR adds six per-adapter chunk-boundary fixtures under
tests/distributed/{gsap,anime,three,lottie,css,waapi}-boundary/plus a singlebun:testdriver (chunkBoundary.test.ts) that exercises each fixture's seek-determinism contract. Each fixture is a 60-frame composition (2s @ 30fps, 320×180) that drives the named adapter through the HyperFrames runtime's seek hook. The test renders each atchunkSize=60(N=1 chunk, no seams) andchunkSize=15(N=4 chunks, three seams at frames 15/30/45), then asserts every PNG frame is byte-identical across the two runs.Why png-sequence: mp4 bitstreams encode keyframe placement directly. At
chunkSize=60libx264 emits 1 IDR; atchunkSize=15it emits 4 IDRs at frames 0/15/30/45. Those are legitimately different bytes even when the captured pixels are identical. png-sequence's assemble path merges chunk frame directories with no re-encode, so per-frame byte equality is exactly pixel equality — the strongest contract a distributed render can satisfy.Fixture design (each is ~60 lines of HTML):
gsap-boundary— single GSAPtl.to(...)driving translateX + rotation linearly across 2s.anime-boundary— anime.js v4 timeline registered viawindow.__hfAnime.three-boundary— minimal Three.js scene; cube rotation derived fromwindow.__hfThreeTime.lottie-boundary— inline Lottie JSON (rectangle layer animating position+rotation) loaded vialottie-weband registered viawindow.__hfLottie.css-boundary— pure@keyframesanimation; the HyperFrames CSS adapter seeks viaanimation-delay.waapi-boundary—element.animate()with linear keyframes; runtime setscurrentTimeper frame.The fixtures intentionally omit
meta.jsonso the regression-harness discovery skips them with a clearmissing meta.jsonlog (they're driven exclusively bychunkBoundary.test.ts). The test passesrejectOnSystemFonts: falsebecause some adapter bundles (notably anime.js's IIFE) embed CSS-shaped strings inside their JS source —font-family: ui-monospace, monospacefor internal devtools styling — whichvalidateNoSystemFonts's document-wide regex would otherwise false-positive on every adapter fixture that loads such a bundle. The fixtures display no text, so the relaxed font validation doesn't affect the contract under test.The 7th test case is a layout sanity check that asserts every expected
*-boundaryfixture directory exists.Testing
bun test packages/producer/src/services/distributed/chunkBoundary.test.ts— 7 tests pass on host (6 adapters × byte-identical N=1 vs N=4 + the layout check, 41.6s)bun test packages/producer/src/services/distributed/— all 49 distributed unit tests pass (43.6s)bun run --cwd packages/producer docker:test:distributed font-variant-numeric many-cuts gsap-letters-render-compat style-1-prod sub-composition-video mp4-h264-sdr png-sequence mov-prores mp4-h265-sdr -- --sequential— full smoke set + all four prior stacked fixtures pass (9/9)bunx oxlint+bunx oxfmt --checkcleanbunx tsc --noEmit(producer package) cleanAfter this stack lands
The Phase 4 fixture set is complete: PR 4.6 (#844) pins cross-worker idempotency; 4.2/4.3/4.4/4.5 (#845/#851/#847/#848) prove each format produces correct chunked output at the fixture's
minPsnr; 4.3-pre (#850) added the codec knob H.265 needed; and this PR proves each first-party adapter's seek-determinism survives chunk seams. Phase 5 (CLI surface forhyperframes plan/chunk/assemble) and Phase 6 (AWS Lambda turnkey) are unblocked.🤖 Generated with Claude Code