fix(studio): fix capture button silent failures and broken CLI seek#904
Conversation
The Capture button could silently fail with no user feedback due to
several compounding issues:
- The click handler's try-catch only covered the fetch call, leaving
waitForPendingDomEditSaves() and URL construction unprotected. Any
error there became an unhandled promise rejection with zero UI
feedback. Wrap the entire handler body in try-catch.
- No timeout on the fetch or save-queue drain, so a hung server or
stuck save queue caused the button to appear permanently broken.
Add a 30s AbortController timeout on the fetch and a 5s race
timeout on waitForPendingDomEditSaves.
- The CLI server's thumbnail seek used `__timeline` (singular) which
doesn't exist — the runtime registers `__timelines` (plural). Also
used `.seek()` instead of `.pause(t)` and didn't kick the GSAP
ticker. Align with the Vite adapter's working seek logic.
- The CLI server's getThumbnailBrowser and generateThumbnail catch
blocks swallowed all errors silently — Chrome launch failures and
screenshot errors were invisible. Add console.warn logging.
- Parse the JSON error body from the server so the toast shows the
actual message ("Chrome browser may not be available") instead of
just "Capture failed (500)".
Closes #902
jrusso1020
left a comment
There was a problem hiding this comment.
LGTM — defense-in-depth fix that converts silent failures into surfaced errors at every layer.
Audited
Client error surfacing (useFrameCapture.ts):
- Whole click handler now wrapped in try-catch → toast on any error. Previously
waitForPendingDomEditSaves()rejection or URL construction throw was unhandled. ✓ - 30s fetch timeout via
AbortControllerwith properclearTimeoutcleanup in both success + error paths.AbortErrortranslated to user-readable "Capture timed out" toast. ✓ - 5s save-queue timeout via
Promise.race(waitForPendingDomEditSaves, timeoutReject). ✓ - Server JSON error body parsed via inner try/catch — toast shows the actual server message instead of generic "Capture failed (500)". The nested catch correctly falls back to the default message on non-JSON responses. ✓
Server-side seek bug (studioServer.ts) — this is the load-bearing fix:
// OLD (broken):
else if (win.__timeline?.seek) {
win.__timeline.pause();
win.__timeline.seek(t);
}
// NEW (working):
else if (w.__timelines) {
for (const tl of Object.values(w.__timelines)) {
tl?.pause?.(t);
}
w.gsap?.ticker?.tick?.();
}Three corrections in one:
__timeline(singular, doesn't exist) →__timelines(plural, the actual global).seek(t)→.pause(t)— GSAP's standard "seek and stop at time" pattern (matches hf#881's parent-audio-pause-on-seek shape)- Added
gsap?.ticker?.tick?.()to force render — without this, pausing-at-t doesn't actually advance the visual to time t until the next rAF tick
Aligns with the Vite adapter's working implementation per body. The __timelines global is what landed in hf#849+ and we've been working with across the recent studio PRs; the CLI thumbnail path was running stale code targeting the original singular form. ✓
Readiness check tightened — was !!__timelines || !!__playerReady; now requires !!__timelines && Object.keys(__timelines).length > 0. Avoids the race where the __timelines global exists but no timelines have registered yet. ✓
document.fonts?.ready await — added before screenshot to prevent font-flash captures. Standard pattern for web-font-stable screenshots. ✓
Server-side error logging — getThumbnailBrowser and generateThumbnail both moved from silent catch {} to console.warn with the error message. Operators running hyperframes preview will now see browser-launch failures and screenshot errors in their terminal instead of a mysterious null return. ✓
Actionable error message (thumbnail.ts) — generic "Thumbnail generation returned null" → "Thumbnail generation failed — Chrome browser may not be available". Lines up with the Linux-reporter context where the most likely failure mode is "no Chrome installed" rather than "Chrome failed mid-render." ✓
Body claim verification
- "Wrap entire click handler in try-catch" ✓
- "30s fetch timeout (AbortController) and 5s save-queue drain timeout" ✓
- "Fix CLI server seek logic — used
__timeline(singular, doesn't exist) instead of__timelines(plural)" ✓ verified via diff - "Aligned with the Vite adapter's working implementation" —
__timelines+pause(t)matches what we've been reviewing on the studio path - "Add server-side error logging" ✓
- "Parse JSON error body" ✓
- "Could not reproduce the original bug on macOS" — honest framing; the Linux-specific path (likely Chrome-not-installed) gets the actionable error message, and the seek fix is verified-by-inspection independent of the reporter's specific failure mode
Non-blocking notes
-
Save-queue-timeout semantics — the
Promise.race(saves, timeoutReject)rejects on timeout, which flows to the outer catch → toast → capture aborts. The body's test plan reads "capture proceeds after save completes (or after 5s timeout)" — the second clause is ambiguous but the implementation aborts on timeout, not "proceeds without saved state." Worth a body clarification: "capture aborts with a toast after 5s if saves don't complete." -
CLI server's seek implementation now manually kept in sync with the Vite adapter — same shape as the canonical
vite.adapter.ts+studioServer.tsduplicate-implementation problem we've seen on prior PRs. If the Vite adapter's seek pattern changes again (differentpausearg, different ticker call), the CLI server will silently re-drift. A shared helper (e.g.,executeThumbnailSeek(page, time)in core) would prevent the next divergence. Follow-up scope. -
for (...of Object.values(w.__timelines))pauses ALL timelines uniformly — same call to every registered timeline. For multi-composition projects where only the active comp's timeline should be seeked, this could leave non-active timelines in unexpected states. In practice the screenshot only renders what's visible, so the side-effect is invisible. Worth a comment noting "we pause all timelines because GSAP's tick advances all of them; non-target timelines' pause state is irrelevant for the screenshot." -
await page.evaluate("document.fonts?.ready")as a string vs function — works for simple expressions but function form (await page.evaluate(() => document.fonts?.ready)) is type-safer. Cosmetic.
CI
Test, Typecheck, Lint, Build, Format, CodeQL, Test: runtime contract, CLI smoke (required), Smoke: global install, Preview parity, preview-regression, player-perf all green on this commit. Tests/Render on windows-latest + regression-shards in-progress (older runs cancelled when newer commits triggered re-runs). No failures.
— Review by Rames Jusso (pr-review)
vanceingalls
left a comment
There was a problem hiding this comment.
Two real bug-fixes in one PR (capture-button error swallowing + CLI thumbnail seek). Both fixes are correct; flagging contract-drift and a sibling site that still has the seek bug as important follow-ups, not blockers.
Strengths
useFrameCapture.ts:51-69—AbortController+ 30s timeout + JSON-error-body parse is the right shape. The toast now surfaces the server's actual message instead ofCapture failed (500). Clean.useFrameCapture.ts:37-42— boundingwaitForPendingDomEditSaves()withPromise.raceagainst a 5s reject prevents the indefinite-hang failure mode the issue report describes.studio-api/routes/thumbnail.ts:99-105— the 500 body now tells the operator why (Chrome unavailable). Paired withstudioServer.ts:158-163console.warn, on-call has a real signal.
Important (should fix, not blocker — fine to land first and follow up)
-
packages/cli/src/commands/snapshot.ts:240-256has the same bug this PR is fixing instudioServer.ts, and the PR doesn't touch it. The shape there istls[key].pause(); tls[key].seek(t);— same.seek()-instead-of-.pause(t)pattern, against the same__timelinesregistry. Anyone runninghyperframes snapshotagainst a composition without__playerregistered will hit identical broken-seek behavior. PR body says "aligned with the Vite adapter's working implementation" — that's true for the studio server, but the snapshot CLI command is the sibling site that also satisfies the same contract. -
Three-way contract drift.
packages/studio/vite.thumbnail.ts:1-37already exportsseekThumbnailPreview()— a shared, tested helper (vite.thumbnail.test.ts:1-58covers both the__playerand__timelinespaths). The CLI now duplicates that logic inline atstudioServer.ts:327-340rather than importing it. With this PR landed, three independent implementations of the same seek protocol exist:vite.thumbnail.ts,studioServer.ts(new inline copy), andsnapshot.ts. When the player/timeline contract evolves, two of three will silently drift — which is the exact bug class this PR is fixing. Suggest a follow-up that lifts the helper to a shared location (e.g.@hyperframes/core/studio-api) and has all three sites import it. -
No new test coverage. The PR adds error-handling on the most-visible UI button and rewrites a Puppeteer evaluate block — neither has a test.
studioServer.test.tsonly coversloadRuntimeSource.useFrameCapture.tshas no test file. At minimum, a unit test on the click handler asserting the toast fires on!response.okand on AbortError would catch a future regression of this exact issue. The Vite-sideseekThumbnailPreview()has unit tests; the CLI-side copy doesn't.
Nits
studioServer.ts:323uses(window as any).__timelinesin thewaitForFunctionpredicate but the very next evaluate (:328) uses a properly typedWindow &shape. Worth aligning.studioServer.ts:343—await page.evaluate("document.fonts?.ready")(string form) is the deprecated puppeteer API. The function form is preferred and matches the surrounding evaluates.- PR title/body says "broken CLI seek on linux" but nothing in the diff is Linux-gated. The bug reproduces anywhere
__playerisn't registered (every standalone composition page). The Linux framing comes from the bug reporter's environment, not from the fix shape — would clarify the body so future readers don't grep for aprocess.platformgate.
Verdict: APPROVE
Reasoning: The fixes are correct, CI is green on required checks, and the silent-failure path is materially improved. The snapshot-sibling and helper-extraction are real follow-ups but not gating this PR.
Review by Vai
jrusso1020
left a comment
There was a problem hiding this comment.
Self-correction acknowledging Vai's three importants — the snapshot.ts miss is the real one and exactly the Rule-2 pattern I have a memory about.
What I missed
snapshot.ts:240-256 has the same buggy tls[key].pause(); tls[key].seek(t); pattern
This is the textbook "grep every implementation on contract widening" case from my own memory. I even flagged the duplicate-implementation concern in my non-blocking notes:
"CLI server's seek implementation now manually kept in sync with the Vite adapter — same shape as the canonical vite.adapter.ts + studioServer.ts duplicate-implementation problem we've seen on prior PRs."
But I only enumerated TWO sites (vite adapter, studioServer). Vai grep'd the codebase and found snapshot.ts — a third site running the same buggy pattern. The snapshot subcommand will reproduce the same silent failure this PR fixes for the thumbnail path.
The trigger I had was the PR body's own framing: "Aligned with the Vite adapter's working implementation." That's the exact signal that says "this is a contract being mirrored across multiple consumers — find them all." I noted it as a future-proofing concern but didn't actually grep. Vai did.
Important 2 (DRY — three independent seek impls) — convergent with my "shared helper" note but Vai is right that the fix scope should include extraction now, not as a "follow-up." With three sites diverging, the maintenance debt has compounded enough that the next regression bisect becomes painful. A seekThumbnailPreview(page, time) helper in core (shared between vite.thumbnail.ts, studioServer.ts, snapshot.ts) is the right shape and would have prevented this PR from being only a partial fix.
Important 3 (zero new tests) — I didn't flag this
The error-handling rewrite in useFrameCapture.ts is precisely the kind of code that needs unit tests:
- AbortController fires → translates to "Capture timed out" toast
- Save-queue race rejects → outer catch fires toast with the right message
- JSON error body parses → toast uses server message
- Non-JSON error body → fallback message used
The test plan checkboxes are all manual. None of those behaviors are pinned in CI. Worth at least covering the "AbortError + clearTimeout cleanup" path since that's the most subtle.
Nit (Linux framing in body) — I accepted it without questioning
I noted: "Miguel couldn't repro on macOS — the seek fix is verified by code inspection, the error-surfacing improvements help diagnose whatever Linux is actually doing."
Vai's correction: nothing in the diff is process.platform-gated. The bug repros anywhere __player isn't registered — browser tab not focused, iframe not yet loaded, CDP context teardown. The Linux framing was a red herring driven by the reporter's environment, not the actual bug surface. I should have read the diff more critically rather than accepting the body's narrative. Worth a body edit so future bisects don't chase a Linux-specific cause that isn't there.
Calibration
Three real misses, weighted heavily by the snapshot.ts gap. My memory pattern is right, my application of it was incomplete — I noted the duplicate concern but didn't grep. Treating my prior APPROVE as informational alongside Vai's APPROVE-with-important-followups. Happy to re-approve once snapshot.ts is patched and (ideally) the shared helper is extracted.
Lesson reinforced: when a PR body says "Aligned with [other implementation]", that's the exact trigger to git grep for ALL implementations of the contract — not just the two named ones — before approving.
— Self-correction by Rames Jusso (pr-review)
- Fix snapshot.ts seek logic: same __timeline→__timelines + .pause(t) + gsap ticker kick fix as studioServer.ts (caught by Vai's review) - Use typed Window shape in waitForFunction instead of (window as any) - Use function-form page.evaluate for document.fonts?.ready
Root cause: on Linux, acquireBrowser defaults to beginframe mode (--enable-begin-frame-control) which makes page.screenshot() hang indefinitely — beginframe mode expects CDP HeadlessExperimental.beginFrame commands, not Puppeteer's Page.captureScreenshot. Pass forceScreenshot: true and captureMode: "screenshot" so the thumbnail browser always uses screenshot-compatible Chrome flags. Reproduced on Linux devbox: thumbnail endpoint hung >30s with beginframe flags; returns a valid PNG instantly in screenshot mode.
Summary
Fixes #902 — the Studio Capture button silently hung on Linux because the CLI server's thumbnail browser was launched in beginframe mode.
Root cause: On Linux,
acquireBrowser()defaults to beginframe mode (--enable-begin-frame-control) which makespage.screenshot()hang indefinitely — beginframe mode expects CDPHeadlessExperimental.beginFramecommands, not Puppeteer'sPage.captureScreenshot. The thumbnail code usespage.screenshot(), so it deadlocked.Fix: Pass
forceScreenshot: truetoacquireBrowserso the thumbnail browser always uses screenshot-compatible Chrome flags.Reproduced and verified on a Linux devbox (Ubuntu x64, Chrome headless shell 131): thumbnail endpoint hung >30s before the fix, returns a valid 607KB PNG in <1s after.
Additional fixes in this PR
waitForPendingDomEditSaves()or URL construction were unhandled promise rejections with no toast__timeline(singular, doesn't exist) instead of__timelines(plural); also used.seek()instead of.pause(t)and skipped the GSAP ticker kick. Aligned with the Vite adapter's working implementation. Applied to bothstudioServer.tsandsnapshot.ts.getThumbnailBrowserandgenerateThumbnailcatch blocks were silent; Chrome launch failures and screenshot errors are now loggedTest plan
hyperframes snapshotproduces correct frame captures