Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
324 changes: 324 additions & 0 deletions skills/montage/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,324 @@
---
name: montage
description: >
Author a beat-synced HyperFrames montage from a folder of photos/videos and a
music track — cut each segment on the music's beats and duck the music under
any clip that carries its own audio. Use when the user has (or wants) a music
track and a set of image/video assets and asks for a "montage", "music video",
"photo montage", "beat-synced" / "cut to the beat" / "rhythmic" edit, or a
slideshow driven by the music. Closes the loop from the `hyperframes beats`
CLI output to a finished composition. Docs-only recipe — no engine/runtime
changes; it builds entirely on the shipped `beats` CLI and the core
media/volume contract.
metadata:
{ "tags": "montage, beats, beat-synced, music-video, photo-montage, volume-ducking, crossfade" }
---

# montage — beat-synced montage authoring recipe

A **beat-synced montage** cuts one segment per asset on a steady beat cadence so the picture changes _feel_ the music, and ducks the background music under any clip that brings its own sound. This skill closes the loop that `hyperframes beats` (detection) and Studio beat-snapping (manual) leave open: given a folder of assets + `beats/<audio>.json`, place the cuts on beats and duck the music under each video clip.

**Read `/hyperframes-core` first** for the base composition contract (clips, tracks, `data-*` attributes, the seekable timeline). This recipe only covers what is specific to beat-driven montage: consuming the beats file, segment placement, crossfade exits, and music ducking. It introduces **no new commands** and touches **no engine/runtime code**.

## When to use this skill

Use it when the user has a music track and a set of assets and asks for any of:

- a "montage", "photo montage", "music video", or "highlight reel";
- a "beat-synced", "cut to the beat", or "rhythmic" edit;
- a slideshow/recap whose pacing should follow the music.

Do **not** use it for narrative voiceover pieces (the cadence is the speech, not the music) — that stays in `/general-video` / `/product-launch-video`. If there is no music track at all, there is nothing to sync to; ask first.

## Prerequisites

1. **One music track**, mounted as the music `<audio>`: `data-timeline-role="music"` (the `beats` CLI keys off this — or an id matching `music` / `bgm` / `soundtrack`). This is the track the montage cuts against.
2. **Assets on disk** — photos (`<img>`) and/or video clips. Video clips that carry audible sound get a sibling `<audio>` per the core media contract (see Step 4).

You need the music file present before detecting beats; the assets can be wired in either before or after.

## The loop

### Step 1 — Detect beats, read `beats/<audio>.json`

Run the shipped CLI from the project root (it spawns headless Chrome, so ensure `npx hyperframes browser ensure` first if no browser is on disk):

```bash
npx hyperframes beats # analyzes the music track in ./index.html
npx hyperframes beats --json # same, but prints { ok, file, count, bpm } as JSON
```

The command finds the music `<audio>` (first element with `data-timeline-role="music"`, else the first id matching `music|bgm|soundtrack`), detects beats, and writes:

```
beats/<audio>.json
```

The file is versioned and self-describing — **one `{ time, strength }` per beat**:

```jsonc
{
"version": 1,
"audio": "assets/bgm/track.mp3",
"beats": [
{ "time": 0.523, "strength": 0.62 },
{ "time": 1.014, "strength": 0.81 },
{ "time": 1.505, "strength": 0.43 },
// …
],
}
```

Read it at build time (the composition must stay deterministic — read the file **offline**, bake the chosen times into the HTML/timeline; never fetch it at render time):

```js
// build-time helper (runs once when authoring, NOT in the composition)
const { beats } = JSON.parse(fs.readFileSync("beats/assets/bgm/track.mp3.json", "utf8"));
// beats: Array<{ time: number, strength: number }>
```

> **0 beats is not an error to bake.** The CLI refuses to _write_ a 0-beat file (a silent/ambient track yields nothing), so a missing file means re-run detection or pick a different track.

### Step 2 — Place one segment per asset on a beat cadence

Pick a cadence — **one segment every Nth beat** — so segment boundaries land exactly on beats. `N` sets the pace:

| Cadence (every Nth beat) | Vibe | Typical BPM × N |
| ------------------------ | ------------------- | ----------------------- |
| every 1 beat | frenetic / intro | fast cuts, ≤ ~0.5s each |
| every 2 beats | energetic default | ~1s each at 120 BPM |
| every 4 beats | relaxed / cinematic | ~2s each at 120 BPM |

Then for asset `i`, assign the segment `[beats[i * N].time, beats[(i + 1) * N].time]`. Use `strength` to bias _which_ beats you cut on (cut on the strongest beats), to drop a segment on a long musical rest, or to land a hero asset on the strongest beat in the track.

```js
// build-time: turn beats + assets into concrete clip times
const N = 2; // every 2nd beat
const slots = assets
.map((asset, i) => {
const startBeat = beats[i * N];
const endBeat = beats[(i + 1) * N];
if (!startBeat || !endBeat) return null; // ran out of beats → stop
return {
asset,
start: Number(startBeat.time.toFixed(3)),
duration: Number((endBeat.time - startBeat.time).toFixed(3)),
};
})
.filter(Boolean);

const totalDuration = Number(slots[slots.length - 1].start + slots[slots.length - 1].duration);
```

Each slot becomes a `.clip` with those exact `data-start` / `data-duration`, all inside one composition whose `data-duration` is `totalDuration`. Consecutive `data-start` values are the **clip boundaries** the crossfade rule in Step 3 keys off.

### Step 3 — Crossfade segments (with the required hard kill)

Between segments, crossfade the outgoing segment's content to `opacity: 0` ending _at_ the next segment's start boundary, then **immediately hard-kill** it with a `tl.set(..., { opacity: 0 }, boundary)` at that same time.

This pair is **required**, not stylistic. The `gsap_exit_missing_hard_kill` lint rule fires whenever a GSAP exit tween (a `.to`/`.fromTo` ending in a hidden state — `opacity: 0`, `autoAlpha: 0`, `visibility: "hidden"`, or `display: "none"`) lands on a clip start boundary without a matching `tl.set`. Non-linear seeking can otherwise stop _after_ the fade and leave stale, half-visible state from the previous segment bleeding through the cut.

```js
// segment A ends at boundary B (the next segment's data-start)
tl.to("#seg-a", { opacity: 0, duration: 0.4, ease: "power2.out" }, B - 0.4);
tl.set("#seg-a", { opacity: 0 }, B); // ← the required hard kill
```

Rules of thumb:

- Animate **`opacity`** (or `autoAlpha`) for crossfades — never `visibility`/`display` on a `.clip` element (that trips `gsap_animates_clip_element`; the framework owns clip visibility).
- The hard kill must use the **same hidden state** the exit tween lands in and sit at the **same time** as the boundary (within 0.05s).
- The **final** segment has no successor boundary, so its closing fade needs no hard kill.

### Step 4 — Duck `#bgm` under each video clip, lift the clip's own `<audio>`

When a segment is a **video clip with its own sound**, duck the music under it so the clip's audio is audible, then lift the music back when the clip ends. Per the core media/volume contract, automate `volume` **on the timeline** (the runtime probes the timeline's volume keyframes and applies them identically in preview and render); `data-volume` is only the static baseline for elements no tween touches.

```js
// clip spans [clipStart, clipEnd]; music baseline is data-volume="0.9"
tl.to("#bgm", { volume: 0.15, duration: 0.3, ease: "power1.inOut" }, clipStart); // duck
tl.to("#bgm", { volume: 0.9, duration: 0.3, ease: "power1.inOut" }, clipEnd - 0.3); // lift
```

Wire the clip's sound the standard way: a muted `<video>` for the picture plus a sibling `<audio>` (same `src`, same timing) carrying the sound at `data-volume="1"`. Photo segments and silent B-roll duck nothing — leave the music at its baseline.

```html
<!-- video clip WITH sound: muted <video> for picture + sibling <audio> for sound -->
<video
class="clip"
id="seg-2-video"
src="assets/clip-2.mp4"
muted
playsinline
data-start="3"
data-duration="3"
data-track-index="2"
></video>
<audio
id="seg-2-audio"
src="assets/clip-2.mp4"
data-start="3"
data-duration="3"
data-track-index="3"
data-volume="1"
></audio>
```

## Worked example

A 9-second, 3-segment montage over a music bed. Beats land every ~1.5s, so every-2nd-beat gives 3s segments at boundaries **3.0s** and **6.0s**. Segment 2 is a video clip with its own audio, so the music ducks under it. This file passes `npx hyperframes lint` with 0 errors once the referenced asset files exist (the `assets/*` paths are placeholders — drop your own media in; lint treats missing local media as errors, not warnings).

```html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<title>Beat-synced montage</title>
</head>
<body style="margin: 0">
<div
data-composition-id="montage"
data-start="0"
data-duration="9"
data-width="1920"
data-height="1080"
style="position: relative; width: 1920px; height: 1080px; overflow: hidden; background: #0a0a0a"
>
<!-- Music bed: a direct child of the root composition (the media contract —
media is never nested in a sub-comp/wrapper). data-timeline-role="music"
is what `hyperframes beats` targets; data-volume="0.9" is the static
baseline the timeline automates. -->
<audio
id="bgm"
src="assets/bgm/track.mp3"
data-start="0"
data-duration="9"
data-track-index="0"
data-volume="0.9"
data-timeline-role="music"
></audio>

<!-- Segment 1 — photo, beats[0] → beats[2] (0.0s → 3.0s) -->
<section
class="clip"
id="seg-1"
data-start="0"
data-duration="3"
data-track-index="1"
style="position: absolute; inset: 0"
>
<img
id="seg-1-img"
src="assets/clip-1.jpg"
alt=""
style="width: 100%; height: 100%; object-fit: cover"
/>
<h2
id="seg-1-cap"
style="position: absolute; left: 80px; bottom: 80px; margin: 0; font-family: sans-serif; font-size: 64px; color: #fff"
>
Sunrise
</h2>
</section>

<!-- Segment 2 — video clip WITH its own audio, beats[2] → beats[4] (3.0s → 6.0s) -->
<video
class="clip"
id="seg-2-video"
src="assets/clip-2.mp4"
muted
playsinline
data-start="3"
data-duration="3"
data-track-index="2"
style="position: absolute; inset: 0; width: 100%; height: 100%; object-fit: cover"
></video>
<audio
id="seg-2-audio"
src="assets/clip-2.mp4"
data-start="3"
data-duration="3"
data-track-index="3"
data-volume="1"
></audio>

<!-- Segment 3 — photo, beats[4] → beats[6] (6.0s → 9.0s) -->
<section
class="clip"
id="seg-3"
data-start="6"
data-duration="3"
data-track-index="4"
style="position: absolute; inset: 0"
>
<img
id="seg-3-img"
src="assets/clip-3.jpg"
alt=""
style="width: 100%; height: 100%; object-fit: cover"
/>
<h2
id="seg-3-cap"
style="position: absolute; left: 80px; bottom: 80px; margin: 0; font-family: sans-serif; font-size: 64px; color: #fff"
>
Nightfall
</h2>
</section>
</div>

<script src="https://cdn.jsdelivr.net/npm/gsap@3/dist/gsap.min.js"></script>
<script>
window.__timelines = window.__timelines || {};
const tl = gsap.timeline({ paused: true });
window.__timelines["montage"] = tl;

// Segment starts land on every 2nd beat: 0.0, 3.0, 6.0. Each outgoing
// segment crossfades to opacity:0 at the next boundary (3.0, 6.0), so each
// fade-out gets a matching tl.set "hard kill" at that boundary — the
// gsap_exit_missing_hard_kill rule. The final segment's closing fade has no
// successor boundary, so it needs no hard kill.

// Segment 1 out → boundary 3.0
tl.to("#seg-1", { opacity: 0, duration: 0.4, ease: "power2.out" }, 2.6);
tl.set("#seg-1", { opacity: 0 }, 3.0);

// Segment 2 out → boundary 6.0
tl.to("#seg-2-video", { opacity: 0, duration: 0.4, ease: "power2.out" }, 5.6);
tl.set("#seg-2-video", { opacity: 0 }, 6.0);

// Final segment fades out at the end (no boundary → no hard kill).
tl.to("#seg-3", { opacity: 0, duration: 0.4, ease: "power2.out" }, 8.6);

// Duck #bgm under segment 2's video clip (it carries its own audio), then
// lift back to the data-volume="0.9" baseline. The runtime probes these
// volume keyframes and applies them identically in preview and render.
tl.to("#bgm", { volume: 0.15, duration: 0.3, ease: "power1.inOut" }, 3.0);
tl.to("#bgm", { volume: 0.9, duration: 0.3, ease: "power1.inOut" }, 5.7);
</script>
</body>
</html>
```

### What the example demonstrates

- **`data-timeline-role="music"`** on `#bgm` is the hook the `beats` CLI detects against — and `#bgm` is the duck target.
- **Segment boundaries (3.0, 6.0)** are the next clips' `data-start` values; the crossfade fade-outs end on them and each is followed by a `tl.set(..., { opacity: 0 }, boundary)` hard kill.
- **`opacity`** is animated for crossfades — never `visibility`/`display` on a `.clip` (the framework owns clip visibility).
- **`volume` is automated on the timeline**, not by swapping `data-volume`; `data-volume="0.9"` stays the static baseline the lift returns to.
- **The video clip's sound** rides a sibling `<audio>` (muted `<video>` for picture, `<audio>` for sound) — the core media contract.

## Validation

After authoring or editing a montage composition, run:

```bash
npx hyperframes lint # must report 0 error(s) — block on the hard-kill + clip-visibility rules
npx hyperframes validate # headless runtime check — catches missing assets / JS errors
```

Treat both as blockers. The two rules most likely to fire while iterating this recipe:

- `gsap_exit_missing_hard_kill` — a crossfade ends on a clip boundary without the matching `tl.set`. Add the hard kill (Step 3).
- `gsap_animates_clip_element` — you animated `visibility`/`display` on a `.clip`. Move that content into a child `<div>` and target that, or use `opacity`.

When the cut feels off, the fix is almost always the cadence (`N`) or which beats you cut on (favor `strength`) — not the animation. Re-read the beats file and rebake the segment times.