perf(producer): hdr benchmark harness — --tags filter, peak heap/RSS tracking, bench:hdr script#382
Merged
vanceingalls merged 1 commit intomainfrom Apr 23, 2026
Merged
Conversation
This was referenced Apr 21, 2026
Collaborator
Author
This was referenced Apr 21, 2026
Merged
91f129b to
fe8fd9b
Compare
cadda3c to
aa41490
Compare
253cf29 to
faa7890
Compare
aa41490 to
9b942fb
Compare
faa7890 to
e4eb138
Compare
9b942fb to
3e3fa0c
Compare
e4eb138 to
d781d68
Compare
c76bbb2 to
1cb6854
Compare
23df022 to
53e0f64
Compare
1cb6854 to
adfcf6f
Compare
53e0f64 to
9d3aa62
Compare
adfcf6f to
56d9997
Compare
9d3aa62 to
3500be4
Compare
56d9997 to
b8fa66f
Compare
3500be4 to
f572ea8
Compare
b8fa66f to
d9a7c43
Compare
f572ea8 to
4a1a749
Compare
d9a7c43 to
dc034ec
Compare
4a1a749 to
bbaef03
Compare
dc034ec to
39201e6
Compare
bbaef03 to
0b163b9
Compare
39201e6 to
cdb1508
Compare
0b163b9 to
2d59a61
Compare
cdb1508 to
dea33f9
Compare
2d59a61 to
23c6d5c
Compare
46d8211 to
d3d28cc
Compare
23c6d5c to
6b38bc3
Compare
d3d28cc to
f30f52c
Compare
6b38bc3 to
8918422
Compare
f30f52c to
a54a246
Compare
8918422 to
ca32a75
Compare
a54a246 to
ad66b8f
Compare
632922c to
e110c81
Compare
…tracking, bench:hdr script Makes the existing benchmark harness genuinely useful for HDR perf work before landing image-cache and debug-logging optimizations in the rest of Chunk 8. Three tightly-related changes: 1. **Positive --tags filter** in `benchmark.ts`. Existing harness only had `--exclude-tags` (which defaults to `slow`). Adds `--tags hdr` so HDR runs don't have to wait for unrelated SDR fixtures. Filters compose: a fixture must match `--tags` (if provided) AND must not match `--exclude-tags`. 2. **Peak heap + RSS tracking** in `executeRenderJob`. A 250ms periodic `process.memoryUsage()` sampler runs alongside every render and reports `peakRssMb` / `peakHeapUsedMb` in `RenderPerfSummary`. Wall-clock alone can't catch slow memory regressions like an unbounded image cache — peak RSS does. Sampler is `unref`'d and always cleared in `finally` so it never keeps the event loop alive or leaks across jobs. Both fields are optional on the interface for back-compat with serialized older summaries. 3. **bench:hdr convenience script** plus a perf README at `tests/perf/README.md` documenting the harness, the new flags, and the captured April-2026 HDR baseline (PQ regression: 34.5s / 272 MiB RSS, HLG regression: 11.5s / 227 MiB RSS, both 1080p / 1 worker / 1 run). The benchmark output table is widened and gains PeakRSS / PeakHeap columns. A new `avgOrNull` helper preserves `null` in the JSON when no run reported memory (avoids silently coercing missing data to 0 in older snapshots). No behavior change for non-benchmark renders — the sampler runs in every `executeRenderJob` but its overhead is a single `process.memoryUsage()` call every 250ms, well below noise. Verification: - `bunx tsc --noEmit -p packages/producer` — clean - `bunx oxlint` / `bunx oxfmt --check` on changed files — clean - `bun test src/services/` — 60/60 pass (frameDirCache, orchestrator, etc.) - `bunx tsx src/benchmark.ts --tags hdr --runs 1` — both HDR fixtures render successfully, summary table prints PeakRSS/PeakHeap columns, per-run output shows new memory line. - `bunx tsx src/benchmark.ts --tags nonexistent` — exits 1 with a helpful message naming the active filters. Refs: plans/hdr-followups.md Chunk 8A.
Collaborator
Author
Merge activity
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
Make the existing benchmark harness genuinely useful for HDR perf work: positive
--tagsfilter, peak heap/RSS sampling, abench:hdrscript, and a perf README documenting the captured April-2026 baseline. Lands first in the Chunk 8 sub-stack so subsequent perf PRs can be measured against a known starting point.Why
Chunk 8Aofplans/hdr-followups.md. Wall-clock timing alone can't catch slow memory regressions like an unbounded image cache — peak RSS does. And the existing harness only had--exclude-tags, so HDR runs had to wait for unrelated SDR fixtures.What changed
1. Positive
--tagsfilter inbenchmark.ts. Adds--tags hdrso HDR runs don't have to wait for unrelated fixtures. Filters compose: a fixture must match--tags(if provided) AND must not match--exclude-tags.2. Peak heap + RSS tracking in
executeRenderJob. A 250 ms periodicprocess.memoryUsage()sampler runs alongside every render and reportspeakRssMb/peakHeapUsedMbinRenderPerfSummary. Sampler isunref'd and always cleared infinallyso it never keeps the event loop alive or leaks across jobs. Both fields are optional on the interface for back-compat with serialized older summaries.3.
bench:hdrconvenience script plus a perf README attests/perf/README.mddocumenting the harness, the new flags, and the captured April-2026 HDR baseline (PQ regression: 34.5 s / 272 MiB RSS, HLG regression: 11.5 s / 227 MiB RSS, both 1080p / 1 worker / 1 run).The benchmark output table is widened and gains
PeakRSS/PeakHeapcolumns. A newavgOrNullhelper preservesnullin the JSON when no run reported memory (avoids silently coercing missing data to 0 in older snapshots).No behavior change for non-benchmark renders — the sampler runs in every
executeRenderJobbut its overhead is a singleprocess.memoryUsage()call every 250 ms, well below noise.Test plan
bunx tsc --noEmit -p packages/producer— clean.bunx oxlint/bunx oxfmt --checkon changed files — clean.bun test src/services/— 60/60 pass (frameDirCache, orchestrator, etc.).bunx tsx src/benchmark.ts --tags hdr --runs 1— both HDR fixtures render successfully, summary table printsPeakRSS/PeakHeapcolumns, per-run output shows new memory line.bunx tsx src/benchmark.ts --tags nonexistent— exits 1 with a helpful message naming the active filters.Stack
Chunk 8A of
plans/hdr-followups.md. First PR in the Chunk 8 perf sub-stack; subsequent PRs (image cache, logger gating) measured against this baseline.