feat(tuning): VRAM-tier-adaptive sizing for the full Intel-Mac AMD GPU range by sunrunnerfire · Pull Request #9 · Cerid-AI/quenchforge

sunrunnerfire · 2026-05-31T17:50:54Z

What

Closes the gap that kept quenchforge's AMD support effectively Vega-II-only: every AMD-discrete profile inherited the 32 GB Vega II bench constants (--ctx-size 8192, embed --ubatch-size 1024) regardless of card, so smaller Intel-Mac AMD GPUs (8 GB RX 5700, 4 GB MacBook Pro dGPUs) could oversubscribe VRAM and forced operators to hand-tune QUENCHFORGE_MAX_CONTEXT / QUENCHFORGE_EMBED_UBATCH_SIZE.

How

internal/tuning/tuning.go::amdSizing(vramGB) derives the context ceiling + embed ubatch from the detected headline VRAM (hardware.Info.GPUVRAMGB, now threaded into KernelParams):

VRAM	`--ctx-size` cap	embed `--ubatch-size`	example cards
≥ 12 GB	none (keeps `MaxContext`)	1024	Vega II/Duo, W6800X, W6900X, Vega 56/64, 5600M
7–11 GB	4096	512	RX 5700 / 5700 XT, W5700X
≤ 6 GB	2048	256	4 GB MBP dGPUs (5300M/5500M), Polaris 560X

Guarantees

Zero regression on the validated path — ≥ 12 GB and any VRAM probe miss (0/unknown) keep the exact Vega-II-benched values, so the canonical Mac Pro config is unchanged.
Caps only lower — buildSlotArgs applies min(cfg.MaxContext, cap); a high-VRAM operator who raised QUENCHFORGE_MAX_CONTEXT is never clamped.
Operator overrides still win — explicit QUENCHFORGE_EMBED_UBATCH_SIZE beats the tier; the context cap is independent.
Family-agnostic — unlisted/future AMD cards fall through classifyProfile → vega-pro and size by VRAM like any other.

Tests

TestAmdSizing_Tiers, TestKernelParams_EmbedLowVRAMScalesDown, TestKernelParams_ContextCapAppliesToAllAMDSlots, TestKernelParams_HighVRAMAndNonAMDHaveNoContextCap, TestKernelParams_UbatchOverrideBeatsTierButCapStands, TestBuildSlotArgs_LowVRAMAMDCapsContextAndUbatch. Existing tuning/cmd suites pass unchanged (they exercise VRAM=0 → high tier). gofmt/vet/build clean.

Docs: CHANGELOG v0.8.0 (final), README env table, CLAUDE.md gotcha #2.

Every AMD-discrete profile previously inherited the Vega-II bench constants (--ctx-size 8192, embed --ubatch-size 1024) regardless of the card, so smaller Intel-Mac AMD GPUs (8 GB RX 5700, 4 GB MacBook Pro dGPUs) could oversubscribe VRAM and forced operators to hand-set QUENCHFORGE_MAX_CONTEXT / QUENCHFORGE_EMBED_UBATCH_SIZE. amdSizing(vramGB) now derives both from the detected headline VRAM, threaded into KernelParams as a new arg: >= 12 GB -> no ctx cap, ubatch 1024 (Vega II/Duo, W6800X, W6900X, Vega 56/64) 7-11 GB -> ctx 4096, ubatch 512 (RX 5700/5700 XT, W5700X) <= 6 GB -> ctx 2048, ubatch 256 (4 GB MBP dGPUs, Polaris 560X) Guarantees: the >= 12 GB tier and any VRAM probe miss (0/unknown) keep the exact Vega-II-validated values (zero regression on the canonical path); buildSlotArgs applies the context cap as min(MaxContext, cap) so it only ever lowers; an explicit QUENCHFORGE_EMBED_UBATCH_SIZE still overrides the tier. Family-agnostic: unlisted/future AMD cards fall through classifyProfile to vega-pro and size by VRAM like any other. Adds SlotTuning.ContextSize, the amdSizing curve, six new tuning/cmd tests, and updates CHANGELOG (v0.8.0 final), README env table, and CLAUDE.md gotcha #2.

sunrunnerfire merged commit 31f9ab9 into main May 31, 2026
6 checks passed

sunrunnerfire deleted the feat/vram-tier-adaptive-sizing branch May 31, 2026 17:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tuning): VRAM-tier-adaptive sizing for the full Intel-Mac AMD GPU range#9

feat(tuning): VRAM-tier-adaptive sizing for the full Intel-Mac AMD GPU range#9
sunrunnerfire merged 1 commit into
mainfrom
feat/vram-tier-adaptive-sizing

sunrunnerfire commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sunrunnerfire commented May 31, 2026

What

How

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant