Skip to content

feat(runtime): batteries-included compression policies (truncating / extractive / anchored)#1885

Merged
roryford merged 3 commits into
mainfrom
feat/runtime-default-compression-policies
Jun 15, 2026
Merged

feat(runtime): batteries-included compression policies (truncating / extractive / anchored)#1885
roryford merged 3 commits into
mainfrom
feat/runtime-default-compression-policies

Conversation

@roryford

Copy link
Copy Markdown
Owner

Summary

MK already ships the CompressionPolicy / PreTurnCompressionPolicy seams (wired into ConversationRuntime via TurnCompressionCoordinator), but no concrete strategy — the protocol docs show a hand-wave compress() stub and every consumer had to write their own. This PR adds three batteries-included strategies + one policy wrapper.

Ported and rewritten from Fireside's StoryCompression subsystem, adapted from its internal Message/Result tuples to MK's [ChatMessage] seam (generate is a parameter, not stored → clean Sendable structs, no @MainActor/@unchecked).

Strategies (a ladder)

  • TruncatingCompressionStrategy — zero-inference sliding window (keep system + newest-by-budget tail, drop oldest). The canonical baseline. (new — not in Fireside)
  • ExtractiveCompressionStrategy — zero-inference scored selection (recency / length / keyword-density), verbatim tail, greedy within budget. Adds an optional headBudgetFraction knob to pin establishing context (middle-out / lost-in-the-middle).
  • AnchoredCompressionStrategy — summarize old messages via generate, prepend a .memory("summary") record + keep a verbatim recency tail. Keeps Fireside's summarizerInputWindow input-sizing decoupling, chunk-and-fold for over-window input, summary-floor logic, CancellationError early-return, and fall-back-to-extractive on failure/empty/missing-generate.

Policy wrapper

DefaultCompressionPolicy is a Sendable struct conforming to both CompressionPolicy (post-turn) and PreTurnCompressionPolicy (pre-turn). The trigger asymmetry is handled internally — the post-turn path uses contextUtilization; the pre-turn path receives only messageCount/lastPromptTokens, so the policy stores contextSize as config to compute utilization.

runtime.compressionPolicy        = .extractive(threshold: 0.75)
runtime.preTurnCompressionPolicy = .anchored(threshold: 0.85, contextSize: 8192)

Tests

Tests/ManifoldRuntimeTests/DefaultCompressionPolicyTests.swift — ported from Fireside's StoryCompressionTests + P0 extractive-edge-case / anchored-fallback / summarizer-starvation suites, rewritten against [ChatMessage], plus new truncating-strategy, headBudgetFraction, and pre-turn-vs-post-turn trigger tests.

Out of scope (deliberately)

  • Recursive / hierarchical (summary-of-summaries) summarization
  • Embedding / semantic-relevance selection (belongs with RAG)
  • KV-cache eviction (token/attention layer, not message history)
  • The graph reconciler postCompress hook (left as the default no-op for consumers to fill)

Notes for review

  • Verify the isPinned resolution against ChatMessage (Fireside's Message had isPinned; check whether MK exposes a pin/kind equivalent or whether v1 treats none-pinned as acceptable with a TODO).
  • Draft pending CI + a review-and-fix pass.

🤖 Generated with Claude Code

roryford and others added 2 commits June 15, 2026 20:15
…extractive / anchored)

MK shipped the CompressionPolicy / PreTurnCompressionPolicy seams but no
concrete strategy — every consumer had to hand-roll compress(). Add three
default strategies plus a single policy wrapper that conforms to both seams.

- TruncatingCompressionStrategy: zero-inference sliding window (baseline)
- ExtractiveCompressionStrategy: zero-inference scored selection (recency /
  length / keyword-density) with verbatim tail + optional headBudgetFraction
  knob (middle-out / lost-in-the-middle preservation)
- AnchoredCompressionStrategy: summarize old via generate, prepend a
  .memory("summary") record + keep a verbatim recency tail; chunk-and-fold for
  over-window input; falls back to extractive on failure/empty/missing-generate
- DefaultCompressionPolicy: Sendable struct, conforms to BOTH CompressionPolicy
  and PreTurnCompressionPolicy; static factories .truncating/.extractive/.anchored

Ported and rewritten from Fireside's StoryCompression suite against [ChatMessage].

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a truncating-strategy test that guards the never-drop-newest
invariant under load-bearing-overflow — the existing tests passed even
with that line removed (the greedy backward fill already keeps the
newest in the common case), so the invariant was effectively untested.
Sabotage-verified.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@roryford

Copy link
Copy Markdown
Owner Author

Review-pass note (from the adversarial review + fix pass, commit 477830ba):

  • Deliberate divergence from the Fireside original: ExtractiveCompressionStrategy drops Fireside's explicit candidateBudget = budget - tailBudget cap and fills candidates to the full history budget. This is intentional — strictly better budget utilization, still bounded, never overflows context. Not a port bug.
  • isPinned: no per-message pin equivalent exists on the [ChatMessage] seam (pins live on ChatSession.pinnedMessageIDs, invisible here). The port preserves .system + .memory records verbatim as 'load-bearing' and leaves a TODO to honor a future pinned-IDs channel.
  • Known coverage gap (not a bug): the anchored chunk-and-fold over-window path (summarizerInputWindow set + input exceeds it) has no direct test.
  • Local gate: scripts/test.sh --profile local → 4932 tests, 0 failures, 15 skipped, clean first run.

Budget realism (configurable reservedTokens replacing the bare 512, injectable
tokenizer on the factories, skip-on-tiny-window guard), extractive verbatim-core
overflow clamp, thinking-model robustness (strip <think> before summary parse,
configurable response reserve), multimodal/tool per-part token accounting in
ContextWindowManager, plus the QA test set (asymmetry boundary, chunk-and-fold,
summary-floor, cancellation, tightened weak assertions) and doc fixes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@roryford roryford marked this pull request as ready for review June 15, 2026 11:51
@roryford roryford merged commit 2afdd83 into main Jun 15, 2026
11 checks passed
@roryford roryford deleted the feat/runtime-default-compression-policies branch June 15, 2026 11:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant