Skip to content

Latest commit

 

History

History
475 lines (370 loc) · 12.4 KB

File metadata and controls

475 lines (370 loc) · 12.4 KB
name Automatic Context Compaction
description Intelligent compression of session context while preserving critical patterns
type reference

Automatic Context Compaction

Problem: Long sessions accumulate tokens. Without compression, context window fills with historical details that are no longer needed for current work.

Solution: Automatic compression of old session data, observations, and tasks while preserving patterns that matter for future sessions.


How It Works

Trigger Points

Session Start
  ↓
Accumulate tokens (input + output)
  ↓
Check: tokens > threshold (100K)?
  ├─ No  → Continue normally
  └─ Yes → COMPACTION TRIGGERED
            ├─ Pause current work
            ├─ Summarize old content
            ├─ Replace with compressed summary
            ├─ Archive original
            └─ Resume with clean context

Three Compaction Layers

1. Session Compaction (Token-based)

When a single session reaches 100K tokens, Claude Code automatically:

  • Pauses before next API call
  • Injects a summary prompt asking for compression
  • Claude generates a <summary> block containing:
    • Tasks completed (ID, outcome)
    • Key decisions made
    • Current state (what's in progress)
    • Next steps
  • Discards detailed tool results, keeps summary
  • Saves original transcript to CONTEXT_ARCHIVE.md
  • Resumes with ~10K tokens left (90% reduction)

Example: Processing 10 tasks might generate:

OLD (50K tokens):
  [Full tool output for task 1]
  [Full tool output for task 2]
  [Full tool output for task 3]
  ... (7 more tasks worth of detail)

NEW (5K tokens):
  <summary>
  - TASK-1: Completed ✓ (fixed React import, 43/46 tests passing)
  - TASK-2: Completed ✓ (added 8 SMS scenarios, all tests pass)
  - TASK-3: In progress (scoring analytics, 40% done)
  - Next: Complete TASK-3, start TASK-4
  </summary>

2. Observation Compaction (Pattern-based)

Weekly scan of OBSERVATIONS.md:

  • Count confirmations for each pattern
  • 0-4 confirmations: Keep as observation
  • 5+ confirmations: ESCALATE to ACTIVE_RULES.md
  • 4+ weeks old: ARCHIVE to OBSERVATION_ARCHIVE.md

Example:

Observation: "User prefers direct, honest feedback"
Confirmations: 8 (across 4 sessions)
Action: Escalate to ACTIVE_RULES as Rule #3
Archive: Move to OBSERVATION_ARCHIVE.md

3. Task Compaction (Activity-based)

Monthly scan of completed tasks:

  • Keep last 10 completed tasks in DONE.md (detailed)
  • Archive older tasks to TASK_ARCHIVE.md (summary only)
  • Extract patterns for ACTIVE_RULES

Example DONE.md → TASK_ARCHIVE.md:

OLD ENTRY (from DONE.md):
  - TASK-42: Add SMS OTP handler
    - Completed: 2026-03-15
    - Files: 3 (handler, stack, tests)
    - Tokens: 3.5K estimated, 2.8K actual
    - Commits: 7b848b0, 34d27b8
    - Issues: One test flaky, fixed with waitFor()

ARCHIVED ENTRY (TASK_ARCHIVE.md):
  - TASK-42 [2026-03-15]: Add SMS OTP handler
    | Files: 3 | Tokens: 2.8K | Pattern: asyncauth ✓ |

Configuration

settings.json

{
  "compaction": {
    "enabled": true,
    "triggers": {
      "sessionTokens": {
        "enabled": true,
        "threshold": 100000,
        "description": "Compress when single session reaches 100K tokens"
      },
      "observationAge": {
        "enabled": true,
        "archiveAfterDays": 28,
        "escalateAfterConfirmations": 5,
        "description": "Archive old observations, escalate confirmed patterns"
      },
      "taskArchive": {
        "enabled": true,
        "keepRecentCount": 10,
        "archiveAfterSessions": 20,
        "description": "Archive completed tasks after 20 sessions"
      }
    },
    "summaryModel": "haiku",
    "summaryPrompt": "Summarize progress focusing on: completed items, key decisions, current state, next steps. Omit raw tool output.",
    "archiveFormat": "markdown-table",
    "preserveTokenMetrics": true,
    "monthlyReview": true
  }
}

Threshold Guidelines

Threshold Use Case Example
50K tokens Tight cleanup Per-task compression within a session
100K tokens Default Session boundary (current setting)
150K tokens Loose cleanup Long exploratory sessions
Never Disabled Tasks requiring full audit trail

Compression Ratios

Real Data: ScamGuard MVP

Session processing 8 tasks (without compaction):

Task results:     28K tokens
Component reads:  22K tokens
Test output:      15K tokens
Git logs:         8K tokens
API responses:    12K tokens
───────────────────────────
Total:            85K tokens

Same session (with compaction at 50K mark):

Initial context:  15K tokens
Task 1-3 work:    35K tokens (reaching 50K)
     ↓
COMPACTION TRIGGERED
     ↓
Summary generated: 4K tokens
Task 4-8 work:    45K tokens
     ↓
Final context:    19K tokens (78% reduction)

What Gets Preserved vs. Discarded

Category Preserved ✅ Discarded ❌
Tasks Task ID, outcome, status Full tool output, intermediate steps
Decisions What changed, why, impact Exploration of alternatives considered
Patterns Rules confirmed (5+ times) Early observations (1-4 times)
Timestamps Completion date, duration Minute-by-minute logs
Code Commit SHAs, file counts Diff details, review comments
Tests Pass/fail, coverage %age Full test output, stdout

Archive Structure

CONTEXT_ARCHIVE.md (Session Transcripts)

# Context Archive - Session Transcripts

Index of compressed sessions with generation date and summary.

## Session [2026-04-04 14:00]
**Duration:** 1.5 hours | **Tokens:** 95K input, 35K output | **Tasks:** 4 completed

<summary>
- TASK-101: Cleanup obsolete files ✓
- TASK-102: Fix React imports ✓  
- TASK-103: ErrorBoundary tests ✓ (already complete)
- TASK-104: SMS scenarios ✓
- Next: TASK-105 (scoring analytics)
</summary>

**Original transcript:** /Users/echetoui/.claude/projects/scamguard-mvp/[session-id].jsonl

OBSERVATION_ARCHIVE.md (Old Patterns)

# Observation Archive - Escalated Patterns

Observations confirmed 5+ times and moved to ACTIVE_RULES.

## Escalated → ACTIVE_RULES
- Pattern: User prefers direct, honest feedback (8 confirmations)
  → Added as Rule #3 on 2026-04-04
  
- Pattern: Read code before modifying (7 confirmations)
  → Added as Rule #4 on 2026-04-03

## Legacy Observations (archived, not escalated)
- Observation: Sometimes prefers exploration over planning
  | Confirmations: 3 | Archived: 2026-04-04 | Reason: Low confidence |

TASK_ARCHIVE.md (Completed Tasks)

# Task Archive - Completed Items

Keep last 10 in DONE.md (detailed). Older tasks summarized here.

## March 2026

| Task | Date | Outcome | Files | Tokens | Pattern |
|------|------|---------|-------|--------|---------|
| TASK-85 | 2026-03-28 | Fix auth flow ✓ | 4 | 3.2K | asyncauth |
| TASK-86 | 2026-03-27 | Add SMS handler ✓ | 3 | 2.8K | backend |
| TASK-87 | 2026-03-26 | E2E tests ✓ | 5 | 4.1K | testing |

Triggers & Automation

Automatic Triggers

  1. Session exceeds 100K tokens

    • When: Before next API call
    • Action: Compress session
    • Duration: 30 seconds
  2. Weekly observation scan (Sunday midnight)

    • When: Configured via CronCreate
    • Action: Archive old observations, escalate patterns
    • Duration: 1-2 seconds
  3. Monthly task archive (1st of month)

    • When: Configured via CronCreate
    • Action: Move old completed tasks to archive
    • Duration: 2-3 seconds

Manual Triggers

# Trigger compaction immediately (if needed)
# User can call: compress-context --session-now
# Or: compress-context --observations --tasks

Workflow Integration

Before Compaction

Session with 8 tasks:
├─ Messages: 40 (detailed conversation)
├─ Tool results: Full output from each task
├─ Context: 95K tokens
└─ Status: Getting close to limit

During Compaction (Automatic)

Claude Code detects: 100K+ tokens
  ↓
Injects: "Summarize your progress so far"
  ↓
Claude generates: <summary>Tasks 1-4 done, 5-8 in progress</summary>
  ↓
System action: Archive detailed transcript, keep summary
  ↓
New context: 15K tokens (fresh start with summary)

After Compaction

Session continues:
├─ Messages: 5 (compressed summary + new work)
├─ Tool results: Only from tasks 5+
├─ Context: 35K tokens (plenty of room)
└─ Status: Can continue for another 8 tasks

Example Compression Prompts

Session Compaction Prompt

You've completed several tasks in this session. Before continuing, 
please generate a brief summary for archival:

1. **Completed Items:** List task IDs and outcomes (e.g., "TASK-102: Fixed React imports, 43/46 tests passing")
2. **Key Decisions:** What architectural choices were made?
3. **Current State:** What's in progress? What's next?
4. **Metrics:** Total tokens used, files modified, commits made

Format as: <summary>...</summary>

This will be archived while we continue with a clean context.

Observation Escalation Prompt

Review OBSERVATIONS.md. For each observation with 5+ confirmations:

1. Move to ACTIVE_RULES.md as a new confirmed rule
2. Update MEMORY.md index
3. Add rule definition and "Why:" section from observation

Format: Add to ACTIVE_RULES.md, then remove from OBSERVATIONS.md.

Monitoring & Metrics

Per-Session Metrics

Session ID: 2026-04-04-14h
─────────────────────────────
Starting tokens:    0
Peak tokens:        95K (at compaction)
Compaction trigger: Token threshold exceeded
Tokens recovered:   80K (→ 15K remaining)
Compression ratio:  85% reduction
Time to compact:    0.3s

Artifacts:
  ✓ Original transcript archived
  ✓ Summary generated (4K)
  ✓ MEMORY.md updated
  ✓ Context cleaned

Monthly Compaction Report

## Monthly Report - April 2026

**Sessions processed:** 12
**Average session tokens:** 65K
**Compactions triggered:** 8 (67%)
**Token recovery:** 640K (80% reduction in archived content)

**Observations escalated:** 2 (→ ACTIVE_RULES)
**Observations archived:** 4 (low confidence)
**Tasks archived:** 6 (moved to TASK_ARCHIVE.md)

**Cost impact:** 
- Without compaction: 900K tokens (estimated)
- With compaction: 180K tokens (actual)
- Savings: 80% (720K tokens, ~$14.40)

Best Practices

✅ When to Enable Compaction

  • Long-running sessions (3+ hours)
  • Batch processing (20+ tasks)
  • Iterative refinement sessions
  • Multi-week projects with daily sessions
  • Token budget constraints ($50/month)

❌ When to Disable Compaction

  • Short focused tasks (<50K tokens)
  • Detailed audit trails required
  • Regulatory compliance contexts
  • Security-sensitive work (keep full logs)
  • Research/exploration (need full context)

Configuration Tuning

{
  "compaction": {
    "triggers": {
      "sessionTokens": {
        "threshold": 80000  // Lower = more aggressive compression
      },
      "observationAge": {
        "archiveAfterDays": 21  // Faster archival = cleaner memory
      },
      "taskArchive": {
        "keepRecentCount": 15  // More recent tasks = longer history
      }
    }
  }
}

Technical Details

Compression Algorithm

  1. Identify compressible content:

    • Tool results (read file, grep results)
    • API responses
    • Test output
    • Build logs
    • Git commands
  2. Generate summary:

    • Extract high-level facts (what changed, why, impact)
    • Preserve decision context (from DECISIONS.md)
    • Keep metrics (token cost, time duration)
  3. Replace and archive:

    • Store original in CONTEXT_ARCHIVE.md
    • Replace in-memory context with summary
    • Update indexes (MEMORY.md, etc.)
  4. Resume seamlessly:

    • Claude Code continues with fresh context
    • Can still reference archived decisions
    • Patterns preserved in ACTIVE_RULES

Summary

Aspect Benefit
Token Savings 70-85% reduction in long sessions
Context Clarity Focus on current work, not history
Pattern Preservation Confirmed rules carry forward
Audit Trail Original transcripts archived
Cost Reduction ~80% monthly token savings
Automatic No manual intervention needed

Status: Ready to implement in settings.json and integrate with caching layer.