Draft — 2026-04-09. To revisit when we tackle coding agent memory.
An application of BELLA outside news epistemics: using the same calculus to solve the memory problem in LLM coding agents (Claude Code, Cursor, etc.).
The claim: current LLM memory fails because context is a flat sequence of tokens. BELLA's belief tree IS the right structure for agent memory — same six rules, same calculus, different domain.
Flat context window: tokens in a sequence, recency bias
/compact: lossy summary, destroys structure
1M context: bigger flat buffer, same structural problem
RAG: embedding retrieval, no accumulation
Vector DB: nearest neighbors, no mass, no structure
Knowledge graph: structure, but no mass or convergence
MEMORY.md (manual): beliefs, but human-maintained, no calculus
Every current approach is stateless retrieval. None of them accumulate. None of them learn. The agent has the same blind spots in session 50 as session 1.
| BELLA rule | What flat context violates | What breaks |
|---|---|---|
| R1 accumulate | No mass — each message weighs the same | Told 5 times = told once |
| R2 structure | Flat — no parent-child, no typed edges | "key decision" ≈ "idle thought" |
| R3 emerge | No fields — one stream for everything | Can't scope to "auth" vs "perf" |
| R4 self-refer | No self-model | Doesn't know what it's forgotten |
| R5 converge | No feedback | Doesn't learn file reputation |
| R6 entangle | No entities as bridges | Can't see file spans features |
Dementia from compaction: /compact produces a summary. The summary loses the "why." The agent re-suggests rejected approaches because the DISPUTES are gone.
Re-suggesting rejected approaches: User says "don't mock the DB in tests" five times across five sessions. Current system: each rejection is a new token in an old context that gets compacted away. BELLA: the rejection accumulates mass — m=0.95 after 5 voices. Cannot be forgotten.
Losing track of relationships: Current system forgets that auth.py depends on session.py after a compaction. BELLA: entities bridge the files through shared beliefs (R6).
No pattern learning: Current system makes the same TypeScript mistake in every session. BELLA: Entity(TypeScript).reputation drops with each DISPUTES. Future TS work is more cautious (R5 feedback).
No personalization: Current system treats every codebase identically. BELLA: the belief tree for THIS codebase accumulates over sessions. File reputations stabilize. Patterns emerge. The agent gets better at this specific project, not just at coding in general.
Every user message and agent action is a claim. Process it through the same pipe we use for news:
Conversation message → EW (extract claims)
→ SENSE (land + place)
→ R1 accumulate
→ emergence
→ tree grows
Five levels, same as the organism model:
Level 0: Messages raw user input + agent observations
Level 1: Beliefs accumulated with mass and structure
Level 2: Entities files, functions, patterns, libraries
Level 3: Fields features, modules, concerns
Level 4: Self-model "I make too many TS errors" (Ψ — R4)
Extending EW's relation types for the coding domain:
causes "A happens, so B broke" causal debugging
counters "don't do X, do Y" preference rules
elaborates "here's how it works" explanation
evidence "the error was <stack>" grounding
mechanism "because of how X works" technical reason
assessment "this approach is cleaner" judgment
decision "let's use asyncpg" choice made
correction "no, not that way" explicit denial
observation "the tests pass now" state report
Each maps to a BELLA operation:
- causes → CAUSE
- counters / correction → DENY
- decision → ADD (with high lr)
- observation → CONFIRM (if matches existing belief)
- etc.
Code objects as entities:
File entity(path/to/file.py)
Function entity(module.func_name)
Class entity(module.ClassName)
Library entity(asyncpg)
Pattern entity(async context manager)
Concern entity(connection pooling)
Error type entity(TypeError in X)
Each accumulates reputation through R1:
- File with many confirmed beliefs → well-understood, stable
- File with many DISPUTES → problematic, needs care
- Library with high reputation → trusted, use by default
- Pattern with DISPUTES → anti-pattern in this codebase
Fields emerge from R3 when belief subtrees converge. Examples:
Field(authentication):
beliefs about auth.py, session.py, login flow, JWT,
password hashing, OAuth. Emerges from clustering.
Field(database):
beliefs about asyncpg, pool config, queries, migrations,
schema. Emerges as database work accumulates.
Field(testing practices):
beliefs about test organization, fixtures, mocking,
integration vs unit. Includes DISPUTES for rejected approaches.
Fields are scope for context loading. Working on auth loads the auth field, not the whole codebase history.
Replace "load last N messages" with "EXPAND(relevant field)":
1. Detect active context from current request
- Files being edited → entity set
- Topic keywords → field candidates
- Recent actions → recent beliefs
2. Load by mass priority:
HIGH mass (m > 0.8) → always include (core rules, user preferences)
Field-relevant → include if matches active field
Entity-relevant → include if touches active files
DISPUTES → include if relevant (prevents re-suggestion)
LOW mass / old → drop first
3. Budget allocation:
60% high-mass beliefs (rules, decisions)
30% field-relevant beliefs
10% recent low-mass context
Token budget is spent on important context, not recent context. Recency matters only as a tiebreaker.
Agent suggests approach A
→ user corrects: "no, use B"
→ DENY approach A, ADD approach B
→ B.mass rises, A accumulates DISPUTES
→ entity(pattern_B).reputation rises
→ entity(pattern_A).reputation drops
→ future claims involving pattern_A get lower lr
→ agent approaches pattern_A with more caution
→ fewer mistakes of this type
This is R5 convergence. The agent's behavior is shaped by the accumulated belief tree. Corrections compound — saying the same thing 5 times produces a belief with 5x the mass, and 5x the influence on future actions.
The surprising thing: most of the machinery already exists.
Already built:
✓ belief_embeddings table (PG pgvector ANN)
✓ Neo4j belief graph (structure, edges)
✓ R1 Jaynes accumulation
✓ R2 entropy-driven structure
✓ R3 emergence via centroid convergence
✓ PG ANN landing (O(log n) retrieval)
✓ Seven operations (CONFIRM, AMEND, ADD, DENY, CAUSE, MERGE, MOVE)
✓ Per-root local heal
✓ Entity model with INVOLVES edges
Needs building:
→ EW for conversation messages (extract claims from chat)
→ Code-specific entity extractor (files, functions, libraries)
→ Conversation claim types (decision, correction, observation)
→ Context loader: EXPAND(field, budget) → tokens
→ Integration with LLM context assembly
→ Session boundary handling (when does a conversation end?)
The core is done. What's missing is the thin layer that adapts BELLA's pipe to conversation input instead of news pages.
RAG (retrieval-augmented generation) and BELLA memory are often confused. They are fundamentally different:
| Aspect | RAG | BELLA Memory |
|---|---|---|
| Storage | Flat vector DB | Structured belief tree |
| Retrieval | Nearest neighbor | Field + entity + mass |
| Accumulation | None — each doc stands alone | Jaynes lr — mass compounds |
| Structure | None | Parent-child, causal, DISPUTES |
| Learning | None — same forever | R5 feedback — reputation evolves |
| Conflicts | Not represented | Explicit DISPUTES edges |
| Context budget | Top-k similar | Mass-weighted field expansion |
| Temporal | Timestamp only | Temporal anchors, event_time |
| Personalization | None | Entity reputation per codebase |
RAG answers "what documents are similar to this query?" BELLA answers "what do we know about this, with what confidence, and what did we decide NOT to do?"
BELLA's six rules are not domain-specific. They describe ANY system that accumulates evidence over time:
- News epistemics (what we've built)
- LLM agent memory (this document)
- Personal knowledge management
- Scientific research tracking
- Team decision archives
- Customer support history
- Medical patient records
The same calculus works in all of them because the underlying problem is the same: evidence arrives in a stream, needs to be structured, contested claims must be visible, mass must accumulate, and retrieval must be by relevance not recency.
The LLM memory application is the most urgent because coding agents are hitting the limits of flat context RIGHT NOW. Every heavy user of Claude Code / Cursor / etc. has felt the dementia from /compact. The pain point is acute.
Session boundaries: When does a conversation "end"? Is each conversation a fresh context that inherits from the persistent tree? Or is the tree continuous?
Claim extraction: How aggressive should EW be on chat messages? Every sentence a claim? Or only explicit decisions and rules?
Self-observation: Should the agent's own actions become claims? ("I modified file X" — then if it breaks, CAUSE edge from the modification to the breakage.)
Cross-project memory: Does a developer's preferences transfer between codebases? Is there a personal belief tree above the project trees?
Privacy/trust: User might reject a belief mid-conversation. Does the rejection itself accumulate? Could the system become manipulable through accumulated DENYs?
Pruning: What happens to very old, never-confirmed, low-mass beliefs? Do they decay? Or stay as "maybe"?
Entropy at scale: At 10k beliefs, is per-root heal still enough? Or do we need hierarchical fields (fields-of-fields)?
Focus. News epistemics is the current target. But BELLA was designed to be domain-agnostic from the start — that's what the theory work ensures. When we come back to this, the core doesn't need to change. We add a thin conversation-to-claim adapter and a context loader.
The proof that BELLA is the right model is that the SAME rules apply to both. If we had to modify R1-R6 for the coding memory case, that would be evidence the theory is wrong. But we don't. It just works.
This file exists as a placeholder for that future work — and as a reference when other systems' memory failures remind us we already have the answer.
- SPEC.md — the full theory (R1-R6)
- grow.py — current implementation for news
- project_bella_pipe — memory file with current pipeline notes
- project_bella_entropy_law — the fundamental entropy discovery