Embedded memory graphs for AI agents. One crate. One SQLite file. A complete graph-memory runtime that makes agents self-aware.
An embedded memory graph engine where everything — identity, knowledge, tool calls, LLM calls, sub-agent work, loop iterations, self-model — lives as a node in the graph. The agent queries its own history the same way it queries any other knowledge. No external database. No network dependency. Just a single .db file.
This is the upstream engine. If you want to build your own agent, fork cede instead. If you want an omnichannel deployment with HTTP API, see omni-cede.
cortex-embedded <-- you are here (embedded memory graph engine)
|-- cede <-- forkable starter kit for building agents
|-- omni-cede <-- omnichannel variant (HTTP API, identity, sessions)
- Embedded memory graph — 18 node kinds, 6 edge kinds, full provenance tracking. Everything is a node; everything is queryable.
- Graph-native chat sessions — each turn stores a
UserInputnode, builds a fresh HNSW-based briefing (no growing message history) - Hybrid recall — HNSW ANN search + BFS graph traversal + trust scoring + recency decay + session recency window
- Embeddings — BAAI/bge-small-en-v1.5 via fastembed (384-dim, runs locally)
- Auto-link — background task creates
RelatesToandContradictsedges automatically - Three-tier contradiction detection — cosine similarity -> negation keyword heuristic -> LLM adjudication
- Decay — importance fades over time; Soul/Belief/Goal nodes are immune
- Trust propagation —
Supportsedges boost trust,Contradictsedges reduce it - Context compaction — LLM extracts key facts from long conversations into the graph
- LLM backends — Anthropic Claude, Ollama (local), Mock (testing)
- Tool registry — tools write provenance-tracked results into the graph
- Sub-agents — spawn into the shared graph with scoped identity
- TUI graph explorer — interactive terminal UI with chat panel, node inspection, graph visualization
- CLI — chat, ask, memory search, identity management, consolidation, diagnostics
# Build
cargo build --release
# Initialize database and download embedding model
cortex init
# Interactive chat (requires LLM)
ANTHROPIC_API_KEY=sk-ant-... cortex chat
# or with Ollama
cortex --ollama llama3 chat
# Single query
cortex ask "What do you know about Rust?"
# Interactive graph explorer with chat
ANTHROPIC_API_KEY=sk-ant-... cortex graph explore
# Graph overview
cortex graph overview
# Filter graph by node kind
cortex graph filter soul,belief,fact
# Memory stats
cortex memory stats
# Semantic search
cortex memory search "authentication"
# View identity
cortex soul show
# Check graph health
cortex doctor
# Run trust consolidation
cortex consolidate+---------------------------------------------+
| cortex-embedded |
+----------+----------+----------+------------+
| recall | briefing | tools | agent |
| (hybrid | (context | (registry| (loop + |
| search) | doc) | + trust)| sub-agents)|
+----------+----------+----------+------------+
| graph + memory |
| (BFS walk, scoring, decay) |
+----------+----------------------------------+
| HNSW | SQLite |
| (2-tier) | (WAL mode, bundled rusqlite) |
+----------+----------------------------------+
| fastembed |
| (BAAI/bge-small-en-v1.5) |
+---------------------------------------------+
Unlike traditional chatbots that grow a message array, cortex-embedded stores each user input as a UserInput node in the graph. Every turn:
- Store the user's text as a
UserInputnode (embedded, auto-linked) - Build a fresh briefing using the input as a semantic query against the full graph
- Include the recency window — always surface the last 7 session nodes regardless of similarity
- Send
[system(briefing), user(input)]to the LLM — no growing history
This means the agent's "memory" is the graph itself. Relevant prior exchanges surface through HNSW similarity; meta-conversational messages ("stop using big words") surface through the recency window.
| Category | Kinds |
|---|---|
| Knowledge | Fact, Entity, Concept, Decision |
| Identity | Soul, Belief, Goal |
| Conversational | UserInput |
| Operational | Session, Turn, LlmCall, ToolCall, LoopIteration |
| Sub-agents | SubAgent, Delegation, Synthesis |
| Self-model | Pattern, Capability, Limitation |
RelatesTo · Contradicts · Supports · DerivesFrom · PartOf · Supersedes
Every interaction creates a provenance chain:
UserInput -> Session
Fact -> ToolCall -> LoopIteration -> Session
The agent knows not just what it knows, but how it came to know it, when, via which tool, and how much to trust it.
Recall pipeline:
- Embed query -> HNSW k-NN search
- BFS graph walk from candidates
- Score:
importance * trust * recency * proximity_bonus - Merge recency window (last 7 session nodes)
- Return ranked nodes with contradiction warnings
Background tasks:
- Auto-link — new nodes are compared against the graph; similar nodes get
RelatesToedges - Contradiction detection — three-tier pipeline: cosine threshold -> negation keywords -> LLM adjudication
- Decay — every 60s, nodes lose importance proportional to elapsed time (floor: 0.01)
Embed the memory graph directly in your application — no server, no network calls:
use cortex_embedded::{CortexEmbedded, types::*};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let cx = CortexEmbedded::open("my_agent.db").await?;
// Store knowledge
let node = Node::new(NodeKind::Fact, "Rust is fast")
.with_body("Rust provides zero-cost abstractions and memory safety.");
cx.remember(node).await?;
// Recall
let results = cx.recall("performance", RecallOptions::default()).await?;
for r in &results {
println!("[{}] {} - score: {:.3}", r.node.kind, r.node.title, r.score);
}
// Build briefing for LLM
let briefing = cx.briefing("system design", 12).await?;
println!("{}", briefing.context_doc);
Ok(())
}| Crate | Purpose |
|---|---|
rusqlite (bundled) |
SQLite with WAL mode |
instant-distance |
HNSW approximate nearest neighbor search |
fastembed |
Local text embeddings (ONNX runtime) |
tokio |
Async runtime |
reqwest |
HTTP client for Anthropic API |
clap |
CLI argument parsing |
ratatui + crossterm |
TUI graph explorer |
async-channel |
Background task communication |
# Run all 28 tests
cargo test -- --test-threads=1
# Unit tests only (HNSW)
cargo test --lib hnsw
# Integration tests only
cargo test --test integration -- --test-threads=1MIT