Memory for your fleet of AI coding agents.
memto is the unified memory layer for the multiple AI coding CLIs you use.
Wake up any past session across Claude Code, Codex, Hermes, and OpenClaw
and ask it a question β or just read its transcript.
No extraction. No daemon. No cloud.
Like the movie β every past AI session is a polaroid you can hold up and ask questions of.
Because your agents remember, and you don't.
If you run one AI coding tab at a time, you don't need this.
If you run five β rΓ©sumΓ© in one, startup in another, debugging a customer issue in a third, deep research in a fourth, taxes in a fifth β then today the answer to "where is the LaTeX file for my rΓ©sumΓ©?" lives in exactly one agent's head.
The other four have no idea. You, the human, are the only thing connecting them, and your short-term memory is the bottleneck.
memto is built for that scenario: multiple AI coding agents running in parallel across unrelated projects. Not for enterprise teams locked to a single tool. Not for deep single-codebase work. For the super-individual with five tabs open.
Product decisions come from these three. None of them are negotiable.
- The memory IS the session. No extraction, no embeddings, no "facts in a vector DB". The raw transcript file your agent CLI already wrote β that's the memory. We just make it queryable.
- Never mutate the past. Every
askforks a non-destructive copy. Your original session files are never touched. Rolling back is always a no-op because nothing changed. - Agent-native, zero ops. One bundled CLI,
--jsonon every command, a bundled skill that teaches your agents when to call it. No daemon. No database. No cloud.npx memto-cliand go.
| π Cross-runtime, no extraction | One unified interface for Claude Code, Codex, Hermes, and OpenClaw. Every adapter reads native files directly; no conversion step, no ingestion pipeline. This is the only thing in the market that does this. |
| πͺ Fork-safe by design | Every ask copies the session, asks on the copy, deletes the copy. Original files untouched. You can safely query a 3-month-old session without fear of polluting it. |
| β‘ Two-tier access | memto messages reads the transcript directly. memto ask forks and revives the original agent. Pick the one that fits the question β agents learn to read first, synthesize second. |
| π€ Agent-native output | --json everywhere. Ships a markdown skill so any modern agent CLI picks up the usage pattern automatically. No MCP server needed. |
| π§ͺ No DB, no daemon, no cloud | Contrast with Mem0 / Letta / Zep / chum-mem β all require ingestion pipelines and external stores. memto ships a single 60 KB JS file. |
| β± Auto-scaled timeouts | 120s floor + 1s per MB of transcript. Large sessions (60 MB+) don't silently die from premature kills. |
| π΅οΈ Prompt wrapper filtering | Runtime-specific noise (<environment_context>, Sender (untrusted metadata):, slash-command blobs, skill-injection headers) gets stripped so first_user_prompt is what the human actually typed. |
| π§ͺ 61 tests, 4 runtimes verified | Every adapter has synthetic-fixture tests. All four runtimes end-to-end verified against real local stores. |
# one-shot, no install
npx memto-cli list
# global install
npm i -g memto-cli && memto --helpTeach your agents to call memto automatically β drop the bundled skill into your agent's skills directory:
curl -fsSL https://raw.githubusercontent.com/shizhigu/memto/main/skills/memto.md \
> ~/.claude/skills/memto.md # adjust path for your agentOnce dropped in, your agent automatically learns when to use memto messages vs memto ask.
memto list --limit 10[claude-code] 2026-04-10 refactor-billing-service
cwd: ~/Projects/billing
first: migrate Stripe webhooks to async handlers, preserve idempotencyβ¦
model: claude-opus-4-6
[codex ] 2026-04-09 fix-memory-leak-in-parser
cwd: ~/Projects/lsp-server
first: investigate heap growth during long document parses
[hermes ] 2026-04-08 onboarding-email-sequence
first: draft a 5-email welcome series for new B2B signups
[openclaw ] 2026-04-05 deploy-staging
first: verify the CD pipeline is green before Tuesday's release cut
Every runtime, one merged view. Pipe to jq for filtering:
memto list --json --limit 30 | jq '.[] | select(.cwd | test("billing"))'memto grep "retry.*policy" -i --role user --json
memto grep "stripe.*webhook" --runtime claude-code --since 2026-03-01 --jsonScans every session's transcript in parallel (default: all four runtimes, most-recent-first, up to 200 per runtime). Returns hits grouped by session, each with role + timestamp + snippet. 2β20 seconds for 170+ sessions.
This is the right first command for any "find the thing" question β usually you don't know up front which session holds the answer.
memto messages --id <session_id> --last 10 --json
memto messages --id <session_id> --grep "retry" --role user --jsonSub-second, zero tokens. Use this for content lookup β file paths, error messages, decisions stated verbatim. 80% of memory queries can be answered here without ever forking.
memto ask --id <session_id> --question "what did we decide about retry logic?"βββ [claude-code] refactor-billing-service βββ
We settled on exponential backoff keyed by (customer_id, event_type),
capped at 24h, with idempotency keys persisted to Redis for 7 days.
Use when raw content isn't enough β when you need the original agent's synthesis, not just its transcript. Fork is non-destructive; originals are never touched.
# what did past-me think during messages 20..40?
memto reconstruct --id <session_id> --from-msg 20 --upto-msg 40 \
--question "what was my position on the retry debate?"
# what did I believe before I learned X?
memto reconstruct --id <session_id> --upto 2026-03-15T10:30:00Z \
--question "what's the leading approach?"Forks the session, truncates to the window [from, upto], then asks. The
agent answers from that slice only β no hindsight from later messages, no
noise from unrelated earlier episodes. This is the closest thing memto has
to cognitive science's "reconstructive episodic memory": you're not
replaying the whole session, you're reconstructing what the agent could
have known at that moment.
you / your agent
β
β memto list Β· messages Β· ask
βΌ
ββββββββββββββββββββββββββββββββββββββ
β memto β one CLI, npx-able β
ββββββββββββββββ¬ββββββββββββββββββββββ
β NormalizedSession / NormalizedMessage
βΌ
ββββββββββββββββββββββββββββββββββββββ
β @memto/session-core β
β claude-code Β· codex Β· hermes Β· openclaw
ββββ¬βββββββββββ¬βββββββββββ¬ββββββββ¬ββββ
βΌ βΌ βΌ βΌ
~/.claude ~/.codex ~/.hermes ~/.openclaw
Four native stores, one normalized shape. Each adapter reads its runtime's files directly β no ingestion, no duplicate store. SQLite for hermes uses bun:sqlite under bun and better-sqlite3 under node (picked at runtime).
import { listAllSessions, getMessages, ask } from '@memto/session-core';
// 1. enumerate
const sessions = await listAllSessions({
limitPerRuntime: 20,
sampling: { strategy: 'head-and-tail', head: 2, tail: 2 },
});
// 2. read transcript directly
const resumeSession = sessions.find(s => /rΓ©sumΓ©/i.test(s.title ?? ''));
if (resumeSession) {
const msgs = await getMessages(resumeSession.runtime, resumeSession.id);
const hit = msgs.find(m => /\.tex/.test(m.text));
if (hit) console.log(hit.text);
}
// 3. synthesize β wake up the original agent
if (resumeSession) {
const { answer, timed_out } = await ask(resumeSession, 'where is the LaTeX file?');
if (!timed_out) console.log(answer);
}Think of memory not as a database but as a fleet of dormant coworkers.
Each past session is one coworker. They kept detailed notes while they were working β the full transcript, every file they touched, every decision they made. They went home at the end of the day.
When you want to know something, you don't try to rebuild their knowledge from scratch by reading their notes. Either:
- You read their notes directly β that's
memto messages. Fast, free, but you have to scan. - You tap one on the shoulder β that's
memto ask. "Hey, quick question." They wake up, answer from the full context already in their head, then go back to sleep.
The "tap on the shoulder" is called fork-resume β we clone their session state just enough to run the question, get the answer, and discard the clone. The original session file is never modified.
| memto | Mem0 / Zep | Letta | |
|---|---|---|---|
| Unit of memory | whole past session, queryable live | extracted facts in a vector DB | hierarchical summary tiers in one agent |
| Cross-runtime | β 4 runtimes, 1 interface | β app-specific | β per-agent |
| Non-destructive read | β fork-safe | n/a | β internal only |
| External dependencies | 0 β just node | ChromaDB etc. | Postgres / SQLite |
| First-time cost | none β indexes what your CLIs already wrote | re-ETL every conversation | re-architect your agent |
| Best for | the super-individual running 5+ AI tabs | single-app long-term memory | single-agent role-played memory |
The fundamental divide: everything on the right takes your agent conversations, extracts structured claims from them, and stores those claims elsewhere. memto doesn't extract. The raw session IS the memory β you just wake it up and ask.
memto/
βββ packages/
β βββ cli/ β the `memto` binary
β βββ session-core/ β universal adapter + fork/ask orchestration
β βββ src/
β βββ types.ts
β βββ jsonl.ts β streaming JSONL reader
β βββ sqlite.ts β bun:sqlite / better-sqlite3 shim
β βββ derive.ts β title / prompt / sampling helpers
β βββ resume.ts β ask() orchestrator per runtime
β βββ adapters/
β βββ claude-code.ts
β βββ codex.ts
β βββ hermes.ts
β βββ openclaw.ts
βββ skills/
β βββ memto.md β standard-format skill; drop into your agent's skills/
βββ examples/
βββ assets/
- v0.4 β Cursor / Windsurf / Zed adapters Β· live file-watch indexing Β· richer summary hooks
- v0.5 β cross-device encrypted sync Β· per-session privacy tags
- v0.6 β team-shared memory (opt-in sharing of specific sessions between people) Β· simple web dashboard
File an issue if one of these matters to you, or open a PR.
See CONTRIBUTING.md. TL;DR: each adapter is ~200 lines, tests use synthetic fixtures, PRs welcome.
Built for the super-individual running five AI tabs at once.
ποΈ every session becomes a polaroid you can ask questions of.