User Guide

This guide is for operators and end users of mem0-server: people who want to deploy it, connect clients to it, and use it day to day. If you want to work on the code itself, see the Developer Guide.

What it is
How memory works
Prerequisites
Configuration reference
Choosing a deployment method
Deploying with Docker Compose
Deploying to CapRover
Connecting clients
Prompting agents to use memory
REST API reference
Backups and restore
Health and monitoring
Troubleshooting

What it is

mem0-server is a self-hosted mem0 memory store that you run as a single service. It gives AI agents and scripts a shared, persistent long-term memory, reachable two ways from one process:

REST API under /api/v1/memories… — for scripts, n8n, curl, and any HTTP client.
Streamable HTTP MCP under /mcp — for Claude Code, Claude Desktop, Claude.ai web, and Cowork.

Both protocols read and write the same memory store, so a fact you save from Claude Code is searchable from a curl script and vice versa.

It is single-user by design: there is exactly one user, set by MEM0_DEFAULT_USER_ID. There is no multi-tenant separation — anyone holding the API token has full access to the one memory store.

How memory works

When you add a memory, mem0 uses an LLM (Anthropic Claude by default) to extract durable facts from your text, then stores each fact as a vector embedding (OpenAI by default) in Qdrant. Searches are semantic: you ask in natural language and get back the most similar stored facts, not keyword matches.

Re-adding the same content is free. Before that LLM extraction runs, the server fingerprints the content (normalized: lowercased and whitespace-collapsed, so differences in case or spacing still match); if it matches something already stored, the add is skipped (no LLM call) and the response is {"results": [], "deduplicated": true, "memory_id": "…"}. This makes re-runs of imports and webhook/n8n retries cheap and idempotent. Pass "dedup": false on a REST add to force re-extraction. (This is distinct from mem0's semantic dedup, which still applies when similar-but-not-identical content reaches the LLM.)

Memories can optionally be tagged with:

agent_id — a provenance tag for which agent/tool wrote it (e.g. n8n-flow, claude-code). Over MCP it is write-only: the search/list tools always span the whole store, so every connected agent (Claude Code, Codex, Claude.ai web, …) shares one memory. The REST API can still filter reads by agent_id for scripts that explicitly want a slice.
run_id — a session or workflow run identifier.
metadata — arbitrary JSON you attach to a memory.

user_id always defaults to MEM0_DEFAULT_USER_ID; you rarely need to set it.

Provenance and review metadata (convention)

For "governed" agent memory — distinguishing trusted from untrusted, fresh from stale — there's a small convention of reserved metadata keys. They're just ordinary metadata (nothing enforces them), but the REST read endpoints can filter on them, and agents can reason about them:

Key	Recommended values	Meaning
`source`	free-form, e.g. `user`, `agent`, `import:chatgpt`, `capture:telegram`, `tool:n8n`	Where the memory came from. The import scripts and capture bot already set this.
`confidence`	`high`, `medium`, `low`, `unknown`	How much to trust the content.
`review_status`	`unreviewed`, `approved`, `rejected`, `stale`	Whether a human/agent has vetted it.
`reviewed_by` / `reviewed_at`	free-form / ISO 8601	Who vetted it and when (optional).
`expires_at`	ISO 8601 timestamp	When the fact should stop being trusted.

confidence and review_status are independent — a memory can be high-confidence but unreviewed, or approved but deliberately low-confidence. expires_at addresses the common agent-memory failure mode where old memory stays trusted after the world changed: set it on facts that age out, then pass exclude_expired=true on reads (below) to drop them. This is a flat convention layered on metadata; the existing top-level agent_id remains the writer tag.

Set them on any write, e.g.:

curl -X POST https://mem0.your-domain.com/api/v1/memories \
  -H "Authorization: Bearer $MEM0_API_KEY" -H "Content-Type: application/json" \
  -d '{"content": "Q3 OKRs are finalized", "agent_id": "claude-code",
       "metadata": {"source": "user", "confidence": "high",
                    "review_status": "approved", "expires_at": "2026-10-01T00:00:00Z"}}'

Filter reads by them with the query params on GET /api/v1/memories (and the same fields on search) — see the REST API reference. Per the shared-store design, the MCP read tools never filter by these (they span the whole store); metadata filtering is a REST-only affordance for scripts.

Prerequisites

Before deploying you need:

Requirement	Why
Docker + Docker Compose, or a CapRover instance	Runs the app. See Choosing a deployment method.
A reachable Qdrant instance (with API key)	Vector backend that stores the memories. The Docker Compose method provides this for you.
An Anthropic API key	Default LLM for fact extraction.
An OpenAI API key	Default embedding model.
An S3 bucket + AWS credentials	Only for the nightly backup app (CapRover).
A domain/subdomain (e.g. `mem0.your-domain.com`)	Public HTTPS URL for clients and OAuth. Optional for a local Docker Compose run.

You can swap the LLM/embedder providers (see Configuration reference), but the defaults above are the supported path.

Critical: MEM0_EMBED_DIMS must match the embedding model's real output dimension (text-embedding-3-small = 1536, text-embedding-3-large = 3072). A mismatch causes silent empty search results, not an error. Changing the embedding model later requires dropping and recreating the Qdrant collection.

Configuration reference

All configuration is via environment variables, validated at startup by app/config.py. The service refuses to start if a required variable is missing. Copy .env.example to .env for local runs, or set these in the CapRover app's App Configs panel for production.

Variable	Required	Default	Notes
`QDRANT_HOST`	yes	—	Qdrant hostname, e.g. `qdrant.your-domain.com`.
`QDRANT_PORT`	no	`443`	Qdrant port.
`QDRANT_HTTPS`	no	`true`	Use HTTPS to reach Qdrant.
`QDRANT_API_KEY`	yes	—	Qdrant API key.
`MEM0_COLLECTION`	no	`memories`	Qdrant collection name.
`MEM0_DEFAULT_USER_ID`	yes	—	The single user, e.g. `default-user`.
`MEM0_LLM_PROVIDER`	no	`anthropic`	LLM provider for fact extraction.
`MEM0_LLM_MODEL`	no	`claude-haiku-4-5-20251001`	LLM model.
`ANTHROPIC_API_KEY`	if provider=anthropic	—	Required when the LLM provider is Anthropic.
`MEM0_EMBED_PROVIDER`	no	`openai`	Embedding provider.
`MEM0_EMBED_MODEL`	no	`text-embedding-3-small`	Embedding model.
`MEM0_EMBED_DIMS`	no	`1536`	Must match the embedder's real dimension.
`OPENAI_API_KEY`	if provider=openai	—	Required when the embed provider is OpenAI.
`MEM0_API_KEY`	yes	—	Static bearer token protecting REST + MCP. Generate with `openssl rand -hex 32`.
`PUBLIC_BASE_URL`	yes	—	Public URL, e.g. `https://mem0.your-domain.com`. Used in OAuth metadata.
`OAUTH_SIGNING_KEY`	no	empty	PEM RSA private key. Setting this enables Phase 2 OAuth. Leave blank for Phase 1.
`OAUTH_ALLOWED_REDIRECT_URIS`	no	claude.ai + cowork + chatgpt callbacks	Comma-separated allowlist for OAuth redirect URIs. An entry ending in `` is a path-prefix* match locked to an exact scheme + host — it must be a full `scheme://host/path/` prefix (e.g. `https://chatgpt.com/connector/oauth/`). Host-only or bare wildcards (`https://chatgpt.com`, `https://`, ``) are ignored, so a misconfigured entry can't match lookalike hosts like `chatgpt.com.evil.com`.
`TRUST_FORWARDED_FOR`	no	`true`	Use the first `X-Forwarded-For` hop as the client IP for rate limiting. Correct behind CapRover's nginx; set to `false` if the app is exposed directly (no reverse proxy), where the header would be attacker-controlled.
`RATE_LIMIT_AUTH_FAILURES`	no	`10`	Failed bearer-token attempts (REST + MCP, per surface) allowed per IP per window before 429s. `0` disables.
`RATE_LIMIT_AUTH_WINDOW_SECONDS`	no	`60`	Window for the above.
`RATE_LIMIT_CONSENT_FAILURES`	no	`5`	Failed OAuth consent (wrong API key) attempts per IP per window. `0` disables.
`RATE_LIMIT_CONSENT_WINDOW_SECONDS`	no	`300`	Window for the above.
`RATE_LIMIT_TOKEN_FAILURES`	no	`10`	Failed `/oauth/token` exchanges per IP per window. `0` disables.
`RATE_LIMIT_TOKEN_WINDOW_SECONDS`	no	`60`	Window for the above.
`LOG_LEVEL`	no	`INFO`	Log level.

Rate limiting

Failed authentication attempts are rate-limited per client IP to slow down brute-force guessing of MEM0_API_KEY (and OAuth codes). Only failures count — normal authenticated traffic is never throttled — but once an IP crosses the limit, all its requests to that surface (even with the correct token) get HTTP 429 with a Retry-After header until the window expires. The four surfaces (REST /api/v1/..., MCP /mcp, OAuth consent, OAuth token) are limited independently. /healthz and /metrics are never limited. Limits are per uvicorn worker (the default image runs 2 workers), so the effective ceiling is about twice the configured value. If you lock yourself out during testing, wait out the window or restart the app.

Phases

Phase 1 (MVP) — static bearer token only. Leave OAUTH_SIGNING_KEY blank. Works with Claude Code, Claude Desktop, curl, n8n — anything that can send an Authorization: Bearer header.
Phase 2 (OAuth) — set OAUTH_SIGNING_KEY to a PEM RSA private key. This turns on OAuth 2.1 + PKCE + Dynamic Client Registration endpoints so Claude.ai web and Cowork can connect. The static bearer token keeps working alongside OAuth.

Generate an OAuth signing key with:

openssl genrsa 2048

When pasting a multi-line PEM into a single env var, replace newlines with \n — the app converts \n back to real newlines at load time.

Choosing a deployment method

There are two supported ways to run mem0-server:

Docker Compose — the simplest path if you don't already run CapRover. One docker compose up brings up both Qdrant and the app on a single host, with persistent volumes for each. You manage your own HTTPS (typically via a reverse proxy) and your own backups. Best for a single VM, a homelab, or local use.
CapRover — best if you already operate a CapRover instance and want push-to-main auto-deploy plus the companion nightly S3 backup app. This method connects to an existing, external Qdrant.

The application is identical in both cases; only the surrounding infrastructure differs. The sections below cover each.

Deploying with Docker Compose

The repository ships a docker-compose.yml that runs Qdrant and the app together. You do not need an external Qdrant for this method.

Copy the example environment file and fill in the secrets:
```
cp .env.example .env
```
At minimum set: MEM0_API_KEY (generate with openssl rand -hex 32), QDRANT_API_KEY (any strong secret — the bundled Qdrant is configured to require it), ANTHROPIC_API_KEY, OPENAI_API_KEY, and MEM0_DEFAULT_USER_ID.

You can leave QDRANT_HOST, QDRANT_PORT, and QDRANT_HTTPS at their .env.example values — the compose file overrides them to point at the in-stack Qdrant service (qdrant:6333, no TLS on the internal network).
Bring up the stack:
```
docker compose up -d
```
This builds the app image from the root Dockerfile, starts Qdrant with a persistent qdrant_data volume, and starts the app on http://localhost:8000. The app's /healthz endpoint round-trips to Qdrant; once it returns {"ok": true, ...} the stack is ready.
Verify:
```
curl http://localhost:8000/healthz
```

HTTPS and public access. The compose stack serves plain HTTP on port 8000. MCP clients and OAuth require HTTPS, so for anything beyond local use put the app behind a reverse proxy (Caddy, nginx, Traefik) that terminates TLS, and set PUBLIC_BASE_URL in .env to the public HTTPS URL (e.g. https://mem0.your-domain.com). For Phase 2 OAuth, also set OAUTH_SIGNING_KEY (see Phases).

Backups. The nightly S3 backup app is part of the CapRover setup. With Docker Compose you can take Qdrant snapshots yourself against the bundled instance — see Backups and restore for the snapshot/restore API; the qdrant_data volume also holds the on-disk data.

Updating. Pull the latest code and rebuild:

git pull
docker compose up -d --build

Deploying to CapRover

This method connects to an existing, external Qdrant (it does not start one for you). Deployment is push-to-main → CapRover webhook. Merging to main triggers a rebuild and redeploy automatically, independent of CI status.

1. Deploy the main app (`mem0-server`)

In CapRover, create a new app named mem0-server. Enable Has Persistent Data and map a volume to /app/data (used by the Phase 2 OAuth SQLite store; harmless in Phase 1).
Open App Configs and set every required variable from the Configuration reference. At minimum: QDRANT_HOST, QDRANT_API_KEY, MEM0_DEFAULT_USER_ID, ANTHROPIC_API_KEY, OPENAI_API_KEY, MEM0_API_KEY, PUBLIC_BASE_URL.
Set Container HTTP Port to 8000.
Under Deployment → Method 3 (Deploy from GitHub/Bitbucket/GitLab), point at this repository and the main branch. CapRover gives you a webhook URL — add it as a GitHub push webhook on the repo so merges to main auto-deploy.
Under HTTP Settings, enable HTTPS and Force HTTPS, and attach your domain (e.g. mem0.your-domain.com). This domain must match PUBLIC_BASE_URL.

The repository root captain-definition and Dockerfile build the image. The container runs uvicorn app.main:app --workers 2 and exposes a /healthz healthcheck.

2. Deploy the backup app (`mem0-backup`)

The nightly Qdrant→S3 backup is a separate CapRover app built from the backup/ directory in this same repository.

Create a second CapRover app named mem0-backup. It needs no exposed ports.
Set its Captain Definition Relative Path to ./backup/captain-definition.
Point its deployment at this repo / main (same webhook pattern, or deploy manually).
Set its env vars: QDRANT_URL (e.g. https://qdrant.your-domain.com), QDRANT_API_KEY, MEM0_COLLECTION, S3_BUCKET, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and optionally S3_PREFIX (default mem0-backups), AWS_DEFAULT_REGION (default us-east-1), and RETENTION_DAYS (default 14).

The backup container runs crond and executes backup/backup.sh nightly at 03:00 UTC. Each run creates a Qdrant snapshot, downloads it, uploads it to S3, deletes the Qdrant-side snapshot, keeps the 3 most recent local files, and prunes S3 objects older than RETENTION_DAYS.

3. Deploy the digest app (optional, `mem0-digest`)

The optional digest app posts a periodic summary of recently added memories to a Slack or Discord channel — a lightweight way to resurface what you've been capturing. Like the backup app it's a separate, port-less CapRover app built from the digest/ directory, running crond.

Create a CapRover app named mem0-digest. It needs no exposed ports.
Set its Captain Definition Relative Path to ./digest/captain-definition.
Point its deployment at this repo / main (same webhook pattern, or deploy manually).
Set its env vars:

Variable	Required	Default	Notes
`MEM0_URL`	yes	—	Base URL of the memory server, e.g. `https://mem0.your-domain.com`.
`MEM0_API_KEY`	yes	—	The same bearer token the server uses.
`DIGEST_WEBHOOK_URL`	recommended	—	Slack or Discord incoming webhook URL. If unset, the digest is written to the container log instead of being sent.
`DIGEST_WEBHOOK_FORMAT`	no	auto	`slack` or `discord`; auto-detected from the URL. Set explicitly if detection is wrong.
`DIGEST_CRON`	no	`0 8 * * *`	Cron schedule in UTC. Default is daily at 08:00; use e.g. `0 8 * * 1` for Mondays only.
`DIGEST_WINDOW_DAYS`	no	`1`	Look-back window. Match it to the schedule (e.g. `7` for a weekly digest).
`ANTHROPIC_API_KEY`	no	—	If set, the digest is summarized by Claude; otherwise it's a plain bulleted list.
`MEM0_LLM_MODEL`	no	`claude-haiku-4-5-20251001`	Model used for summarization.
`DIGEST_MAX_MEMORIES`	no	`100`	Cap on memories fetched per run. The server's list endpoint allows at most 100, so larger values are clamped to 100.
`DIGEST_TITLE`	no	`🧠 Memory digest`	Heading on the posted message.
`DIGEST_SEND_WHEN_EMPTY`	no	`false`	`true` to post even when nothing new was found.
`DIGEST_RUN_ON_START`	no	`false`	`true` to run once immediately on container start — handy to verify config.

Each run fetches recent memories over the REST API, optionally summarizes them with Claude, and posts the result to your webhook. To verify right after deploy, set DIGEST_RUN_ON_START=true and check caprover logs mem0-digest.

Cost note: with ANTHROPIC_API_KEY set, each run makes one Claude call to summarize. Without it, no LLM is used and you get a plain bulleted list.

Non-CapRover hosts can run the same image directly:

docker build -t mem0-digest ./digest
docker run -d --name mem0-digest \
  -e MEM0_URL=https://mem0.your-domain.com -e MEM0_API_KEY=... \
  -e DIGEST_WEBHOOK_URL=https://hooks.slack.com/services/... \
  -e DIGEST_CRON="0 8 * * *" -e DIGEST_WINDOW_DAYS=1 \
  mem0-digest

4. Deploy the capture bot (optional, `mem0-capture`)

The optional capture bot lets you save a thought into memory by sending a Telegram message — frictionless capture from your phone. It's a separate, port-less CapRover app built from the capture/ directory. It long-polls the Telegram Bot API (no inbound webhook or public port needed) and stores each message via POST /api/v1/memories, tagged agent_id=capture:telegram.

Because the memory store is single-user and high-trust, the bot only saves messages from an allowlist of Telegram chat IDs — anyone else is refused.

Create a Telegram bot: message @BotFather, send /newbot, and copy the bot token it gives you.
Create a CapRover app named mem0-capture. It needs no exposed ports.
Set its Captain Definition Relative Path to ./capture/captain-definition.
Set its env vars (see the table below), leaving TELEGRAM_ALLOWED_CHAT_IDS blank for now, and deploy.
Message your bot anything. It replies with your chat id (it stores nothing yet — "discovery mode"). Put that id in TELEGRAM_ALLOWED_CHAT_IDS and redeploy.
Message it again — it now replies "Saved ✓" and the note is in your memory.

Variable	Required	Default	Notes
`MEM0_URL`	yes	—	Base URL of the memory server.
`MEM0_API_KEY`	yes	—	The same bearer token the server uses.
`TELEGRAM_BOT_TOKEN`	yes	—	Bot token from @BotFather.
`TELEGRAM_ALLOWED_CHAT_IDS`	recommended	—	Comma-separated chat IDs allowed to save. Blank = discovery mode (replies with your chat id, stores nothing).
`CAPTURE_AGENT_ID`	no	`capture:telegram`	Provenance tag stored as `agent_id`.
`TELEGRAM_POLL_TIMEOUT`	no	`30`	Long-poll seconds per Telegram request.

Send plain text to save it as-is, or use /note <text>. /start and /help show usage. Other chat platforms (Slack slash commands, Discord bots) can be added the same way — parse the inbound message and call the same POST /api/v1/memories endpoint.

Security: keep TELEGRAM_ALLOWED_CHAT_IDS set in production. Anyone who finds your bot can message it, but only allowlisted chats can write to your memory; everyone else is refused.

Non-CapRover hosts can run the same image directly:

docker build -t mem0-capture ./capture
docker run -d --name mem0-capture \
  -e MEM0_URL=https://mem0.your-domain.com -e MEM0_API_KEY=... \
  -e TELEGRAM_BOT_TOKEN=123456:ABC... -e TELEGRAM_ALLOWED_CHAT_IDS=123456789 \
  mem0-capture

Connecting clients

All clients authenticate with the same MEM0_API_KEY bearer token (Phase 1), except Claude.ai web and Cowork, which use OAuth (Phase 2).

Claude Code

claude mcp add --scope user --transport http mem0-remote \
  https://mem0.your-domain.com/mcp \
  --header "Authorization: Bearer $MEM0_API_KEY"

After adding, the six memory tools (add/search/list/get/update/delete) become available in Claude Code.

Claude Desktop

Add an entry under the MCP servers section of Claude Desktop's config, pointing at https://mem0.your-domain.com/mcp with an Authorization: Bearer <token> header (Streamable HTTP transport). Restart Claude Desktop to pick it up. Both /mcp and /mcp/ work; /mcp is the canonical form.

Claude.ai web / Cowork (OAuth)

This requires Phase 2 (OAUTH_SIGNING_KEY set). In the client's connector settings:

Add a custom connector pointing at https://mem0.your-domain.com/mcp.
Leave the client ID and secret blank — the server supports Dynamic Client Registration, so the client registers itself automatically.
On the consent screen, enter your MEM0_API_KEY in the API key field and click Authorize, then let the redirect complete.

Why the API key prompt matters (security): this server is single-user and the consent step authenticates you as the owner. Because the OAuth endpoints are public, anyone who knows the URL could otherwise reach the consent screen; requiring MEM0_API_KEY at authorization ensures only the holder of that key can mint an access token to your memories. Treat MEM0_API_KEY as the master credential — anyone with it has full access via either the bearer header or the OAuth flow.

The server also only allows redirect URIs listed in OAUTH_ALLOWED_REDIRECT_URIS, which defaults to the official claude.ai, Cowork, and ChatGPT callbacks.

ChatGPT (OAuth, Developer Mode)

Also Phase 2. In ChatGPT, enable Developer Mode, add a custom connector pointing at https://mem0.your-domain.com/mcp, and choose OAuth. On the consent screen enter your MEM0_API_KEY and authorize.

ChatGPT's OAuth callback is a per-connector URL of the form https://chatgpt.com/connector/oauth/<connector-id> — the <connector-id> is unique to each connector you create. The default allowlist already covers these via the prefix entry https://chatgpt.com/connector/oauth/*, so you don't need to add the exact URL. If you've customized OAUTH_ALLOWED_REDIRECT_URIS, include that wildcard entry.

A trailing * is a path-prefix match, not a free-form glob: it is locked to the exact scheme and host of the entry and only extends the path, so write the full scheme://host/path/ prefix (keep the trailing /). An entry without a concrete host and path — https://chatgpt.com*, https://*, or a bare * — is ignored rather than honored, so a typo can't accidentally allow a lookalike host such as chatgpt.com.evil.com.

REST / curl / n8n

Send the bearer token as an Authorization header. See the REST API reference below.

Prompting agents to use memory

Connecting a client only makes the memory tools available — it does not make the agent use them. Models won't reliably search or save memory on their own; you have to tell them to. The most durable way is to put a short instruction block in whatever file the agent reads at the start of every session (CLAUDE.md, ChatGPT custom instructions, AGENTS.md, a system prompt, etc.).

A good memory instruction covers four behaviors:

Recall first — search memory at the start of a task, before answering, so past context is used.
Save durable facts — persist preferences, decisions, project conventions, and recurring context as they come up (not transient chatter).
Update, don't duplicate — when something changes, update the existing memory instead of adding a near-duplicate.
Don't store secrets — never save passwords, API keys, or sensitive personal data.

The server exposes six tools: search_memories, add_memory, list_memories, get_memory, update_memory, delete_memory. Adjust the tool/connector names below to match how your client surfaces them (for example, Claude Code namespaces them like mcp__mem0-remote__search_memories).

Claude (CLAUDE.md)

For Claude Code, add this to the project's CLAUDE.md (or your user-level ~/.claude/CLAUDE.md to apply it everywhere). For Claude Desktop, paste the same text into a Project's custom instructions.

## Long-term memory (mem0)

You have a persistent memory store available through the mem0 MCP server. Use it in every session:

- **At the start of a task**, call `search_memories` with a query about the topic to recall any
  relevant preferences, decisions, or context before you respond.
- **When the user shares** a durable preference, decision, project convention, or fact they'll
  likely want recalled later, call `add_memory` to save it. Keep each memory a single clear fact.
- **When something changes**, find the existing memory (`search_memories` / `list_memories`) and
  `update_memory` it instead of adding a duplicate.
- Do **not** store secrets, credentials, or sensitive personal data.
- You don't need to announce routine memory operations; just use them naturally.

ChatGPT (custom instructions)

In ChatGPT, open Settings → Personalization → Custom instructions (or a Project's instructions) and add the following to the "How would you like ChatGPT to respond?" box. This assumes you've connected the mem0 connector in Developer Mode (see ChatGPT (OAuth, Developer Mode)).

I have a personal long-term memory store connected via the mem0 MCP connector. Use it every session:
- Before answering a substantive question, use the connector's search_memories tool to recall any
  relevant saved preferences, decisions, or context.
- When I share a durable preference, decision, or fact worth remembering, use add_memory to save it
  as a single clear statement.
- If something changes, update the existing memory rather than creating a duplicate.
- Never store passwords, API keys, or sensitive personal data.

Other agents (AGENTS.md and similar)

Many coding agents and frameworks read an AGENTS.md (or an equivalent system-prompt/rules file) at session start. Drop in a tool-agnostic version:

## Memory

A shared long-term memory store is available via the mem0 MCP server. Behavior:

1. Recall: at the start of a task, search memory for context relevant to the request before acting.
2. Persist: save durable facts, preferences, decisions, and conventions as they arise.
3. Reconcile: update an existing memory when it changes; avoid near-duplicates.
4. Safety: never store secrets, credentials, or sensitive personal data.

Tools: search_memories, add_memory, list_memories, get_memory, update_memory, delete_memory.

If your agent has no instruction file but does take a system prompt, the same four numbered rules work verbatim there.

Companion prompt packs

Beyond the baseline rules above, docs/prompts/ collects reusable, copy-paste prompt packs for specific recurring tasks — auto-capturing a session summary, research synthesis, and meeting synthesis. They're documentation only (no server changes) and drive the same six tools.

REST API reference

All endpoints live under /api/v1 and require Authorization: Bearer <MEM0_API_KEY>. Request and response bodies are JSON. user_id defaults to MEM0_DEFAULT_USER_ID if omitted. Response schemas are published in the interactive docs at /docs (OpenAPI).

Error responses

Failures return a stable JSON shape:

{"detail": "human-readable summary", "error": "machine_code", "request_id": "abc123def456"}

Status	`error`	Meaning
`401`	—	Missing/invalid bearer token (plain `detail` only).
`404`	—	Memory ID does not exist (`GET`/`PUT`/`DELETE` by ID).
`422`	—	Request validation failed (FastAPI's standard shape).
`502`	`upstream_provider_error`	The LLM or embedding provider failed — check provider keys/status.
`503`	`backend_unavailable`	Qdrant is unreachable or erroring — same condition `/healthz` reports.
`500`	`internal_error`	Unexpected failure. The body never contains internals; quote the `request_id` (also settable via an `X-Request-Id` request header) when digging through server logs.

Add a memory — `POST /api/v1/memories`

Provide either content (a string) or messages (a chat transcript). Optional: agent_id, run_id, metadata, user_id, and dedup (default true).

By default, submitting content that matches something already stored — compared on a normalized fingerprint (case-insensitive, whitespace-collapsed), not raw bytes — is skipped before the LLM runs and returns {"results": [], "deduplicated": true, "memory_id": "…"} (see How memory works). Set "dedup": false to force re-extraction.

curl -X POST https://mem0.your-domain.com/api/v1/memories \
  -H "Authorization: Bearer $MEM0_API_KEY" -H "Content-Type: application/json" \
  -d '{"content": "We host services on CapRover on DigitalOcean", "agent_id": "n8n-flow"}'

With a transcript instead of a plain string:

curl -X POST https://mem0.your-domain.com/api/v1/memories \
  -H "Authorization: Bearer $MEM0_API_KEY" -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "I prefer dark mode"}]}'

Search memories — `POST /api/v1/memories/search`

Semantic search. Optional agent_id, run_id, user_id, and limit (1–100, default 10).

curl -X POST https://mem0.your-domain.com/api/v1/memories/search \
  -H "Authorization: Bearer $MEM0_API_KEY" -H "Content-Type: application/json" \
  -d '{"query": "where do we host things?"}'

Recency boost (optional). By default results are ordered purely by semantic similarity. When you care more about what's latest than what's the closest topical match, add recency_weight (0.0–1.0): 0 keeps the default order, 1 orders almost entirely by how recently each memory was created or updated. The half-life of the decay (default 30 days) is tunable via recency_half_life_days.

curl -X POST https://mem0.your-domain.com/api/v1/memories/search \
  -H "Authorization: Bearer $MEM0_API_KEY" -H "Content-Type: application/json" \
  -d '{"query": "current deploy target", "recency_weight": 0.4}'

When recency_weight > 0, each returned result carries a rerank_score showing the blended similarity-plus-recency value it was sorted by. The MCP search_memories tool accepts the same recency_weight argument.

Keyword search (optional). Semantic search ranks by meaning, which can miss an exact term — a name, identifier, URL, or rare token. Pass "mode": "keyword" to instead do a case-insensitive substring match over memory text, returning the most recent matches first:

curl -X POST https://mem0.your-domain.com/api/v1/memories/search \
  -H "Authorization: Bearer $MEM0_API_KEY" -H "Content-Type: application/json" \
  -d '{"query": "Philips Hue", "mode": "keyword"}'

The default is "mode": "semantic". The MCP search_memories tool accepts the same mode argument. Keyword mode spans the whole user store; it's a literal-match fallback, not a replacement for semantic retrieval. (recency_weight applies to semantic mode only.)

Under the hood, keyword matching is pushed down to Qdrant via a full-text payload index that the server creates automatically on first use (no setup or maintenance needed). Whole-word queries are answered from the index; queries that only match inside a word (e.g. hil matching Philips) transparently fall back to a scan of up to a few thousand memories, so results are the same as before — exact, case-insensitive substring matches, most recent first. If your Qdrant version can't create the index, everything still works via the scan path.

List memories — `GET /api/v1/memories`

Query params: agent_id, run_id, user_id, limit (1–100, default 50), offset (0–10000, default 0), plus the provenance/review filters source, confidence, review_status (exact match), and exclude_expired (drop memories whose expires_at is in the past). See Provenance and review metadata.

The response carries a pagination object — {"limit": …, "offset": …, "has_more": …} — so the full store can be enumerated by advancing offset by limit while has_more is true. Ordering is stable (by internal ID) but not chronological. With exclude_expired=true, expired items are dropped after the page is cut, so a page may contain fewer than limit items while has_more is still true.

curl https://mem0.your-domain.com/api/v1/memories?limit=20 \
  -H "Authorization: Bearer $MEM0_API_KEY"

# Next page:
curl "https://mem0.your-domain.com/api/v1/memories?limit=20&offset=20" \
  -H "Authorization: Bearer $MEM0_API_KEY"

# Only approved, non-expired memories imported from ChatGPT:
curl "https://mem0.your-domain.com/api/v1/memories?source=import:chatgpt&review_status=approved&exclude_expired=true" \
  -H "Authorization: Bearer $MEM0_API_KEY"

The same source / confidence / review_status / exclude_expired fields are accepted on POST /api/v1/memories/search (in both semantic and keyword modes).

Get one — `GET /api/v1/memories/{memory_id}`

Returns 404 if the memory does not exist.

Update — `PUT /api/v1/memories/{memory_id}`

Body: {"content": "new text"}. Returns 404 if the memory does not exist.

Delete — `DELETE /api/v1/memories/{memory_id}`

Returns {"deleted": true, "memory_id": "…"}, or 404 if the memory does not exist.

Bulk delete — `POST /api/v1/memories/delete_bulk`

Deletes every memory matching exact-match filters: agent_id, run_id, source, confidence, review_status (plus optional user_id). At least one filter besides user_id is required — wiping the whole store through this endpoint is deliberately impossible.

It is a dry run by default: with "confirm": false (or omitted) nothing is deleted; the response reports the match count and a sample of up to 10 items so you can verify the blast radius first. Re-post the same body with "confirm": true to actually delete. matched and deleted are per call, capped at 1000; if has_more is true, more memories match than this call covered — repeat the call until it's false. If a deletion fails partway, the response carries "error": "delete_failed_partway" with the partial deleted count; deletes are idempotent, so just re-post. Deletions go through mem0 (not a raw vector-store filter delete), so each memory's history stays consistent.

Typical use — undo a bad import run:

# 1. Dry run: how much would this delete?
curl -X POST https://mem0.your-domain.com/api/v1/memories/delete_bulk \
  -H "Authorization: Bearer $MEM0_API_KEY" -H "Content-Type: application/json" \
  -d '{"source": "import:chatgpt"}'
# -> {"matched": 412, "deleted": 0, "dry_run": true, "has_more": false, "sample": [...]}

# 2. Looks right - confirm:
curl -X POST https://mem0.your-domain.com/api/v1/memories/delete_bulk \
  -H "Authorization: Bearer $MEM0_API_KEY" -H "Content-Type: application/json" \
  -d '{"source": "import:chatgpt", "confirm": true}'
# -> {"matched": 412, "deleted": 412, "dry_run": false, "has_more": false, ...}

There is intentionally no MCP equivalent — a destructive filter-delete is an operator/script action, not something a connected agent should be able to reach for.

History — `GET /api/v1/memories/{memory_id}/history`

Returns the change history for a memory.

A ready-made smoke test against a live server is in scripts/smoke.sh, and an MCP-level smoke test in scripts/smoke_mcp.py.

Importing existing data

A new memory store starts empty. To seed it from data you already have, the repo ships standalone importer scripts under scripts/ that read common export formats and POST them to the REST API. They're plain REST clients — run them from a checkout against any reachable server.

Source	Script	What it sends
ChatGPT export (`conversations.json`)	`scripts/import_chatgpt.py`	One `messages` payload per conversation
Obsidian vault (folder of `.md`)	`scripts/import_obsidian.py`	One memory per note (frontmatter stripped)
Readwise highlights (CSV export)	`scripts/import_readwise.py`	One memory per highlight (+ its note)

All three take the same options: a path to the export, --base-url/--api-key (default to $MEM0_URL/$MEM0_API_KEY), --source (provenance tag), --limit (stop after N — good for a trial), and --dry-run (parse and report without sending). Each imported memory is tagged agent_id=import:<source> and carries a source (plus title/path/book/author where available) in its metadata, so you can later tell imported memories apart from ones written during a session.

# 1. Preview without sending anything
python scripts/import_chatgpt.py ~/Downloads/conversations.json --dry-run

# 2. Trial run: import only the first 5
export MEM0_URL=https://mem0.your-domain.com
export MEM0_API_KEY=...
python scripts/import_obsidian.py ~/my-vault --limit 5

# 3. Full import
python scripts/import_readwise.py ~/Downloads/readwise.csv

Cost note. Every new imported memory goes through the normal add path, which invokes the fact-extraction LLM (see the Configuration reference). A large ChatGPT or Obsidian import can mean thousands of LLM calls — use --dry-run and --limit first to gauge volume. Re-running an import is cheap and idempotent: content already stored — matched on a normalized fingerprint (case-insensitive, whitespace-collapsed) — is skipped before the LLM runs (see How memory works), so a second pass over the same export adds nothing and costs nothing.

Requirements: Python 3.12 and the project's dependencies installed (pip install -r requirements.txt); the scripts add the repo root to sys.path, so no packaging step is needed.

Backups and restore

The mem0-backup app handles nightly snapshots automatically (see deploy step 2). To restore from a snapshot:

# 1. Download a snapshot from S3
aws s3 cp s3://<bucket>/mem0-backups/2026-05-20T03-00-00Z.snapshot ./

# 2. Upload it to Qdrant
curl -X POST -H "api-key: $QDRANT_API_KEY" \
  -F "snapshot=@2026-05-20T03-00-00Z.snapshot" \
  "https://qdrant.your-domain.com/collections/memories/snapshots/upload"

# 3. Verify the collection is back
curl -H "api-key: $QDRANT_API_KEY" \
  "https://qdrant.your-domain.com/collections/memories"

Run a restore drill periodically so you know the snapshots are usable before you need them.

Health and monitoring

GET /healthz — does a real 2-second-timeout round-trip to Qdrant. Returns {"ok": true, "version": "…", "qdrant": "reachable"} on success, or HTTP 503 with {"ok": false, "qdrant": "unreachable"} if Qdrant can't be reached. CapRover uses this for its container healthcheck. No auth required.
GET /metrics — Prometheus metrics: http_requests_total (labelled by method, matched route template, and status) and http_request_duration_seconds (labelled by method and matched route template). No auth required.

Every request is logged as structured JSON (via structlog) with a request_id, method, path, status, and latency. The Authorization header is never logged.

Troubleshooting

Symptom	Likely cause / fix
Search returns empty, no error	`MEM0_EMBED_DIMS` doesn't match the Qdrant collection's vector size. Recreate the collection with the correct dimension.
401 on REST or MCP	Missing or wrong `Authorization: Bearer` token. Confirm it equals `MEM0_API_KEY`.
`Task group is not initialized` on first MCP request	FastMCP lifespan not wired into FastAPI — a code/deploy regression. See `app/main.py`.
503 from `/healthz`	Qdrant is unreachable. Check `QDRANT_HOST`/`QDRANT_PORT`/`QDRANT_HTTPS`/`QDRANT_API_KEY`.
Server won't start	A required env var is missing or a provider key is absent. Check the startup logs; `app/config.py` names the missing variable.
Claude.ai web / Cowork can't connect	OAuth not enabled (`OAUTH_SIGNING_KEY` blank), or the client's redirect URI isn't in `OAUTH_ALLOWED_REDIRECT_URIS`.
"Couldn't reach the MCP server" on Claude.ai web / Cowork (but Claude Code/Desktop work)	OAuth discovery failure. Confirm `OAUTH_SIGNING_KEY` is set and `PUBLIC_BASE_URL` exactly matches the public HTTPS URL; the server must advertise the protected-resource metadata in the `/mcp/` 401 `WWW-Authenticate` header.
Connector fails right after consent; logs show `POST /oauth/register → 400`	The client's callback isn't in `OAUTH_ALLOWED_REDIRECT_URIS`. The server logs a `dcr_redirect_uri_rejected` warning with the exact `requested` URI and the active `allowed` list — add the requested URI to `OAUTH_ALLOWED_REDIRECT_URIS` and redeploy. Claude.ai web/desktop/mobile/Cowork use `https://claude.ai/api/mcp/auth_callback`.
Backup job not running	Check the backup container: `caprover logs mem0-backup`.
Digest not arriving	Check `caprover logs mem0-digest`. Common causes: `DIGEST_WEBHOOK_URL` unset (digest is only logged), nothing within `DIGEST_WINDOW_DAYS`, or a wrong webhook URL. Set `DIGEST_RUN_ON_START=true` to trigger a run immediately.
Capture bot replies "not authorized"	Your Telegram chat id isn't in `TELEGRAM_ALLOWED_CHAT_IDS`. Clear that var to enter discovery mode (the bot replies with your id), add the id, and redeploy.
Capture bot saves nothing / only echoes your chat id	It's in discovery mode because `TELEGRAM_ALLOWED_CHAT_IDS` is blank. Set it to your chat id and redeploy.

FilesExpand file tree

USER_GUIDE.md

Latest commit

History