Flux Memory

Self-organizing retrieval fabric for AI memory.

Flux Memory is an AI memory system that persists knowledge as a self-modifying weighted graph. It learns which memories matter through feedback signals - reinforcing useful grains, decaying stale ones, and automatically clustering related knowledge.

Features

Graph-based memory - grains connected by weighted conduits, propagated via signal attenuation
Self-organizing - lazy decay, Louvain clustering, automatic promotion/demotion, shortcut reinforcement
Three access paths - MCP server (for AI agents), REST API (HTTP), Python SDK
Booth architecture - concurrent read workers, serial write queue, async feedback queue
Per-caller rate limiting - 500 grains/min default, configurable per instance
Admin authentication - argon2 password hashing, TOTP 2FA (RFC 6238), session tokens
Two operating modes - flux_extracts (local Ollama LLM) or caller_extracts (AI provides features)

Quick Start

Install

pip install flux-memory

Windows fallback if flux is not on PATH:

python -m flux --help
python -m flux init --name my-memory

For CLI-first installs, pipx install flux-memory is recommended because it manages command shims and PATH setup.

Initialize an instance

flux init --name my-memory

This prompts for:

Operating mode (caller_extracts or flux_extracts)
Admin password (argon2-hashed)
Optional TOTP two-factor authentication, with terminal QR and first-code verification

Initialization also writes MCP client snippets under:

~/.flux/<name>/integrations/

Start services

flux start --name my-memory

Starts:

REST API health endpoint at http://localhost:7465/health
Dashboard at http://localhost:7462

To view the dashboard from a phone on the same local network, start with:

flux start --name my-memory --broadcast

This binds the dashboard to 0.0.0.0, prints LAN URLs such as http://192.168.x.x:7462, and serves a device-frame preview at /mobile-preview. The REST API remains local-only by default.

For private access from outside the local network, use a tailnet/VPN such as Tailscale instead of router port forwarding. Keep Flux running locally, sign in to Tailscale on this machine and the remote device, then publish only the dashboard to your private tailnet:

flux start --name my-memory
tailscale serve --http=7462 http://127.0.0.1:7462
tailscale serve status

Then open http://<tailscale-device-name>:7462 from another signed-in Tailscale device. This keeps the dashboard private to your tailnet and does not expose the REST API or MCP transport to the public internet.

If you do not want an overlay app, the private alternative is your own VPN endpoint, usually on your router/firewall. Connect to that VPN from the office using an OS-supported VPN profile, then open the dashboard over the home LAN address printed by flux start --name my-memory --broadcast. Do not forward port 7462 directly from the router to the internet.

flux start does not make the stdio MCP server discoverable by itself. MCP clients launch stdio servers directly. Use the generated snippet or run:

flux mcp --name my-memory

from your MCP client configuration.

In other words, flux start starts only the REST API and dashboard. It does not start a background network MCP server that Codex, Claude, Cursor, or other clients can auto-detect. Each MCP client must have its own config entry.

Stop services

flux stop --name my-memory

Check status

flux status --name my-memory

MCP Integration

Connect Flux Memory to any MCP-compatible AI agent. Flux uses stdio MCP by default, so the client must launch Flux.

Generate or refresh client snippets:

flux mcp-config --name my-memory

Codex example:

[mcp_servers."flux-my-memory"]
command = "python"
args = ["-m", "flux.cli", "mcp", "--name", "my-memory"]

On first connection, call flux_onboard to receive integration instructions:

flux_onboard() -> returns workflow instructions + operating mode

Standard workflow per conversation turn:

flux_retrieve(query) - fetch relevant memories before responding
flux_store(content, provenance) - save new facts after responding
flux_feedback(trace_id, grain_id, useful) - rate each retrieved grain

For clients such as Codex, save the flux_onboard instructions into an always-loaded instruction surface, such as a project or user AGENTS.md. Saving the workflow only as a memory note is not enough, because the agent must already remember to use Flux before it can retrieve that note.

Available MCP tools:

Tool	Description
`flux_store`	Store a memory grain
`flux_retrieve`	Retrieve relevant memories
`flux_feedback`	Rate a retrieved grain (learning signal)
`flux_health`	Current health and signal statistics
`flux_list_grains`	List grains by status (active/dormant/quarantined/archived)
`flux_onboard`	Get integration instructions for this instance

REST API

POST /store          {"content": "...", "provenance": "user_stated"}
POST /store/batch    {"items": [{"content": "..."}]}
POST /retrieve       {"query": "..."}
POST /feedback       {"trace_id": "...", "grain_id": "...", "useful": true}
GET  /health
GET  /grains?status=active&limit=50

Pass X-Caller-Id: agent-name header for per-caller rate limiting and tracking.

Python SDK

from flux.storage import FluxStore
from flux.service import FluxService
from flux.config import Config

store = FluxStore("~/.flux/my-memory/flux.db")
svc = FluxService(store, cfg=Config())
svc.start()

grain_id = svc.store("Paris is the capital of France", provenance="user_stated")
result = svc.retrieve("French capital")
svc.feedback(result.trace_id, result.grains[0]["id"], useful=True)

svc.stop()
store.close()

Configuration

Instance config lives at ~/.flux/<name>/config.yaml. Key parameters:

Parameter	Default	Description
`OPERATING_MODE`	`flux_extracts`	LLM extraction mode
`MCP_HOST`	`127.0.0.1`	Reserved for network MCP transports
`MCP_PORT`	`7464`	Reserved MCP port
`REST_HOST`	`127.0.0.1`	REST bind host
`REST_PORT`	`7465`	REST API port
`DASHBOARD_HOST`	`127.0.0.1`	Dashboard bind host
`DASHBOARD_PORT`	`7462`	Dashboard port
`READ_WORKERS`	`3`	Concurrent read workers
`MAX_GRAINS_PER_CALL`	`100`	Batch ingestion cap
`MAX_GRAINS_PER_MINUTE`	`500`	Per-caller rate limit
`MAX_WRITE_QUEUE_DEPTH`	`1000`	Write queue backpressure cap
`LLM_MODEL`	`llama3.1:8b`	Ollama model (flux_extracts mode)

Admin

flux admin --name my-memory

Password + TOTP gated interactive menu: search/purge/restore grains, view audit log, change password, open dashboard.

Requirements

Python 3.10+
SQLite 3.35+ (WAL mode)
For flux_extracts mode: Ollama with any configured local model

Development

git clone https://github.com/harsh5i/flux-memory
cd flux-memory
pip install -e ".[test]"
pytest tests/

License

MIT - see LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
config		config
design/mobile-preview		design/mobile-preview
docs		docs
src/flux		src/flux
tests		tests
.gitignore		.gitignore
BUILD_SUMMARY_v0.6.md		BUILD_SUMMARY_v0.6.md
CLAUDE.md		CLAUDE.md
DASHBOARD_DESIGN_BRIEF.md		DASHBOARD_DESIGN_BRIEF.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flux Memory

Features

Quick Start

Install

Initialize an instance

Start services

Stop services

Check status

MCP Integration

REST API

Python SDK

Configuration

Admin

Requirements

Development

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Flux Memory

Features

Quick Start

Install

Initialize an instance

Start services

Stop services

Check status

MCP Integration

REST API

Python SDK

Configuration

Admin

Requirements

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages