English first. Chinese follows each section.
This project is not just a chat-memory feature.
It is an attempt to define a general-purpose memory infrastructure for enterprise applications, workflows, and agents.
The core idea is simple:
- preserve raw context as the source of truth
- index memory through entities
- build summaries, tags, facts, and long-term memories as derived layers
- keep the whole system testable, traceable, and rebuildable
这不是一个单纯的“聊天记忆功能”。
这个项目想讨论的是:能不能做一套企业级、通用型的记忆基础设施,给应用、流程系统和 Agent 统一使用。
核心思想很简单:
- 原始上下文必须保真保存,作为事实底座
- 记忆通过实体来组织和查询
- 摘要、标签、事实、长期记忆都只是派生层
- 整个系统要可测试、可追溯、可重建
This project mainly tries to solve five problems.
In real companies, relevant context is usually spread across many places:
- CRM notes
- product events
- customer service conversations
- team sync updates
- agent execution traces
All of them may describe the same customer, team, task, or workflow, but they are rarely organized into one memory system.
Many systems need memory, but not all of them are LLM-based.
Some systems only need to:
- write raw context
- query a timeline
- search by tags, time, or source
So this project is designed to support ordinary business systems, not only LLM agents.
If a system only stores summaries, several problems appear quickly:
- you cannot verify whether the summary was wrong
- you cannot rebuild when the model changes
- you lose auditability
- migration becomes fragile
This project therefore treats raw sessions as the factual base layer.
Enterprise memory is rarely about only one user.
A single memory item may involve:
- a customer
- a sales rep
- a team
- a company
- an agent
- a task
That is why the system is designed around Entity + Session + Link, not around single-user conversation history.
If memory cannot be exported, imported, restored, or rebuilt, it is hard to treat it as infrastructure.
This project therefore includes:
- export/import
- checksum validation
- rebuildable derived layers
- model-replaceable processing
这个项目主要在试图解决五类问题。
企业里的相关信息通常分散在很多地方:
- CRM 跟进记录
- 产品行为日志
- 客服对话
- 团队同步内容
- Agent 执行轨迹
这些信息可能都在描述同一个客户、团队、任务或流程,但通常没有统一的记忆底座。
很多系统需要“记住”,但并不是 LLM 系统。
它们可能只需要:
- 写入原始上下文
- 查询某个对象的时间线
- 按标签、时间、来源做结构化检索
所以这个项目从设计上就不只面向大模型 Agent,也面向普通业务系统。
如果系统只保存摘要,很快就会遇到很多问题:
- 摘要是否错误,无法核对
- 模型变化后,旧结果难以重建
- 审计时找不到原始依据
- 数据迁移时不稳定
所以这个项目把原始 Session 作为事实底座。
企业里的记忆通常不只属于一个人。
一条记忆可能同时涉及:
- 客户
- 销售
- 销售团队
- 公司
- Agent
- 任务
这也是为什么系统围绕 Entity + Session + Link 来设计,而不是围绕“单用户聊天历史”设计。
如果记忆不能导出、导入、恢复、重建,它就很难成为真正的基础设施。
所以项目从一开始就把下面这些能力纳入范围:
- 导出 / 导入
- checksum 校验
- 可重建的派生层
- 可替换模型的处理链路
This system is an attempt to build:
a general-purpose memory infrastructure that uses raw sessions as the factual base, entities as the primary lookup entry, and derived memory as the acceleration layer.
一句话概括,这个系统想做的是:
一套以原始 Session 为事实底座、以 Entity 为查询入口、以派生记忆为加速层的通用记忆基础设施。
A session is not only a conversation.
It can also represent:
- a customer interaction
- a user behavior bundle
- an agent run
- a team update
- a process context
What matters is that it is preserved as raw context.
Most useful queries are not:
- "show me one message"
They are:
- "what happened around this customer?"
- "what does this team already know?"
- "what happened in the last agent run?"
So entities are the main entry point for memory retrieval.
The system derives:
summarytagfactlong_term_memory
These exist to improve retrieval and consumption, not to replace the source data.
The current design uses three layers:
- Raw layer: original sessions
- Derived layer: summaries, tags, facts
- Long-term layer: distilled memory worth keeping beyond one interaction
This separation helps users understand what is source truth and what is model-generated interpretation.
The project can use an OpenAI-compatible chat-completions endpoint, such as DeepSeek.
But the model is only used for:
- answering questions
- extracting long-term memory
It is not required for the base system to function.
Session 不只是对话。
它也可以表示:
- 一次客户沟通
- 一次行为聚合
- 一次 Agent 运行
- 一次团队同步
- 一次流程上下文
关键不在形式,而在于它是被保真保存的原始上下文。
大多数真正有价值的查询,不是:
- “给我看一条消息”
而是:
- “这个客户最近发生了什么?”
- “这个团队已经知道什么?”
- “上一次 Agent 运行里发生了什么?”
所以系统以 Entity 为主要查询入口。
系统会从原始内容派生出:
summarytagfactlong_term_memory
这些能力是为了更快检索、更易消费,而不是替代原始事实。
当前设计里,记忆被拆成三层:
- 原始层:原始 Session
- 提炼层:摘要、标签、事实
- 长期层:值得长期保留的沉淀记忆
这样做是为了区分“事实来源”和“模型解释结果”。
这个项目可以接 OpenAI-compatible 的 chat-completions 接口,比如 DeepSeek。
但模型目前只负责:
- 问答
- 长期记忆提炼
它不是系统基础能力的前提。
The repository already demonstrates:
- entity and session ingestion
- timeline lookup by entity
- structured memory search
- derived summary/tag/fact generation
- long-term memory extraction
- basic permission separation
- export and import
- OpenClaw-style agent integration entry points
- a test UI to validate whether memory is used correctly
- visible trace of how memory is consumed by the model
So the repository already answers one important question:
Can the minimum write -> retrieve -> answer -> distill loop be made runnable?
The answer is yes.
这个仓库目前已经证明了下面这些事情可以跑通:
Entity/Session写入- 基于实体的时间线查询
- 结构化记忆搜索
summary / tag / fact派生- 长期记忆提炼
- 基础权限隔离
- 导出 / 导入
- OpenClaw 风格的 Agent 接入入口
- 一个用于验证“记忆是否真的被用到”的测试界面
- 模型消费记忆的可视化 trace
所以这个仓库已经回答了一个关键问题:
最小的“写入 -> 检索 -> 回答 -> 沉淀”闭环能不能跑起来?
答案是:可以。
This is still an early-stage architecture prototype, not a finished production platform.
It does not yet fully solve:
- large-scale storage and indexing
- strict multi-tenant governance
- advanced audit and compliance
- conflict resolution for evolving facts
- stable SDKs for wide adoption
- low-friction production integration standards
- enterprise-grade operational reliability
So if the question is:
"Is this already a mature enterprise memory product?"
The honest answer is:
No.
这个项目现在仍然更像一个早期架构原型,而不是成熟的生产平台。
它还没有真正解决:
- 大规模存储与索引
- 严格的多租户治理
- 高级审计与合规
- 动态事实冲突处理
- 稳定的 SDK 生态
- 低摩擦的生产接入标准
- 企业级运行可靠性
所以如果问题是:
“它现在是不是成熟的企业记忆产品?”
更诚实的答案是:
不是。
Even if the current implementation may not justify becoming a full product immediately, the project still has value.
Many teams say they want a "memory system", but they do not clearly define:
- what is raw truth
- what is derived interpretation
- what is queried by whom
- what should be rebuildable
This project helps break that ambiguity into a concrete architecture.
Instead of treating memory as a chat feature, this project treats memory as:
- context infrastructure
- organizational knowledge retention
- cross-system shared state
- traceable support layer for agents
That shift in perspective is probably the most valuable output of the project.
即使当前实现未必足以支撑一个完整产品,这个项目仍然有价值。
很多团队说自己想做“记忆系统”,但并没有真正定义清楚:
- 什么是原始事实
- 什么是派生解释
- 谁通过什么方式来查询
- 哪些东西必须可以重建
这个项目把这些模糊概念拆成了更具体的架构问题。
它不把记忆当成单纯的聊天功能,而是把记忆看成:
- 上下文基础设施
- 组织知识沉淀
- 跨系统共享状态
- Agent 的可追溯支撑层
这可能是这个项目最重要的产出。
This project makes sense if you want to explore:
- whether enterprise memory should be infrastructure
- whether ordinary systems and agents can share one memory base
- whether raw-first memory design is worth the complexity
- whether long-term memory should remain rebuildable from source sessions
It makes less sense if your real need is only:
- chat history
- prompt stuffing
- simple conversation persistence
如果你想探索下面这些问题,这个项目是有意义的:
- 企业记忆应不应该成为基础设施
- 普通系统和 Agent 能不能共用同一套记忆底座
- 以原始数据优先的记忆设计值不值得
- 长期记忆是否应该始终可从源 Session 重建
但如果你真正需要的只是:
- 聊天历史保存
- prompt 拼接
- 简单会话持久化
那这个项目可能就显得过重了。
- src: service implementation
- public: interactive test UI
- docs: architecture and API notes
- examples: integration examples
- test: automated tests
npm test
npm startIf you want real model-backed answers, configure:
LLM_PROVIDER_NAMELLM_API_URLLLM_MODELLLM_API_KEY
本地运行:
npm test
npm start如果希望启用真实模型问答,需要配置:
LLM_PROVIDER_NAMELLM_API_URLLLM_MODELLLM_API_KEY
The most important conclusion from this project is not a demo page, not an API, and not a model integration.
It is this:
Enterprise memory is not mainly about making a model remember more. It is about turning context into something that can be preserved, queried, migrated, and rebuilt as infrastructure.
这个项目留下的最重要结论,不是某个 demo 页面,不是某个接口,也不是某个模型接入。
而是这句话:
企业记忆系统真正要解决的,不是“让模型多记一点”,而是“让上下文成为一种可保真、可检索、可迁移、可重建的基础能力”。