feat(memory): port memory manager and extraction to Python#2740
feat(memory): port memory manager and extraction to Python#2740JackYPCOnline wants to merge 9 commits into
Conversation
Port the strands-ts memory module to strands-py: a MemoryManager plugin with search_memory/add_memory tools, pluggable MemoryStore backends, and background extraction (ExtractionCoordinator, ModelExtractor, invocation/interval triggers). Adapts the TS promise model to asyncio: gather(return_exceptions=True) for Promise.allSettled, per-store asyncio.Task chains for serialized saves, and a tracked background-task set. Adds an opt-in flush_on_invocation_end so extraction persists under the synchronous Agent(...) entry point, whose per-invocation event loop would otherwise cancel in-flight saves. Adds AggregateMemoryError to types/exceptions for Python 3.10-safe multi-store failure aggregation (ExceptionGroup is 3.11+).
|
Question — shipped store backend: This PR adds the Is a concrete store (and an integ test) following in a separate PR? If so, a one-line note in the description would set expectations. If this is meant to be usable on its own, shipping at least an in-memory store would make the feature self-contained and give the suite an end-to-end anchor beyond the unit-level fakes. |
|
Process note — API review label: This PR introduces a substantial new public API surface — a new top-level The PR description does an excellent job documenting use cases, signatures, and the asyncio deviation — that's exactly the proposer prep the doc asks for. Could you add the |
|
Assessment: Comment A clean, well-documented port. Tests pass (106), Review themes
Nicely structured, well-tested work — the whole-object equality assertions and case-for-case TS parity make the suite easy to trust. |
Move the _flatten_reasons helper into memory_manager.py (where the TS source defines it) and delete memory/_errors.py, which had no TypeScript counterpart. AggregateMemoryError remains in strands.types.exceptions.
Add flush_on_invocation_end to MemoryManagerConfig so the doc/typing aid matches the constructor, and document the intentional empty-stores divergence (mirrored from TS) between the tools and the programmatic search/add APIs. Addresses PR strands-agents#2740 review.
|
Assessment: Comment (approve-pending-API-review — unchanged) Reviewed the latest commit One thing I checked carefully: Re-verified on Position is unchanged from my prior review — all code feedback is resolved and the port is solid. The only remaining gate is the |
Make MemoryStore extend MemoryStoreConfig (a Protocol) so the declarative fields live in one place, drop the unused @runtime_checkable on MemoryStore and Extractor (no isinstance checks are performed), and simplify the per-store save chain to await the prior save directly - it always settles internally, so the extra exception guard (which also swallowed CancelledError) is unnecessary.
|
Assessment: Comment (approve-pending-API-review — unchanged) Reviewed
Re-verified on No new code issues. Position unchanged — the only remaining gate is the |
| ``asyncio.Task`` chain to serialize a single store's saves, and | ||
| ``asyncio.gather(..., return_exceptions=True)`` to run concurrent writes so one | ||
| failure does not cancel the rest. | ||
| """ |
There was a problem hiding this comment.
I think it would be good to note somewhere that this mechanism might be worth revisiting. If we want "fire-and-forget" it might be better here to do it similar to otel which involves asynchronously running export on a separate process. Maybe something we can revisit when we work on background tool tasks. But again, I think we should record this somewhere. Maybe on the background tool task ticket.
As it is implemented here and in TS, I would say it is not true "fire-and-forget" because saves block other work when running due to the single threaded-ness of both languages.
|
|
||
| task.add_done_callback(_done) | ||
|
|
||
| async def process(self, store: MemoryStore) -> None: |
There was a problem hiding this comment.
Nit: I would say process doesn't need to be async since enqueue anyway offloads to create_task. With that said, you could collapse a bit here and have process return the created task instead of itself being wrapped in a task above in schedule. This would be a minor memory optimization since of course it is less task scheduling on the event loop.
| AggregateMemoryError: If any concurrent ``add`` write fails. | ||
| """ | ||
| extraction = store.extraction | ||
| assert extraction is not None # noqa: S101 - extraction stores always configure this. |
| """ | ||
| if "text" in block: | ||
| return "text" | ||
| return next(iter(block.keys()), "") |
There was a problem hiding this comment.
Nit: Seems like you don't need to special case text.
{"text": ...}
{"toolUse": ...}
^ It is the same structure. One key per content block.
| DEFAULT_SYSTEM_PROMPT = ( | ||
| "You extract durable facts worth remembering across future conversations from a transcript.\n" | ||
| "\n" | ||
| 'Return ONLY a JSON array of objects, each: {"content": string}. Each object is one discrete, ' |
There was a problem hiding this comment.
We should utilize structured output here to make it more reliable.
| assert extraction is not None # noqa: S101 - extraction stores always configure this. | ||
|
|
||
| if extraction.extractor is not None: | ||
| entries = await extraction.extractor.extract(messages, ExtractorContext(default_model=self._default_model)) |
There was a problem hiding this comment.
What if a fact only really defines itself across multiple messages? We could miss that in the extraction right because of the trimming we do over turns? Is under-extraction a problem you think?
Turn 1 user: I have a friend.
Turn 1 assistant: Cool, what is their name?
Turn 2 user: Bob
If turn 1 is trimmed before turn 2 extraction, we maybe lose the fact that Bob specifically is a friend.
Not something that needs to be addressed for this PR but still curious on what has been discussed around this.
| ``ExceptionGroup`` so the SDK supports Python 3.10 (``ExceptionGroup`` is only | ||
| available on 3.11+). |
There was a problem hiding this comment.
Nit: Another example of an implementation decision that I would say doesn't belong in the docstring. Not saying you have to pull it out now but we should find a better place because this is useful context for our code translation workflow.
| in :meth:`MemoryManager.init_agent` for any store configured with an | ||
| ``ExtractionConfig``. | ||
|
|
||
| All public field and method names use ``snake_case``. |
There was a problem hiding this comment.
Nit: Not necessary to state in a docstring in my opinion.
|
|
||
| # Synchronous Agent(...) entry point: enable flush_on_invocation_end so | ||
| # extraction writes persist across each invocation's event loop. | ||
| memory_manager = MemoryManager(stores=[my_store], flush_on_invocation_end=True) |
There was a problem hiding this comment.
Should we enable this automatically on the users behalf when a sync invoke is called?
| search_tool_config: MemoryToolConfig | bool = True, | ||
| add_tool_config: MemoryAddToolConfig | bool = False, |
There was a problem hiding this comment.
Nit: Why not MemorySearchToolConfig for symmetry with MemoryAddToolConfig?
There was a problem hiding this comment.
Should we have a dedicated memory_manager param on Agent init? This would match the pattern of conversation_manager and session_manager.
There was a problem hiding this comment.
yes, should be after Owen's PR
- process() returns the queued save task instead of being wrapped in an extra task by schedule(), removing a layer of event-loop scheduling per save. - Replace the extraction-config asserts with an early guard and pass the extractor into _write, dropping the noqa. - Simplify _block_kind: every content block is a single-key mapping, so no need to special-case text. - Trim implementation rationale from the AggregateMemoryError docstring and drop a redundant naming note from the memory_manager module docstring. Addresses PR strands-agents#2740 review nits.
|
Assessment: Comment (approve — code is solid; design follow-ups are maintainer/author calls) Reviewed
Re-verified on Still open — but these are maintainer/author decisions, not code blockersThe substantive design questions from the review are tracked in-thread and don't block the port:
From my side the code is in good shape and all actionable review nits on this commit are addressed; the remaining items are design conversations for the team. |
Collapse over-explained docstrings to concise summaries, drop Args/Returns blocks for self-evident params, and shorten multi-line rationale comments to one line. No code changes.
|
Assessment: Comment (approve — pure documentation commit, no behavioral change) Reviewed I verified this is genuinely docs-only: I stripped all docstrings via AST from both The trims condense some genuinely verbose prose (e.g. the multi-paragraph "how it works in three pieces" module docstring) into tighter summaries, while preserving the key concepts a reader needs — the per-store high-water mark, the per-store task chain serialization, backoff/probe behavior, and the never-raises guarantee. This is a net readability win. Re-verified on No code concerns. Position unchanged — the code is in good shape; the remaining open items are the design conversations in @pgrayy's review thread. |
Description
Ports the
memorymodule from the TypeScript SDK (strands-ts/src/memory) to the Python SDK (strands-py). The module gives agents cross-session recall and persistence through aMemoryManagerplugin that manages pluggableMemoryStorebackends, exposessearch_memory/add_memorytools, and runs automatic background extraction that distills conversation turns into durable memory.This is a behavior-preserving port: the TypeScript test suite is the reference, and every
it(...)case has a Python counterpart. The notable design work is adapting the TS promise/event-loop model to Pythonasyncio.Public API — new package
strands.memory:Also adds
AggregateMemoryErrortostrands.types.exceptions— a Python 3.10-safe stand-in for JSAggregateError(ExceptionGroupis 3.11+) used to surface multi-store write failures with each underlying reason.Asyncio adaptation (key deviation from TS). TS relies on a long-lived event loop, so fire-and-forget background saves survive between turns. Python's synchronous
Agent(...)entry point runs each invocation in a fresh loop (asyncio.run), which cancels in-flight tasks on return. The port usesasyncio.gather(return_exceptions=True)forPromise.allSettled, per-storeasyncio.Taskchains for serialized saves, and a tracked background-task set. The opt-inflush_on_invocation_end(defaultFalse) registers anAfterInvocationEventhook that awaits pending writes before the per-invocation loop tears down; async callers owning a persistent loop can leave it off and callflush()at a shutdown boundary. This flag has no TS equivalent and exists solely to bridge the event-loop lifecycle difference.Scope. This PR ports
strands-ts/src/memory/only. Concrete store backends (e.g.BedrockKnowledgeBaseStore, which lives in the separatestrands-ts/src/vended-memory-stores/module) are intended as a follow-up PR; this change ships theMemoryManagermachinery and theMemoryStoreprotocol that those backends implement.Related Issues
N/A
Documentation PR
No new docs required; public classes are documented via docstrings.
Type of Change
New feature
Testing
How have you tested the change? Verify that the changes do not break functionality or introduce new warnings.
Ported the three TS test files (
memory-manager,extraction,model-extractor) as example-basedpytestsuites so coverage mirrors the TS suite case-for-case; runs green undertests/strands/memory. Adjacentplugins/toolssuites were run to confirm no regressions in plugin/tool discovery.hatch run prepareChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.