Skip to content

feat(memory): port memory manager and extraction to Python#2740

Open
JackYPCOnline wants to merge 9 commits into
strands-agents:mainfrom
JackYPCOnline:feat/memory-manager-port
Open

feat(memory): port memory manager and extraction to Python#2740
JackYPCOnline wants to merge 9 commits into
strands-agents:mainfrom
JackYPCOnline:feat/memory-manager-port

Conversation

@JackYPCOnline

@JackYPCOnline JackYPCOnline commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Description

Ports the memory module from the TypeScript SDK (strands-ts/src/memory) to the Python SDK (strands-py). The module gives agents cross-session recall and persistence through a MemoryManager plugin that manages pluggable MemoryStore backends, exposes search_memory / add_memory tools, and runs automatic background extraction that distills conversation turns into durable memory.

This is a behavior-preserving port: the TypeScript test suite is the reference, and every it(...) case has a Python counterpart. The notable design work is adapting the TS promise/event-loop model to Python asyncio.

Public API — new package strands.memory:

from strands import Agent
from strands.memory import MemoryManager

# A store implements `search` and (optionally) `add` / `add_messages` / `get_tools`,
# and may declare an extraction config to auto-distill turns into memory.
memory_manager = MemoryManager(
    stores=[my_store],
    add_tool_config=True,            # expose the add_memory tool (opt-in)
    flush_on_invocation_end=True,    # await pending extraction writes per invocation
)
agent = Agent(model=model, plugins=[memory_manager])
agent("Remember I prefer dark mode")

# Programmatic access (coroutines):
results = await memory_manager.search("user preferences")
await memory_manager.flush()

Also adds AggregateMemoryError to strands.types.exceptions — a Python 3.10-safe stand-in for JS AggregateError (ExceptionGroup is 3.11+) used to surface multi-store write failures with each underlying reason.

Asyncio adaptation (key deviation from TS). TS relies on a long-lived event loop, so fire-and-forget background saves survive between turns. Python's synchronous Agent(...) entry point runs each invocation in a fresh loop (asyncio.run), which cancels in-flight tasks on return. The port uses asyncio.gather(return_exceptions=True) for Promise.allSettled, per-store asyncio.Task chains for serialized saves, and a tracked background-task set. The opt-in flush_on_invocation_end (default False) registers an AfterInvocationEvent hook that awaits pending writes before the per-invocation loop tears down; async callers owning a persistent loop can leave it off and call flush() at a shutdown boundary. This flag has no TS equivalent and exists solely to bridge the event-loop lifecycle difference.

Scope. This PR ports strands-ts/src/memory/ only. Concrete store backends (e.g. BedrockKnowledgeBaseStore, which lives in the separate strands-ts/src/vended-memory-stores/ module) are intended as a follow-up PR; this change ships the MemoryManager machinery and the MemoryStore protocol that those backends implement.

Related Issues

N/A

Documentation PR

No new docs required; public classes are documented via docstrings.

Type of Change

New feature

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce new warnings.

Ported the three TS test files (memory-manager, extraction, model-extractor) as example-based pytest suites so coverage mirrors the TS suite case-for-case; runs green under tests/strands/memory. Adjacent plugins/tools suites were run to confirm no regressions in plugin/tool discovery.

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Port the strands-ts memory module to strands-py: a MemoryManager plugin with search_memory/add_memory tools, pluggable MemoryStore backends, and background extraction (ExtractionCoordinator, ModelExtractor, invocation/interval triggers).

Adapts the TS promise model to asyncio: gather(return_exceptions=True) for Promise.allSettled, per-store asyncio.Task chains for serialized saves, and a tracked background-task set. Adds an opt-in flush_on_invocation_end so extraction persists under the synchronous Agent(...) entry point, whose per-invocation event loop would otherwise cancel in-flight saves.

Adds AggregateMemoryError to types/exceptions for Python 3.10-safe multi-store failure aggregation (ExceptionGroup is 3.11+).
@github-actions github-actions Bot added enhancement New feature or request area-persistence Session management or checkpointing area-async Related to asynchronous flows or multi-threading area-hooks Features or requests that might be implementable via hooks python Pull requests that update python code strands-running labels Jun 11, 2026
Comment thread strands-py/src/strands/memory/types.py
Comment thread strands-py/src/strands/memory/memory_manager.py
Comment thread strands-py/src/strands/memory/memory_manager.py
@github-actions

Copy link
Copy Markdown
Contributor

Question — shipped store backend: This PR adds the MemoryManager machinery and the MemoryStore protocol, but no concrete store implementation. The TS side ships BedrockKnowledgeBaseStore (under strands-ts/src/vended-memory-stores/), and the design doc references an InMemoryMemoryStore for prototyping. As it stands, a user can't actually use this feature without writing their own store first, and there's no integration test exercising a real end-to-end path.

Is a concrete store (and an integ test) following in a separate PR? If so, a one-line note in the description would set expectations. If this is meant to be usable on its own, shipping at least an in-memory store would make the feature self-contained and give the suite an end-to-end anchor beyond the unit-level fakes.

@github-actions

Copy link
Copy Markdown
Contributor

Process note — API review label: This PR introduces a substantial new public API surface — a new top-level strands.memory package, the MemoryManager primitive, the MemoryStore protocol that customers implement, and ~10 new exported types/dataclasses. Per team/API_BAR_RAISING.md, a new customer-facing primitive/abstraction falls under "substantial changes," which calls for an explicitly designated API reviewer (or team consensus) rather than standard PR review alone.

The PR description does an excellent job documenting use cases, signatures, and the asyncio deviation — that's exactly the proposer prep the doc asks for. Could you add the needs-api-review label so this gets the appropriate sign-off before merge? Also worth confirming the divergence from designs/0011-memory-manager.md is intentional (the design doc describes store_memory/injection/EvictionTrigger; this port mirrors the current TS surface — add_memory, no injection — which seems correct, but the design doc status is still "Proposed").

Comment thread strands-py/src/strands/memory/memory_manager.py
@github-actions

Copy link
Copy Markdown
Contributor

Assessment: Comment

A clean, well-documented port. Tests pass (106), ruff and mypy are clean, the implementation faithfully mirrors the current TS surface, and the asyncio adaptation is thoughtfully reasoned in both the code and the PR description. My feedback is concentrated on the event-loop lifecycle defaults and process, not the porting work itself.

Review themes
  • API review process: Substantial new public surface (new package + customer-facing primitive/protocol) — should carry needs-api-review per team/API_BAR_RAISING.md.
  • Asyncio lifecycle defaults: flush_on_invocation_end=False makes the most common (sync) path silently drop extraction writes; flush() also doesn't drain add-tool fire-and-forget tasks. These are the highest-impact items — the obvious path should be the happy path.
  • Public-type fidelity: MemoryManagerConfig claims to mirror the constructor but omits flush_on_invocation_end.
  • Usability/scope: No concrete MemoryStore ships, so the feature isn't self-contained; worth confirming a backend + integ test is following.
  • Minor: divergent stores=[] semantics between the tool and programmatic API.

Nicely structured, well-tested work — the whole-object equality assertions and case-for-case TS parity make the suite easy to trust.

Move the _flatten_reasons helper into memory_manager.py (where the TS source defines it) and delete memory/_errors.py, which had no TypeScript counterpart. AggregateMemoryError remains in strands.types.exceptions.
Add flush_on_invocation_end to MemoryManagerConfig so the doc/typing aid matches the constructor, and document the intentional empty-stores divergence (mirrored from TS) between the tools and the programmatic search/add APIs. Addresses PR strands-agents#2740 review.
@github-actions

Copy link
Copy Markdown
Contributor

Assessment: Comment (approve-pending-API-review — unchanged)

Reviewed the latest commit 2d81d694 ("docs: make docstrings and comments standalone"). It's a docstring/comment cleanup only — stripping the "Python port of TypeScript…" internal framing so the docs read for SDK consumers rather than porters. Good call; that context belonged in the PR description, not the shipped docstrings.

One thing I checked carefully: extraction/__init__.py dropped its re-exports and __all__. Verified this is safe — strands.memory.__init__ imports straight from the submodules (.extraction.model_extractor, etc.), not via those re-exports, and nothing in src/ or tests/ imports from strands.memory.extraction directly. The canonical public surface is unchanged (23 exports, all key symbols still importable at runtime).

Re-verified on 2d81d694: 108 memory tests pass, ruff and mypy clean. No behavior or API change.

Position is unchanged from my prior review — all code feedback is resolved and the port is solid. The only remaining gate is the needs-api-review label (maintainer action), with the optional InMemoryMemoryStore + integ test as a team call.

Make MemoryStore extend MemoryStoreConfig (a Protocol) so the declarative fields live in one place, drop the unused @runtime_checkable on MemoryStore and Extractor (no isinstance checks are performed), and simplify the per-store save chain to await the prior save directly - it always settles internally, so the extra exception guard (which also swallowed CancelledError) is unnecessary.
@github-actions

Copy link
Copy Markdown
Contributor

Assessment: Comment (approve-pending-API-review — unchanged)

Reviewed 2dfc2699 ("refactor: align store contract and simplify save chaining"). Unlike the previous docs commit, this one carries real code changes — three of them, all sound:

  1. _run_chain dropped the try/except BaseException around await previous — good change. The old except BaseException: pass swallowed CancelledError too, which is an asyncio anti-pattern; letting cancellation propagate is the correct idiom, and no chain task is ever .cancel()'d here anyway. The "previous always completes normally" claim holds because _extract swallows write errors internally. (Minor: the pre-try work in _extract — mark read + message filtering — is technically unprotected, but those are in-memory ops with effectively no raise path, so not worth guarding.)

  2. Removed @runtime_checkable from Extractor and MemoryStore — confirmed there are zero isinstance/issubclass checks against either protocol anywhere in src/ or tests/, so the decorator was unused (and misleading on protocols carrying data members). Clean removal; isinstance now correctly raises TypeError.

  3. MemoryStoreConfig dataclass → Protocol, with MemoryStore(MemoryStoreConfig, Protocol) — verified nothing ever constructed MemoryStoreConfig(...), so nothing breaks. The identity/behavior fields were previously duplicated across the dataclass and the MemoryStore protocol; folding them into one structural base that MemoryStore extends is a nice DRY-up. Note this changes the shape of a public exported symbol, so it falls under the already-pending API review rather than being a separate concern.

Re-verified on 2dfc2699: 108 memory tests pass, ruff and mypy clean, public surface intact (23 exports).

No new code issues. Position unchanged — the only remaining gate is the needs-api-review label (maintainer action), with the optional InMemoryMemoryStore + integ test as a team call.

``asyncio.Task`` chain to serialize a single store's saves, and
``asyncio.gather(..., return_exceptions=True)`` to run concurrent writes so one
failure does not cancel the rest.
"""

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to note somewhere that this mechanism might be worth revisiting. If we want "fire-and-forget" it might be better here to do it similar to otel which involves asynchronously running export on a separate process. Maybe something we can revisit when we work on background tool tasks. But again, I think we should record this somewhere. Maybe on the background tool task ticket.

As it is implemented here and in TS, I would say it is not true "fire-and-forget" because saves block other work when running due to the single threaded-ness of both languages.

Comment thread strands-py/src/strands/memory/extraction/coordinator.py Outdated

task.add_done_callback(_done)

async def process(self, store: MemoryStore) -> None:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I would say process doesn't need to be async since enqueue anyway offloads to create_task. With that said, you could collapse a bit here and have process return the created task instead of itself being wrapped in a task above in schedule. This would be a minor memory optimization since of course it is less task scheduling on the event loop.

AggregateMemoryError: If any concurrent ``add`` write fails.
"""
extraction = store.extraction
assert extraction is not None # noqa: S101 - extraction stores always configure this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: See comment above.

"""
if "text" in block:
return "text"
return next(iter(block.keys()), "")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Seems like you don't need to special case text.

{"text": ...}
{"toolUse": ...}

^ It is the same structure. One key per content block.

DEFAULT_SYSTEM_PROMPT = (
"You extract durable facts worth remembering across future conversations from a transcript.\n"
"\n"
'Return ONLY a JSON array of objects, each: {"content": string}. Each object is one discrete, '

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should utilize structured output here to make it more reliable.

assert extraction is not None # noqa: S101 - extraction stores always configure this.

if extraction.extractor is not None:
entries = await extraction.extractor.extract(messages, ExtractorContext(default_model=self._default_model))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if a fact only really defines itself across multiple messages? We could miss that in the extraction right because of the trimming we do over turns? Is under-extraction a problem you think?

Turn 1  user: I have a friend.
Turn 1  assistant: Cool, what is their name?
Turn 2  user: Bob

If turn 1 is trimmed before turn 2 extraction, we maybe lose the fact that Bob specifically is a friend.

Not something that needs to be addressed for this PR but still curious on what has been discussed around this.

Comment on lines +137 to +138
``ExceptionGroup`` so the SDK supports Python 3.10 (``ExceptionGroup`` is only
available on 3.11+).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Another example of an implementation decision that I would say doesn't belong in the docstring. Not saying you have to pull it out now but we should find a better place because this is useful context for our code translation workflow.

in :meth:`MemoryManager.init_agent` for any store configured with an
``ExtractionConfig``.

All public field and method names use ``snake_case``.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Not necessary to state in a docstring in my opinion.


# Synchronous Agent(...) entry point: enable flush_on_invocation_end so
# extraction writes persist across each invocation's event loop.
memory_manager = MemoryManager(stores=[my_store], flush_on_invocation_end=True)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we enable this automatically on the users behalf when a sync invoke is called?

Comment on lines +131 to +132
search_tool_config: MemoryToolConfig | bool = True,
add_tool_config: MemoryAddToolConfig | bool = False,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Why not MemorySearchToolConfig for symmetry with MemoryAddToolConfig?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a dedicated memory_manager param on Agent init? This would match the pattern of conversation_manager and session_manager.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have integ tests planned?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, should be after Owen's PR

- process() returns the queued save task instead of being wrapped in an extra task by schedule(), removing a layer of event-loop scheduling per save. - Replace the extraction-config asserts with an early guard and pass the extractor into _write, dropping the noqa. - Simplify _block_kind: every content block is a single-key mapping, so no need to special-case text. - Trim implementation rationale from the AggregateMemoryError docstring and drop a redundant naming note from the memory_manager module docstring. Addresses PR strands-agents#2740 review nits.
@github-actions

Copy link
Copy Markdown
Contributor

Assessment: Comment (approve — code is solid; design follow-ups are maintainer/author calls)

Reviewed 20a5a5e6 ("address review nits in coordinator and docstrings"), which responds to @pgrayy's review. The changes are clean and the nits are well-handled:

  • process() → sync returning asyncio.Task | None (per the memory-optimization nit) — verified there are only two callers, schedule() and the test suite, and both were updated. schedule() no longer double-wraps in create_task, and the tests now route through a small _drive() helper that awaits the returned task. No stale await process(...) remains.
  • Dropped the assert extraction is not None in favor of a graceful if extraction is None: return. Good call: I confirmed the coordinator is only ever constructed from _extraction_stores, which is pre-filtered on extraction is not None (memory_manager.py:195), so the branch is genuinely unreachable in production. Tightening the field type (the alternative suggested) isn't viable since MemoryStore.extraction must stay | None for non-extraction stores — the early return is the right resolution.
  • _write now takes extractor as an explicit param rather than re-deriving it and re-asserting — removes the second assert and reads more directly.
  • _block_kind no longer special-cases "text" — correct, every content block is a single-key mapping, so the generic path already covers it.
  • Docstring trims (the snake_case note, the ExceptionGroup/3.10 rationale) — agreed these were implementation/translation context that didn't belong in shipped docstrings.

Re-verified on 20a5a5e6: 108 memory tests pass, ruff and mypy clean.

Still open — but these are maintainer/author decisions, not code blockers

The substantive design questions from the review are tracked in-thread and don't block the port:

  • Structured output for the extractor prompt (reliability)
  • Possible under-extraction when a fact spans turns trimmed before extraction
  • Whether to add a dedicated memory_manager param on Agent.__init__ (symmetry with conversation_manager/session_manager)
  • Auto-enabling flush_on_invocation_end on sync invoke
  • MemorySearchToolConfig naming symmetry
  • Recording the "not true fire-and-forget" caveat on the background-tool-task ticket
  • Integ tests — author noted these land after Owen's PR

From my side the code is in good shape and all actionable review nits on this commit are addressed; the remaining items are design conversations for the team.

@opieter-aws opieter-aws enabled auto-merge (squash) June 12, 2026 21:14
@opieter-aws opieter-aws disabled auto-merge June 12, 2026 21:14
Collapse over-explained docstrings to concise summaries, drop Args/Returns blocks for self-evident params, and shorten multi-line rationale comments to one line. No code changes.
@github-actions

Copy link
Copy Markdown
Contributor

Assessment: Comment (approve — pure documentation commit, no behavioral change)

Reviewed 97b70025 ("docs(memory): tighten verbose docstrings and comments") — 7 files, +208/−606.

I verified this is genuinely docs-only: I stripped all docstrings via AST from both 20a5a5e6 and 97b70025 and diffed the resulting executable code. All 7 files are byte-for-byte identical with docstrings removed (coordinator.py, model_extractor.py, triggers.py, extraction/types.py, memory_manager.py, types.py, exceptions.py). No logic, signatures, or control flow changed.

The trims condense some genuinely verbose prose (e.g. the multi-paragraph "how it works in three pieces" module docstring) into tighter summaries, while preserving the key concepts a reader needs — the per-store high-water mark, the per-store task chain serialization, backoff/probe behavior, and the never-raises guarantee. This is a net readability win.

Re-verified on 97b70025: 108 memory tests pass, ruff and mypy clean.

No code concerns. Position unchanged — the code is in good shape; the remaining open items are the design conversations in @pgrayy's review thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-async Related to asynchronous flows or multi-threading area-hooks Features or requests that might be implementable via hooks area-persistence Session management or checkpointing enhancement New feature or request python Pull requests that update python code size/xl

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants