Skip to content

feat: add async support to MemorySessionManager#478

Open
nborges-aws wants to merge 1 commit into
mainfrom
fireAndForget
Open

feat: add async support to MemorySessionManager#478
nborges-aws wants to merge 1 commit into
mainfrom
fireAndForget

Conversation

@nborges-aws
Copy link
Copy Markdown
Contributor

Issue #, if available:
#452

Description of changes:

  • New async_mode: bool = False on AgentCoreMemoryConfig. Opt-in; default preserves existing behavior.
  • When True, register_hooks installs async callbacks that wrap the existing sync methods in asyncio.to_thread()
  • logger.warning at register_hooks time when async_mode=True pointing users to stream_async / invoke_async — sync invocation will raise RuntimeError from Strands.
  • 7 new tests covering newly added logic

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 12, 2026

✅ No Breaking Changes Detected

No public API breaking changes found in this PR.

Copy link
Copy Markdown
Contributor

@Hweinstock Hweinstock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes LGTM! I think an integ test for this flag would be awesome (perhaps as a follow-up), and perhaps worth getting @jariy17 to take a look.

"Sync invocation will raise RuntimeError from Strands' hook registry."
)

registry.add_callback(AgentInitializedEvent, lambda event: self.initialize(event.agent))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note (for my own understanding): because this path doesn't call RepositorySessionManager.register_hooks, we must manually register the initialize hook: https://github.com/strands-agents/sdk-python/blob/main/src/strands/session/session_manager.py#L43.

In other words, we pick this synchronous hook from that implementation and leave out the rest to overwrite them with our own async hooks.

if self.config.batch_size > 1:
registry.add_callback(AfterInvocationEvent, lambda event: self._flush_messages())

async def _on_after_invocation_flush(event: AfterInvocationEvent) -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we add a small helper to reduce boilerplate here?

Copy link
Copy Markdown
Contributor

@jariy17 jariy17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything is good but look into the _flush_agent_state bug and missing bidi hooks.

RepositorySessionManager.register_hooks(self, registry, **kwargs)
registry.add_callback(MessageAddedEvent, lambda event: self.retrieve_customer_context(event))
if not self.config.async_mode:
RepositorySessionManager.register_hooks(self, registry, **kwargs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: We're manually registering hooks in the async path instead of going through RepositorySessionManager.register_hooks(), so if strands adds new hooks upstream, we won't pick them up here.

"""
RepositorySessionManager.register_hooks(self, registry, **kwargs)
registry.add_callback(MessageAddedEvent, lambda event: self.retrieve_customer_context(event))
if not self.config.async_mode:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I prefer if self.config.async_mode: — easier to read when the primary condition isn't negated.


registry.add_callback(MultiAgentInitializedEvent, _on_multi_agent_initialized)
registry.add_callback(AfterNodeCallEvent, _on_after_node_call)
registry.add_callback(AfterMultiAgentInvocationEvent, _on_after_multi_agent_invocation)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Do we support BidiAgent? The sync path picks up BidiAgent hooks through the parent, but async mode doesn't register them. Should we add them here or explicitly document that BidiAgent + async_mode is unsupported?

logger.info("Flushed %d message events to AgentCore Memory", len(results))
return results

def _flush_agent_states_only(self) -> list[dict[str, Any]]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function copies the buffer under one lock acquisition (L1142-1143), does network I/O, then clears it under a separate lock acquisition (L1175-1176). Anything appended
between the copy and the clear gets destroyed without ever being sent.

With async mode, multiple to_thread workers can be calling create_agent concurrently, which makes the window between copy and clear much easier to hit. Let's copy the same locking mechanism in _flush_messages_only() where it sends and clears under the same lock.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great find, agree we should copy and clear within the same lock acquire.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants