Skip to content

[Enhancement] AgentCoreMemorySessionManager: add async_mode setting to prevent event loop blocking #452

@jariy17

Description

@jariy17

Problem

AgentCoreMemorySessionManager makes synchronous boto3 calls on the asyncio event loop hot path when used with Agent.stream_async() inside an async WebSocket server (e.g. AgentCore Runtime). This adds 200–800ms of blocked event loop time per agent turn, directly degrading time-to-first-token (TTFT) for streaming responses.

The four blocking call sites are:

  1. initialize()read_session(), read_agent(), list_messages() (sync gmdp client calls)
  2. append_message()create_event() (sync, once per message when batch_size=1)
  3. retrieve_customer_context() → uses ThreadPoolExecutor + as_completed(), but as_completed() still blocks the calling coroutine
  4. sync_agent()update_agent()create_event() (sync, per turn)

Because all four run inside the Strands hook lifecycle (BeforeInvocationEvent, MessageAddedEvent, AfterInvocationEvent), callers cannot wrap them without subclassing or forking the SDK.

Proposed Solution

Add an async_mode configuration setting to AgentCoreMemorySessionManager (or its config class):

class MemorySessionManagerConfig:
    async_mode: bool = False  # default: sync (backwards-compatible)

When async_mode=True, the session manager wraps all 4 blocking call sites with asyncio.to_thread() to offload boto3 calls to a thread pool, keeping the event loop unblocked:

# Pseudocode
if self.config.async_mode:
    await asyncio.to_thread(self._blocking_call, ...)
else:
    self._blocking_call(...)

This is a non-breaking change — existing sync users default to sync behavior unchanged. Async users opt in by setting async_mode=True.

Impact

  • Without fix: 200–800ms per turn of event loop blocking → degraded TTFT for streaming agents
  • With fix: Boto3 calls offloaded to thread pool → event loop stays free for I/O

Affected File

src/bedrock_agentcore/memory/integrations/strands/session_manager.py

Acceptance Criteria

  • async_mode: bool = False added to config (backwards-compatible default)
  • All 4 blocking call sites wrapped with asyncio.to_thread when async_mode=True
  • retrieve_customer_context uses asyncio.gather instead of blocking as_completed when async_mode=True
  • Unit tests cover both async_mode=False (existing behavior) and async_mode=True
  • Docs updated with usage example for async WebSocket agents

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions