From 8b820f2c28b59449a0ca66159316c7f6f48849b1 Mon Sep 17 00:00:00 2001
From: Nitin Kanukolanu <nitinkanukolanu@gmail.com>
Date: Thu, 16 Apr 2026 10:05:22 -0400
Subject: [PATCH 1/4] docs: update AWS Bedrock documentation with comprehensive
 coverage

- Add Quick Start section to aws-bedrock.md for running a full Bedrock instance
- Document bedrock/ prefix requirement for embedding models vs no prefix for generation
- Add installation instructions for the [aws] extra (boto3, botocore)
- Fix Docker Compose instructions to use --profile aws instead of DOCKER_TARGET
- Fix region env var references (REGION_NAME + AWS_REGION_NAME)
- Add latest Anthropic Claude models (Opus 4.6, Sonnet 4.6, Opus 4.1, Sonnet 4, etc.)
- Add additional embedding models (Titan multimodal, Cohere Embed v4)
- Fix default GENERATION_MODEL in llm-providers.md to match config.py
- Add hybrid config examples (Bedrock embeddings + OpenAI generation and vice versa)
- Add troubleshooting entries for missing AWS dependencies
- Cross-reference between aws-bedrock.md and llm-providers.md
---
 docs/aws-bedrock.md   | 297 +++++++++++++++++++++++++++---------------
 docs/llm-providers.md | 140 +++++++++++---------
 2 files changed, 273 insertions(+), 164 deletions(-)

diff --git a/docs/aws-bedrock.md b/docs/aws-bedrock.md
index 6bc501e8..a778c561 100644
--- a/docs/aws-bedrock.md
+++ b/docs/aws-bedrock.md
@@ -1,60 +1,143 @@
 # AWS Bedrock Models
 
-> **Note:** This documentation has been consolidated into [LLM Providers](llm-providers.md#aws-bedrock).
-> This page is kept for reference but the LLM Providers guide is the authoritative source.
-
 The Redis Agent Memory Server supports [Amazon Bedrock](https://aws.amazon.com/bedrock/) for both **embedding models** and **LLM generation models**. This allows you to use AWS-native AI models while keeping your data within the AWS ecosystem.
 
-## Quick Reference
+> **See also:** [LLM Providers](llm-providers.md#aws-bedrock) for a broader overview of all supported providers, and [Embedding Providers](embedding-providers.md) for embedding-specific configuration.
+
+## Quick Start — Run a Full Bedrock-Backed Instance
+
+Follow these steps to get the memory server running entirely on AWS Bedrock in under five minutes.
+
+### 1. Install the `[aws]` extra
+
+The server's core install does **not** include AWS SDK libraries. You must install the `[aws]` extra so that `boto3` (AWS SDK for Python) and `botocore` (low-level AWS client library) are available at runtime. Without these packages, any Bedrock operation will fail with an import error.
+
+```bash
+# With pip
+pip install agent-memory-server[aws]
+
+# With uv (used by this project)
+uv sync --extra aws
+```
+
+### 2. Export environment variables
+
+```bash
+# ── AWS credentials ──────────────────────────────────────────────
+# (or use an IAM role / AWS CLI profile instead — see "AWS Credentials" below)
+export AWS_ACCESS_KEY_ID=your-access-key-id
+export AWS_SECRET_ACCESS_KEY=your-secret-access-key
+export REGION_NAME=us-east-1          # Used by the server's own model-validation client
+export AWS_REGION_NAME=us-east-1      # Used by LiteLLM for Bedrock API calls
+
+# ── Bedrock embedding model (bedrock/ prefix REQUIRED) ──────────
+export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0
+export REDISVL_VECTOR_DIMENSIONS=1024   # Must match the embedding model
+
+# ── Bedrock generation models (NO prefix needed) ────────────────
+export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0
+export FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0
+
+# ── Redis ────────────────────────────────────────────────────────
+export REDIS_URL=redis://localhost:6379
+```
+
+### 3. Start Redis and the server
+
+```bash
+# Start Redis (requires Docker)
+docker-compose up redis -d
+
+# Start the memory server
+uv run agent-memory api
+```
+
+The REST API is now available at `http://localhost:8000` (docs at `/docs`).
+
+> **Tip:** To also run background tasks (memory extraction, compaction, etc.), start a worker in a second terminal:
+> ```bash
+> uv run agent-memory task-worker
+> ```
+
+---
 
-For complete AWS Bedrock configuration, see [LLM Providers - AWS Bedrock](llm-providers.md#aws-bedrock).
+## Why Two Region Variables?
 
-**Key points:**
-- All LLM operations use [LiteLLM](https://docs.litellm.ai/) internally
-- Bedrock embedding models require the `bedrock/` prefix (e.g., `bedrock/amazon.titan-embed-text-v2:0`)
-- Bedrock generation models do not need a prefix (e.g., `anthropic.claude-sonnet-4-5-20250929-v1:0`)
-- The `[aws]` extra installs `boto3` and `botocore` for AWS authentication
+The server reads the AWS region in **two** places:
 
-## Overview
+| Variable | Read by | Purpose |
+|----------|---------|---------|
+| `REGION_NAME` | The server's Settings (pydantic-settings) | Creating `boto3` sessions for model-existence validation |
+| `AWS_REGION_NAME` | LiteLLM | Making the actual Bedrock inference API calls |
 
-Amazon Bedrock provides access to a wide variety of foundation models from leading AI providers. The Redis Agent Memory Server supports using Bedrock for:
+Set **both** to the same value to avoid surprises. If you rely solely on an IAM role or AWS CLI profile, `boto3` and LiteLLM may auto-detect the region from the instance metadata or `~/.aws/config`, but explicitly setting the variables is recommended.
 
-1. **Embedding Models** - For semantic search and memory retrieval
-2. **LLM Generation Models** - For memory extraction, summarization, and topic modeling
+---
 
-### Supported Embedding Models
+## Understanding the `bedrock/` Prefix
 
-> **Important:** Use the `bedrock/` prefix for embedding models.
+All LLM operations in the server go through [LiteLLM](https://docs.litellm.ai/), which uses a provider prefix to route requests to the correct backend.
+
+| Model type | Prefix required? | Example |
+|------------|------------------|---------|
+| **Embedding** | **Yes** — must include `bedrock/` | `bedrock/amazon.titan-embed-text-v2:0` |
+| **Generation (chat)** | **No** — Bedrock model IDs are recognized automatically | `anthropic.claude-sonnet-4-5-20250929-v1:0` |
+
+**Why the difference?** LiteLLM can infer the provider for generation models from the Bedrock-style model ID (e.g., `anthropic.claude-*`). For embedding models, however, the `bedrock/` prefix is the only way LiteLLM distinguishes a Bedrock embedding call from other providers. If you omit the prefix on an embedding model, the server will auto-add it and emit a **deprecation warning** — but this behaviour will be removed in a future release.
+
+---
+
+## Supported Models
+
+### Embedding Models
+
+> **Important:** Always use the `bedrock/` prefix for embedding models.
 
 | Model ID | Provider | Dimensions | Description |
 |----------|----------|------------|-------------|
-| `bedrock/amazon.titan-embed-text-v2:0` | Amazon | 1024 | Latest Titan embedding model |
-| `bedrock/amazon.titan-embed-text-v1` | Amazon | 1536 | Original Titan embedding model |
+| `bedrock/amazon.titan-embed-text-v2:0` | Amazon | 1024 | Latest Titan text embedding (recommended) |
+| `bedrock/amazon.titan-embed-text-v1` | Amazon | 1536 | Original Titan text embedding |
+| `bedrock/amazon.titan-embed-image-v1` | Amazon | 1024 | Titan multimodal (text + image) embedding |
 | `bedrock/cohere.embed-english-v3` | Cohere | 1024 | English-focused embeddings |
 | `bedrock/cohere.embed-multilingual-v3` | Cohere | 1024 | Multilingual embeddings |
+| `bedrock/cohere.embed-v4:0` | Cohere | 1024 | Cohere Embed v4 — text + image embedding |
 
-### Pre-configured LLM Generation Models
+### Generation Models (Anthropic Claude on Bedrock)
 
-The following models are pre-configured in the codebase:
+The following Anthropic Claude models are available on Bedrock. Models marked **pre-configured** have entries in `MODEL_CONFIGS` (in `config.py`) with validated token limits; the others are fully usable by setting the corresponding environment variable — LiteLLM routes the request based on the Bedrock model ID.
 
-| Model ID | Provider | Max Tokens | Description |
-|----------|----------|------------|-------------|
-| `anthropic.claude-sonnet-4-5-20250929-v1:0` | Anthropic | 200,000 | Claude 4.5 Sonnet |
-| `anthropic.claude-haiku-4-5-20251001-v1:0` | Anthropic | 200,000 | Claude 4.5 Haiku |
-| `anthropic.claude-opus-4-5-20251101-v1:0` | Anthropic | 200,000 | Claude 4.5 Opus |
+| Model ID | Description | Pre-configured |
+|----------|-------------|:--------------:|
+| `anthropic.claude-opus-4-6-v1` | Claude Opus 4.6 — latest and most capable | |
+| `anthropic.claude-sonnet-4-6` | Claude Sonnet 4.6 | |
+| `anthropic.claude-opus-4-5-20251101-v1:0` | Claude Opus 4.5 | ✓ |
+| `anthropic.claude-sonnet-4-5-20250929-v1:0` | Claude Sonnet 4.5 | ✓ |
+| `anthropic.claude-haiku-4-5-20251001-v1:0` | Claude Haiku 4.5 — fast & cost-effective | ✓ |
+| `anthropic.claude-opus-4-1-20250805-v1:0` | Claude Opus 4.1 | |
+| `anthropic.claude-sonnet-4-20250514-v1:0` | Claude Sonnet 4 | |
+| `anthropic.claude-3-5-haiku-20241022-v1:0` | Claude 3.5 Haiku | |
+| `anthropic.claude-3-haiku-20240307-v1:0` | Claude 3 Haiku | |
+
+> **Tip:** For the recommended quick-start configuration, use `anthropic.claude-sonnet-4-5-20250929-v1:0` as `GENERATION_MODEL` and `anthropic.claude-haiku-4-5-20251001-v1:0` as `FAST_MODEL`.
 
 ## Installation
 
-AWS Bedrock support requires additional dependencies. Install them with:
+AWS Bedrock support requires additional dependencies. Install the `[aws]` extra:
 
 ```bash
+# With pip
 pip install agent-memory-server[aws]
+
+# With uv (recommended — used by this project)
+uv sync --extra aws
 ```
 
 This installs:
 
-- `boto3` - AWS SDK for Python
-- `botocore` - Low-level AWS client library
+- **`boto3`** (`>=1.42.1,<2.0.0`) — AWS SDK for Python
+- **`botocore`** (`>=1.42.1,<2.0.0`) — Low-level AWS client library
+
+> **Without these packages**, any attempt to use Bedrock models will fail at import time. The standard install (`pip install agent-memory-server`) does **not** include them.
 
 ## Configuration
 
@@ -64,10 +147,12 @@ Configure the following environment variables to use Bedrock models:
 
 ```bash
 # Required: AWS region where Bedrock is available
-AWS_REGION_NAME=us-east-1
+REGION_NAME=us-east-1            # For the server's own boto3 sessions
+AWS_REGION_NAME=us-east-1        # For LiteLLM's Bedrock calls
 
-# For Bedrock Embedding Models (note: bedrock/ prefix required)
+# For Bedrock Embedding Models (bedrock/ prefix REQUIRED)
 EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0
+REDISVL_VECTOR_DIMENSIONS=1024   # Must match the embedding model's output dimensions
 
 # For Bedrock LLM Generation Models (no prefix needed)
 GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0
@@ -100,7 +185,7 @@ aws_secret_access_key = your-secret-access-key
 
 #### Option 3: IAM Role (Recommended for AWS deployments)
 
-When running on AWS infrastructure (EC2, ECS, Lambda, etc.), use IAM roles for automatic credential management. No explicit credentials are needed.
+When running on AWS infrastructure (EC2, ECS, Lambda, etc.), use IAM roles for automatic credential management. No explicit credentials are needed — `boto3` and LiteLLM will discover credentials from the instance metadata service.
 
 #### Option 4: AWS SSO / AWS CLI Profile
 
@@ -116,50 +201,70 @@ export AWS_PROFILE=your-profile
 
 ### Docker Configuration
 
-The Docker image supports two build targets:
+The Dockerfile provides two build targets (multi-stage):
 
-- **`standard`** (default): OpenAI/Anthropic support only
-- **`aws`**: Includes AWS Bedrock embedding models support
+- **`standard`** (default) — OpenAI / Anthropic support only
+- **`aws`** — Includes `boto3` and `botocore` for AWS Bedrock support
 
 #### Building the AWS-enabled Image
 
 ```bash
 # Build directly with Docker
 docker build --target aws -t agent-memory-server:aws .
-
-# Or use Docker Compose with the DOCKER_TARGET variable
-DOCKER_TARGET=aws docker-compose up --build
 ```
 
-#### Docker Compose Configuration
+#### Docker Compose
 
-When using Docker Compose, set the `DOCKER_TARGET` environment variable to `aws`:
+The `docker-compose.yml` ships with a dedicated **`aws` profile** that uses pre-built AWS images (`redislabs/agent-memory-server-aws`). Activate it with `--profile aws`:
 
 ```bash
-# Start with AWS Bedrock support
-DOCKER_TARGET=aws docker-compose up --build
+# Start the full AWS stack (API + MCP + task worker + Redis)
+docker-compose --profile aws up
 
-# Or for the production-like setup
-DOCKER_TARGET=aws docker-compose -f docker-compose-task-workers.yml up --build
+# Or start only the API and Redis
+docker-compose --profile aws up api-aws redis
 ```
 
-Create a `.env` file with your credentials and configuration:
-
-```bash
-# Docker build target
-DOCKER_TARGET=aws
+> **Note:** The `aws` profile services (`api-aws`, `mcp-aws`, `task-worker-aws`) are separate from the standard services. Do **not** mix profiles — run either `docker-compose up` (standard) or `docker-compose --profile aws up`.
 
-# Embedding model (note: bedrock/ prefix required)
-EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0
+Create a `.env` file with your credentials and model configuration. The Docker Compose AWS services read this file automatically:
 
+```bash
 # AWS credentials
+REGION_NAME=us-east-1
 AWS_REGION_NAME=us-east-1
 AWS_ACCESS_KEY_ID=your-access-key-id
 AWS_SECRET_ACCESS_KEY=your-secret-access-key
-AWS_SESSION_TOKEN=your-session-token  # Optional
+AWS_SESSION_TOKEN=your-session-token  # Optional, for temporary credentials
+
+# Embedding model (bedrock/ prefix required)
+EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0
+REDISVL_VECTOR_DIMENSIONS=1024
+
+# Generation models (override the defaults in docker-compose.yml if desired)
+GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0
+FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0
+```
+
+You can also pass AWS credentials at runtime with `docker run`:
+
+```bash
+docker run -e AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY \
+  -e REGION_NAME=us-east-1 -e AWS_REGION_NAME=us-east-1 \
+  -e EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 \
+  -e REDISVL_VECTOR_DIMENSIONS=1024 \
+  -e GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 \
+  -p 8000:8000 redislabs/agent-memory-server-aws:latest
 ```
 
-The Docker Compose files already include the AWS environment variables, so you only need to set them in your `.env` file or environment.
+Or mount your AWS credentials directory:
+
+```bash
+docker run -v ~/.aws:/root/.aws:ro \
+  -e AWS_PROFILE=my-profile \
+  -e REGION_NAME=us-east-1 -e AWS_REGION_NAME=us-east-1 \
+  -p 8000:8000 redislabs/agent-memory-server-aws:latest
+```
 
 ## Required IAM Permissions
 
@@ -224,17 +329,16 @@ REDISVL_VECTOR_DIMENSIONS=1536
 ### Example 1: Bedrock Embeddings with OpenAI Generation
 
 ```bash
-# Embedding model (Bedrock - note: bedrock/ prefix required)
+# Embedding model (Bedrock — bedrock/ prefix required)
 EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0
+REDISVL_VECTOR_DIMENSIONS=1024
 
 # AWS Configuration
+REGION_NAME=us-east-1
 AWS_REGION_NAME=us-east-1
 AWS_ACCESS_KEY_ID=your-access-key-id
 AWS_SECRET_ACCESS_KEY=your-secret-access-key
 
-# Embedding dimensions (must match embedding model)
-REDISVL_VECTOR_DIMENSIONS=1024
-
 # Generation model (OpenAI)
 GENERATION_MODEL=gpt-4o
 OPENAI_API_KEY=your-openai-key
@@ -247,35 +351,51 @@ REDIS_URL=redis://localhost:6379
 
 ```bash
 # AWS Configuration
+REGION_NAME=us-east-1
 AWS_REGION_NAME=us-east-1
 AWS_ACCESS_KEY_ID=your-access-key-id
 AWS_SECRET_ACCESS_KEY=your-secret-access-key
 
-# Embedding model (Bedrock Titan - note: bedrock/ prefix required)
+# Embedding model (Bedrock Titan — bedrock/ prefix required)
 EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0
 REDISVL_VECTOR_DIMENSIONS=1024
 
-# Generation models (Bedrock Claude - no prefix needed)
+# Generation models (Bedrock Claude — no prefix needed)
 GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0
 FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0
 SLOW_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0
-TOPIC_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0
 
 # Other settings
 REDIS_URL=redis://localhost:6379
 ```
 
+### Example 3: OpenAI Embeddings with Bedrock Generation
+
+```bash
+# Embeddings via OpenAI
+EMBEDDING_MODEL=text-embedding-3-small
+OPENAI_API_KEY=your-openai-key
+
+# Generation via Bedrock
+REGION_NAME=us-east-1
+AWS_REGION_NAME=us-east-1
+AWS_ACCESS_KEY_ID=your-access-key-id
+AWS_SECRET_ACCESS_KEY=your-secret-access-key
+GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0
+
+REDIS_URL=redis://localhost:6379
+```
+
 ### YAML Configuration
 
 ```yaml
 # config.yaml - Full Bedrock Stack
 region_name: us-east-1
-embedding_model: amazon.titan-embed-text-v2:0
+embedding_model: bedrock/amazon.titan-embed-text-v2:0   # bedrock/ prefix required
 redisvl_vector_dimensions: 1024
 generation_model: anthropic.claude-sonnet-4-5-20250929-v1:0
 fast_model: anthropic.claude-haiku-4-5-20251001-v1:0
 slow_model: anthropic.claude-sonnet-4-5-20250929-v1:0
-topic_model: anthropic.claude-haiku-4-5-20251001-v1:0
 redis_url: redis://localhost:6379
 ```
 
@@ -305,68 +425,33 @@ Before using a Bedrock model, you must enable it in the AWS Console:
 
 ## Mixing Providers
 
-You can mix and match providers for different use cases:
-
-### Bedrock Embeddings with OpenAI Generation
-
-```bash
-# Embeddings via Bedrock (note: bedrock/ prefix required)
-EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0
-AWS_REGION_NAME=us-east-1
-
-# Generation via OpenAI
-GENERATION_MODEL=gpt-4o
-OPENAI_API_KEY=your-openai-key
-```
-
-### Full Bedrock Stack (Embeddings + Generation)
-
-```bash
-# All AWS - keep everything within your AWS environment
-AWS_REGION_NAME=us-east-1
-
-# Embeddings via Bedrock (note: bedrock/ prefix required)
-EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0
-REDISVL_VECTOR_DIMENSIONS=1024
-
-# Generation via Bedrock Claude (no prefix needed)
-GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0
-FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0
-```
-
-### OpenAI Embeddings with Bedrock Generation
-
-```bash
-# Embeddings via OpenAI
-EMBEDDING_MODEL=text-embedding-3-small
-OPENAI_API_KEY=your-openai-key
-
-# Generation via Bedrock
-AWS_REGION_NAME=us-east-1
-GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0
-```
+You can mix and match providers for embeddings and generation independently. See the [Complete Configuration Examples](#complete-configuration-examples) section above for full `.env` snippets covering each combination.
 
 This flexibility allows you to:
+
 - Keep all data within AWS for compliance requirements
-- Use the best model for each task
+- Use the best model for each task (e.g., OpenAI embeddings + Bedrock generation)
 - Optimize costs by choosing appropriate models for different operations
 
 ## Troubleshooting
 
 ### "AWS-related dependencies might be missing"
 
-Install the AWS extras:
+You need to install the `[aws]` extra. The standard install does not include `boto3`:
 
 ```bash
 pip install agent-memory-server[aws]
+# or with uv:
+uv sync --extra aws
 ```
 
-### "Missing environment variable 'AWS_REGION_NAME'"
+### "Missing environment variable 'REGION_NAME'"
 
-Set the AWS region:
+The server's Settings class reads the region from `REGION_NAME`. Set it:
 
 ```bash
-export AWS_REGION_NAME=us-east-1
+export REGION_NAME=us-east-1
+export AWS_REGION_NAME=us-east-1   # Also set this for LiteLLM
 ```
 
 ### "Bedrock embedding model not found"
diff --git a/docs/llm-providers.md b/docs/llm-providers.md
index 911a579c..f4510aa9 100644
--- a/docs/llm-providers.md
+++ b/docs/llm-providers.md
@@ -16,7 +16,7 @@ All LLM operations go through a single `LLMClient` abstraction:
 │           └──────────┬───────────────┘                   │
 │                      ▼                                   │
 │               ┌──────────────┐                           │
-│               │   LiteLLM    │                           │
+│               │   LLM Proxy  │                           │
 │               └──────┬───────┘                           │
 └──────────────────────┼───────────────────────────────────┘
                        ▼
@@ -41,19 +41,24 @@ export OPENAI_API_KEY=sk-...
 export GENERATION_MODEL=gpt-4o
 export EMBEDDING_MODEL=text-embedding-3-small
 
-# Anthropic
+# Anthropic (requires a separate embedding provider — Anthropic has no embedding models)
 export ANTHROPIC_API_KEY=sk-ant-...
 export GENERATION_MODEL=claude-3-5-sonnet-20241022
-export EMBEDDING_MODEL=text-embedding-3-small  # Use OpenAI for embeddings
+export OPENAI_API_KEY=sk-...                    # Needed for embeddings
+export EMBEDDING_MODEL=text-embedding-3-small   # Use OpenAI for embeddings
 
-# AWS Bedrock
+# AWS Bedrock (full stack — see "AWS Bedrock" section for details)
 export AWS_ACCESS_KEY_ID=...
 export AWS_SECRET_ACCESS_KEY=...
-export AWS_REGION_NAME=us-east-1
-export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0
-export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0
+export REGION_NAME=us-east-1                                      # Server's own boto3 sessions
+export AWS_REGION_NAME=us-east-1                                  # LiteLLM Bedrock calls
+export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 # No prefix for generation
+export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0       # bedrock/ prefix REQUIRED
+export REDISVL_VECTOR_DIMENSIONS=1024                             # Must match embedding model
 ```
 
+> **Bedrock users:** You must also install the `[aws]` extra (`pip install agent-memory-server[aws]` or `uv sync --extra aws`) to get `boto3` and `botocore`. See [AWS Bedrock](aws-bedrock.md) for a full walkthrough.
+
 ## Supported Providers
 
 ### Generation Models (Chat Completions)
@@ -125,15 +130,19 @@ export EMBEDDING_MODEL=text-embedding-3-small
 
 AWS Bedrock provides access to foundation models from multiple providers (Anthropic Claude, Amazon Titan, Cohere, etc.) through AWS infrastructure.
 
+> **Full guide:** See [AWS Bedrock Models](aws-bedrock.md) for a complete walkthrough including a Quick Start demo, Docker Compose instructions, IAM policies, and troubleshooting.
+
 #### Installation
 
-AWS Bedrock support requires additional dependencies:
+AWS Bedrock support requires the `[aws]` extra, which installs `boto3` (`>=1.42.1`) and `botocore` (`>=1.42.1`). Without these packages, any Bedrock operation will fail at import time.
 
 ```bash
+# With pip
 pip install agent-memory-server[aws]
-```
 
-This installs `boto3` and `botocore` for AWS authentication.
+# With uv (recommended — used by this project)
+uv sync --extra aws
+```
 
 #### Authentication
 
@@ -143,14 +152,17 @@ Bedrock uses standard AWS credentials. Configure using any of these methods:
 # Option 1: Environment variables (recommended for development)
 export AWS_ACCESS_KEY_ID=AKIA...
 export AWS_SECRET_ACCESS_KEY=...
-export AWS_REGION_NAME=us-east-1
+export REGION_NAME=us-east-1        # Server's own boto3 sessions (model validation)
+export AWS_REGION_NAME=us-east-1    # LiteLLM's Bedrock API calls
 
 # Option 2: AWS CLI profile
 export AWS_PROFILE=my-profile
+export REGION_NAME=us-east-1
 export AWS_REGION_NAME=us-east-1
 
 # Option 3: IAM role (recommended for production on AWS)
-# No credentials needed - uses instance/container role
+# No credentials needed — uses instance/container role
+export REGION_NAME=us-east-1
 export AWS_REGION_NAME=us-east-1
 
 # Option 4: AWS SSO
@@ -158,46 +170,59 @@ aws sso login --profile your-profile
 export AWS_PROFILE=your-profile
 ```
 
+> **Why two region variables?** The server reads `REGION_NAME` (via pydantic-settings) for its own `boto3` model-validation client, while LiteLLM reads `AWS_REGION_NAME` for the actual Bedrock inference calls. Set both to the same value.
+
 #### Generation Models
 
+Generation models use Bedrock-native model IDs **without** a prefix — LiteLLM recognises them automatically.
+
 ```bash
-# Claude models on Bedrock (no prefix needed for generation)
+# Claude models on Bedrock (no prefix needed)
 export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0
-export FAST_MODEL=anthropic.claude-3-5-haiku-20241022-v1:0
-
-# Amazon Titan
-export GENERATION_MODEL=amazon.titan-text-premier-v1:0
+export FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0
 ```
 
-**Supported Bedrock generation models:**
-- `anthropic.claude-sonnet-4-5-20250929-v1:0` (recommended)
-- `anthropic.claude-3-5-sonnet-20241022-v2:0`
-- `anthropic.claude-3-5-haiku-20241022-v1:0`
-- `anthropic.claude-3-opus-20240229-v1:0`
-- `amazon.titan-text-premier-v1:0`
-- `amazon.titan-text-express-v1`
+**Anthropic Claude models on Bedrock** (models marked ✓ are pre-configured in `MODEL_CONFIGS`):
+
+| Model ID | Description | Pre-configured |
+|----------|-------------|:--------------:|
+| `anthropic.claude-opus-4-6-v1` | Claude Opus 4.6 — latest | |
+| `anthropic.claude-sonnet-4-6` | Claude Sonnet 4.6 | |
+| `anthropic.claude-opus-4-5-20251101-v1:0` | Claude Opus 4.5 | ✓ |
+| `anthropic.claude-sonnet-4-5-20250929-v1:0` | Claude Sonnet 4.5 | ✓ |
+| `anthropic.claude-haiku-4-5-20251001-v1:0` | Claude Haiku 4.5 — fast & cost-effective | ✓ |
+| `anthropic.claude-opus-4-1-20250805-v1:0` | Claude Opus 4.1 | |
+| `anthropic.claude-sonnet-4-20250514-v1:0` | Claude Sonnet 4 | |
+| `anthropic.claude-3-5-haiku-20241022-v1:0` | Claude 3.5 Haiku | |
+| `anthropic.claude-3-haiku-20240307-v1:0` | Claude 3 Haiku | |
+
+Any Bedrock model can be used by setting the environment variable — LiteLLM routes based on the model ID convention.
 
 #### Embedding Models
 
-> **Important:** Bedrock embedding models require the `bedrock/` prefix.
+> **Important:** Bedrock embedding models **require** the `bedrock/` prefix.
+
+LiteLLM needs the `bedrock/` prefix to distinguish Bedrock embeddings from other providers. If you omit it, the server auto-adds the prefix and emits a **deprecation warning** — this fallback will be removed in a future release.
 
 ```bash
-# Correct - use bedrock/ prefix
+# Correct — use bedrock/ prefix
 export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0
-REDISVL_VECTOR_DIMENSIONS=1024  # Must match embedding model
+export REDISVL_VECTOR_DIMENSIONS=1024  # Must match embedding model dimensions
 
-# Deprecated - unprefixed names emit a warning
-export EMBEDDING_MODEL=amazon.titan-embed-text-v2:0  # Works but shows deprecation warning
+# Deprecated — works but emits a warning
+export EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
 ```
 
 **Supported Bedrock embedding models:**
 
 | Model ID | Dimensions | Description |
 |----------|------------|-------------|
-| `bedrock/amazon.titan-embed-text-v2:0` | 1024 | Latest Titan (recommended) |
-| `bedrock/amazon.titan-embed-text-v1` | 1536 | Original Titan |
+| `bedrock/amazon.titan-embed-text-v2:0` | 1024 | Latest Titan text embedding (recommended) |
+| `bedrock/amazon.titan-embed-text-v1` | 1536 | Original Titan text embedding |
+| `bedrock/amazon.titan-embed-image-v1` | 1024 | Titan multimodal (text + image) embedding |
 | `bedrock/cohere.embed-english-v3` | 1024 | English-focused |
 | `bedrock/cohere.embed-multilingual-v3` | 1024 | Multilingual |
+| `bedrock/cohere.embed-v4:0` | 1024 | Cohere Embed v4 — text + image |
 
 #### Enabling Bedrock Models
 
@@ -233,38 +258,27 @@ Your IAM role/user needs these permissions:
 }
 ```
 
-#### Docker Configuration
+#### Docker
 
-The Docker image supports two build targets:
-
-- **`standard`** (default): OpenAI/Anthropic support only
-- **`aws`**: Includes AWS Bedrock support
+The Dockerfile has a dedicated `aws` build target, and `docker-compose.yml` provides an `aws` profile with pre-built AWS images:
 
 ```bash
-# Build AWS-enabled image
+# Build the AWS-enabled image directly
 docker build --target aws -t agent-memory-server:aws .
 
-# Or with Docker Compose
-DOCKER_TARGET=aws docker-compose up --build
+# Or use Docker Compose with the aws profile
+docker-compose --profile aws up
 ```
 
-When running, pass AWS credentials:
+Pass AWS credentials at runtime:
 
 ```bash
-docker run -e AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY -e AWS_REGION_NAME \
+docker run -e AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY \
+  -e REGION_NAME=us-east-1 -e AWS_REGION_NAME=us-east-1 \
   -e GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 \
   -e EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 \
   -e REDISVL_VECTOR_DIMENSIONS=1024 \
-  agent-memory-server:aws
-```
-
-Or mount credentials:
-
-```bash
-docker run -v ~/.aws:/root/.aws:ro \
-  -e AWS_PROFILE=my-profile \
-  -e AWS_REGION_NAME=us-east-1 \
-  agent-memory-server:aws
+  -p 8000:8000 redislabs/agent-memory-server-aws:latest
 ```
 
 #### Complete Example
@@ -273,17 +287,22 @@ Full Bedrock stack (keep all AI operations within AWS):
 
 ```bash
 # AWS credentials
+export REGION_NAME=us-east-1
 export AWS_REGION_NAME=us-east-1
 export AWS_ACCESS_KEY_ID=...
 export AWS_SECRET_ACCESS_KEY=...
 
-# Embeddings (bedrock/ prefix required)
+# Embeddings (bedrock/ prefix REQUIRED)
 export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0
 export REDISVL_VECTOR_DIMENSIONS=1024
 
 # Generation (no prefix needed)
 export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0
-export FAST_MODEL=anthropic.claude-3-5-haiku-20241022-v1:0
+export FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0
+
+# Start Redis and the server
+docker-compose up redis -d
+uv run agent-memory api
 ```
 
 ### Ollama (Local Models)
@@ -343,11 +362,11 @@ export EMBEDDING_MODEL=text-embedding-3-small  # OpenAI
 
 | Variable | Description | Default |
 |----------|-------------|---------|
-| `GENERATION_MODEL` | Primary model for AI tasks | `gpt-4o-mini` |
-| `FAST_MODEL` | Fast model for topic extraction, etc. | Same as `GENERATION_MODEL` |
-| `QUERY_OPTIMIZATION_MODEL` | Model for query optimization | Same as `GENERATION_MODEL` |
+| `GENERATION_MODEL` | Primary model for AI tasks | `gpt-5` |
+| `FAST_MODEL` | Fast model for topic extraction, etc. | `gpt-5-mini` |
+| `SLOW_MODEL` | Slower, more capable model for complex tasks | `gpt-5` |
 | `EMBEDDING_MODEL` | Model for vector embeddings | `text-embedding-3-small` |
-| `REDISVL_VECTOR_DIMENSIONS` | Override embedding dimensions | Auto-detected |
+| `REDISVL_VECTOR_DIMENSIONS` | Override embedding dimensions | `1536` (auto-detected for known models) |
 
 ### Model Validation
 
@@ -381,7 +400,11 @@ export REDISVL_VECTOR_DIMENSIONS=1024
 **Bedrock "Access Denied"**
 - Verify IAM permissions include `bedrock:InvokeModel`
 - Check model is enabled in your AWS region
-- Ensure correct `AWS_REGION_NAME`
+- Ensure both `REGION_NAME` and `AWS_REGION_NAME` are set correctly
+- See [AWS Bedrock Troubleshooting](aws-bedrock.md#troubleshooting) for more details
+
+**Bedrock "AWS-related dependencies might be missing"**
+- Install the `[aws]` extra: `pip install agent-memory-server[aws]` or `uv sync --extra aws`
 
 ### Debug Logging
 
@@ -417,6 +440,7 @@ The following are no longer required:
 
 ## See Also
 
+- [AWS Bedrock Models](aws-bedrock.md) - Complete Bedrock guide with Quick Start, IAM policies, and Docker setup
 - [Embedding Providers](embedding-providers.md) - Detailed embedding configuration
 - [Configuration](configuration.md) - All environment variables
 - [Query Optimization](query-optimization.md) - Model selection for query optimization

From f1dded2f7a0417d8996847ce4dd2f8beb3717839 Mon Sep 17 00:00:00 2001
From: Nitin Kanukolanu <nitinkanukolanu@gmail.com>
Date: Thu, 16 Apr 2026 10:10:42 -0400
Subject: [PATCH 2/4] fix: update stale docstring in langchain integration to
 use create_agent

The module docstring was using the removed create_tool_calling_agent and
AgentExecutor API from LangChain < v0.2. Updated to use create_agent
from langchain.agents, which is the current API.
---
 .../integrations/langchain.py                  | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/agent-memory-client/agent_memory_client/integrations/langchain.py b/agent-memory-client/agent_memory_client/integrations/langchain.py
index be9ec46b..2675617b 100644
--- a/agent-memory-client/agent_memory_client/integrations/langchain.py
+++ b/agent-memory-client/agent_memory_client/integrations/langchain.py
@@ -8,9 +8,8 @@
     ```python
     from agent_memory_client import create_memory_client
     from agent_memory_client.integrations.langchain import get_memory_tools
-    from langchain.agents import create_tool_calling_agent, AgentExecutor
+    from langchain.agents import create_agent
     from langchain_openai import ChatOpenAI
-    from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
 
     # Initialize memory client
     memory_client = await create_memory_client("http://localhost:8000")
@@ -24,16 +23,15 @@
 
     # Use with LangChain agent
     llm = ChatOpenAI(model="gpt-4o")
-    prompt = ChatPromptTemplate.from_messages([
-        ("system", "You are a helpful assistant with memory."),
-        ("human", "{input}"),
-        MessagesPlaceholder("agent_scratchpad"),
-    ])
-    agent = create_tool_calling_agent(llm, tools, prompt)
-    executor = AgentExecutor(agent=agent, tools=tools)
+    agent = create_agent(
+        llm, tools,
+        system_prompt="You are a helpful assistant with memory."
+    )
 
     # Run the agent
-    result = await executor.ainvoke({"input": "Remember that I love pizza"})
+    result = await agent.ainvoke(
+        {"messages": [("human", "Remember that I love pizza")]}
+    )
     ```
 """
 

From 7cd01fb30a08e4748f34f5a60d0e0366af987644 Mon Sep 17 00:00:00 2001
From: Nitin Kanukolanu <nitinkanukolanu@gmail.com>
Date: Thu, 16 Apr 2026 10:28:11 -0400
Subject: [PATCH 3/4] docs: address PR review feedback from Copilot

- Fix REGION_NAME vs AWS_REGION_NAME: clarify AWS_REGION_NAME is required
  (LiteLLM), REGION_NAME is optional (server-side boto3 model-existence checks)
- Fix IAM role section: note that server boto3 utilities currently require
  explicit credentials, but LiteLLM handles IAM roles natively
- Fix REDISVL_VECTOR_DIMENSIONS description: it is a fallback/override, not
  auto-detected
- Add asterisk notes on embedding models not in MODEL_CONFIGS (titan-embed-image,
  cohere.embed-v4) requiring explicit REDISVL_VECTOR_DIMENSIONS
---
 docs/aws-bedrock.md   | 24 ++++++++++++++----------
 docs/llm-providers.md | 22 +++++++++++-----------
 2 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/docs/aws-bedrock.md b/docs/aws-bedrock.md
index a778c561..4e64cb84 100644
--- a/docs/aws-bedrock.md
+++ b/docs/aws-bedrock.md
@@ -27,8 +27,8 @@ uv sync --extra aws
 # (or use an IAM role / AWS CLI profile instead — see "AWS Credentials" below)
 export AWS_ACCESS_KEY_ID=your-access-key-id
 export AWS_SECRET_ACCESS_KEY=your-secret-access-key
-export REGION_NAME=us-east-1          # Used by the server's own model-validation client
-export AWS_REGION_NAME=us-east-1      # Used by LiteLLM for Bedrock API calls
+export AWS_REGION_NAME=us-east-1      # Required: used by LiteLLM for Bedrock API calls
+export REGION_NAME=us-east-1          # Optional: used by server-side boto3 utilities (model-existence checks)
 
 # ── Bedrock embedding model (bedrock/ prefix REQUIRED) ──────────
 export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0
@@ -67,10 +67,10 @@ The server reads the AWS region in **two** places:
 
 | Variable | Read by | Purpose |
 |----------|---------|---------|
-| `REGION_NAME` | The server's Settings (pydantic-settings) | Creating `boto3` sessions for model-existence validation |
-| `AWS_REGION_NAME` | LiteLLM | Making the actual Bedrock inference API calls |
+| `AWS_REGION_NAME` | LiteLLM | **Required.** Making the actual Bedrock inference and embedding API calls |
+| `REGION_NAME` | The server's Settings (pydantic-settings) | **Optional.** Used only by server-side `boto3` utilities that check whether a Bedrock model exists (`_aws/utils.py`) |
 
-Set **both** to the same value to avoid surprises. If you rely solely on an IAM role or AWS CLI profile, `boto3` and LiteLLM may auto-detect the region from the instance metadata or `~/.aws/config`, but explicitly setting the variables is recommended.
+If you only use LiteLLM for Bedrock (the common case), `AWS_REGION_NAME` is sufficient. Set `REGION_NAME` as well if you want the server's optional model-existence checks to work. When both are used, set them to the same value.
 
 ---
 
@@ -97,10 +97,12 @@ All LLM operations in the server go through [LiteLLM](https://docs.litellm.ai/),
 |----------|----------|------------|-------------|
 | `bedrock/amazon.titan-embed-text-v2:0` | Amazon | 1024 | Latest Titan text embedding (recommended) |
 | `bedrock/amazon.titan-embed-text-v1` | Amazon | 1536 | Original Titan text embedding |
-| `bedrock/amazon.titan-embed-image-v1` | Amazon | 1024 | Titan multimodal (text + image) embedding |
+| `bedrock/amazon.titan-embed-image-v1` | Amazon | 1024 | Titan multimodal (text + image) embedding * |
 | `bedrock/cohere.embed-english-v3` | Cohere | 1024 | English-focused embeddings |
 | `bedrock/cohere.embed-multilingual-v3` | Cohere | 1024 | Multilingual embeddings |
-| `bedrock/cohere.embed-v4:0` | Cohere | 1024 | Cohere Embed v4 — text + image embedding |
+| `bedrock/cohere.embed-v4:0` | Cohere | 1024 | Cohere Embed v4 — text + image embedding * |
+
+> \* Models marked with **\*** are not in `MODEL_CONFIGS` and their dimensions cannot be auto-resolved. You **must** set `REDISVL_VECTOR_DIMENSIONS=1024` explicitly when using them, or you will get vector-size mismatch errors at runtime.
 
 ### Generation Models (Anthropic Claude on Bedrock)
 
@@ -147,8 +149,8 @@ Configure the following environment variables to use Bedrock models:
 
 ```bash
 # Required: AWS region where Bedrock is available
-REGION_NAME=us-east-1            # For the server's own boto3 sessions
-AWS_REGION_NAME=us-east-1        # For LiteLLM's Bedrock calls
+AWS_REGION_NAME=us-east-1        # Required: for LiteLLM's Bedrock calls
+REGION_NAME=us-east-1            # Optional: for server-side boto3 model-existence checks
 
 # For Bedrock Embedding Models (bedrock/ prefix REQUIRED)
 EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0
@@ -185,7 +187,9 @@ aws_secret_access_key = your-secret-access-key
 
 #### Option 3: IAM Role (Recommended for AWS deployments)
 
-When running on AWS infrastructure (EC2, ECS, Lambda, etc.), use IAM roles for automatic credential management. No explicit credentials are needed — `boto3` and LiteLLM will discover credentials from the instance metadata service.
+When running on AWS infrastructure (EC2, ECS, Lambda, etc.), IAM roles provide automatic credential management. LiteLLM will discover credentials from the instance metadata service automatically.
+
+> **Note:** The server's built-in `boto3` model-existence utilities (`_aws/utils.py`) currently require explicit `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables. If you rely solely on an IAM role, LiteLLM Bedrock calls will work, but the optional model-existence checks will be skipped. This limitation may be removed in a future release.
 
 #### Option 4: AWS SSO / AWS CLI Profile
 
diff --git a/docs/llm-providers.md b/docs/llm-providers.md
index f4510aa9..91278d75 100644
--- a/docs/llm-providers.md
+++ b/docs/llm-providers.md
@@ -50,8 +50,8 @@ export EMBEDDING_MODEL=text-embedding-3-small   # Use OpenAI for embeddings
 # AWS Bedrock (full stack — see "AWS Bedrock" section for details)
 export AWS_ACCESS_KEY_ID=...
 export AWS_SECRET_ACCESS_KEY=...
-export REGION_NAME=us-east-1                                      # Server's own boto3 sessions
-export AWS_REGION_NAME=us-east-1                                  # LiteLLM Bedrock calls
+export AWS_REGION_NAME=us-east-1                                  # Required: LiteLLM Bedrock calls
+export REGION_NAME=us-east-1                                      # Optional: server-side boto3 utilities
 export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 # No prefix for generation
 export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0       # bedrock/ prefix REQUIRED
 export REDISVL_VECTOR_DIMENSIONS=1024                             # Must match embedding model
@@ -152,17 +152,15 @@ Bedrock uses standard AWS credentials. Configure using any of these methods:
 # Option 1: Environment variables (recommended for development)
 export AWS_ACCESS_KEY_ID=AKIA...
 export AWS_SECRET_ACCESS_KEY=...
-export REGION_NAME=us-east-1        # Server's own boto3 sessions (model validation)
-export AWS_REGION_NAME=us-east-1    # LiteLLM's Bedrock API calls
+export AWS_REGION_NAME=us-east-1    # Required: LiteLLM's Bedrock API calls
+export REGION_NAME=us-east-1        # Optional: server-side boto3 model-existence checks
 
 # Option 2: AWS CLI profile
 export AWS_PROFILE=my-profile
-export REGION_NAME=us-east-1
 export AWS_REGION_NAME=us-east-1
 
 # Option 3: IAM role (recommended for production on AWS)
-# No credentials needed — uses instance/container role
-export REGION_NAME=us-east-1
+# No explicit credentials needed — LiteLLM discovers them from instance metadata
 export AWS_REGION_NAME=us-east-1
 
 # Option 4: AWS SSO
@@ -170,7 +168,7 @@ aws sso login --profile your-profile
 export AWS_PROFILE=your-profile
 ```
 
-> **Why two region variables?** The server reads `REGION_NAME` (via pydantic-settings) for its own `boto3` model-validation client, while LiteLLM reads `AWS_REGION_NAME` for the actual Bedrock inference calls. Set both to the same value.
+> **Why two region variables?** `AWS_REGION_NAME` is required for LiteLLM's Bedrock inference calls. `REGION_NAME` is only needed if you use the server's optional `boto3`-based model-existence checks (`_aws/utils.py`). When both are set, use the same value.
 
 #### Generation Models
 
@@ -219,10 +217,12 @@ export EMBEDDING_MODEL=amazon.titan-embed-text-v2:0
 |----------|------------|-------------|
 | `bedrock/amazon.titan-embed-text-v2:0` | 1024 | Latest Titan text embedding (recommended) |
 | `bedrock/amazon.titan-embed-text-v1` | 1536 | Original Titan text embedding |
-| `bedrock/amazon.titan-embed-image-v1` | 1024 | Titan multimodal (text + image) embedding |
+| `bedrock/amazon.titan-embed-image-v1` | 1024 | Titan multimodal (text + image) embedding * |
 | `bedrock/cohere.embed-english-v3` | 1024 | English-focused |
 | `bedrock/cohere.embed-multilingual-v3` | 1024 | Multilingual |
-| `bedrock/cohere.embed-v4:0` | 1024 | Cohere Embed v4 — text + image |
+| `bedrock/cohere.embed-v4:0` | 1024 | Cohere Embed v4 — text + image * |
+
+> \* Models marked with **\*** are not in `MODEL_CONFIGS`, so their dimensions cannot be auto-resolved. Set `REDISVL_VECTOR_DIMENSIONS=1024` explicitly when using them.
 
 #### Enabling Bedrock Models
 
@@ -366,7 +366,7 @@ export EMBEDDING_MODEL=text-embedding-3-small  # OpenAI
 | `FAST_MODEL` | Fast model for topic extraction, etc. | `gpt-5-mini` |
 | `SLOW_MODEL` | Slower, more capable model for complex tasks | `gpt-5` |
 | `EMBEDDING_MODEL` | Model for vector embeddings | `text-embedding-3-small` |
-| `REDISVL_VECTOR_DIMENSIONS` | Override embedding dimensions | `1536` (auto-detected for known models) |
+| `REDISVL_VECTOR_DIMENSIONS` | Fallback/override for embedding dimensions when they cannot be resolved from `MODEL_CONFIGS` | `1536` |
 
 ### Model Validation
 

From 51b0264624febe4b8c41bb40f61153c8fdc70f84 Mon Sep 17 00:00:00 2001
From: Nitin Kanukolanu <nitinkanukolanu@gmail.com>
Date: Thu, 16 Apr 2026 10:35:02 -0400
Subject: [PATCH 4/4] docs: address reviewer feedback

- Clarify bedrock/ prefix is optional (not prohibited) for generation models
- Remove boto3/botocore version ranges from docs
- Rename LLM Proxy to LLM Proxy (LiteLLM) in architecture diagram
- Update wording from 'no prefix needed' to 'prefix optional'
---
 docs/aws-bedrock.md   | 10 +++++-----
 docs/llm-providers.md |  8 ++++----
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/docs/aws-bedrock.md b/docs/aws-bedrock.md
index 4e64cb84..283d4cfd 100644
--- a/docs/aws-bedrock.md
+++ b/docs/aws-bedrock.md
@@ -81,9 +81,9 @@ All LLM operations in the server go through [LiteLLM](https://docs.litellm.ai/),
 | Model type | Prefix required? | Example |
 |------------|------------------|---------|
 | **Embedding** | **Yes** — must include `bedrock/` | `bedrock/amazon.titan-embed-text-v2:0` |
-| **Generation (chat)** | **No** — Bedrock model IDs are recognized automatically | `anthropic.claude-sonnet-4-5-20250929-v1:0` |
+| **Generation (chat)** | **Optional** — not required, but allowed | `anthropic.claude-sonnet-4-5-20250929-v1:0` |
 
-**Why the difference?** LiteLLM can infer the provider for generation models from the Bedrock-style model ID (e.g., `anthropic.claude-*`). For embedding models, however, the `bedrock/` prefix is the only way LiteLLM distinguishes a Bedrock embedding call from other providers. If you omit the prefix on an embedding model, the server will auto-add it and emit a **deprecation warning** — but this behaviour will be removed in a future release.
+**Why the difference?** LiteLLM can infer the provider for generation models from the Bedrock-style model ID (e.g., `anthropic.claude-*`), so the `bedrock/` prefix is optional for generation. Adding it (e.g., `bedrock/anthropic.claude-sonnet-4-5-20250929-v1:0`) also works. For embedding models, the `bedrock/` prefix is the only way LiteLLM distinguishes a Bedrock embedding call from other providers, so it is **required**. If you omit the prefix on an embedding model, the server will auto-add it and emit a **deprecation warning**, but this fallback will be removed in a future release.
 
 ---
 
@@ -136,8 +136,8 @@ uv sync --extra aws
 
 This installs:
 
-- **`boto3`** (`>=1.42.1,<2.0.0`) — AWS SDK for Python
-- **`botocore`** (`>=1.42.1,<2.0.0`) — Low-level AWS client library
+- **`boto3`** — AWS SDK for Python
+- **`botocore`** — Low-level AWS client library
 
 > **Without these packages**, any attempt to use Bedrock models will fail at import time. The standard install (`pip install agent-memory-server`) does **not** include them.
 
@@ -156,7 +156,7 @@ REGION_NAME=us-east-1            # Optional: for server-side boto3 model-existen
 EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0
 REDISVL_VECTOR_DIMENSIONS=1024   # Must match the embedding model's output dimensions
 
-# For Bedrock LLM Generation Models (no prefix needed)
+# For Bedrock LLM Generation Models (bedrock/ prefix optional)
 GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0
 FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0
 
diff --git a/docs/llm-providers.md b/docs/llm-providers.md
index 91278d75..f0021c77 100644
--- a/docs/llm-providers.md
+++ b/docs/llm-providers.md
@@ -16,7 +16,7 @@ All LLM operations go through a single `LLMClient` abstraction:
 │           └──────────┬───────────────┘                   │
 │                      ▼                                   │
 │               ┌──────────────┐                           │
-│               │   LLM Proxy  │                           │
+│               │LLM Proxy (LiteLLM)│                        │
 │               └──────┬───────┘                           │
 └──────────────────────┼───────────────────────────────────┘
                        ▼
@@ -52,7 +52,7 @@ export AWS_ACCESS_KEY_ID=...
 export AWS_SECRET_ACCESS_KEY=...
 export AWS_REGION_NAME=us-east-1                                  # Required: LiteLLM Bedrock calls
 export REGION_NAME=us-east-1                                      # Optional: server-side boto3 utilities
-export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 # No prefix for generation
+export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 # bedrock/ prefix optional for generation
 export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0       # bedrock/ prefix REQUIRED
 export REDISVL_VECTOR_DIMENSIONS=1024                             # Must match embedding model
 ```
@@ -134,7 +134,7 @@ AWS Bedrock provides access to foundation models from multiple providers (Anthro
 
 #### Installation
 
-AWS Bedrock support requires the `[aws]` extra, which installs `boto3` (`>=1.42.1`) and `botocore` (`>=1.42.1`). Without these packages, any Bedrock operation will fail at import time.
+AWS Bedrock support requires the `[aws]` extra, which installs `boto3` and `botocore`. Without these packages, any Bedrock operation will fail at import time.
 
 ```bash
 # With pip
@@ -172,7 +172,7 @@ export AWS_PROFILE=your-profile
 
 #### Generation Models
 
-Generation models use Bedrock-native model IDs **without** a prefix — LiteLLM recognises them automatically.
+Generation models use Bedrock-native model IDs. The `bedrock/` prefix is **optional** for generation (LiteLLM recognizes Bedrock model IDs automatically), but adding it also works.
 
 ```bash
 # Claude models on Bedrock (no prefix needed)