From 8b820f2c28b59449a0ca66159316c7f6f48849b1 Mon Sep 17 00:00:00 2001 From: Nitin Kanukolanu Date: Thu, 16 Apr 2026 10:05:22 -0400 Subject: [PATCH 1/4] docs: update AWS Bedrock documentation with comprehensive coverage - Add Quick Start section to aws-bedrock.md for running a full Bedrock instance - Document bedrock/ prefix requirement for embedding models vs no prefix for generation - Add installation instructions for the [aws] extra (boto3, botocore) - Fix Docker Compose instructions to use --profile aws instead of DOCKER_TARGET - Fix region env var references (REGION_NAME + AWS_REGION_NAME) - Add latest Anthropic Claude models (Opus 4.6, Sonnet 4.6, Opus 4.1, Sonnet 4, etc.) - Add additional embedding models (Titan multimodal, Cohere Embed v4) - Fix default GENERATION_MODEL in llm-providers.md to match config.py - Add hybrid config examples (Bedrock embeddings + OpenAI generation and vice versa) - Add troubleshooting entries for missing AWS dependencies - Cross-reference between aws-bedrock.md and llm-providers.md --- docs/aws-bedrock.md | 297 +++++++++++++++++++++++++++--------------- docs/llm-providers.md | 140 +++++++++++--------- 2 files changed, 273 insertions(+), 164 deletions(-) diff --git a/docs/aws-bedrock.md b/docs/aws-bedrock.md index 6bc501e8..a778c561 100644 --- a/docs/aws-bedrock.md +++ b/docs/aws-bedrock.md @@ -1,60 +1,143 @@ # AWS Bedrock Models -> **Note:** This documentation has been consolidated into [LLM Providers](llm-providers.md#aws-bedrock). -> This page is kept for reference but the LLM Providers guide is the authoritative source. - The Redis Agent Memory Server supports [Amazon Bedrock](https://aws.amazon.com/bedrock/) for both **embedding models** and **LLM generation models**. This allows you to use AWS-native AI models while keeping your data within the AWS ecosystem. -## Quick Reference +> **See also:** [LLM Providers](llm-providers.md#aws-bedrock) for a broader overview of all supported providers, and [Embedding Providers](embedding-providers.md) for embedding-specific configuration. + +## Quick Start — Run a Full Bedrock-Backed Instance + +Follow these steps to get the memory server running entirely on AWS Bedrock in under five minutes. + +### 1. Install the `[aws]` extra + +The server's core install does **not** include AWS SDK libraries. You must install the `[aws]` extra so that `boto3` (AWS SDK for Python) and `botocore` (low-level AWS client library) are available at runtime. Without these packages, any Bedrock operation will fail with an import error. + +```bash +# With pip +pip install agent-memory-server[aws] + +# With uv (used by this project) +uv sync --extra aws +``` + +### 2. Export environment variables + +```bash +# ── AWS credentials ────────────────────────────────────────────── +# (or use an IAM role / AWS CLI profile instead — see "AWS Credentials" below) +export AWS_ACCESS_KEY_ID=your-access-key-id +export AWS_SECRET_ACCESS_KEY=your-secret-access-key +export REGION_NAME=us-east-1 # Used by the server's own model-validation client +export AWS_REGION_NAME=us-east-1 # Used by LiteLLM for Bedrock API calls + +# ── Bedrock embedding model (bedrock/ prefix REQUIRED) ────────── +export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 +export REDISVL_VECTOR_DIMENSIONS=1024 # Must match the embedding model + +# ── Bedrock generation models (NO prefix needed) ──────────────── +export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 +export FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0 + +# ── Redis ──────────────────────────────────────────────────────── +export REDIS_URL=redis://localhost:6379 +``` + +### 3. Start Redis and the server + +```bash +# Start Redis (requires Docker) +docker-compose up redis -d + +# Start the memory server +uv run agent-memory api +``` + +The REST API is now available at `http://localhost:8000` (docs at `/docs`). + +> **Tip:** To also run background tasks (memory extraction, compaction, etc.), start a worker in a second terminal: +> ```bash +> uv run agent-memory task-worker +> ``` + +--- -For complete AWS Bedrock configuration, see [LLM Providers - AWS Bedrock](llm-providers.md#aws-bedrock). +## Why Two Region Variables? -**Key points:** -- All LLM operations use [LiteLLM](https://docs.litellm.ai/) internally -- Bedrock embedding models require the `bedrock/` prefix (e.g., `bedrock/amazon.titan-embed-text-v2:0`) -- Bedrock generation models do not need a prefix (e.g., `anthropic.claude-sonnet-4-5-20250929-v1:0`) -- The `[aws]` extra installs `boto3` and `botocore` for AWS authentication +The server reads the AWS region in **two** places: -## Overview +| Variable | Read by | Purpose | +|----------|---------|---------| +| `REGION_NAME` | The server's Settings (pydantic-settings) | Creating `boto3` sessions for model-existence validation | +| `AWS_REGION_NAME` | LiteLLM | Making the actual Bedrock inference API calls | -Amazon Bedrock provides access to a wide variety of foundation models from leading AI providers. The Redis Agent Memory Server supports using Bedrock for: +Set **both** to the same value to avoid surprises. If you rely solely on an IAM role or AWS CLI profile, `boto3` and LiteLLM may auto-detect the region from the instance metadata or `~/.aws/config`, but explicitly setting the variables is recommended. -1. **Embedding Models** - For semantic search and memory retrieval -2. **LLM Generation Models** - For memory extraction, summarization, and topic modeling +--- -### Supported Embedding Models +## Understanding the `bedrock/` Prefix -> **Important:** Use the `bedrock/` prefix for embedding models. +All LLM operations in the server go through [LiteLLM](https://docs.litellm.ai/), which uses a provider prefix to route requests to the correct backend. + +| Model type | Prefix required? | Example | +|------------|------------------|---------| +| **Embedding** | **Yes** — must include `bedrock/` | `bedrock/amazon.titan-embed-text-v2:0` | +| **Generation (chat)** | **No** — Bedrock model IDs are recognized automatically | `anthropic.claude-sonnet-4-5-20250929-v1:0` | + +**Why the difference?** LiteLLM can infer the provider for generation models from the Bedrock-style model ID (e.g., `anthropic.claude-*`). For embedding models, however, the `bedrock/` prefix is the only way LiteLLM distinguishes a Bedrock embedding call from other providers. If you omit the prefix on an embedding model, the server will auto-add it and emit a **deprecation warning** — but this behaviour will be removed in a future release. + +--- + +## Supported Models + +### Embedding Models + +> **Important:** Always use the `bedrock/` prefix for embedding models. | Model ID | Provider | Dimensions | Description | |----------|----------|------------|-------------| -| `bedrock/amazon.titan-embed-text-v2:0` | Amazon | 1024 | Latest Titan embedding model | -| `bedrock/amazon.titan-embed-text-v1` | Amazon | 1536 | Original Titan embedding model | +| `bedrock/amazon.titan-embed-text-v2:0` | Amazon | 1024 | Latest Titan text embedding (recommended) | +| `bedrock/amazon.titan-embed-text-v1` | Amazon | 1536 | Original Titan text embedding | +| `bedrock/amazon.titan-embed-image-v1` | Amazon | 1024 | Titan multimodal (text + image) embedding | | `bedrock/cohere.embed-english-v3` | Cohere | 1024 | English-focused embeddings | | `bedrock/cohere.embed-multilingual-v3` | Cohere | 1024 | Multilingual embeddings | +| `bedrock/cohere.embed-v4:0` | Cohere | 1024 | Cohere Embed v4 — text + image embedding | -### Pre-configured LLM Generation Models +### Generation Models (Anthropic Claude on Bedrock) -The following models are pre-configured in the codebase: +The following Anthropic Claude models are available on Bedrock. Models marked **pre-configured** have entries in `MODEL_CONFIGS` (in `config.py`) with validated token limits; the others are fully usable by setting the corresponding environment variable — LiteLLM routes the request based on the Bedrock model ID. -| Model ID | Provider | Max Tokens | Description | -|----------|----------|------------|-------------| -| `anthropic.claude-sonnet-4-5-20250929-v1:0` | Anthropic | 200,000 | Claude 4.5 Sonnet | -| `anthropic.claude-haiku-4-5-20251001-v1:0` | Anthropic | 200,000 | Claude 4.5 Haiku | -| `anthropic.claude-opus-4-5-20251101-v1:0` | Anthropic | 200,000 | Claude 4.5 Opus | +| Model ID | Description | Pre-configured | +|----------|-------------|:--------------:| +| `anthropic.claude-opus-4-6-v1` | Claude Opus 4.6 — latest and most capable | | +| `anthropic.claude-sonnet-4-6` | Claude Sonnet 4.6 | | +| `anthropic.claude-opus-4-5-20251101-v1:0` | Claude Opus 4.5 | ✓ | +| `anthropic.claude-sonnet-4-5-20250929-v1:0` | Claude Sonnet 4.5 | ✓ | +| `anthropic.claude-haiku-4-5-20251001-v1:0` | Claude Haiku 4.5 — fast & cost-effective | ✓ | +| `anthropic.claude-opus-4-1-20250805-v1:0` | Claude Opus 4.1 | | +| `anthropic.claude-sonnet-4-20250514-v1:0` | Claude Sonnet 4 | | +| `anthropic.claude-3-5-haiku-20241022-v1:0` | Claude 3.5 Haiku | | +| `anthropic.claude-3-haiku-20240307-v1:0` | Claude 3 Haiku | | + +> **Tip:** For the recommended quick-start configuration, use `anthropic.claude-sonnet-4-5-20250929-v1:0` as `GENERATION_MODEL` and `anthropic.claude-haiku-4-5-20251001-v1:0` as `FAST_MODEL`. ## Installation -AWS Bedrock support requires additional dependencies. Install them with: +AWS Bedrock support requires additional dependencies. Install the `[aws]` extra: ```bash +# With pip pip install agent-memory-server[aws] + +# With uv (recommended — used by this project) +uv sync --extra aws ``` This installs: -- `boto3` - AWS SDK for Python -- `botocore` - Low-level AWS client library +- **`boto3`** (`>=1.42.1,<2.0.0`) — AWS SDK for Python +- **`botocore`** (`>=1.42.1,<2.0.0`) — Low-level AWS client library + +> **Without these packages**, any attempt to use Bedrock models will fail at import time. The standard install (`pip install agent-memory-server`) does **not** include them. ## Configuration @@ -64,10 +147,12 @@ Configure the following environment variables to use Bedrock models: ```bash # Required: AWS region where Bedrock is available -AWS_REGION_NAME=us-east-1 +REGION_NAME=us-east-1 # For the server's own boto3 sessions +AWS_REGION_NAME=us-east-1 # For LiteLLM's Bedrock calls -# For Bedrock Embedding Models (note: bedrock/ prefix required) +# For Bedrock Embedding Models (bedrock/ prefix REQUIRED) EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 +REDISVL_VECTOR_DIMENSIONS=1024 # Must match the embedding model's output dimensions # For Bedrock LLM Generation Models (no prefix needed) GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 @@ -100,7 +185,7 @@ aws_secret_access_key = your-secret-access-key #### Option 3: IAM Role (Recommended for AWS deployments) -When running on AWS infrastructure (EC2, ECS, Lambda, etc.), use IAM roles for automatic credential management. No explicit credentials are needed. +When running on AWS infrastructure (EC2, ECS, Lambda, etc.), use IAM roles for automatic credential management. No explicit credentials are needed — `boto3` and LiteLLM will discover credentials from the instance metadata service. #### Option 4: AWS SSO / AWS CLI Profile @@ -116,50 +201,70 @@ export AWS_PROFILE=your-profile ### Docker Configuration -The Docker image supports two build targets: +The Dockerfile provides two build targets (multi-stage): -- **`standard`** (default): OpenAI/Anthropic support only -- **`aws`**: Includes AWS Bedrock embedding models support +- **`standard`** (default) — OpenAI / Anthropic support only +- **`aws`** — Includes `boto3` and `botocore` for AWS Bedrock support #### Building the AWS-enabled Image ```bash # Build directly with Docker docker build --target aws -t agent-memory-server:aws . - -# Or use Docker Compose with the DOCKER_TARGET variable -DOCKER_TARGET=aws docker-compose up --build ``` -#### Docker Compose Configuration +#### Docker Compose -When using Docker Compose, set the `DOCKER_TARGET` environment variable to `aws`: +The `docker-compose.yml` ships with a dedicated **`aws` profile** that uses pre-built AWS images (`redislabs/agent-memory-server-aws`). Activate it with `--profile aws`: ```bash -# Start with AWS Bedrock support -DOCKER_TARGET=aws docker-compose up --build +# Start the full AWS stack (API + MCP + task worker + Redis) +docker-compose --profile aws up -# Or for the production-like setup -DOCKER_TARGET=aws docker-compose -f docker-compose-task-workers.yml up --build +# Or start only the API and Redis +docker-compose --profile aws up api-aws redis ``` -Create a `.env` file with your credentials and configuration: - -```bash -# Docker build target -DOCKER_TARGET=aws +> **Note:** The `aws` profile services (`api-aws`, `mcp-aws`, `task-worker-aws`) are separate from the standard services. Do **not** mix profiles — run either `docker-compose up` (standard) or `docker-compose --profile aws up`. -# Embedding model (note: bedrock/ prefix required) -EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 +Create a `.env` file with your credentials and model configuration. The Docker Compose AWS services read this file automatically: +```bash # AWS credentials +REGION_NAME=us-east-1 AWS_REGION_NAME=us-east-1 AWS_ACCESS_KEY_ID=your-access-key-id AWS_SECRET_ACCESS_KEY=your-secret-access-key -AWS_SESSION_TOKEN=your-session-token # Optional +AWS_SESSION_TOKEN=your-session-token # Optional, for temporary credentials + +# Embedding model (bedrock/ prefix required) +EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 +REDISVL_VECTOR_DIMENSIONS=1024 + +# Generation models (override the defaults in docker-compose.yml if desired) +GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 +FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0 +``` + +You can also pass AWS credentials at runtime with `docker run`: + +```bash +docker run -e AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY \ + -e REGION_NAME=us-east-1 -e AWS_REGION_NAME=us-east-1 \ + -e EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 \ + -e REDISVL_VECTOR_DIMENSIONS=1024 \ + -e GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 \ + -p 8000:8000 redislabs/agent-memory-server-aws:latest ``` -The Docker Compose files already include the AWS environment variables, so you only need to set them in your `.env` file or environment. +Or mount your AWS credentials directory: + +```bash +docker run -v ~/.aws:/root/.aws:ro \ + -e AWS_PROFILE=my-profile \ + -e REGION_NAME=us-east-1 -e AWS_REGION_NAME=us-east-1 \ + -p 8000:8000 redislabs/agent-memory-server-aws:latest +``` ## Required IAM Permissions @@ -224,17 +329,16 @@ REDISVL_VECTOR_DIMENSIONS=1536 ### Example 1: Bedrock Embeddings with OpenAI Generation ```bash -# Embedding model (Bedrock - note: bedrock/ prefix required) +# Embedding model (Bedrock — bedrock/ prefix required) EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 +REDISVL_VECTOR_DIMENSIONS=1024 # AWS Configuration +REGION_NAME=us-east-1 AWS_REGION_NAME=us-east-1 AWS_ACCESS_KEY_ID=your-access-key-id AWS_SECRET_ACCESS_KEY=your-secret-access-key -# Embedding dimensions (must match embedding model) -REDISVL_VECTOR_DIMENSIONS=1024 - # Generation model (OpenAI) GENERATION_MODEL=gpt-4o OPENAI_API_KEY=your-openai-key @@ -247,35 +351,51 @@ REDIS_URL=redis://localhost:6379 ```bash # AWS Configuration +REGION_NAME=us-east-1 AWS_REGION_NAME=us-east-1 AWS_ACCESS_KEY_ID=your-access-key-id AWS_SECRET_ACCESS_KEY=your-secret-access-key -# Embedding model (Bedrock Titan - note: bedrock/ prefix required) +# Embedding model (Bedrock Titan — bedrock/ prefix required) EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 REDISVL_VECTOR_DIMENSIONS=1024 -# Generation models (Bedrock Claude - no prefix needed) +# Generation models (Bedrock Claude — no prefix needed) GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0 SLOW_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 -TOPIC_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0 # Other settings REDIS_URL=redis://localhost:6379 ``` +### Example 3: OpenAI Embeddings with Bedrock Generation + +```bash +# Embeddings via OpenAI +EMBEDDING_MODEL=text-embedding-3-small +OPENAI_API_KEY=your-openai-key + +# Generation via Bedrock +REGION_NAME=us-east-1 +AWS_REGION_NAME=us-east-1 +AWS_ACCESS_KEY_ID=your-access-key-id +AWS_SECRET_ACCESS_KEY=your-secret-access-key +GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 + +REDIS_URL=redis://localhost:6379 +``` + ### YAML Configuration ```yaml # config.yaml - Full Bedrock Stack region_name: us-east-1 -embedding_model: amazon.titan-embed-text-v2:0 +embedding_model: bedrock/amazon.titan-embed-text-v2:0 # bedrock/ prefix required redisvl_vector_dimensions: 1024 generation_model: anthropic.claude-sonnet-4-5-20250929-v1:0 fast_model: anthropic.claude-haiku-4-5-20251001-v1:0 slow_model: anthropic.claude-sonnet-4-5-20250929-v1:0 -topic_model: anthropic.claude-haiku-4-5-20251001-v1:0 redis_url: redis://localhost:6379 ``` @@ -305,68 +425,33 @@ Before using a Bedrock model, you must enable it in the AWS Console: ## Mixing Providers -You can mix and match providers for different use cases: - -### Bedrock Embeddings with OpenAI Generation - -```bash -# Embeddings via Bedrock (note: bedrock/ prefix required) -EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 -AWS_REGION_NAME=us-east-1 - -# Generation via OpenAI -GENERATION_MODEL=gpt-4o -OPENAI_API_KEY=your-openai-key -``` - -### Full Bedrock Stack (Embeddings + Generation) - -```bash -# All AWS - keep everything within your AWS environment -AWS_REGION_NAME=us-east-1 - -# Embeddings via Bedrock (note: bedrock/ prefix required) -EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 -REDISVL_VECTOR_DIMENSIONS=1024 - -# Generation via Bedrock Claude (no prefix needed) -GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 -FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0 -``` - -### OpenAI Embeddings with Bedrock Generation - -```bash -# Embeddings via OpenAI -EMBEDDING_MODEL=text-embedding-3-small -OPENAI_API_KEY=your-openai-key - -# Generation via Bedrock -AWS_REGION_NAME=us-east-1 -GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 -``` +You can mix and match providers for embeddings and generation independently. See the [Complete Configuration Examples](#complete-configuration-examples) section above for full `.env` snippets covering each combination. This flexibility allows you to: + - Keep all data within AWS for compliance requirements -- Use the best model for each task +- Use the best model for each task (e.g., OpenAI embeddings + Bedrock generation) - Optimize costs by choosing appropriate models for different operations ## Troubleshooting ### "AWS-related dependencies might be missing" -Install the AWS extras: +You need to install the `[aws]` extra. The standard install does not include `boto3`: ```bash pip install agent-memory-server[aws] +# or with uv: +uv sync --extra aws ``` -### "Missing environment variable 'AWS_REGION_NAME'" +### "Missing environment variable 'REGION_NAME'" -Set the AWS region: +The server's Settings class reads the region from `REGION_NAME`. Set it: ```bash -export AWS_REGION_NAME=us-east-1 +export REGION_NAME=us-east-1 +export AWS_REGION_NAME=us-east-1 # Also set this for LiteLLM ``` ### "Bedrock embedding model not found" diff --git a/docs/llm-providers.md b/docs/llm-providers.md index 911a579c..f4510aa9 100644 --- a/docs/llm-providers.md +++ b/docs/llm-providers.md @@ -16,7 +16,7 @@ All LLM operations go through a single `LLMClient` abstraction: │ └──────────┬───────────────┘ │ │ ▼ │ │ ┌──────────────┐ │ -│ │ LiteLLM │ │ +│ │ LLM Proxy │ │ │ └──────┬───────┘ │ └──────────────────────┼───────────────────────────────────┘ ▼ @@ -41,19 +41,24 @@ export OPENAI_API_KEY=sk-... export GENERATION_MODEL=gpt-4o export EMBEDDING_MODEL=text-embedding-3-small -# Anthropic +# Anthropic (requires a separate embedding provider — Anthropic has no embedding models) export ANTHROPIC_API_KEY=sk-ant-... export GENERATION_MODEL=claude-3-5-sonnet-20241022 -export EMBEDDING_MODEL=text-embedding-3-small # Use OpenAI for embeddings +export OPENAI_API_KEY=sk-... # Needed for embeddings +export EMBEDDING_MODEL=text-embedding-3-small # Use OpenAI for embeddings -# AWS Bedrock +# AWS Bedrock (full stack — see "AWS Bedrock" section for details) export AWS_ACCESS_KEY_ID=... export AWS_SECRET_ACCESS_KEY=... -export AWS_REGION_NAME=us-east-1 -export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 -export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 +export REGION_NAME=us-east-1 # Server's own boto3 sessions +export AWS_REGION_NAME=us-east-1 # LiteLLM Bedrock calls +export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 # No prefix for generation +export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 # bedrock/ prefix REQUIRED +export REDISVL_VECTOR_DIMENSIONS=1024 # Must match embedding model ``` +> **Bedrock users:** You must also install the `[aws]` extra (`pip install agent-memory-server[aws]` or `uv sync --extra aws`) to get `boto3` and `botocore`. See [AWS Bedrock](aws-bedrock.md) for a full walkthrough. + ## Supported Providers ### Generation Models (Chat Completions) @@ -125,15 +130,19 @@ export EMBEDDING_MODEL=text-embedding-3-small AWS Bedrock provides access to foundation models from multiple providers (Anthropic Claude, Amazon Titan, Cohere, etc.) through AWS infrastructure. +> **Full guide:** See [AWS Bedrock Models](aws-bedrock.md) for a complete walkthrough including a Quick Start demo, Docker Compose instructions, IAM policies, and troubleshooting. + #### Installation -AWS Bedrock support requires additional dependencies: +AWS Bedrock support requires the `[aws]` extra, which installs `boto3` (`>=1.42.1`) and `botocore` (`>=1.42.1`). Without these packages, any Bedrock operation will fail at import time. ```bash +# With pip pip install agent-memory-server[aws] -``` -This installs `boto3` and `botocore` for AWS authentication. +# With uv (recommended — used by this project) +uv sync --extra aws +``` #### Authentication @@ -143,14 +152,17 @@ Bedrock uses standard AWS credentials. Configure using any of these methods: # Option 1: Environment variables (recommended for development) export AWS_ACCESS_KEY_ID=AKIA... export AWS_SECRET_ACCESS_KEY=... -export AWS_REGION_NAME=us-east-1 +export REGION_NAME=us-east-1 # Server's own boto3 sessions (model validation) +export AWS_REGION_NAME=us-east-1 # LiteLLM's Bedrock API calls # Option 2: AWS CLI profile export AWS_PROFILE=my-profile +export REGION_NAME=us-east-1 export AWS_REGION_NAME=us-east-1 # Option 3: IAM role (recommended for production on AWS) -# No credentials needed - uses instance/container role +# No credentials needed — uses instance/container role +export REGION_NAME=us-east-1 export AWS_REGION_NAME=us-east-1 # Option 4: AWS SSO @@ -158,46 +170,59 @@ aws sso login --profile your-profile export AWS_PROFILE=your-profile ``` +> **Why two region variables?** The server reads `REGION_NAME` (via pydantic-settings) for its own `boto3` model-validation client, while LiteLLM reads `AWS_REGION_NAME` for the actual Bedrock inference calls. Set both to the same value. + #### Generation Models +Generation models use Bedrock-native model IDs **without** a prefix — LiteLLM recognises them automatically. + ```bash -# Claude models on Bedrock (no prefix needed for generation) +# Claude models on Bedrock (no prefix needed) export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 -export FAST_MODEL=anthropic.claude-3-5-haiku-20241022-v1:0 - -# Amazon Titan -export GENERATION_MODEL=amazon.titan-text-premier-v1:0 +export FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0 ``` -**Supported Bedrock generation models:** -- `anthropic.claude-sonnet-4-5-20250929-v1:0` (recommended) -- `anthropic.claude-3-5-sonnet-20241022-v2:0` -- `anthropic.claude-3-5-haiku-20241022-v1:0` -- `anthropic.claude-3-opus-20240229-v1:0` -- `amazon.titan-text-premier-v1:0` -- `amazon.titan-text-express-v1` +**Anthropic Claude models on Bedrock** (models marked ✓ are pre-configured in `MODEL_CONFIGS`): + +| Model ID | Description | Pre-configured | +|----------|-------------|:--------------:| +| `anthropic.claude-opus-4-6-v1` | Claude Opus 4.6 — latest | | +| `anthropic.claude-sonnet-4-6` | Claude Sonnet 4.6 | | +| `anthropic.claude-opus-4-5-20251101-v1:0` | Claude Opus 4.5 | ✓ | +| `anthropic.claude-sonnet-4-5-20250929-v1:0` | Claude Sonnet 4.5 | ✓ | +| `anthropic.claude-haiku-4-5-20251001-v1:0` | Claude Haiku 4.5 — fast & cost-effective | ✓ | +| `anthropic.claude-opus-4-1-20250805-v1:0` | Claude Opus 4.1 | | +| `anthropic.claude-sonnet-4-20250514-v1:0` | Claude Sonnet 4 | | +| `anthropic.claude-3-5-haiku-20241022-v1:0` | Claude 3.5 Haiku | | +| `anthropic.claude-3-haiku-20240307-v1:0` | Claude 3 Haiku | | + +Any Bedrock model can be used by setting the environment variable — LiteLLM routes based on the model ID convention. #### Embedding Models -> **Important:** Bedrock embedding models require the `bedrock/` prefix. +> **Important:** Bedrock embedding models **require** the `bedrock/` prefix. + +LiteLLM needs the `bedrock/` prefix to distinguish Bedrock embeddings from other providers. If you omit it, the server auto-adds the prefix and emits a **deprecation warning** — this fallback will be removed in a future release. ```bash -# Correct - use bedrock/ prefix +# Correct — use bedrock/ prefix export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 -REDISVL_VECTOR_DIMENSIONS=1024 # Must match embedding model +export REDISVL_VECTOR_DIMENSIONS=1024 # Must match embedding model dimensions -# Deprecated - unprefixed names emit a warning -export EMBEDDING_MODEL=amazon.titan-embed-text-v2:0 # Works but shows deprecation warning +# Deprecated — works but emits a warning +export EMBEDDING_MODEL=amazon.titan-embed-text-v2:0 ``` **Supported Bedrock embedding models:** | Model ID | Dimensions | Description | |----------|------------|-------------| -| `bedrock/amazon.titan-embed-text-v2:0` | 1024 | Latest Titan (recommended) | -| `bedrock/amazon.titan-embed-text-v1` | 1536 | Original Titan | +| `bedrock/amazon.titan-embed-text-v2:0` | 1024 | Latest Titan text embedding (recommended) | +| `bedrock/amazon.titan-embed-text-v1` | 1536 | Original Titan text embedding | +| `bedrock/amazon.titan-embed-image-v1` | 1024 | Titan multimodal (text + image) embedding | | `bedrock/cohere.embed-english-v3` | 1024 | English-focused | | `bedrock/cohere.embed-multilingual-v3` | 1024 | Multilingual | +| `bedrock/cohere.embed-v4:0` | 1024 | Cohere Embed v4 — text + image | #### Enabling Bedrock Models @@ -233,38 +258,27 @@ Your IAM role/user needs these permissions: } ``` -#### Docker Configuration +#### Docker -The Docker image supports two build targets: - -- **`standard`** (default): OpenAI/Anthropic support only -- **`aws`**: Includes AWS Bedrock support +The Dockerfile has a dedicated `aws` build target, and `docker-compose.yml` provides an `aws` profile with pre-built AWS images: ```bash -# Build AWS-enabled image +# Build the AWS-enabled image directly docker build --target aws -t agent-memory-server:aws . -# Or with Docker Compose -DOCKER_TARGET=aws docker-compose up --build +# Or use Docker Compose with the aws profile +docker-compose --profile aws up ``` -When running, pass AWS credentials: +Pass AWS credentials at runtime: ```bash -docker run -e AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY -e AWS_REGION_NAME \ +docker run -e AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY \ + -e REGION_NAME=us-east-1 -e AWS_REGION_NAME=us-east-1 \ -e GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 \ -e EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 \ -e REDISVL_VECTOR_DIMENSIONS=1024 \ - agent-memory-server:aws -``` - -Or mount credentials: - -```bash -docker run -v ~/.aws:/root/.aws:ro \ - -e AWS_PROFILE=my-profile \ - -e AWS_REGION_NAME=us-east-1 \ - agent-memory-server:aws + -p 8000:8000 redislabs/agent-memory-server-aws:latest ``` #### Complete Example @@ -273,17 +287,22 @@ Full Bedrock stack (keep all AI operations within AWS): ```bash # AWS credentials +export REGION_NAME=us-east-1 export AWS_REGION_NAME=us-east-1 export AWS_ACCESS_KEY_ID=... export AWS_SECRET_ACCESS_KEY=... -# Embeddings (bedrock/ prefix required) +# Embeddings (bedrock/ prefix REQUIRED) export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 export REDISVL_VECTOR_DIMENSIONS=1024 # Generation (no prefix needed) export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 -export FAST_MODEL=anthropic.claude-3-5-haiku-20241022-v1:0 +export FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0 + +# Start Redis and the server +docker-compose up redis -d +uv run agent-memory api ``` ### Ollama (Local Models) @@ -343,11 +362,11 @@ export EMBEDDING_MODEL=text-embedding-3-small # OpenAI | Variable | Description | Default | |----------|-------------|---------| -| `GENERATION_MODEL` | Primary model for AI tasks | `gpt-4o-mini` | -| `FAST_MODEL` | Fast model for topic extraction, etc. | Same as `GENERATION_MODEL` | -| `QUERY_OPTIMIZATION_MODEL` | Model for query optimization | Same as `GENERATION_MODEL` | +| `GENERATION_MODEL` | Primary model for AI tasks | `gpt-5` | +| `FAST_MODEL` | Fast model for topic extraction, etc. | `gpt-5-mini` | +| `SLOW_MODEL` | Slower, more capable model for complex tasks | `gpt-5` | | `EMBEDDING_MODEL` | Model for vector embeddings | `text-embedding-3-small` | -| `REDISVL_VECTOR_DIMENSIONS` | Override embedding dimensions | Auto-detected | +| `REDISVL_VECTOR_DIMENSIONS` | Override embedding dimensions | `1536` (auto-detected for known models) | ### Model Validation @@ -381,7 +400,11 @@ export REDISVL_VECTOR_DIMENSIONS=1024 **Bedrock "Access Denied"** - Verify IAM permissions include `bedrock:InvokeModel` - Check model is enabled in your AWS region -- Ensure correct `AWS_REGION_NAME` +- Ensure both `REGION_NAME` and `AWS_REGION_NAME` are set correctly +- See [AWS Bedrock Troubleshooting](aws-bedrock.md#troubleshooting) for more details + +**Bedrock "AWS-related dependencies might be missing"** +- Install the `[aws]` extra: `pip install agent-memory-server[aws]` or `uv sync --extra aws` ### Debug Logging @@ -417,6 +440,7 @@ The following are no longer required: ## See Also +- [AWS Bedrock Models](aws-bedrock.md) - Complete Bedrock guide with Quick Start, IAM policies, and Docker setup - [Embedding Providers](embedding-providers.md) - Detailed embedding configuration - [Configuration](configuration.md) - All environment variables - [Query Optimization](query-optimization.md) - Model selection for query optimization From f1dded2f7a0417d8996847ce4dd2f8beb3717839 Mon Sep 17 00:00:00 2001 From: Nitin Kanukolanu Date: Thu, 16 Apr 2026 10:10:42 -0400 Subject: [PATCH 2/4] fix: update stale docstring in langchain integration to use create_agent The module docstring was using the removed create_tool_calling_agent and AgentExecutor API from LangChain < v0.2. Updated to use create_agent from langchain.agents, which is the current API. --- .../integrations/langchain.py | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/agent-memory-client/agent_memory_client/integrations/langchain.py b/agent-memory-client/agent_memory_client/integrations/langchain.py index be9ec46b..2675617b 100644 --- a/agent-memory-client/agent_memory_client/integrations/langchain.py +++ b/agent-memory-client/agent_memory_client/integrations/langchain.py @@ -8,9 +8,8 @@ ```python from agent_memory_client import create_memory_client from agent_memory_client.integrations.langchain import get_memory_tools - from langchain.agents import create_tool_calling_agent, AgentExecutor + from langchain.agents import create_agent from langchain_openai import ChatOpenAI - from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder # Initialize memory client memory_client = await create_memory_client("http://localhost:8000") @@ -24,16 +23,15 @@ # Use with LangChain agent llm = ChatOpenAI(model="gpt-4o") - prompt = ChatPromptTemplate.from_messages([ - ("system", "You are a helpful assistant with memory."), - ("human", "{input}"), - MessagesPlaceholder("agent_scratchpad"), - ]) - agent = create_tool_calling_agent(llm, tools, prompt) - executor = AgentExecutor(agent=agent, tools=tools) + agent = create_agent( + llm, tools, + system_prompt="You are a helpful assistant with memory." + ) # Run the agent - result = await executor.ainvoke({"input": "Remember that I love pizza"}) + result = await agent.ainvoke( + {"messages": [("human", "Remember that I love pizza")]} + ) ``` """ From 7cd01fb30a08e4748f34f5a60d0e0366af987644 Mon Sep 17 00:00:00 2001 From: Nitin Kanukolanu Date: Thu, 16 Apr 2026 10:28:11 -0400 Subject: [PATCH 3/4] docs: address PR review feedback from Copilot - Fix REGION_NAME vs AWS_REGION_NAME: clarify AWS_REGION_NAME is required (LiteLLM), REGION_NAME is optional (server-side boto3 model-existence checks) - Fix IAM role section: note that server boto3 utilities currently require explicit credentials, but LiteLLM handles IAM roles natively - Fix REDISVL_VECTOR_DIMENSIONS description: it is a fallback/override, not auto-detected - Add asterisk notes on embedding models not in MODEL_CONFIGS (titan-embed-image, cohere.embed-v4) requiring explicit REDISVL_VECTOR_DIMENSIONS --- docs/aws-bedrock.md | 24 ++++++++++++++---------- docs/llm-providers.md | 22 +++++++++++----------- 2 files changed, 25 insertions(+), 21 deletions(-) diff --git a/docs/aws-bedrock.md b/docs/aws-bedrock.md index a778c561..4e64cb84 100644 --- a/docs/aws-bedrock.md +++ b/docs/aws-bedrock.md @@ -27,8 +27,8 @@ uv sync --extra aws # (or use an IAM role / AWS CLI profile instead — see "AWS Credentials" below) export AWS_ACCESS_KEY_ID=your-access-key-id export AWS_SECRET_ACCESS_KEY=your-secret-access-key -export REGION_NAME=us-east-1 # Used by the server's own model-validation client -export AWS_REGION_NAME=us-east-1 # Used by LiteLLM for Bedrock API calls +export AWS_REGION_NAME=us-east-1 # Required: used by LiteLLM for Bedrock API calls +export REGION_NAME=us-east-1 # Optional: used by server-side boto3 utilities (model-existence checks) # ── Bedrock embedding model (bedrock/ prefix REQUIRED) ────────── export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 @@ -67,10 +67,10 @@ The server reads the AWS region in **two** places: | Variable | Read by | Purpose | |----------|---------|---------| -| `REGION_NAME` | The server's Settings (pydantic-settings) | Creating `boto3` sessions for model-existence validation | -| `AWS_REGION_NAME` | LiteLLM | Making the actual Bedrock inference API calls | +| `AWS_REGION_NAME` | LiteLLM | **Required.** Making the actual Bedrock inference and embedding API calls | +| `REGION_NAME` | The server's Settings (pydantic-settings) | **Optional.** Used only by server-side `boto3` utilities that check whether a Bedrock model exists (`_aws/utils.py`) | -Set **both** to the same value to avoid surprises. If you rely solely on an IAM role or AWS CLI profile, `boto3` and LiteLLM may auto-detect the region from the instance metadata or `~/.aws/config`, but explicitly setting the variables is recommended. +If you only use LiteLLM for Bedrock (the common case), `AWS_REGION_NAME` is sufficient. Set `REGION_NAME` as well if you want the server's optional model-existence checks to work. When both are used, set them to the same value. --- @@ -97,10 +97,12 @@ All LLM operations in the server go through [LiteLLM](https://docs.litellm.ai/), |----------|----------|------------|-------------| | `bedrock/amazon.titan-embed-text-v2:0` | Amazon | 1024 | Latest Titan text embedding (recommended) | | `bedrock/amazon.titan-embed-text-v1` | Amazon | 1536 | Original Titan text embedding | -| `bedrock/amazon.titan-embed-image-v1` | Amazon | 1024 | Titan multimodal (text + image) embedding | +| `bedrock/amazon.titan-embed-image-v1` | Amazon | 1024 | Titan multimodal (text + image) embedding * | | `bedrock/cohere.embed-english-v3` | Cohere | 1024 | English-focused embeddings | | `bedrock/cohere.embed-multilingual-v3` | Cohere | 1024 | Multilingual embeddings | -| `bedrock/cohere.embed-v4:0` | Cohere | 1024 | Cohere Embed v4 — text + image embedding | +| `bedrock/cohere.embed-v4:0` | Cohere | 1024 | Cohere Embed v4 — text + image embedding * | + +> \* Models marked with **\*** are not in `MODEL_CONFIGS` and their dimensions cannot be auto-resolved. You **must** set `REDISVL_VECTOR_DIMENSIONS=1024` explicitly when using them, or you will get vector-size mismatch errors at runtime. ### Generation Models (Anthropic Claude on Bedrock) @@ -147,8 +149,8 @@ Configure the following environment variables to use Bedrock models: ```bash # Required: AWS region where Bedrock is available -REGION_NAME=us-east-1 # For the server's own boto3 sessions -AWS_REGION_NAME=us-east-1 # For LiteLLM's Bedrock calls +AWS_REGION_NAME=us-east-1 # Required: for LiteLLM's Bedrock calls +REGION_NAME=us-east-1 # Optional: for server-side boto3 model-existence checks # For Bedrock Embedding Models (bedrock/ prefix REQUIRED) EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 @@ -185,7 +187,9 @@ aws_secret_access_key = your-secret-access-key #### Option 3: IAM Role (Recommended for AWS deployments) -When running on AWS infrastructure (EC2, ECS, Lambda, etc.), use IAM roles for automatic credential management. No explicit credentials are needed — `boto3` and LiteLLM will discover credentials from the instance metadata service. +When running on AWS infrastructure (EC2, ECS, Lambda, etc.), IAM roles provide automatic credential management. LiteLLM will discover credentials from the instance metadata service automatically. + +> **Note:** The server's built-in `boto3` model-existence utilities (`_aws/utils.py`) currently require explicit `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables. If you rely solely on an IAM role, LiteLLM Bedrock calls will work, but the optional model-existence checks will be skipped. This limitation may be removed in a future release. #### Option 4: AWS SSO / AWS CLI Profile diff --git a/docs/llm-providers.md b/docs/llm-providers.md index f4510aa9..91278d75 100644 --- a/docs/llm-providers.md +++ b/docs/llm-providers.md @@ -50,8 +50,8 @@ export EMBEDDING_MODEL=text-embedding-3-small # Use OpenAI for embeddings # AWS Bedrock (full stack — see "AWS Bedrock" section for details) export AWS_ACCESS_KEY_ID=... export AWS_SECRET_ACCESS_KEY=... -export REGION_NAME=us-east-1 # Server's own boto3 sessions -export AWS_REGION_NAME=us-east-1 # LiteLLM Bedrock calls +export AWS_REGION_NAME=us-east-1 # Required: LiteLLM Bedrock calls +export REGION_NAME=us-east-1 # Optional: server-side boto3 utilities export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 # No prefix for generation export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 # bedrock/ prefix REQUIRED export REDISVL_VECTOR_DIMENSIONS=1024 # Must match embedding model @@ -152,17 +152,15 @@ Bedrock uses standard AWS credentials. Configure using any of these methods: # Option 1: Environment variables (recommended for development) export AWS_ACCESS_KEY_ID=AKIA... export AWS_SECRET_ACCESS_KEY=... -export REGION_NAME=us-east-1 # Server's own boto3 sessions (model validation) -export AWS_REGION_NAME=us-east-1 # LiteLLM's Bedrock API calls +export AWS_REGION_NAME=us-east-1 # Required: LiteLLM's Bedrock API calls +export REGION_NAME=us-east-1 # Optional: server-side boto3 model-existence checks # Option 2: AWS CLI profile export AWS_PROFILE=my-profile -export REGION_NAME=us-east-1 export AWS_REGION_NAME=us-east-1 # Option 3: IAM role (recommended for production on AWS) -# No credentials needed — uses instance/container role -export REGION_NAME=us-east-1 +# No explicit credentials needed — LiteLLM discovers them from instance metadata export AWS_REGION_NAME=us-east-1 # Option 4: AWS SSO @@ -170,7 +168,7 @@ aws sso login --profile your-profile export AWS_PROFILE=your-profile ``` -> **Why two region variables?** The server reads `REGION_NAME` (via pydantic-settings) for its own `boto3` model-validation client, while LiteLLM reads `AWS_REGION_NAME` for the actual Bedrock inference calls. Set both to the same value. +> **Why two region variables?** `AWS_REGION_NAME` is required for LiteLLM's Bedrock inference calls. `REGION_NAME` is only needed if you use the server's optional `boto3`-based model-existence checks (`_aws/utils.py`). When both are set, use the same value. #### Generation Models @@ -219,10 +217,12 @@ export EMBEDDING_MODEL=amazon.titan-embed-text-v2:0 |----------|------------|-------------| | `bedrock/amazon.titan-embed-text-v2:0` | 1024 | Latest Titan text embedding (recommended) | | `bedrock/amazon.titan-embed-text-v1` | 1536 | Original Titan text embedding | -| `bedrock/amazon.titan-embed-image-v1` | 1024 | Titan multimodal (text + image) embedding | +| `bedrock/amazon.titan-embed-image-v1` | 1024 | Titan multimodal (text + image) embedding * | | `bedrock/cohere.embed-english-v3` | 1024 | English-focused | | `bedrock/cohere.embed-multilingual-v3` | 1024 | Multilingual | -| `bedrock/cohere.embed-v4:0` | 1024 | Cohere Embed v4 — text + image | +| `bedrock/cohere.embed-v4:0` | 1024 | Cohere Embed v4 — text + image * | + +> \* Models marked with **\*** are not in `MODEL_CONFIGS`, so their dimensions cannot be auto-resolved. Set `REDISVL_VECTOR_DIMENSIONS=1024` explicitly when using them. #### Enabling Bedrock Models @@ -366,7 +366,7 @@ export EMBEDDING_MODEL=text-embedding-3-small # OpenAI | `FAST_MODEL` | Fast model for topic extraction, etc. | `gpt-5-mini` | | `SLOW_MODEL` | Slower, more capable model for complex tasks | `gpt-5` | | `EMBEDDING_MODEL` | Model for vector embeddings | `text-embedding-3-small` | -| `REDISVL_VECTOR_DIMENSIONS` | Override embedding dimensions | `1536` (auto-detected for known models) | +| `REDISVL_VECTOR_DIMENSIONS` | Fallback/override for embedding dimensions when they cannot be resolved from `MODEL_CONFIGS` | `1536` | ### Model Validation From 51b0264624febe4b8c41bb40f61153c8fdc70f84 Mon Sep 17 00:00:00 2001 From: Nitin Kanukolanu Date: Thu, 16 Apr 2026 10:35:02 -0400 Subject: [PATCH 4/4] docs: address reviewer feedback - Clarify bedrock/ prefix is optional (not prohibited) for generation models - Remove boto3/botocore version ranges from docs - Rename LLM Proxy to LLM Proxy (LiteLLM) in architecture diagram - Update wording from 'no prefix needed' to 'prefix optional' --- docs/aws-bedrock.md | 10 +++++----- docs/llm-providers.md | 8 ++++---- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/aws-bedrock.md b/docs/aws-bedrock.md index 4e64cb84..283d4cfd 100644 --- a/docs/aws-bedrock.md +++ b/docs/aws-bedrock.md @@ -81,9 +81,9 @@ All LLM operations in the server go through [LiteLLM](https://docs.litellm.ai/), | Model type | Prefix required? | Example | |------------|------------------|---------| | **Embedding** | **Yes** — must include `bedrock/` | `bedrock/amazon.titan-embed-text-v2:0` | -| **Generation (chat)** | **No** — Bedrock model IDs are recognized automatically | `anthropic.claude-sonnet-4-5-20250929-v1:0` | +| **Generation (chat)** | **Optional** — not required, but allowed | `anthropic.claude-sonnet-4-5-20250929-v1:0` | -**Why the difference?** LiteLLM can infer the provider for generation models from the Bedrock-style model ID (e.g., `anthropic.claude-*`). For embedding models, however, the `bedrock/` prefix is the only way LiteLLM distinguishes a Bedrock embedding call from other providers. If you omit the prefix on an embedding model, the server will auto-add it and emit a **deprecation warning** — but this behaviour will be removed in a future release. +**Why the difference?** LiteLLM can infer the provider for generation models from the Bedrock-style model ID (e.g., `anthropic.claude-*`), so the `bedrock/` prefix is optional for generation. Adding it (e.g., `bedrock/anthropic.claude-sonnet-4-5-20250929-v1:0`) also works. For embedding models, the `bedrock/` prefix is the only way LiteLLM distinguishes a Bedrock embedding call from other providers, so it is **required**. If you omit the prefix on an embedding model, the server will auto-add it and emit a **deprecation warning**, but this fallback will be removed in a future release. --- @@ -136,8 +136,8 @@ uv sync --extra aws This installs: -- **`boto3`** (`>=1.42.1,<2.0.0`) — AWS SDK for Python -- **`botocore`** (`>=1.42.1,<2.0.0`) — Low-level AWS client library +- **`boto3`** — AWS SDK for Python +- **`botocore`** — Low-level AWS client library > **Without these packages**, any attempt to use Bedrock models will fail at import time. The standard install (`pip install agent-memory-server`) does **not** include them. @@ -156,7 +156,7 @@ REGION_NAME=us-east-1 # Optional: for server-side boto3 model-existen EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 REDISVL_VECTOR_DIMENSIONS=1024 # Must match the embedding model's output dimensions -# For Bedrock LLM Generation Models (no prefix needed) +# For Bedrock LLM Generation Models (bedrock/ prefix optional) GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0 diff --git a/docs/llm-providers.md b/docs/llm-providers.md index 91278d75..f0021c77 100644 --- a/docs/llm-providers.md +++ b/docs/llm-providers.md @@ -16,7 +16,7 @@ All LLM operations go through a single `LLMClient` abstraction: │ └──────────┬───────────────┘ │ │ ▼ │ │ ┌──────────────┐ │ -│ │ LLM Proxy │ │ +│ │LLM Proxy (LiteLLM)│ │ │ └──────┬───────┘ │ └──────────────────────┼───────────────────────────────────┘ ▼ @@ -52,7 +52,7 @@ export AWS_ACCESS_KEY_ID=... export AWS_SECRET_ACCESS_KEY=... export AWS_REGION_NAME=us-east-1 # Required: LiteLLM Bedrock calls export REGION_NAME=us-east-1 # Optional: server-side boto3 utilities -export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 # No prefix for generation +export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 # bedrock/ prefix optional for generation export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 # bedrock/ prefix REQUIRED export REDISVL_VECTOR_DIMENSIONS=1024 # Must match embedding model ``` @@ -134,7 +134,7 @@ AWS Bedrock provides access to foundation models from multiple providers (Anthro #### Installation -AWS Bedrock support requires the `[aws]` extra, which installs `boto3` (`>=1.42.1`) and `botocore` (`>=1.42.1`). Without these packages, any Bedrock operation will fail at import time. +AWS Bedrock support requires the `[aws]` extra, which installs `boto3` and `botocore`. Without these packages, any Bedrock operation will fail at import time. ```bash # With pip @@ -172,7 +172,7 @@ export AWS_PROFILE=your-profile #### Generation Models -Generation models use Bedrock-native model IDs **without** a prefix — LiteLLM recognises them automatically. +Generation models use Bedrock-native model IDs. The `bedrock/` prefix is **optional** for generation (LiteLLM recognizes Bedrock model IDs automatically), but adding it also works. ```bash # Claude models on Bedrock (no prefix needed)