From 7d9f683ab6a484cb6e4583f654563b8ba26b8a07 Mon Sep 17 00:00:00 2001 From: Nitin Kanukolanu Date: Tue, 9 Jun 2026 19:20:09 -0400 Subject: [PATCH 1/2] docs: add MCP user-guide notebook for search, upsert, and ADK agent Add docs/user_guide/15_mcp.ipynb, a hands-on guide that creates and loads a Redis index, writes and validates an MCP config, starts the RedisVL MCP server over Streamable HTTP, exercises the search-records and upsert-records tools from an MCP client, and wires the same server to a Google ADK agent. Register the notebook in the how-to guides index (card, quick reference, and toctree). --- docs/user_guide/15_mcp.ipynb | 657 +++++++++++++++++++++++++ docs/user_guide/how_to_guides/index.md | 3 + 2 files changed, 660 insertions(+) create mode 100644 docs/user_guide/15_mcp.ipynb diff --git a/docs/user_guide/15_mcp.ipynb b/docs/user_guide/15_mcp.ipynb new file mode 100644 index 00000000..9f390039 --- /dev/null +++ b/docs/user_guide/15_mcp.ipynb @@ -0,0 +1,657 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Serve an Index over MCP\n", + "\n", + "The [Model Context Protocol (MCP)](https://modelcontextprotocol.io/) is an open standard that lets AI agents discover and call external tools through one uniform interface. RedisVL ships an MCP server, the `rvl mcp` command, that exposes a single existing Redis index to any MCP client through two tools:\n", + "\n", + "- **`search-records`** : semantic, full-text, or hybrid retrieval over the index.\n", + "- **`upsert-records`** : add or overwrite records in the index.\n", + "\n", + "The server owns the embedding model, so clients only ever send **text**. No raw vectors cross the client boundary, retrieval behavior lives entirely in one config file, and the same index can be shared with ADK, Claude Desktop, Cursor, or any other MCP client with zero custom code.\n", + "\n", + "This guide walks the full loop end to end:\n", + "\n", + "1. Create and load a Redis index.\n", + "2. Write the MCP config that binds the server to that index.\n", + "3. Start the RedisVL MCP server over Streamable HTTP.\n", + "4. Call `search-records` and `upsert-records` from a plain MCP client.\n", + "5. Wire the same server to a [Google ADK](https://google.github.io/adk-docs/) agent so a model can retrieve and write knowledge through MCP.\n", + "\n", + "## Prerequisites\n", + "\n", + "Before you begin, ensure you have:\n", + "- A running Redis instance ([Redis 8+](https://redis.io/downloads/) or [Redis Cloud](https://redis.io/cloud)) with the Search capability.\n", + "- [`uv`](https://docs.astral.sh/uv/) on your `PATH` (the server is launched with `uvx`/`rvl`).\n", + "\n", + "For the complete config schema and tool contracts, see the [Run RedisVL MCP](how_to_guides/mcp.md) how-to guide.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Install Packages\n", + "\n", + "The MCP server lives behind the `mcp` extra. We also install the `sentence-transformers` extra so query and record embedding can run locally with no API key.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -q \"redisvl[mcp,sentence-transformers]>=0.20.0\" nest_asyncio pandas" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Connect to Redis\n", + "\n", + "The MCP server reads from a normal Redis URL. We use the same URL here to create the index it will serve.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import warnings\n", + "\n", + "import nest_asyncio\n", + "import pandas as pd\n", + "\n", + "# Notebook event loops are already running; this lets the MCP client and the\n", + "# ADK runner use top-level await cleanly.\n", + "warnings.filterwarnings(\"ignore\")\n", + "nest_asyncio.apply()\n", + "\n", + "REDIS_URL = os.environ.get(\"REDIS_URL\", \"redis://localhost:6379\")\n", + "\n", + "from redis import Redis\n", + "\n", + "Redis.from_url(REDIS_URL).ping()\n", + "print(\"Connected to\", REDIS_URL)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. Create and Load an Index\n", + "\n", + "The MCP server only ever **binds to an index that already exists**, it never creates one. So the first step is the ordinary RedisVL flow: define a schema, embed some records, and load them.\n", + "\n", + "We use a tiny Redis knowledge corpus and embed each record's `text` field locally with `HFTextVectorizer` (`all-MiniLM-L6-v2`, 384 dimensions). The `doc_id` tag gives every record a stable identifier so upserts can update in place.\n", + "\n", + "> **Reserved field names:** RedisVL MCP uses `id`, `score`, and a few others in its response envelope. Name your identifier field something else (here, `doc_id`) so it does not collide.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from redisvl.index import SearchIndex\n", + "from redisvl.utils.vectorize import HFTextVectorizer\n", + "\n", + "INDEX_NAME = \"redisvl_mcp_guide\"\n", + "INDEX_PREFIX = \"mcp_guide\"\n", + "EMBEDDING_MODEL = \"sentence-transformers/all-MiniLM-L6-v2\"\n", + "\n", + "documents = [\n", + " {\"doc_id\": \"redisvl-intro\", \"title\": \"What is RedisVL\",\n", + " \"text\": \"RedisVL is a Python client for building AI applications on Redis, \"\n", + " \"with vector search, semantic caching, and semantic routing.\"},\n", + " {\"doc_id\": \"vector-search\", \"title\": \"Vector search\",\n", + " \"text\": \"Redis stores embeddings and runs k-nearest-neighbor vector similarity \"\n", + " \"search using HNSW or FLAT indexes.\"},\n", + " {\"doc_id\": \"semantic-cache\", \"title\": \"Semantic caching\",\n", + " \"text\": \"SemanticCache stores past LLM responses and returns a cached answer when \"\n", + " \"a new prompt is semantically similar to an earlier one.\"},\n", + " {\"doc_id\": \"mcp-server\", \"title\": \"RedisVL MCP server\",\n", + " \"text\": \"The rvl mcp command serves an existing Redis index to MCP clients through \"\n", + " \"the search-records and upsert-records tools.\"},\n", + "]\n", + "\n", + "vectorizer = HFTextVectorizer(model=EMBEDDING_MODEL)\n", + "\n", + "schema = {\n", + " \"index\": {\"name\": INDEX_NAME, \"prefix\": INDEX_PREFIX, \"storage_type\": \"hash\"},\n", + " \"fields\": [\n", + " {\"name\": \"doc_id\", \"type\": \"tag\"},\n", + " {\"name\": \"title\", \"type\": \"text\"},\n", + " {\"name\": \"text\", \"type\": \"text\"},\n", + " {\n", + " \"name\": \"embedding\",\n", + " \"type\": \"vector\",\n", + " \"attrs\": {\n", + " \"algorithm\": \"hnsw\",\n", + " \"dims\": vectorizer.dims,\n", + " \"distance_metric\": \"cosine\",\n", + " \"datatype\": \"float32\",\n", + " },\n", + " },\n", + " ],\n", + "}\n", + "\n", + "index = SearchIndex.from_dict(schema, redis_url=REDIS_URL)\n", + "index.create(overwrite=True, drop=True)\n", + "\n", + "records = []\n", + "for doc in documents:\n", + " record = dict(doc)\n", + " record[\"embedding\"] = vectorizer.embed(doc[\"text\"], as_buffer=True)\n", + " records.append(record)\n", + "\n", + "keys = index.load(records, id_field=\"doc_id\")\n", + "print(f\"Loaded {len(keys)} records into index '{INDEX_NAME}'\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. Write the MCP Config\n", + "\n", + "The config binds **one logical MCP server to one existing index**. It has three parts:\n", + "\n", + "- **`server`** : how to reach Redis (`redis_url`).\n", + "- **`indexes..search`** : how the `search-records` tool queries. `type: vector` embeds the query text and runs vector similarity. (`fulltext` and `hybrid` are also supported, hybrid needs native Redis support.)\n", + "- **`indexes..runtime`** : field mappings and guardrails.\n", + " - `vector_field_name` / `text_field_name` : which fields to search.\n", + " - `default_embed_text_field` : the field the server embeds, both for incoming queries and for new records on upsert. This is what makes the server **embed text itself** so clients never send vectors.\n", + " - `default_limit` / `max_limit` / `max_result_window` : cap result sizes.\n", + "\n", + "We point the same `HFTextVectorizer` model at the server so its query embeddings match the vectors we stored. Because we do **not** pass `--read-only` when launching, both `search-records` and `upsert-records` are exposed.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from pathlib import Path\n", + "\n", + "import yaml\n", + "\n", + "from redisvl.mcp import load_mcp_config\n", + "\n", + "mcp_config_path = (Path.cwd() / \"redisvl_mcp_guide.yaml\").resolve()\n", + "\n", + "mcp_config = {\n", + " \"server\": {\"redis_url\": REDIS_URL},\n", + " \"indexes\": {\n", + " INDEX_NAME: {\n", + " \"redis_name\": INDEX_NAME,\n", + " \"vectorizer\": {\"class\": \"HFTextVectorizer\", \"model\": EMBEDDING_MODEL},\n", + " \"search\": {\"type\": \"vector\"},\n", + " \"runtime\": {\n", + " \"text_field_name\": \"text\",\n", + " \"vector_field_name\": \"embedding\",\n", + " \"default_embed_text_field\": \"text\",\n", + " \"default_limit\": 3,\n", + " \"max_limit\": 10,\n", + " \"max_result_window\": 100,\n", + " # The first call loads the embedding model into memory, which can\n", + " # take 30+ seconds. Give startup and requests plenty of room.\n", + " \"request_timeout_seconds\": 120,\n", + " \"startup_timeout_seconds\": 120,\n", + " },\n", + " }\n", + " },\n", + "}\n", + "\n", + "mcp_config_path.write_text(yaml.safe_dump(mcp_config, sort_keys=False), encoding=\"utf-8\")\n", + "\n", + "# load_mcp_config validates the file the way the server will at startup.\n", + "validated = load_mcp_config(str(mcp_config_path))\n", + "print(\"Wrote and validated\", mcp_config_path)\n", + "print({\n", + " \"binding_id\": validated.binding_id,\n", + " \"redis_name\": validated.redis_name,\n", + " \"search_type\": validated.binding.search.type,\n", + "})" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Start the RedisVL MCP Server\n", + "\n", + "The server must be running before any client can connect. We use **Streamable HTTP** transport, which works reliably in notebooks (stdio transport breaks under Jupyter/Colab because they wrap `stdout`/`stderr`).\n", + "\n", + "**Option A : from a terminal (recommended for local development).** Open a separate terminal and run:\n", + "\n", + "```bash\n", + "uvx --from \"redisvl[mcp,sentence-transformers]\" rvl mcp \\\n", + " --config redisvl_mcp_guide.yaml \\\n", + " --transport streamable-http --host 127.0.0.1 --port 8000\n", + "```\n", + "\n", + "Then skip the next code cell and set `MCP_URL = \"http://127.0.0.1:8000/mcp\"`.\n", + "\n", + "**Option B : from the notebook (required for Colab).** The next cell launches the server as a background subprocess. It first refuses to continue if the port is already taken, then waits until the server answers a real MCP handshake (confirming it is our server and the embedding model has finished loading), not merely until the socket opens.\n", + "\n", + "```{warning}\n", + "Streamable HTTP is **unauthenticated by default**. Only bind to public interfaces (`--host 0.0.0.0`) on trusted networks or behind an authenticating proxy. Without `--read-only`, the `upsert-records` write tool is exposed to any client that can reach the server.\n", + "```\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# NBVAL_SKIP\n", + "import socket\n", + "import subprocess\n", + "import time\n", + "\n", + "from fastmcp import Client\n", + "\n", + "MCP_HOST, MCP_PORT = \"127.0.0.1\", 8000\n", + "MCP_URL = f\"http://{MCP_HOST}:{MCP_PORT}/mcp\"\n", + "\n", + "\n", + "def port_in_use(host: str, port: int) -> bool:\n", + " \"\"\"Return True if something is already listening on host:port.\"\"\"\n", + " with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:\n", + " sock.settimeout(0.5)\n", + " return sock.connect_ex((host, port)) == 0\n", + "\n", + "\n", + "# Fail loudly on a port clash. A bare TCP probe would otherwise treat an\n", + "# unrelated server already on this port as \"ready\" and surface a confusing\n", + "# \"Session terminated\" error on the first tool call.\n", + "if port_in_use(MCP_HOST, MCP_PORT):\n", + " raise RuntimeError(\n", + " f\"Port {MCP_PORT} is already in use. Stop whatever is bound to it, or pick a free \"\n", + " f\"port by setting MCP_PORT and MCP_URL above before launching the server.\"\n", + " )\n", + "\n", + "# A clearer search-tool description steers the model toward real field names.\n", + "# RedisVL still appends schema-derived filter and return_fields hints.\n", + "os.environ[\"REDISVL_MCP_TOOL_SEARCH_DESCRIPTION\"] = (\n", + " \"Search the Redis knowledge base. Fields: doc_id (tag), title (text), text (text). \"\n", + " \"Only use these field names in return_fields and filters.\"\n", + ")\n", + "\n", + "mcp_process = subprocess.Popen(\n", + " [\n", + " \"rvl\", \"mcp\",\n", + " \"--config\", str(mcp_config_path),\n", + " \"--transport\", \"streamable-http\",\n", + " \"--host\", MCP_HOST,\n", + " \"--port\", str(MCP_PORT),\n", + " ],\n", + " stdin=subprocess.PIPE,\n", + " stdout=subprocess.DEVNULL,\n", + " stderr=subprocess.DEVNULL,\n", + " env=os.environ.copy(),\n", + " start_new_session=True, # own process group, so we can stop the whole tree\n", + ")\n", + "\n", + "\n", + "async def wait_until_ready(url: str, timeout: float = 120.0) -> set[str]:\n", + " \"\"\"Wait for a real MCP handshake, not just an open socket.\n", + "\n", + " Connecting with an MCP client and confirming ``search-records`` is exposed\n", + " proves the process is actually our RedisVL server (not some other listener\n", + " on this port) and that it has finished loading the embedding model, so the\n", + " first real call will not cold-start.\n", + " \"\"\"\n", + " deadline = time.time() + timeout\n", + " last_error = None\n", + " while time.time() < deadline:\n", + " if mcp_process.poll() is not None:\n", + " raise RuntimeError(f\"MCP server exited early (code {mcp_process.returncode})\")\n", + " try:\n", + " async with Client(url) as client:\n", + " tool_names = {t.name for t in await client.list_tools()}\n", + " if \"search-records\" in tool_names:\n", + " return tool_names\n", + " except Exception as exc: # not ready yet, keep polling\n", + " last_error = exc\n", + " time.sleep(1.0)\n", + " raise RuntimeError(f\"MCP server not ready after {timeout:.0f}s (last error: {last_error})\")\n", + "\n", + "\n", + "tool_names = await wait_until_ready(MCP_URL)\n", + "print(f\"MCP server ready (PID {mcp_process.pid}); tools exposed: {sorted(tool_names)}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 4. Call the Tools from an MCP Client\n", + "\n", + "Any MCP client can now connect. We use [`fastmcp`](https://gofastmcp.com/)'s `Client` (installed with the `mcp` extra) to show exactly what crosses the wire. Notice the request is **plain text**: the server embeds it before searching.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# NBVAL_SKIP\n", + "from fastmcp import Client\n", + "\n", + "async with Client(MCP_URL) as client:\n", + " tools = await client.list_tools()\n", + " print(\"Tools exposed:\", [t.name for t in tools])\n", + "\n", + " result = await client.call_tool(\n", + " \"search-records\",\n", + " {\"query\": \"How do I cache LLM responses?\", \"limit\": 3},\n", + " )\n", + "\n", + "# result.data is the structured JSON response from the server.\n", + "search_payload = result.data\n", + "\n", + "# Verify the tool actually returned grounded results.\n", + "assert search_payload[\"results\"], \"search-records returned no results\"\n", + "print(f\"search_type={search_payload['search_type']} | {len(search_payload['results'])} results returned\")\n", + "\n", + "rows = [\n", + " {**hit[\"record\"], \"score\": round(hit[\"score\"], 3), \"score_type\": hit[\"score_type\"]}\n", + " for hit in search_payload[\"results\"]\n", + "]\n", + "pd.DataFrame(rows)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Upsert a New Record\n", + "\n", + "`upsert-records` takes a list of records, each carrying the `default_embed_text_field` (`text`) the server will embed. Passing `id_field=\"doc_id\"` makes writes idempotent: the same `doc_id` overwrites in place rather than creating a duplicate. We send no vector, the server generates it.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# NBVAL_SKIP\n", + "async with Client(MCP_URL) as client:\n", + " upsert_result = await client.call_tool(\n", + " \"upsert-records\",\n", + " {\n", + " \"records\": [\n", + " {\n", + " \"doc_id\": \"semantic-router\",\n", + " \"title\": \"Semantic routing\",\n", + " \"text\": \"SemanticRouter classifies an incoming query and routes it to the \"\n", + " \"best matching route using vector similarity over route references.\",\n", + " }\n", + " ],\n", + " \"id_field\": \"doc_id\",\n", + " },\n", + " )\n", + " print(\"Upsert:\", upsert_result.data)\n", + "\n", + " # The new record is immediately searchable.\n", + " verify = await client.call_tool(\n", + " \"search-records\",\n", + " {\"query\": \"route queries to the right topic\", \"limit\": 2},\n", + " )\n", + "\n", + "# Verify the write landed and is retrievable.\n", + "assert upsert_result.data[\"keys_upserted\"] == 1, upsert_result.data\n", + "assert any(\n", + " hit[\"record\"][\"doc_id\"] == \"semantic-router\" for hit in verify.data[\"results\"]\n", + "), \"the upserted record was not found by a follow-up search\"\n", + "print(\"Verified: upserted\", upsert_result.data[\"keys\"], \"and found it via search\")\n", + "\n", + "pd.DataFrame(\n", + " {**h[\"record\"], \"score\": round(h[\"score\"], 3)} for h in verify.data[\"results\"]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 5. Use the Index from a Google ADK Agent\n", + "\n", + "The same server drops straight into an agent framework. This is the pattern from the [`redisvl-mcp-rag-agent`](https://github.com/redis-applied-ai/redisvl-mcp-rag-agent) demo, reduced to its core: a [Google ADK](https://google.github.io/adk-docs/) `LlmAgent` whose only tools are the RedisVL MCP `search-records` and `upsert-records`, reached over the same Streamable HTTP endpoint.\n", + "\n", + "The agent orchestrates with a chat model (here OpenAI's `gpt-4o` via [LiteLLM](https://docs.litellm.ai/)) but **retrieval and writes go through MCP**, so the model only ever sends text.\n", + "\n", + "This section needs `google-adk`, `litellm`, and an `OPENAI_API_KEY`.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# NBVAL_SKIP\n", + "%pip install -q \"google-adk>=1.0.0\" litellm" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# NBVAL_SKIP\n", + "from getpass import getpass\n", + "\n", + "if not os.environ.get(\"OPENAI_API_KEY\"):\n", + " os.environ[\"OPENAI_API_KEY\"] = getpass(\"OPENAI_API_KEY: \")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Build the Agent\n", + "\n", + "`McpToolset` connects ADK to the running server. The `tool_filter` lists exactly which MCP tools the agent may call, here both `search-records` and `upsert-records`. The `instruction` tells the model to ground every answer in retrieved records and to write only when explicitly asked.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# NBVAL_SKIP\n", + "import uuid\n", + "\n", + "from google.adk.agents import LlmAgent\n", + "from google.adk.models.lite_llm import LiteLlm\n", + "from google.adk.runners import Runner\n", + "from google.adk.sessions import InMemorySessionService\n", + "from google.adk.tools.mcp_tool import McpToolset\n", + "from google.adk.tools.mcp_tool.mcp_session_manager import StreamableHTTPConnectionParams\n", + "from google.genai import types as genai_types\n", + "\n", + "APP_NAME, USER_ID = \"redisvl_mcp_guide\", \"notebook_user\"\n", + "\n", + "INSTRUCTION = (\n", + " \"You are a Redis knowledge assistant backed by one RedisVL index. \"\n", + " \"Both your tools take plain TEXT and the server embeds it, so never construct vectors.\\n\"\n", + " \"- To answer, call search-records first, then ground your answer in the results and \"\n", + " \"name the titles you used. If nothing relevant comes back, say so; do not invent an answer.\\n\"\n", + " \"- Only call upsert-records when the user clearly asks to save or correct knowledge. Put the \"\n", + " \"content in the `text` field, pass id_field='doc_id' with a `doc_id`, and confirm what you wrote.\"\n", + ")\n", + "\n", + "toolset = McpToolset(\n", + " connection_params=StreamableHTTPConnectionParams(url=MCP_URL),\n", + " tool_filter=[\"search-records\", \"upsert-records\"],\n", + ")\n", + "\n", + "agent = LlmAgent(\n", + " name=\"redis_mcp_agent\",\n", + " model=LiteLlm(model=\"openai/gpt-4o\"),\n", + " instruction=INSTRUCTION,\n", + " tools=[toolset],\n", + ")\n", + "\n", + "session_service = InMemorySessionService()\n", + "runner = Runner(app_name=APP_NAME, agent=agent, session_service=session_service)\n", + "\n", + "\n", + "async def ask_agent(query: str) -> dict:\n", + " \"\"\"Run one turn; return the final answer plus the MCP tool calls it made.\"\"\"\n", + " session_id = f\"session-{uuid.uuid4().hex[:8]}\"\n", + " await session_service.create_session(\n", + " app_name=APP_NAME, user_id=USER_ID, session_id=session_id, state={}\n", + " )\n", + " message = genai_types.Content(role=\"user\", parts=[genai_types.Part(text=query)])\n", + "\n", + " answer, tool_calls = \"\", []\n", + " async for event in runner.run_async(\n", + " user_id=USER_ID, session_id=session_id, new_message=message\n", + " ):\n", + " for call in event.get_function_calls() or []:\n", + " tool_calls.append({\"name\": call.name, \"args\": dict(call.args or {})})\n", + " if event.is_final_response() and event.content and event.content.parts:\n", + " answer = \"\".join(part.text or \"\" for part in event.content.parts)\n", + " return {\"answer\": answer, \"tool_calls\": tool_calls}\n", + "\n", + "\n", + "print(\"Agent ready, wired to\", MCP_URL)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Ask a Grounded Question\n", + "\n", + "The agent calls `search-records` and answers from what Redis returns.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# NBVAL_SKIP\n", + "result = await ask_agent(\"What does RedisVL offer for semantic caching?\")\n", + "print(result[\"answer\"])\n", + "print(\"\\nTool calls:\", result[\"tool_calls\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Ask the Agent to Save Knowledge\n", + "\n", + "Because `upsert-records` is exposed, the agent can add to the corpus mid-conversation, then retrieve it.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# NBVAL_SKIP\n", + "save = await ask_agent(\n", + " \"Remember this: RedisVL EmbeddingsCache stores embeddings keyed by text so you do not \"\n", + " \"re-embed the same input twice. Save it with doc_id 'embeddings-cache'.\"\n", + ")\n", + "print(save[\"answer\"])\n", + "print(\"\\nTool calls:\", save[\"tool_calls\"])\n", + "\n", + "check = await ask_agent(\"How can I avoid re-embedding the same text?\")\n", + "print(\"\\n---\\n\", check[\"answer\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Cleanup\n", + "\n", + "Close the toolset, stop the background server (skip if you started it from a terminal, stop it there with Ctrl-C), and drop the index.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import signal\n", + "\n", + "try:\n", + " await toolset.close()\n", + "except NameError:\n", + " pass # agent section was skipped\n", + "\n", + "if \"mcp_process\" in dir() and mcp_process.poll() is None:\n", + " os.killpg(os.getpgid(mcp_process.pid), signal.SIGTERM)\n", + " mcp_process.wait(timeout=10)\n", + " print(\"MCP server stopped.\")\n", + "\n", + "index.delete(drop=True)\n", + "mcp_config_path.unlink(missing_ok=True)\n", + "print(\"Index dropped and config removed.\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Recap\n", + "\n", + "You served a single Redis index over MCP and used it two ways:\n", + "\n", + "1. **Index + config** : created a RedisVL index, then bound it to the `rvl mcp` server with a small YAML config. `default_embed_text_field` makes the server embed text itself, so clients send only text.\n", + "2. **Direct client** : connected with `fastmcp` and called `search-records` and `upsert-records` over Streamable HTTP.\n", + "3. **ADK agent** : pointed a Google ADK `LlmAgent` at the same endpoint with `McpToolset`, so the model retrieves and writes knowledge through MCP with no Redis-specific code.\n", + "\n", + "Any MCP-compatible client (ADK, Claude Desktop, Cursor) can reuse this exact server and index. For the full config schema, tool contracts, transports, and read-only mode, see the [Run RedisVL MCP](how_to_guides/mcp.md) how-to guide.\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.6" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/docs/user_guide/how_to_guides/index.md b/docs/user_guide/how_to_guides/index.md index e9b62da4..90c08d98 100644 --- a/docs/user_guide/how_to_guides/index.md +++ b/docs/user_guide/how_to_guides/index.md @@ -43,6 +43,7 @@ How-to guides are **task-oriented** recipes that help you accomplish specific go - [Manage Indices with the CLI](../cli.ipynb): create, inspect, and delete indices from your terminal - [Run RedisVL MCP](mcp.md): expose an existing Redis index to MCP clients +- [Serve an Index over MCP](../15_mcp.ipynb): hands-on notebook for the search and upsert tools and a Google ADK agent ::: :::: @@ -65,6 +66,7 @@ How-to guides are **task-oriented** recipes that help you accomplish specific go | Decide on storage format | [Choose a Storage Type](../05_hash_vs_json.ipynb) | | Manage indices from terminal | [Manage Indices with the CLI](../cli.ipynb) | | Expose an index through MCP | [Run RedisVL MCP](mcp.md) | +| Run a hands-on MCP notebook with search, upsert, and an ADK agent | [Serve an Index over MCP](../15_mcp.ipynb) | | Plan and run a supported index migration | [Migrate an Index](migrate-indexes.md) | | Quantize vectors with resume, rollback, and the wizard | [Migrate an Index: Quantization, Resume, Backup, Wizard](../14_index_migration.ipynb) | @@ -84,6 +86,7 @@ Cache Embeddings <../10_embeddings_cache> Use Advanced Query Types <../11_advanced_queries> Write SQL Queries for Redis <../12_sql_to_redis_queries> Run RedisVL MCP +Serve an Index over MCP <../15_mcp> Migrate an Index Migrate an Index: Quantization, Resume, Backup, Wizard <../14_index_migration> ``` From e0e9e8f5af8b4693a862b5e1c9673b978192dbe7 Mon Sep 17 00:00:00 2001 From: Nitin Kanukolanu Date: Tue, 9 Jun 2026 19:22:21 -0400 Subject: [PATCH 2/2] update uvfile --- uv.lock | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/uv.lock b/uv.lock index f0dee76b..5bfeca3e 100644 --- a/uv.lock +++ b/uv.lock @@ -4869,7 +4869,7 @@ wheels = [ [[package]] name = "redisvl" -version = "0.19.0" +version = "0.20.0" source = { editable = "." } dependencies = [ { name = "jsonpath-ng" },