GML-2103 Release 2.0.0 - Agentic Engine#46
Conversation
- Float pinned dependencies to minimum-version constraints and upgrade the core stack to current releases. - Migrate all imports (top-level and lazy in-function) to the LangChain 1.x module layout. - Drop the unused top-level langchain package. - Remove a dead, unreferenced chat-agent module. - Update the model-evaluation tests to use replacements for the removed LangChain evaluators. Refs: GML-2137
- Add a structured chunker that preserves table and section layout, plus an auto chunker that picks a strategy from the document type. - Improve extracted-PDF text quality by repairing mojibake, vertical CJK runs, and broken table rows. - Bundle third-party license attribution for the structured-document parser. Refs: GML-2121, GML-2081
- Add an agentic chat mode that plans multi-step retrieval and combines structured and unstructured lookups, selectable per graph. - Default to the agentic engine and fall back to the classic engine automatically when the chat model can't do tool-calling. - Add per-graph external MCP server configuration with an admin-only setup page. - Surface the agent's plan and per-step execution in the trace log view. Refs: GML-2107, GML-2109, GML-2111, GML-2112, GML-2102, GML-1983, GML-1987
- Show the underlying exception detail to superadmin users when an answer fails, while other users still see the generic message. - Preserve the steps completed before a failure so the trace log shows how far the agent got. Refs: GML-2136
- Let users pick the chat engine per message: Agent (Auto, Planned, or Reactive) or Classic (Auto or a specific retriever), defaulting to Agent. - Accept the selection on the chat and query endpoints; fall back to Classic with a notice when the model can't run Agent mode. - Show the engine/style below Agent answers instead of a retriever method. - Default the configured agent style to planned.
- Bump the release version for 2.0.0 (supersedes the 1.4.2 value carried in by the merge).
- Load the latest 10 conversations on open, with a "more…" control to load older ones in batches. - Fetch a conversation's messages only when it is opened, instead of fetching every conversation up front.
- Keep each section's heading context inside its chunk - Roll small sections up into their parent up to the size budget - Keep tables intact, including tables nested inside lists - Default to the auto-detecting chunker; route markdown and HTML through the structured chunker
…user portion - Keep prompt rules in a fixed system prompt; expose only an additional-instructions section for customization - Ignore legacy full-prompt overrides until they are re-saved as a user portion - Let the system prompt own format instructions consistently across prompts
…embedding - Embed each chunk with a compact summary of its topic, section, and entities for contextual retrieval - Retry embeddings at shorter lengths on provider overflow and skip vertices that still do not fit - Skip vertices without an embedding during similarity search - Size upserts to the pending work and report distinct vertex and edge counts
- Install only missing queries through a non-blocking REST request with status polling - Fall back to GSQL create/install when the REST path is unavailable - Await the schema-version lookup on asynchronous request paths
- Run all files, then retry connection-failed loads once the database is reachable again - Bound the wait so a persistent outage fails out instead of hanging - Report files that still fail so re-running ingest reloads only those
…non-ASCII context - Return the answer only by default; accept an option to include sources and trace - Serialize context before the token-budget check so non-ASCII content is sized correctly
- Let superusers register external MCP servers and install the libraries they require - Ensure a graph's configured MCP servers are installed before the agent runs - Restrict MCP server configuration to superusers
- Clarify the chat engine and style picker - Allow bulk-clearing older conversations - Rename the graph Compatibility Check to Migration Assistant - Fix clipped descenders in text inputs
…s prompt customizable - Always include a vector search unless a question is confidently pure structured data (planner and reactive paths) - Expose the agentic agent prompt for customization via the fixed-rules + editable-user-portion split - Apply the same user customization whether the agent runs planned or reactive
- Move soft style, length, granularity, threshold, and example guidance from the locked rules into the editable default for the customizable prompts - Keep output format, inputs, schema/attribute rules, and faithfulness guards locked
- Re-rank chunks reached by graph expansion by similarity to the question and keep the top results, bounding the context sent to the model - Default and floor the cap at twice Top K; honor a larger explicit or configured value - Expose Max Results on the GraphRAG Configuration page beside Top K and Number of Hops
- Rank the chunks pulled in via a community's entities by similarity to the question and keep the top results, bounding the context - Always keep the community summaries; cap only the related-chunk text - Reuse the same result cap (default and floor twice Top K) and config knob as hybrid search
- Give the planner its own customizable system prompt (analyze the question up front; decide structural and/or unstructured retrieval, how many, in what order; then consolidate and answer) - Reframe the react agent prompt to reason-act-observe: first action from analysis, each next action based on whether the gathered context can yet answer - Expose both as separate entries on the Customize Prompts page; prefer (not mandate) vector search
- Move the retrieval strategy (which methods, when, how many, what order) out of the fixed rules into the editable default for both planner and react prompts - Keep the role, act model, and output mechanics fixed in the system prompt - Pre-fill the Customize Prompts strategy with the default so it can be tuned, not authored from scratch
- React agent returns a structured answer with citations, like the planned and classic engines - Record retrieved vs. selected sources for the admin trace - Recover the answer from malformed model JSON instead of returning raw context - Keep numbers and units verbatim in their original form
- Answer greetings and questions about the assistant directly, without a knowledge-base lookup - Send informational questions to the agent to retrieve or use a tool - Expose the routing policy as an editable prompt; the output contract stays fixed
- Read the graph schema only when a question needs structured or document retrieval; greetings and tool-only questions skip it - Drive react retrieval from each observation rather than a fixed plan
- Call external MCP tools from the React engine (sanitize tool names for the model, resolve them back on dispatch) - Give the planner each tool's typed, required, described parameters so external-tool calls are formed correctly
- Turn the chat send button into a stop control while a response streams, so the next question can be asked without waiting - Show the underlying cause in the admin error detail instead of a generic wrapper
- List the agentic planner, react agent, and agent-routing prompts on the prompt-customization reference
- Manual, run-against-a-live-graph parity script; not part of the automated test suite
- Base the final answer on the full retrieved context - Keep retrieving until the question is answered completely, with guidance for tabular and numeric questions
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
- Demonstrate the agentic chat engine in the question-answering example - Correct hybrid search to the vertex types that carry vector indexes - Update the demo notebook to current search method names and graph setup Refs: GML-2103
# Conflicts: # common/requirements.txt # ecc/app/graphrag/graph_rag.py # graphrag/Dockerfile
- Omit the temperature setting for models that reject it, so o-series models can be selected without failing the connection test
- Gate the Agentic chat options on a tool-calling-capable model: warn on the config page and at startup, and fall back to the classic engine otherwise - Route a classic retriever selection correctly when no engine is specified, instead of misapplying it as an agentic style - Give admins a reference id on chat errors instead of raw exception text, keeping the detail in the protected server logs - Make the Migration Assistant check fast and deterministic, and repair only the queries it reports, installing them by name rather than reinstalling all - Surface a clear, compressed reason when a repair fails and show the first few errors Refs: GML-2103, GML-2109, GML-2161
- Describe the agentic (planned/reactive) and classic chat engines and how to configure and customize them - Update the login and RAG configuration screenshots Refs: GML-2103
PR Type
Enhancement, Bug fix, Tests, Documentation
Description
Add agentic GraphRAG engine
Enable MCP tool execution
Introduce structured document chunking
Harden prompt customization and parsing
Diagram Walkthrough
File Walkthrough
16 files
Split prompts and add agentic LLM helpersAdd structure-aware markdown and HTML chunkerDefault ingestion to auto chunkerAdd document-type aware chunker dispatcherDetect model support for tool callingImprove extracted document text qualityAdapt extraction to split prompt contractPreserve structured chunk metadata in embeddingsAdd agentic chat orchestration entrypointImplement reactive tool-calling agent loopImplement planned multi-step agent executionExecute planned agent tool stepsWire graph execution for agentic modeSynthesize grounded agent final answersExpose GraphRAG retrieval as toolsRegister available agent and MCP tools1 files
Sanitize split-prompt user customization content3 files
Bump package version to 2.0.0Add shared MCP server configuration supportAdd release configuration for agentic features83 files