Skip to content

GML-2103 Release 2.0.0 - Agentic Engine#46

Merged
chengbiao-jin merged 34 commits into
mainfrom
release_2.0.0
Jul 2, 2026
Merged

GML-2103 Release 2.0.0 - Agentic Engine#46
chengbiao-jin merged 34 commits into
mainfrom
release_2.0.0

Conversation

@chengbiao-jin

@chengbiao-jin chengbiao-jin commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

PR Type

Enhancement, Bug fix, Tests, Documentation


Description

  • Add agentic GraphRAG engine

  • Enable MCP tool execution

  • Introduce structured document chunking

  • Harden prompt customization and parsing


Diagram Walkthrough

flowchart LR
  A["User chat request"]
  B["Agentic triage and planning"]
  C["GraphRAG and MCP tools"]
  D["Structured retrieval context"]
  E["Synthesized answer and trace"]

  A -- "routes to" --> B
  B -- "executes" --> C
  C -- "retrieves" --> D
  D -- "grounds" --> E
Loading

File Walkthrough

Relevant files
Enhancement
16 files
base_llm.py
Split prompts and add agentic LLM helpers                               
+807/-153
structured.py
Add structure-aware markdown and HTML chunker                       
+1119/-0
ecc_util.py
Default ingestion to auto chunker                                               
+16/-10 
auto.py
Add document-type aware chunker dispatcher                             
+118/-0 
capabilities.py
Detect model support for tool calling                                       
+149/-0 
text_extractors.py
Improve extracted document text quality                                   
+190/-22
LLMEntityRelationshipExtractor.py
Adapt extraction to split prompt contract                               
+32/-14 
tigergraph_embedding_store.py
Preserve structured chunk metadata in embeddings                 
+89/-10 
agentic_agent.py
Add agentic chat orchestration entrypoint                               
+279/-0 
agentic_react.py
Implement reactive tool-calling agent loop                             
+212/-0 
agentic_planner.py
Implement planned multi-step agent execution                         
+142/-0 
agentic_executor.py
Execute planned agent tool steps                                                 
+188/-0 
agentic_graph.py
Wire graph execution for agentic mode                                       
+126/-0 
agentic_synthesizer.py
Synthesize grounded agent final answers                                   
+88/-0   
graphrag_tools.py
Expose GraphRAG retrieval as tools                                             
+324/-0 
tool_registry.py
Register available agent and MCP tools                                     
+339/-0 
Security
1 files
prompt_validation.py
Sanitize split-prompt user customization content                 
+109/-11
Configuration changes
3 files
VERSION
Bump package version to 2.0.0                                                       
+1/-1     
mcp_config.py
Add shared MCP server configuration support                           
+229/-0 
config.py
Add release configuration for agentic features                     
+40/-2   
Additional files
83 files
CHANGELOG.md +30/-0   
README.md +61/-2   
__init__.py +3/-1     
html_chunker.py +1/-1     
markdown_chunker.py +1/-1     
recursive_chunker.py +1/-1     
connections.py +41/-0   
migrate.py +12/-8   
schema_extraction.py +1/-1     
schema_utils.py +2/-2     
embedding_services.py +1/-1     
Content_Similarity_Vector_Search.gsql +3/-1     
GraphRAG_Community_Vector_Search.gsql +20/-1   
GraphRAG_Hybrid_Vector_Search.gsql +19/-1   
aws_sagemaker_endpoint.py +1/-1     
schemas.py +48/-1   
tool_io_schemas.py +44/-0   
requirements.txt +193/-178
community_summarizer.py +3/-1     
graph_rag.py +31/-22 
util.py +55/-53 
workers.py +53/-23 
supportai_init.py +1/-1     
README_chunkers.md +165/-0 
test_chunkers.py +357/-0 
test_chunkers_demo.py +198/-0 
test_chunkers_simple.py +317/-0 
ActionProvider.tsx +50/-3   
Bot.tsx +117/-19
Contexts.tsx +6/-1     
CustomChatMessage.tsx +22/-2   
SideMenu.tsx +135/-70
input.tsx +21/-7   
index.css +23/-5   
main.tsx +5/-0     
TraceLogs.tsx +82/-1   
CustomizePrompts.tsx +36/-6   
GraphRAGConfig.tsx +50/-0   
KGAdmin.tsx +8/-8     
McpServersConfig.tsx +780/-0 
SetupLayout.tsx +8/-0     
Dockerfile +1/-1     
agent.py +0/-124 
agent.py +58/-5   
agent_generation.py +19/-7   
agent_graph.py +1/-1     
agent_hallucination_check.py +1/-1     
agent_rewrite.py +1/-1     
agent_router.py +1/-1     
agent_usefulness_check.py +1/-1     
method_selector.py +1/-1     
main.py +27/-0   
__init__.py +33/-0   
client_manager.py +222/-0 
registry_adapter.py +102/-0 
result_normalize.py +87/-0   
runtime.py +101/-0 
__init__.py +1/-0     
inquiryai.py +50/-22 
mcp_servers.py +354/-0 
ui.py +260/-72
BaseRetriever.py +5/-2     
CommunityRetriever.py +13/-3   
HybridRetriever.py +18/-4   
supportai.py +109/-57
supportai_ingest.py +2/-2     
find_existing_query.py +289/-0 
generate_cypher.py +3/-3     
generate_function.py +4/-4     
generate_gsql.py +3/-3     
map_question_to_schema.py +4/-4     
tg_mcp_tools.py +167/-0 
tool_guards.py +84/-0   
test_agent_mode.py +32/-0   
test_connections.py +2/-3     
test_e2e_prompt_customization.py +73/-276
test_e2e_schema_aware_ingest.py +10/-17 
test_invoke_with_parser.py +56/-0   
test_prompt_split.py +96/-0   
test_prompt_validation.py +111/-116
test_service.py +38/-11 
README.md +94/-0   
docling-MIT +21/-0   

- Float pinned dependencies to minimum-version constraints and upgrade the core stack to current releases.
- Migrate all imports (top-level and lazy in-function) to the LangChain 1.x module layout.
- Drop the unused top-level langchain package.
- Remove a dead, unreferenced chat-agent module.
- Update the model-evaluation tests to use replacements for the removed LangChain evaluators.

Refs: GML-2137
- Add a structured chunker that preserves table and section layout, plus an auto chunker that picks a strategy from the document type.
- Improve extracted-PDF text quality by repairing mojibake, vertical CJK runs, and broken table rows.
- Bundle third-party license attribution for the structured-document parser.

Refs: GML-2121, GML-2081
- Add an agentic chat mode that plans multi-step retrieval and combines structured and unstructured lookups, selectable per graph.
- Default to the agentic engine and fall back to the classic engine automatically when the chat model can't do tool-calling.
- Add per-graph external MCP server configuration with an admin-only setup page.
- Surface the agent's plan and per-step execution in the trace log view.

Refs: GML-2107, GML-2109, GML-2111, GML-2112, GML-2102, GML-1983, GML-1987
- Show the underlying exception detail to superadmin users when an answer fails, while other users still see the generic message.
- Preserve the steps completed before a failure so the trace log shows how far the agent got.

Refs: GML-2136
- Let users pick the chat engine per message: Agent (Auto, Planned, or Reactive) or Classic (Auto or a specific retriever), defaulting to Agent.
- Accept the selection on the chat and query endpoints; fall back to Classic with a notice when the model can't run Agent mode.
- Show the engine/style below Agent answers instead of a retriever method.
- Default the configured agent style to planned.
- Bump the release version for 2.0.0 (supersedes the 1.4.2 value carried in by the merge).
- Load the latest 10 conversations on open, with a "more…" control to load older ones in batches.
- Fetch a conversation's messages only when it is opened, instead of fetching every conversation up front.
- Keep each section's heading context inside its chunk
- Roll small sections up into their parent up to the size budget
- Keep tables intact, including tables nested inside lists
- Default to the auto-detecting chunker; route markdown and HTML through the structured chunker
…user portion

- Keep prompt rules in a fixed system prompt; expose only an additional-instructions section for customization
- Ignore legacy full-prompt overrides until they are re-saved as a user portion
- Let the system prompt own format instructions consistently across prompts
…embedding

- Embed each chunk with a compact summary of its topic, section, and entities for contextual retrieval
- Retry embeddings at shorter lengths on provider overflow and skip vertices that still do not fit
- Skip vertices without an embedding during similarity search
- Size upserts to the pending work and report distinct vertex and edge counts
- Install only missing queries through a non-blocking REST request with status polling
- Fall back to GSQL create/install when the REST path is unavailable
- Await the schema-version lookup on asynchronous request paths
- Run all files, then retry connection-failed loads once the database is reachable again
- Bound the wait so a persistent outage fails out instead of hanging
- Report files that still fail so re-running ingest reloads only those
…non-ASCII context

- Return the answer only by default; accept an option to include sources and trace
- Serialize context before the token-budget check so non-ASCII content is sized correctly
- Let superusers register external MCP servers and install the libraries they require
- Ensure a graph's configured MCP servers are installed before the agent runs
- Restrict MCP server configuration to superusers
- Clarify the chat engine and style picker
- Allow bulk-clearing older conversations
- Rename the graph Compatibility Check to Migration Assistant
- Fix clipped descenders in text inputs
…s prompt customizable

- Always include a vector search unless a question is confidently pure structured data (planner and reactive paths)
- Expose the agentic agent prompt for customization via the fixed-rules + editable-user-portion split
- Apply the same user customization whether the agent runs planned or reactive
- Move soft style, length, granularity, threshold, and example guidance from the locked rules into the editable default for the customizable prompts
- Keep output format, inputs, schema/attribute rules, and faithfulness guards locked
- Re-rank chunks reached by graph expansion by similarity to the question and keep the top results, bounding the context sent to the model
- Default and floor the cap at twice Top K; honor a larger explicit or configured value
- Expose Max Results on the GraphRAG Configuration page beside Top K and Number of Hops
- Rank the chunks pulled in via a community's entities by similarity to the question and keep the top results, bounding the context
- Always keep the community summaries; cap only the related-chunk text
- Reuse the same result cap (default and floor twice Top K) and config knob as hybrid search
- Give the planner its own customizable system prompt (analyze the question up front; decide structural and/or unstructured retrieval, how many, in what order; then consolidate and answer)
- Reframe the react agent prompt to reason-act-observe: first action from analysis, each next action based on whether the gathered context can yet answer
- Expose both as separate entries on the Customize Prompts page; prefer (not mandate) vector search
- Move the retrieval strategy (which methods, when, how many, what order) out of the fixed rules into the editable default for both planner and react prompts
- Keep the role, act model, and output mechanics fixed in the system prompt
- Pre-fill the Customize Prompts strategy with the default so it can be tuned, not authored from scratch
- React agent returns a structured answer with citations, like the
  planned and classic engines
- Record retrieved vs. selected sources for the admin trace
- Recover the answer from malformed model JSON instead of returning
  raw context
- Keep numbers and units verbatim in their original form
- Answer greetings and questions about the assistant directly, without
  a knowledge-base lookup
- Send informational questions to the agent to retrieve or use a tool
- Expose the routing policy as an editable prompt; the output contract
  stays fixed
- Read the graph schema only when a question needs structured or
  document retrieval; greetings and tool-only questions skip it
- Drive react retrieval from each observation rather than a fixed plan
- Call external MCP tools from the React engine (sanitize tool names
  for the model, resolve them back on dispatch)
- Give the planner each tool's typed, required, described parameters so
  external-tool calls are formed correctly
- Turn the chat send button into a stop control while a response
  streams, so the next question can be asked without waiting
- Show the underlying cause in the admin error detail instead of a
  generic wrapper
- List the agentic planner, react agent, and agent-routing prompts on
  the prompt-customization reference
- Manual, run-against-a-live-graph parity script; not part of the
  automated test suite
- Base the final answer on the full retrieved context
- Keep retrieving until the question is answered completely, with
  guidance for tabular and numeric questions
@tg-pr-agent

tg-pr-agent Bot commented Jul 1, 2026

Copy link
Copy Markdown

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 5 🔵🔵🔵🔵🔵
🧪 PR contains tests
🔒 Security concerns

Sensitive information exposure:
_agent_error_text includes raw exception messages in user-facing responses for superadmins. Even privileged chat responses can be persisted, copied, or exposed through history/trace workflows, and upstream LLM/provider exceptions may contain internal URLs, configuration details, request payload fragments, or credentials. Prefer returning a trace/request ID and storing sanitized details in protected logs.

⚡ Recommended focus areas for review

Performance Risk

The migration status endpoint now performs LLM-based prompt conflict reviews for every split prompt override. If this endpoint is polled by the UI, it can become slow, costly, quota-sensitive, and nondeterministic. Consider making the LLM review explicit, cached, asynchronous, or limited to local deterministic checks in status calls.

review_svc = get_llm_service(get_chat_config(graphname))
graph_prompt_dir = os.path.join(
    "configs", "graph_configs", graphname, "prompts"
)
for fname in LLM_Model._SPLIT_PROMPT_SPEC:
    p = os.path.join(graph_prompt_dir, fname)
    if not os.path.exists(p):
        continue
    try:
        raw = open(p, encoding="utf-8").read()
    except Exception:
        continue
    sys_attr, _ = LLM_Model._SPLIT_PROMPT_SPEC[fname]
    if review_svc._is_legacy_full_prompt(raw, getattr(review_svc, sys_attr)):
        prompt_issues[fname] = {"legacy_full_prompt": True}
        continue
    placeholders = find_placeholders(raw)
    review = review_svc.review_user_portion_llm(fname, raw)
    if placeholders or review.get("has_conflict"):
Compatibility Risk

_chat_agent passes the same menu value as both supportai_retriever and agent_style. Existing clients that send classic retriever values via rag_pattern without mode=classic may now route through the agentic default and provide invalid agent styles such as hybrid or contextual. Validate/normalize the value after resolving the engine mode.

value = value or "auto"
return make_agent(
    graphname, conn, use_cypher, ws=ws, mode=mode,
    supportai_retriever=value, agent_style=value,
)
Sensitive Details

Raw exception text is returned in the chat response for superadmins. Provider and backend exceptions may include sensitive configuration, URLs, request fragments, or credentials and may also be persisted in chat history. Consider sanitizing or using a trace ID with details only in protected logs.

if is_superadmin and error_msg:
    return f"{generic}\n\n(Admin detail: {error_msg})"

- Demonstrate the agentic chat engine in the question-answering example
- Correct hybrid search to the vertex types that carry vector indexes
- Update the demo notebook to current search method names and graph setup

Refs: GML-2103
# Conflicts:
#	common/requirements.txt
#	ecc/app/graphrag/graph_rag.py
#	graphrag/Dockerfile
- Omit the temperature setting for models that reject it, so o-series
  models can be selected without failing the connection test
- Gate the Agentic chat options on a tool-calling-capable model: warn on the
  config page and at startup, and fall back to the classic engine otherwise
- Route a classic retriever selection correctly when no engine is specified,
  instead of misapplying it as an agentic style
- Give admins a reference id on chat errors instead of raw exception text,
  keeping the detail in the protected server logs
- Make the Migration Assistant check fast and deterministic, and repair only
  the queries it reports, installing them by name rather than reinstalling all
- Surface a clear, compressed reason when a repair fails and show the first
  few errors

Refs: GML-2103, GML-2109, GML-2161
- Describe the agentic (planned/reactive) and classic chat engines and how
  to configure and customize them
- Update the login and RAG configuration screenshots

Refs: GML-2103
@chengbiao-jin chengbiao-jin merged commit a1d54b4 into main Jul 2, 2026
1 check failed
@chengbiao-jin chengbiao-jin deleted the release_2.0.0 branch July 2, 2026 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants