Skip to content

Add read-only Forge session MCP server#87

Open
eshulman2 wants to merge 1 commit into
forge-sdlc:mainfrom
eshulman2:feature/session-mcp-data-layer
Open

Add read-only Forge session MCP server#87
eshulman2 wants to merge 1 commit into
forge-sdlc:mainfrom
eshulman2:feature/session-mcp-data-layer

Conversation

@eshulman2

@eshulman2 eshulman2 commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • add a read-only Forge session summary service backed by checkpoint state
  • expose production HTTP endpoint: GET /api/v1/sessions/{ticket_key}/summary
  • expose a user-facing stdio MCP server for session and observability inspection
  • keep forge-session-mcp out of the checked-in mcp-servers.json so Forge agents do not load session-inspection tools themselves
  • add GRAFANA_BASE_URL config so summaries can include dashboard links
  • add user docs for the HTTP endpoint and optional Claude/MCP setup

Observability Data Layer

  • add Langfuse API-backed observability endpoints under /api/v1/observability/*
  • expose safe aggregate tools for ticket cost/tokens/latency, model usage, workflow funnel, and metadata coverage
  • expose explicit full-trace access through:
    • GET /api/v1/observability/tickets/{ticket_key}/traces
    • GET /api/v1/observability/traces/{trace_id}
    • MCP tools get_session_traces and get_trace
  • avoid direct ClickHouse access from Forge API/MCP; Grafana can still use ClickHouse, but the user data layer goes through Langfuse APIs

Safety

  • users can query Forge API or their own assistant MCP client instead of needing Redis, Langfuse, Grafana, or ClickHouse credentials
  • session summaries and aggregate observability responses intentionally omit raw prompts, model messages, generated artifacts, tool inputs, and raw trace payloads
  • full trace tools are explicit and return full_trace_data_exposed: true
  • HTTP session summaries default to checkpoint-derived summary only; Redis logs are opt-in on the HTTP endpoint and bounded by logs_limit <= 50
  • the MCP server is configured by users in their assistant client, not by Forge's agent MCP configuration

Docs

  • document /api/v1/sessions/{ticket_key}/summary in the API reference
  • document observability aggregate and full-trace endpoints in the API reference
  • add docs/reference/session-inspection.md with HTTP and Claude MCP setup instructions
  • link session inspection from README, docs index, and configuration reference

Tests

  • jq empty mcp-servers.json
  • UV_CACHE_DIR=/tmp/uv-cache uv run ruff check src/forge/observability/access.py src/forge/api/routes/observability.py src/forge/mcp/session.py src/forge/api/routes/__init__.py src/forge/main.py tests/unit/observability/test_access.py tests/unit/api/routes/test_observability.py tests/unit/mcp/test_session_server.py
  • UV_CACHE_DIR=/tmp/uv-cache uv run pytest tests/unit/observability/test_access.py tests/unit/api/routes/test_observability.py tests/unit/mcp/test_session_server.py tests/unit/api/routes/test_sessions.py tests/unit/sessions/test_summary.py
  • live gateway smoke checks for health, session summary, observability health, ticket observability, ticket traces, model usage, and workflow funnel

@eshulman2 eshulman2 requested a review from danchild June 18, 2026 14:07
@eshulman2 eshulman2 force-pushed the feature/session-mcp-data-layer branch 2 times, most recently from b3de7fb to f8c4ab3 Compare June 18, 2026 15:17
@eshulman2 eshulman2 force-pushed the feature/session-mcp-data-layer branch from f8c4ab3 to 68dc785 Compare June 21, 2026 05:57
Comment thread docs/reference/config.md
| `GRAFANA_ADMIN_PASSWORD` | Grafana admin password (default: `grafana`) |
| `LANGFUSE_DOCKER_NETWORK` | External Docker/Podman network for self-hosted Langfuse when using `devtools/grafana/compose.langfuse-network.yml` (default: `langfuse_default`) |
| `CLICKHOUSE_HOST` | Langfuse ClickHouse host reachable from the Grafana container |
| `CLICKHOUSE_HOST` | Langfuse ClickHouse host reachable from Grafana |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should drop this line change. It's just a change in the description of CLICKHOUSE_HOST and it's misleading


| Endpoint | Purpose |
|----------|---------|
| `/api/v1/observability/tickets/{ticket_key}` | Ticket cost, tokens, latency, workflow steps, and recent observation metadata |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is value in /api/v1/sessions/{session_id}/summary and the api/v1/observability/tickets endpoints - each provides useful information and context. However, there is a semantic clash between the sessions and tickets segments. Langfuse sessions and JIRA tickets are effectively the same thing in Forge, and having different segments creates ambiguity. We need to rethink the routing here

raise ValueError("trace_id must not be empty")

client = get_langfuse_client()
trace = await _call_langfuse("trace.get", client.async_api.trace.get, normalized)

@danchild danchild Jul 2, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use kwarg trace_id in the call to _call_langfuse to be consistent with get_session_traces()

Comment thread src/forge/mcp/session.py
return {"error": str(exc), "raw_trace_data_exposed": False}

@mcp.tool(
name="get_session_traces",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the size of the payload, get_session_traces and get_trace results in an error on the local mcp client (tested with Claude Code). The source of the error seems to be related to a ~30MB payload ceiling on Google Vertex AI or a surpassing of the context window, resulting in the mcp client dropping its connection with the forge mcp server. We need to verify if this is the true cause of the error to determine a solution, but more importantly, we need to rethink a strategy on how to expose these traces to mcp clients as the payloads are large even for single traces.

`cwd` to the local Forge repository path:

```json
{

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation was enough here for me to figure out how to define an mcp server using OpenCode. This is how it works with OpenCode

"forge-session": { "type": "local", "command": ["uv", "run", "forge-session-mcp"], "cwd": "/path/to/forge", "enabled": true }

I haven't used other MCP clients other than Claude and OpenCode to know if using local type or providing args in the command list is more typical. Just providing this example for informational purposes should it help clarify the documentation

metadata = _metadata(trace)
step = str(metadata.get("workflow_step") or "unknown")
cost = _number(_get_attr(trace, "total_cost", "totalCost"))
input_tokens = _number(_get_attr(trace, "input_tokens", "inputTokens"))

@danchild danchild Jul 2, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input_tokens, output_tokens, and total_tokens are not being set properly. Langfuse doesn't provide these figures in the traces API. However, they can be queried using the observations or metrics API. See get_model_usage() for an example.

limit=limit,
order_by="timestamp.desc",
fields="core,metrics,io",
)

@danchild danchild Jul 2, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fields="core,metrics.io creates an error on the LangFuse backend here because we are not passing in a session_id

e.g. err:

"detail": "Langfuse trace.list failed: ...

| `get_session_traces` | Tool that returns Langfuse traces for one Jira ticket session; full trace data by default |
| `get_trace` | Tool that returns one full Langfuse trace by trace id |
| `get_model_usage` | Tool that returns aggregate model calls, cost, tokens, and latency |
| `get_workflow_funnel` | Tool that returns workflow-step issue, trace, cost, and latency aggregates |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_workflow_funnel is currently implemented without filtering by a session. Therefore, it returns a list of steps for all sessions which doesn't sound useful on the surface. Can you clarify the intent of this endpoint?


The MCP responses include `raw_state_exposed: false` or
`raw_trace_data_exposed: false` for curated responses. Full trace responses use
`full_trace_data_exposed: true`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based our research and findings into large trace payloads, we should reconsider how to expose the full_trace_data_exposed: true to make sure it too doesn't create large payload issues

@danchild

danchild commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Hi @eshulman2 great work here. This is jammed packed of great features - we just need to refine them. My main concerns involve intent and semantics more than the implementation, namely, the routing schema and reorganizing the features/functionality in a more coherent way (see other comments for those details).

In addition to your work, I think it would be useful to add a subsetting mechanism to /api/v1/observability/model-usage, querying it by session. What do you think?

Finally, do we plan to add authz for externally hosted MCP servers in subsequent PR's?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants