arcade-agent

Framework-agnostic tool library for software architecture analysis.

Provides composable tools for parsing source code, recovering architecture, detecting architectural smells, computing quality metrics, and comparing versions. Works standalone or plugs into MCP, LangChain, or Claude SDK.

Install

pip install -e ".[dev]"

# With MCP server support (for AI agent integration)
pip install -e ".[mcp,dev]"

Quick Start

from arcade_agent.tools.ingest import ingest
from arcade_agent.tools.parse import parse
from arcade_agent.tools.recover import recover
from arcade_agent.tools.detect_smells import detect_smells
from arcade_agent.tools.compute_metrics import compute_metrics
from arcade_agent.tools.visualize import visualize

# 1. Ingest a project
repo = ingest("/path/to/java/project")

# 2. Parse dependencies
graph = parse(repo.path, language="java")

# 3. Recover architecture
arch = recover(graph, algorithm="pkg")

# 4. Detect smells
smells = detect_smells(arch, graph)

# 5. Compute metrics
metrics = compute_metrics(arch, graph)

# 6. Generate report
visualize(repo.name, repo.version, graph, arch, smells, output="report.html")

Tools

Tool	Description
`ingest`	Clone/load source code, detect versions, discover files
`parse`	Parse source → DependencyGraph via tree-sitter
`recover`	Recover architecture (PKG, WCA, ACDC, ARC, LIMBO)
`detect_smells`	Find dependency cycles, concern overload, scattered functionality, link overload (heuristic or LLM-powered)
`compute_metrics`	Calculate RCI, TurboMQ, connectivity metrics
`compare`	A2A architecture comparison across versions
`visualize`	Generate HTML reports, DOT, Mermaid, JSON, RSF
`query`	Explore recovered architecture interactively
`summarize`	Codebase overview with package tree, hotspots, entry points; drill-down via `focus`
`explain_component`	Component detail: API surface, dependencies, cohesion
`find_relevant`	Find entities relevant to a natural-language query

Balanced Architecture Score

arcade-agent keeps the original ARCADE-style quality metrics (RCI, TurboMQ, BasicMQ, IntraConnectivity, InterConnectivity, TwoWayPairRatio) and adds an explainable derived score for reporting:

BalancedArchitectureScore =
  0.50 * cohesion_family
+ 0.35 * PrincipleAlignmentScore
+ 0.15 * SmellDiscipline

This score is bounded to [0, 1] and higher is better. It is intended as a readable summary for PR comments and self-analysis reports, not as a replacement for the raw metrics. The original metrics are still emitted and shown so teams can inspect the underlying cohesion and coupling values.

The balanced score exists because raw cohesion/coupling metrics can be hard to interpret in isolation. A change can improve RCI while still creating an unbalanced component, a dependency hub, or new architecture smells. The derived score combines three views:

Cohesion family — existing RCI, TurboMQ, and BasicMQ signals.
Principle alignment — acyclic dependencies, layering health, responsibility focus, interface segregation, component balance, hub balance, boundary clarity, and dependency distribution.
Smell discipline — architectural smell burden weighted by severity and affected-component scope.

Reports also include score_drivers, which identify the strongest and weakest signals behind the score. This makes the result actionable: reviewers can see whether a score moved because of dependency concentration, component imbalance, smell burden, or another architectural pressure.

Supported Languages

Java (full support)
Python (full support)
C/C++ (full support)
TypeScript/JavaScript (stub — contributions welcome)

Example: ARCADE Core

ARCADE Core is a Java-based architecture recovery workbench from USC's Software Architecture Research Group. Running arcade-agent against it:

git clone https://github.com/usc-softarch/arcade_core.git
python examples/basic_analysis.py arcade_core --language java

Results (v1.2.0): 170 entities, 470 edges, 13 components recovered, 7 architectural smells detected (including a 7-component dependency cycle and concern overload in the Clustering module).

See examples/arcade_core_report.html for the full interactive report.

Algorithm Comparison

Compare PKG, ACDC, ARC, and LIMBO recovery algorithms side-by-side on the same project:

python examples/compare_algorithms.py arcade_core --language java --use-llm

See examples/comparison_report.html for the full comparison report.

MCP Server (AI Agent Integration)

arcade-agent exposes all tools via the Model Context Protocol so AI agents (Claude Code, Cursor, etc.) can analyze codebases with minimal token usage.

Start the server

arcade-mcp

Configure in Claude Code

Add to your Claude Code MCP settings:

{
  "mcpServers": {
    "arcade-agent": {
      "command": "arcade-mcp"
    }
  }
}

How it works

Session store — Tools like parse and recover return compact summaries with a session_id. Pass session IDs to downstream tools instead of full data objects.
Token budget — Every tool accepts an optional max_tokens parameter. Outputs are progressively truncated (entity details → edge summaries → component counts) to fit.
Parse caching — Parsed dependency graphs are cached to .arcade-cache/ keyed by file modification times. Repeated analysis of the same codebase skips re-parsing.
On-demand detail — Call get_full_result(session_id) to retrieve complete data when the summary isn't enough.

Example agent workflow

Agent: call parse(source_path="/path/to/project")
       → {session_id: "a1b2c3", num_entities: 170, num_edges: 470, ...}

Agent: call recover(dep_graph="a1b2c3", algorithm="pkg")
       → {session_id: "d4e5f6", num_components: 13, components: [...], ...}

Agent: call detect_smells(architecture="d4e5f6", dep_graph="a1b2c3")
       → {num_smells: 7, smells: [...]}

Agent: call get_full_result(session_id="a1b2c3", max_tokens=2000)
       → full graph data, truncated to fit token budget

LLM-Powered Analysis

Pass --use-llm to enable Claude-powered concern detection. Requires the claude CLI installed and authenticated.

python examples/basic_analysis.py arcade_core --language java --use-llm

This replaces heuristic smell detection (entity count thresholds, suffix matching) with semantic analysis that identifies what concerns each component handles and why they are problematic. Set ARCADE_MOCK=1 to skip LLM calls, or ARCADE_MODEL=haiku to use a faster model.

CI/CD Integration

arcade-agent ships with a GitHub Action that detects architecture drift on every PR — like SonarQube for architecture.

How It Works

On pull request: parses the codebase, recovers architecture (PKG), compares against a stored baseline, and posts a PR comment with a drift report.
On push to main: updates the baseline (.arcade/baseline.json) so future PRs compare against the latest merged state.

Setup

Copy .github/workflows/arch-drift.yml into your repository. The workflow auto-detects the language, or you can set it explicitly via the language workflow input.

# .github/workflows/arch-drift.yml is included in the repo — just enable Actions.

The baseline is stored in .arcade/baseline.json and committed to the repo automatically when changes are pushed to main.

Local Usage

Run the drift detection script locally:

# Analyze without a baseline (first run)
python scripts/arch_diff.py --source /path/to/project --language java

# Update the baseline
python scripts/arch_diff.py --source /path/to/project --language java --update-baseline

# Compare against existing baseline
python scripts/arch_diff.py --source /path/to/project --language java

PR Comment Format

The action posts a comment with:

Drift table — component count, similarity score, balanced score, principle alignment, RCI, TurboMQ, and supporting metric deltas
Changes — added/removed components, entity movements, splits/merges
Smells — dependency cycles, concern overload, scattered functionality

The comment is updated on each push to the PR (not duplicated).

Roadmap

arcade-agent ports and extends the capabilities of the original ARCADE Java workbench, and is evolving into a token-efficient codebase understanding layer for AI agents. See ROADMAP.md for the full AI agent integration roadmap.

Feature	Status	Details
5 recovery algorithms (PKG, WCA, ACDC, ARC, LIMBO)	Done	Package-based, weighted clustering, pattern-based, LLM concern-based, information-theoretic
4 smell types (BDC, BCO, SPF, BUO)	Done	Heuristic + LLM-powered detection
6 quality metrics	Done	RCI, TurboMQ, BasicMQ, IntraConnectivity, InterConnectivity, TwoWayPairRatio
Balanced architecture score	Done	Derived reporting score combining core metrics, principle signals, and smell burden
A2A architecture comparison	Done	Hungarian algorithm on Jaccard similarity
Multi-language parsing	Done	Java, Python, C/C++ (full), TypeScript (stub)
5 export formats	Done	HTML, DOT, JSON, RSF, Mermaid
LLM concern extraction	Done	Claude CLI for semantic BCO/SPF detection
MCP server	Done	Expose tools to AI agents via Model Context Protocol with session store
Token-budget truncation	Done	Progressive output reduction to fit agent context windows
Parse result caching	Done	Mtime-based cache avoids re-parsing unchanged codebases
Codebase summarization	Done	Token-efficient overview with package tree, hotspots, hierarchical drill-down
Component explanation	Done	API surface, dependencies, cohesion metrics for recovered components
Relevance search	Done	Keyword-based entity search with architecture-aware boosting
Multi-version evolution pipeline	Planned	Batch version history analysis, A2A cost trends, CVG over time
Flexible stopping criteria	Planned	`no_orphans`, `size_fraction` strategies for WCA/ARC/LIMBO
Additional similarity measures	Planned	UEMNM (normalized UEM) and InfoLoss
Architectural Stability metric	Planned	Fan-in/fan-out ratio
MCFP-based comparison	Planned	Minimum Cost Flow for accurate entity movement cost
Design decision recovery (RecovAr)	Planned	Link issue trackers to architectural changes

Comparison with ARCADE Core

arcade-agent is a Python successor to the original ARCADE Core Java workbench. The table below compares capabilities across both projects.

High Value

Feature	ARCADE Core (Java)	arcade-agent (Python)	Notes
LIMBO algorithm	Full	Done (LLM-powered)	Uses Claude CLI concern vectors + size-weighted JS divergence
ARC algorithm	Full (concern-based)	Done (LLM-powered)	Uses Claude CLI concern vectors + JS divergence instead of MALLET topics
Topic modeling (MALLET)	Full (50 topics, 250 iterations)	LLM-based	arcade-agent uses Claude CLI instead of MALLET for semantic concern analysis
Evolution metrics (A2A cost, CVG)	MCFP-based movement cost, coverage	Basic Jaccard comparison	Core computes actual entity movement costs and bidirectional coverage
Multi-version batch analysis	VersionMap, VersionTree, batch processing	Single-pair compare	Core can process entire version histories and track trends
Stopping/serialization criteria	3 stopping + 4 serialization strategies	Hardcoded target cluster count	Flexible termination (no-orphans, size-fraction) would improve clustering
Similarity measures	11 (UEMNM, InfoLoss, WeightedJS, ARC variants)	3 (JS, UEM, SCM)	More measures = better tuning per project type

Medium Value

Feature	ARCADE Core (Java)	arcade-agent (Python)	Notes
Architectural Stability metric	Fan-in/fan-out ratio	Missing	Simple addition to existing 6 metrics
Concern-based smell detection	Topic distributions for BCO and SPF	LLM-powered (Claude CLI)	Heuristic fallback also available
Cluster matching (MCFP)	Minimum Cost Flow for movement cost	Hungarian algorithm on Jaccard	MCFP gives more accurate evolution cost
ODEM input format	XML-based dependency parsing	Missing	Academic interchange format, limited real-world use
SmellToIssuesCorrelation	Correlates smells with issue tracker data	Missing	Requires issue tracker integration

Lower Priority

Feature	ARCADE Core (Java)	arcade-agent (Python)	Notes
RecovAr (Design Decision Recovery)	Full engine (GitLab issues/commits)	Missing	Large scope research feature
Issue tracker integration	JIRA + GitLab REST clients	Missing	Needed for RecovAr or SmellToIssues
Swing GUI	Full desktop visualization	HTML reports + CLI	HTML/Mermaid is more modern
Classycle bytecode analysis	Java bytecode dependency extraction	tree-sitter source parsing	tree-sitter is arguably better (no compilation needed)
Make dependency / Understand CSV	C-specific input formats	Missing	Niche; tree-sitter C parser covers the core need

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.arcade		.arcade
.github		.github
examples		examples
scripts		scripts
src/arcade_agent		src/arcade_agent
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
Makefile		Makefile
README.md		README.md
ROADMAP.md		ROADMAP.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

arcade-agent

Install

Quick Start

Tools

Balanced Architecture Score

Supported Languages

Example: ARCADE Core

Algorithm Comparison

MCP Server (AI Agent Integration)

Start the server

Configure in Claude Code

How it works

Example agent workflow

LLM-Powered Analysis

CI/CD Integration

How It Works

Setup

Local Usage

PR Comment Format

Roadmap

Comparison with ARCADE Core

High Value

Medium Value

Lower Priority

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

arcade-agent

Install

Quick Start

Tools

Balanced Architecture Score

Supported Languages

Example: ARCADE Core

Algorithm Comparison

MCP Server (AI Agent Integration)

Start the server

Configure in Claude Code

How it works

Example agent workflow

LLM-Powered Analysis

CI/CD Integration

How It Works

Setup

Local Usage

PR Comment Format

Roadmap

Comparison with ARCADE Core

High Value

Medium Value

Lower Priority

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages