Framework-agnostic tool library for software architecture analysis.
Provides composable tools for parsing source code, recovering architecture, detecting architectural smells, computing quality metrics, and comparing versions. Works standalone or plugs into MCP, LangChain, or Claude SDK.
pip install -e ".[dev]"
# With MCP server support (for AI agent integration)
pip install -e ".[mcp,dev]"from arcade_agent.tools.ingest import ingest
from arcade_agent.tools.parse import parse
from arcade_agent.tools.recover import recover
from arcade_agent.tools.detect_smells import detect_smells
from arcade_agent.tools.compute_metrics import compute_metrics
from arcade_agent.tools.visualize import visualize
# 1. Ingest a project
repo = ingest("/path/to/java/project")
# 2. Parse dependencies
graph = parse(repo.path, language="java")
# 3. Recover architecture
arch = recover(graph, algorithm="pkg")
# 4. Detect smells
smells = detect_smells(arch, graph)
# 5. Compute metrics
metrics = compute_metrics(arch, graph)
# 6. Generate report
visualize(repo.name, repo.version, graph, arch, smells, output="report.html")| Tool | Description |
|---|---|
ingest |
Clone/load source code, detect versions, discover files |
parse |
Parse source → DependencyGraph via tree-sitter |
recover |
Recover architecture (PKG, WCA, ACDC, ARC, LIMBO) |
detect_smells |
Find dependency cycles, concern overload, scattered functionality, link overload (heuristic or LLM-powered) |
compute_metrics |
Calculate RCI, TurboMQ, connectivity metrics |
compare |
A2A architecture comparison across versions |
visualize |
Generate HTML reports, DOT, Mermaid, JSON, RSF |
query |
Explore recovered architecture interactively |
summarize |
Codebase overview with package tree, hotspots, entry points; drill-down via focus |
explain_component |
Component detail: API surface, dependencies, cohesion |
find_relevant |
Find entities relevant to a natural-language query |
arcade-agent keeps the original ARCADE-style quality metrics (RCI, TurboMQ,
BasicMQ, IntraConnectivity, InterConnectivity, TwoWayPairRatio) and adds
an explainable derived score for reporting:
BalancedArchitectureScore =
0.50 * cohesion_family
+ 0.35 * PrincipleAlignmentScore
+ 0.15 * SmellDiscipline
This score is bounded to [0, 1] and higher is better. It is intended as a
readable summary for PR comments and self-analysis reports, not as a replacement
for the raw metrics. The original metrics are still emitted and shown so teams
can inspect the underlying cohesion and coupling values.
The balanced score exists because raw cohesion/coupling metrics can be hard to
interpret in isolation. A change can improve RCI while still creating an
unbalanced component, a dependency hub, or new architecture smells. The derived
score combines three views:
- Cohesion family — existing
RCI,TurboMQ, andBasicMQsignals. - Principle alignment — acyclic dependencies, layering health, responsibility focus, interface segregation, component balance, hub balance, boundary clarity, and dependency distribution.
- Smell discipline — architectural smell burden weighted by severity and affected-component scope.
Reports also include score_drivers, which identify the strongest and weakest
signals behind the score. This makes the result actionable: reviewers can see
whether a score moved because of dependency concentration, component imbalance,
smell burden, or another architectural pressure.
- Java (full support)
- Python (full support)
- C/C++ (full support)
- TypeScript/JavaScript (stub — contributions welcome)
ARCADE Core is a Java-based architecture recovery workbench from USC's Software Architecture Research Group. Running arcade-agent against it:
git clone https://github.com/usc-softarch/arcade_core.git
python examples/basic_analysis.py arcade_core --language javaResults (v1.2.0): 170 entities, 470 edges, 13 components recovered, 7 architectural smells detected (including a 7-component dependency cycle and concern overload in the Clustering module).
See examples/arcade_core_report.html for the full interactive report.
Compare PKG, ACDC, ARC, and LIMBO recovery algorithms side-by-side on the same project:
python examples/compare_algorithms.py arcade_core --language java --use-llmSee examples/comparison_report.html for the full comparison report.
arcade-agent exposes all tools via the Model Context Protocol so AI agents (Claude Code, Cursor, etc.) can analyze codebases with minimal token usage.
arcade-mcpAdd to your Claude Code MCP settings:
{
"mcpServers": {
"arcade-agent": {
"command": "arcade-mcp"
}
}
}- Session store — Tools like
parseandrecoverreturn compact summaries with asession_id. Pass session IDs to downstream tools instead of full data objects. - Token budget — Every tool accepts an optional
max_tokensparameter. Outputs are progressively truncated (entity details → edge summaries → component counts) to fit. - Parse caching — Parsed dependency graphs are cached to
.arcade-cache/keyed by file modification times. Repeated analysis of the same codebase skips re-parsing. - On-demand detail — Call
get_full_result(session_id)to retrieve complete data when the summary isn't enough.
Agent: call parse(source_path="/path/to/project")
→ {session_id: "a1b2c3", num_entities: 170, num_edges: 470, ...}
Agent: call recover(dep_graph="a1b2c3", algorithm="pkg")
→ {session_id: "d4e5f6", num_components: 13, components: [...], ...}
Agent: call detect_smells(architecture="d4e5f6", dep_graph="a1b2c3")
→ {num_smells: 7, smells: [...]}
Agent: call get_full_result(session_id="a1b2c3", max_tokens=2000)
→ full graph data, truncated to fit token budget
Pass --use-llm to enable Claude-powered concern detection. Requires the claude CLI installed and authenticated.
python examples/basic_analysis.py arcade_core --language java --use-llmThis replaces heuristic smell detection (entity count thresholds, suffix matching) with semantic analysis that identifies what concerns each component handles and why they are problematic. Set ARCADE_MOCK=1 to skip LLM calls, or ARCADE_MODEL=haiku to use a faster model.
arcade-agent ships with a GitHub Action that detects architecture drift on every PR — like SonarQube for architecture.
- On pull request: parses the codebase, recovers architecture (PKG), compares against a stored baseline, and posts a PR comment with a drift report.
- On push to main: updates the baseline (
.arcade/baseline.json) so future PRs compare against the latest merged state.
Copy .github/workflows/arch-drift.yml into your repository. The workflow auto-detects the language, or you can set it explicitly via the language workflow input.
# .github/workflows/arch-drift.yml is included in the repo — just enable Actions.The baseline is stored in .arcade/baseline.json and committed to the repo automatically when changes are pushed to main.
Run the drift detection script locally:
# Analyze without a baseline (first run)
python scripts/arch_diff.py --source /path/to/project --language java
# Update the baseline
python scripts/arch_diff.py --source /path/to/project --language java --update-baseline
# Compare against existing baseline
python scripts/arch_diff.py --source /path/to/project --language javaThe action posts a comment with:
- Drift table — component count, similarity score, balanced score, principle alignment, RCI, TurboMQ, and supporting metric deltas
- Changes — added/removed components, entity movements, splits/merges
- Smells — dependency cycles, concern overload, scattered functionality
The comment is updated on each push to the PR (not duplicated).
arcade-agent ports and extends the capabilities of the original ARCADE Java workbench, and is evolving into a token-efficient codebase understanding layer for AI agents. See ROADMAP.md for the full AI agent integration roadmap.
| Feature | Status | Details |
|---|---|---|
| 5 recovery algorithms (PKG, WCA, ACDC, ARC, LIMBO) | Done | Package-based, weighted clustering, pattern-based, LLM concern-based, information-theoretic |
| 4 smell types (BDC, BCO, SPF, BUO) | Done | Heuristic + LLM-powered detection |
| 6 quality metrics | Done | RCI, TurboMQ, BasicMQ, IntraConnectivity, InterConnectivity, TwoWayPairRatio |
| Balanced architecture score | Done | Derived reporting score combining core metrics, principle signals, and smell burden |
| A2A architecture comparison | Done | Hungarian algorithm on Jaccard similarity |
| Multi-language parsing | Done | Java, Python, C/C++ (full), TypeScript (stub) |
| 5 export formats | Done | HTML, DOT, JSON, RSF, Mermaid |
| LLM concern extraction | Done | Claude CLI for semantic BCO/SPF detection |
| MCP server | Done | Expose tools to AI agents via Model Context Protocol with session store |
| Token-budget truncation | Done | Progressive output reduction to fit agent context windows |
| Parse result caching | Done | Mtime-based cache avoids re-parsing unchanged codebases |
| Codebase summarization | Done | Token-efficient overview with package tree, hotspots, hierarchical drill-down |
| Component explanation | Done | API surface, dependencies, cohesion metrics for recovered components |
| Relevance search | Done | Keyword-based entity search with architecture-aware boosting |
| Multi-version evolution pipeline | Planned | Batch version history analysis, A2A cost trends, CVG over time |
| Flexible stopping criteria | Planned | no_orphans, size_fraction strategies for WCA/ARC/LIMBO |
| Additional similarity measures | Planned | UEMNM (normalized UEM) and InfoLoss |
| Architectural Stability metric | Planned | Fan-in/fan-out ratio |
| MCFP-based comparison | Planned | Minimum Cost Flow for accurate entity movement cost |
| Design decision recovery (RecovAr) | Planned | Link issue trackers to architectural changes |
arcade-agent is a Python successor to the original ARCADE Core Java workbench. The table below compares capabilities across both projects.
| Feature | ARCADE Core (Java) | arcade-agent (Python) | Notes |
|---|---|---|---|
| LIMBO algorithm | Full | Done (LLM-powered) | Uses Claude CLI concern vectors + size-weighted JS divergence |
| ARC algorithm | Full (concern-based) | Done (LLM-powered) | Uses Claude CLI concern vectors + JS divergence instead of MALLET topics |
| Topic modeling (MALLET) | Full (50 topics, 250 iterations) | LLM-based | arcade-agent uses Claude CLI instead of MALLET for semantic concern analysis |
| Evolution metrics (A2A cost, CVG) | MCFP-based movement cost, coverage | Basic Jaccard comparison | Core computes actual entity movement costs and bidirectional coverage |
| Multi-version batch analysis | VersionMap, VersionTree, batch processing | Single-pair compare | Core can process entire version histories and track trends |
| Stopping/serialization criteria | 3 stopping + 4 serialization strategies | Hardcoded target cluster count | Flexible termination (no-orphans, size-fraction) would improve clustering |
| Similarity measures | 11 (UEMNM, InfoLoss, WeightedJS, ARC variants) | 3 (JS, UEM, SCM) | More measures = better tuning per project type |
| Feature | ARCADE Core (Java) | arcade-agent (Python) | Notes |
|---|---|---|---|
| Architectural Stability metric | Fan-in/fan-out ratio | Missing | Simple addition to existing 6 metrics |
| Concern-based smell detection | Topic distributions for BCO and SPF | LLM-powered (Claude CLI) | Heuristic fallback also available |
| Cluster matching (MCFP) | Minimum Cost Flow for movement cost | Hungarian algorithm on Jaccard | MCFP gives more accurate evolution cost |
| ODEM input format | XML-based dependency parsing | Missing | Academic interchange format, limited real-world use |
| SmellToIssuesCorrelation | Correlates smells with issue tracker data | Missing | Requires issue tracker integration |
| Feature | ARCADE Core (Java) | arcade-agent (Python) | Notes |
|---|---|---|---|
| RecovAr (Design Decision Recovery) | Full engine (GitLab issues/commits) | Missing | Large scope research feature |
| Issue tracker integration | JIRA + GitLab REST clients | Missing | Needed for RecovAr or SmellToIssues |
| Swing GUI | Full desktop visualization | HTML reports + CLI | HTML/Mermaid is more modern |
| Classycle bytecode analysis | Java bytecode dependency extraction | tree-sitter source parsing | tree-sitter is arguably better (no compilation needed) |
| Make dependency / Understand CSV | C-specific input formats | Missing | Niche; tree-sitter C parser covers the core need |