Skip to content

speechlabinc/code-reviewr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

code-reviewr

A dual-LLM architecture review library for Node/TypeScript codebases.

Two independent AI reviewers debate their findings, then an arbiter synthesises a canonical report — covering system architecture, data model, AI/ML pipeline, and product roadmap alignment.

Powered by the Claude Agent SDK and Azure OpenAI.


How it works

  1. Phase 1 — Independent reviews (parallel) Reviewer A (Claude) actively browses your codebase using Read/Grep/Glob tools. Reviewer B (Azure GPT-4o) analyses the codebase from a provided summary. Both produce structured reviews covering architecture, data model, AI pipeline, and roadmap alignment.

  2. Phase 2 — Debate The two reviewers challenge and refine each other's findings across up to N rounds. The debate terminates early if both reviewers explicitly converge.

  3. Phase 3 — Arbiter synthesis A third Claude model synthesises the debate into a canonical markdown report and a structured JSON document.


Output

All files are written to ./llm_reviews/ by default (configurable with --output-dir):

llm_reviews/
├── reviewer_a/
│   └── initial_review.md              ← Claude's analysis
├── reviewer_b/
│   └── initial_review.md              ← GPT-4o's analysis
├── debate_logs/
│   └── debate_20260304T120000Z.md     ← Full debate transcript
└── final/
    ├── architecture_and_pipeline_review.md    ← Human-readable report
    └── architecture_and_pipeline_review.json  ← Machine-readable JSON

Requirements

  • Python 3.10+
  • claude-agent-sdk (for Claude / Anthropic calls)
  • openai (for Azure OpenAI calls)
pip install claude-agent-sdk openai
pip install pytest pytest-asyncio  # for running tests

Credentials

Set these environment variables — never hard-code keys:

# Required for Reviewer A and Arbiter (Claude)
export ANTHROPIC_API_KEY=sk-ant-...

# Required for Reviewer B (Azure OpenAI)
export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
export AZURE_OPENAI_API_KEY=your-azure-key

CLI usage

Both reviewers on Claude (Anthropic only):

ANTHROPIC_API_KEY=sk-ant-... \
python -m dual_review \
  --code-paths ./src \
  --roadmap ./docs/product_roadmap.md \
  --reviewer-b-model claude-opus-4-6

Reviewer A = Claude, Reviewer B = Azure GPT-4o:

ANTHROPIC_API_KEY=sk-ant-... \
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/ \
AZURE_OPENAI_API_KEY=your-azure-key \
python -m dual_review \
  --code-paths ./src \
  --roadmap ./docs/product_roadmap.md \
  --reviewer-b-model gpt-4o \
  --azure-deployment gpt-4o=your-deployment-name \
  --output-dir ./my_reviews

Note: --azure-deployment gpt-4o=your-deployment-name maps the model identifier to the deployment name you gave it in Azure OpenAI Studio. If your deployment name matches the model name exactly, you can omit this flag.

Mock backend (no API keys needed — for testing):

python -m dual_review --code-paths ./src --mock

All CLI options:

--code-paths PATH [PATH ...]   Paths to review (default: .)
--roadmap FILE                 Product roadmap document
--output-dir DIR               Output directory (default: ./llm_reviews)
--reviewer-a-model MODEL       Reviewer A model (default: claude-opus-4-6)
--reviewer-b-model MODEL       Reviewer B model (default: gpt-4o)
--arbiter-model MODEL          Arbiter model (default: claude-opus-4-6)
--max-debate-rounds N          Max debate rounds (default: 5)
--no-config-files              Exclude .json/.yml/.yaml from scan
--azure-endpoint URL           Azure OpenAI endpoint
--azure-api-key KEY            Azure OpenAI API key
--azure-api-version VER        Azure API version (default: 2024-12-01-preview)
--azure-deployment MODEL=NAME  Map model → Azure deployment name (repeatable)
--mock                         Use mock backend, no real LLM calls

Programmatic usage

import asyncio
from dual_review import (
    ReviewConfig,
    AgentSDKBackend,
    AzureOpenAIBackend,
    CompositeBackend,
    run_review,
)

# Composite backend: Claude for "claude-*" models, Azure for "gpt-*" models
backend = CompositeBackend(
    routes={
        "claude": AgentSDKBackend(),
        "gpt": AzureOpenAIBackend(
            deployment_map={"gpt-4o": "your-deployment-name"},
        ),
    },
    default=AgentSDKBackend(),
)

result = asyncio.run(run_review(
    ReviewConfig(
        code_paths=["./src"],
        roadmap_path="./docs/product_roadmap.md",
        output_dir="./llm_reviews",
        reviewer_a_model="claude-opus-4-6",
        reviewer_b_model="gpt-4o",
        arbiter_model="claude-opus-4-6",
        max_debate_rounds=3,
    ),
    backend=backend,
))

print(result.summary())
# Open result.final_markdown_path for the full report

Running tests

All 93 tests are deterministic and require no API keys:

python -m pytest tests/ -v

Examples

python examples/basic_usage.py           # Mock backend, converging debate
python examples/custom_convergence.py    # Mock backend, max-rounds termination

# Real LLM examples (require credentials):
python examples/agent_sdk_usage.py       # Claude only
python examples/azure_claude_review.py  # Claude + Azure GPT-4o

Architecture

See ARCHITECTURE.md for detailed design documentation and SYSTEM_DESIGN.md for module responsibilities.

About

Dual-LLM architecture review library for Node/TypeScript codebases — Claude Agent SDK + Azure OpenAI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages