phaseloop

Develop a PLAN.md / roadmap / TODO file to completion with Claude Code — across context-window resets. Each phase runs in fresh subagent contexts and hands off to the next via a durable PHASE-N.md continuation file, so no single agent ever has to hold the whole project in its head.

Two variants, one engine:

Skill	For	Validation gate
`phaseloop-py`	Python projects	`pytest` + `ruff` + `mypy`
`phaseloop-web`	JavaScript / TypeScript / React / Node / Next	`tsc --noEmit` + `eslint` + `vitest`/`jest` + framework build

Why it exists

A long build (a multi-slice feature, a roadmap with 9 milestones) doesn't fit in one context window. The usual failure mode: the agent's context fills with diffs and test logs, reasoning degrades, and quality falls off a cliff around the halfway mark.

phaseloop fixes that by treating PHASE-N.md as a continuation token, not documentation. The orchestrator stays deliberately thin — it only ever reads the PLAN, the latest PHASE-N.md, and short JSON summaries. All the heavy work (reading code, running tests, diffing) happens in subagents whose context is disposable.

The loop

-1. Partition       [recursive autonomy only] PLAN → dependency DAG → tracks + conventions contract
0. Bootstrap        read PLAN + latest PHASE-N.md (where did we stop?)
1+2. Explore        [subagent] find next PLAN item + survey the code
3. Plan & scope     THE GATE — gated: 🔴 human approves scope
                               autonomous: ⚔️ adversarial SCOPE PANEL → SCOPE-N.md
4. Execute          [subagent] implement the approved plan + tests
                    autonomous: + deterministic guardrails (allowlist, dep & diff caps)
5. Validate         [NO agent] full log → .phaseloop/validate-N.log, exit code = gate — hard stop on red
6. Adversarial      [2-3 subagents, parallel] told to REFUTE the work
7. Write PHASE-N.md  commit point — only on green + sign-off (autonomous: one phase = one git commit)
8. Loop             next PLAN item (autonomous: HALT on any ESCALATE)

Two design rules that make it work

Cheap-build / expensive-verify. The implementer runs on a cheaper model (Sonnet) because its scope is locked by the human gate (step 3) and its output is adversarially checked by a stronger model (step 6). Reviewers run on Opus — never downgrade them, or the gate becomes theater. Models are assigned per-subagent and can be overridden per run.
One gate, at scoping — and you choose who holds it. Scoping is the cheapest place to catch drift. In gated mode (default) you approve what gets built and where the boundary is; everything after runs unattended. In autonomous mode (say "autonomous" / "full auto" / "unattended") an adversarial scope panel holds the gate instead — see below. Security-sensitive phases (auth, payments, fail-closed logic) bump the implementer back to Opus and require unanimous reviewer sign-off in both modes.
Validation is agentless, with full logs on disk. Running the test/build toolchain is reading an exit code, not judgment — so it runs as a deterministic command, not a paid subagent. The complete output is teed to a per-phase log file (.phaseloop/validate-N.log); only the exit code + summary tail reach the orchestrator's context. On red, the retry agent reads the log file — full tracebacks, nothing lost — in its own disposable context. Validation runs before the Opus reviewers, so a red suite costs $0 in review tokens.

Autonomous mode — the scope panel replaces the human gate

For long roadmaps you don't want to babysit. The human gate becomes an adversarial panel of Opus reviewers with different lenses and different evidence — a persona is not a lens; the distinct question + distinct inputs is what de-correlates the verdicts:

Role	Sits when the scope touches	Asks
Architect	always	Is this exactly the PLAN item? What coupling will a later phase regret?
Pentester	auth, secrets, input parsing, network, new deps	What attack surface does this CHANGE? What fails open?
Systems engineer	config, persistence, concurrency, build/deploy	What breaks half-deployed, retried, concurrent?

Each returns a strict APPROVE | REVISE | ESCALATE verdict and routing is deterministic: unanimous APPROVE proceeds; a REVISE gets one scope revision then re-panels; any ESCALATE or split vote halts the loop and pings you — a split vote is the signal a human is needed.

The trust anchor moves into the artifacts:

PLAN.md must be contract-grade — acceptance criteria, explicit non-goals. The plan is the only oracle scope review has; an ambiguous item escalates instead of getting paneled.
SCOPE-N.md is written before any code — the audit trail you review whenever you get back.
Deterministic guardrails, no agent: the implementer's changed files are checked against the approved allowlist, new deps must be declared in scope, oversized diffs escalate. A cap is free; agents debating agents is theater.
One phase = one commit (never pushed). Your override while watching is Esc-Esc; your override while away is git revert.

Recursive autonomy (v3) — several loops, one plan

Autonomous mode, multiplied. A Partitioner (Opus) reads the whole PLAN and builds the dependency DAG: items with disjoint file footprints and no edges between them become tracks. Each track gets its own git worktree, its own branch, its own continuation chain (TRACK-A/PHASE-N.md) — and runs the full autonomous loop (scope panel, guardrails, one-phase-one-commit) in parallel with the others.

Three honesty rules keep it from being a token bonfire:

Width is discovered, not declared. Items sharing files share a track. A plan whose DAG is a straight line degrades to plain autonomous mode — the skill says so instead of pretending. Speedup is bounded by DAG width and integration cost (Amdahl), not by how many orchestrators you spawn.
A conventions contract rides in every track prompt. The fleet's killer failure isn't merge conflicts — it's two tracks making divergent-but-individually-plausible decisions. Seam interfaces are written down before any track starts.
Integration is a new serial gate. Green-in-isolation ≠ green-together: track branches merge into an integration branch, the FULL validation suite re-runs, and adversarial integration reviewers hunt seam mismatches and contract divergence. Unresolvable conflicts escalate to you.

Despite the name, the topology is flat: one meta-orchestrator multiplexes the same loop across tracks. Orchestrators never spawn orchestrators — "recursive" is the loop, not the process tree. (Yes, the name stays. It sounds cool.)

Install

./install.sh            # copies both skills into ~/.claude/skills/
./install.sh --web      # web variant only
./install.sh --python   # python variant only

Or manually — copy the skill folder(s) into ~/.claude/skills/:

cp -r skills/phaseloop-py   ~/.claude/skills/
cp -r skills/phaseloop-web  ~/.claude/skills/

Skills are picked up on the next Claude Code session (no restart of an existing session needed beyond a new turn).

Usage

Point it at a plan and ask it to work:

/phaseloop-py   work ./ROADMAP.md
/phaseloop-web  continue the build in docs/PLAN.md

It explores, then stops at the scope checkpoint for your approval before touching code. Approve, and it executes → validates → adversarially reviews → writes PHASE-N.md, then loops to the next item.

For unattended runs, ask for it explicitly:

/phaseloop-py   work ./ROADMAP.md, fully autonomous
/phaseloop-web  run docs/PLAN.md unattended, ping me on escalations
/phaseloop-py   develop ./PLAN.md with recursive autonomy
/phaseloop-web  fleet mode on ROADMAP.md, max 2 tracks

Per-run overrides are just plain English: "implementer on opus this phase", "skip the build step, it's a library", "three reviewers, unanimous", "always seat the pentester".

Layout

skills/
  phaseloop-py/    SKILL.md + PHASE_TEMPLATE.md + SCOPE_TEMPLATE.md + TRACKS_TEMPLATE.md   (Python)
  phaseloop-web/   SKILL.md + PHASE_TEMPLATE.md + SCOPE_TEMPLATE.md + TRACKS_TEMPLATE.md   (JS/TS/React)
install.sh
README.md

Requirements

Claude Code
A project with a runnable test/validation toolchain (the gate is only as good as the tests behind it).

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
skills		skills
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

phaseloop

Why it exists

The loop

Two design rules that make it work

Autonomous mode — the scope panel replaces the human gate

Recursive autonomy (v3) — several loops, one plan

Install

Usage

Layout

Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

phaseloop

Why it exists

The loop

Two design rules that make it work

Autonomous mode — the scope panel replaces the human gate

Recursive autonomy (v3) — several loops, one plan

Install

Usage

Layout

Requirements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages