Why
src/forge_loop/cli.py is 1705 LOC, 29 commands. The quality manifesto soft-cap for Python modules is 500 LOC. We're 3.4× over.
Direct customer pain:
- Workers dispatched on a CLI ticket have to skim 1700 lines to find the relevant subcommand. Slows every iteration on the CLI surface.
- New contributors trying to read the codebase end-to-end hit cli.py as the second-largest blob (after the runner) and bounce.
- The 29 commands span unrelated concerns (run / status / events / config / brainstorm / manifesto / dashboard / mcp / replay / repos / roles / brief / record-session / pipeline / retry / pause / resume / stop / init / doctor / sweep / ...) — there's no single mental model.
This is the canonical modernization-gated case: split so that future work on axis-1-to-5 features moves faster.
What
Split cli.py into cli/ (package) with one file per concern:
src/forge_loop/cli/
├── __init__.py # the Typer app — wires sub-typers, keeps `main` entrypoint
├── run.py # run / pause / resume / stop / retry / record-session
├── observe.py # status / events / doctor / config
├── brainstorm.py # brainstorm / manifesto subcommands
├── repos.py # repos list/disable/enable (multirepo)
├── roles.py # roles list
├── replay.py # replay / brief
├── pipeline.py # pipeline show
├── dashboard.py # dashboard subcommands
├── mcp.py # mcp serve
└── init.py # init / sweep
Acceptance
Axis citation
modernization-gated. Unblocks faster iteration on:
axis:scm-depth (new CLI subcommands for SCM admin operations land cleanly in scm.py instead of bloating cli.py further)
axis:rbac-multitenancy (RBAC admin CLI lands in a new admin.py)
axis:ux-quality (operator tour subcommands land in tour.py)
Test plan
- CliRunner snapshot per command (
--help output, exit codes, common flag combos). Diff against pre-split snapshots = empty.
- Regression: every existing
tests/test_cli*.py runs green untouched.
- Quality-manifesto pin: tests/test_cli_module_size_caps.py asserts every cli/ module ≤ 350 LOC.
- Adversarial: a deliberate typo in a sub-typer registration ⇒ CI fails the snapshot diff (catches regressions).
File pointers
src/forge_loop/cli.py (delete — replaced by package)
src/forge_loop/cli/__init__.py (new — facade)
src/forge_loop/cli/{run,observe,brainstorm,repos,roles,replay,pipeline,dashboard,mcp,init}.py (new)
tests/test_cli_split_snapshots.py (new — the snapshot harness)
tests/test_cli_module_size_caps.py (new — manifesto rule pin)
Why
src/forge_loop/cli.pyis 1705 LOC, 29 commands. The quality manifesto soft-cap for Python modules is 500 LOC. We're 3.4× over.Direct customer pain:
This is the canonical modernization-gated case: split so that future work on axis-1-to-5 features moves faster.
What
Split
cli.pyintocli/(package) with one file per concern:Acceptance
forge-loop --helpand everyforge-loop <cmd> --helpproduce byte-identical output before/after. CliRunner snapshot tests per command lock this.cli/__init__.pybuilds the root Typer app; each module exposes aregister(app)function.tests/test_cli*.pycontinue to pass without rewrite.Axis citation
modernization-gated. Unblocks faster iteration on:
axis:scm-depth(new CLI subcommands for SCM admin operations land cleanly inscm.pyinstead of bloating cli.py further)axis:rbac-multitenancy(RBAC admin CLI lands in a newadmin.py)axis:ux-quality(operator tour subcommands land intour.py)Test plan
--helpoutput, exit codes, common flag combos). Diff against pre-split snapshots = empty.tests/test_cli*.pyruns green untouched.File pointers
src/forge_loop/cli.py(delete — replaced by package)src/forge_loop/cli/__init__.py(new — facade)src/forge_loop/cli/{run,observe,brainstorm,repos,roles,replay,pipeline,dashboard,mcp,init}.py(new)tests/test_cli_split_snapshots.py(new — the snapshot harness)tests/test_cli_module_size_caps.py(new — manifesto rule pin)