Skip to content

epic: Quality + Testing manifestos — customer-tunable, drive worker workflow, feedback-loop on bugs #130

@hadamrd

Description

@hadamrd

Why

Three repeated failure shapes during the dogfood week:

  1. State machine edges go untested — iteration probe has ~12 states, ~30 legal edges; bugs fix(runner): quarantine undeletable worktree dirs (#96) #97/fix(iteration): bail on CLOSED PR + fetch before ahead-count #120/hot-fix(iteration): probe misclassification + escalate label-removal #128 all hit edges that had zero edge-test coverage.
  2. Cosmetic tickets ship value-free — PO has no opinion about what should exist (fixed by epic: customer-tunable Product Brainstormer / PO Master (vision + axes drive every ticket) #121 brainstormer epic).
  3. Code quality drifts — refactors land that "look fine in PR" but introduce shared mutable state, untyped boundaries, or missing observability.

Insight: each bug is valuable feedback. The manifestos turn that feedback into a permanent gate that future workers must follow.

Per-project, customer-tunable, version-controlled manifestos for:

Customer story

A platform lead drops three files into their repo: .forge/product-vision.md, .forge/quality-manifesto.md, .forge/testing-manifesto.md. The PO reads vision to file the right tickets. The worker reads quality to write the feature the right way. The worker reads testing to write tests that catch the right bugs. Every shipped PR has explicit citations from all three. When a bug ships and gets fixed, forge-loop manifesto suggest --from-pr <N> proposes manifesto deltas so the bug can't recur.

Customer-owned files

.forge/quality-manifesto.md

Free-form + structured rules. Examples:

  • "No shared mutable module-level state. Use a Container or per-instance State."
  • "Every external I/O boundary lives behind a typed Protocol with a Fake for tests."
  • "Boot order: settings → log → adapters → runtime. No exceptions."
  • "No subprocess.run for SDK-able calls. Use the typed client."

.forge/testing-manifesto.md

The "how to write tests that catch real bugs" playbook. Examples (informed by THIS week's bugs):

  • "State machine ⇒ ONE TEST PER EDGE + ONE adversarial fallthrough test per default-branch."
  • "External-dep assumption (origin/branch exists, file is readable, network up) ⇒ ONE adversarial test for the false case."
  • "subprocess.run returncode handling ⇒ test BOTH ==0 and !=0 branches."
  • "Every Protocol Fake ⇒ a regression test that asserts the Real impl returns the same shape as the Fake on representative inputs."
  • "Property-based tests on any function with >4 branches OR any function consuming user input (issue body, brief content, env values)."

The feedback loop

  • forge-loop manifesto suggest --from-pr <N> — reads the merged fix PR, analyzes the failure shape, proposes manifesto deltas (new quality rule? new testing pattern?).
  • Manifesto deltas land via PR like any other change, with the bug-PR cited.
  • The next worker run reads the updated manifesto → can't repeat the failure shape.

Acceptance

  • Discovery + schema for .forge/quality-manifesto.md + .forge/testing-manifesto.md (mirror feat(brainstormer): .forge/product-vision.md + axes.yaml discovery + schema #122 pattern)
  • Worker brief loader injects active manifestos into the worker prompt
  • Critic prompt includes manifesto-compliance check (refuses PRs that visibly violate)
  • forge-loop manifesto suggest CLI command — bug PR in → manifesto delta out
  • Per-axis testing manifesto override (axis:scm-depth might require integration tests against a Bitbucket mock that axis:ui-quality doesn't)

Sub-tickets (this epic's children) — filed in this PR

  1. Quality + Testing manifesto discovery + schema (mirror feat(brainstormer): .forge/product-vision.md + axes.yaml discovery + schema #122)
  2. Worker brief integration — manifesto content rendered into the system prompt
  3. Critic compliance check — manifestos in the critic prompt; sev1 if violated
  4. forge-loop manifesto suggest --from-pr feedback-loop CLI
  5. Seed manifestos for forge-loop itself — informed by THIS week's bugs (state-machine edges, subprocess fallthrough, refcount checks)

Connection to today's bugs

Each iteration-probe bug would have been caught by these manifesto rules:

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions