Skip to content

fix(iteration): bail on CLOSED PR + fetch before ahead-count#120

Merged
hadamrd merged 1 commit into
trunkfrom
fix/iteration-probe-closed-pr
May 28, 2026
Merged

fix(iteration): bail on CLOSED PR + fetch before ahead-count#120
hadamrd merged 1 commit into
trunkfrom
fix/iteration-probe-closed-pr

Conversation

@hadamrd
Copy link
Copy Markdown
Owner

@hadamrd hadamrd commented May 28, 2026

Bug

Caught dogfooding the loop on Titan #1104:

  • attempt 2 → dirty_no_commit → commit brief
  • attempt 3 → committed_not_pushed → push brief
  • worker_iterations_exhausted, PR #1107 was CLOSED externally, loop kept pushing

Root cause

  1. `probe_worker_state` checked `state == "MERGED"` but fell through to the no-PR path for `state == "CLOSED"`, locking the loop into a push-forever cycle.
  2. Ahead-count measured `origin/branch..HEAD` against a stale tracking ref. Successful pushes from prior sessions still read as 'commits ahead'.

Fix

  • New `WorkerState.CLOSED_PR_ABANDONED`, added to `TERMINAL_STATES`.
  • Probe returns it on CLOSED-but-not-merged PRs.
  • `run_iteration_loop` calls `escalate_to_human` (label `loop:needs-human` + diagnostic comment) so the dispatcher stops picking the issue up.
  • `git fetch --quiet origin ` before the ahead-count. Subprocess failure suppressed.

Test plan

  • 8 new tests in `tests/test_iteration_closed_pr.py` — CLOSED returns terminal, MERGED still wins over CLOSED, OPEN unchanged, fetch-before-rev-list ordering pin, fetch-failure non-fatal, iteration-loop short-circuits + escalates
  • Full suite: 766 passed (was 758), 0 regressions

…-caught)

Bug discovered dogfooding the loop on Titan #1104 (audit retention):
- Iteration loop attempt 2 → dirty_no_commit (commit brief)
- Iteration loop attempt 3 → committed_not_pushed (push brief)
- worker_iterations_exhausted with final_state=committed_not_pushed
- PR was real and OPEN at one point, then CLOSED externally; the
  iteration loop never noticed and kept trying to push to a branch
  whose work had been abandoned.

Two root causes:

1. ``probe_worker_state`` checked ``state == "MERGED"`` but fell
   through to the "no PR" path for ``state == "CLOSED"``. That fed
   the COMMITTED_NOT_PUSHED → push brief loop forever.

2. The ahead-count (``git rev-list origin/branch..HEAD``) was
   measured against a stale tracking ref. A successful push from a
   prior session still read as 'commits ahead' because nothing in
   the probe path refreshed ``origin/<branch>``.

Fixes:
- New WorkerState.CLOSED_PR_ABANDONED. Added to TERMINAL_STATES so
  the iteration loop bails on it.
- Probe returns CLOSED_PR_ABANDONED when the PR is in CLOSED state
  (not MERGED, not OPEN).
- run_iteration_loop main body calls escalate_to_human (label
  loop:needs-human + diagnostic comment) when it sees the abandoned
  state, so the dispatcher doesn't keep picking up the issue every
  tick.
- ``git fetch --quiet origin <branch>`` runs before the ahead-count.
  Subprocess failure is suppressed — degrades to legacy behaviour
  rather than crashing the probe.

Tests: tests/test_iteration_closed_pr.py — 8 new:
- CLOSED PR returns CLOSED_PR_ABANDONED (the headline regression pin)
- CLOSED_PR_ABANDONED is terminal
- brief_kind_for(CLOSED_PR_ABANDONED) is None (no follow-up brief)
- MERGED still takes precedence over the new CLOSED branch
- OPEN classification unchanged
- Probe calls git-fetch BEFORE git-rev-list (ordering pin)
- Fetch failure non-fatal (network down doesn't crash probe)
- run_iteration_loop short-circuits + escalates on CLOSED_PR_ABANDONED
  (no dispatch_worker call, escalate_to_human fires)

Also updated tests/test_iteration_probe.py::test_terminal_set_...
to include CLOSED_PR_ABANDONED in its membership assertions.

Full suite: 766 passed (was 758), 0 regressions vs trunk baseline.
@hadamrd hadamrd merged commit a548818 into trunk May 28, 2026
2 checks passed
@hadamrd hadamrd deleted the fix/iteration-probe-closed-pr branch May 28, 2026 13:55
hadamrd added a commit that referenced this pull request May 28, 2026
… (#139)

Dogfood the manifestos system on forge-loop itself by writing the seed
quality and testing manifestos that every future forge-loop change is
gated against.

quality-manifesto.md codifies five rules drawn from this week's
persistent-worker work: no shared module-level state (#100), typed
Protocol+Fake at every I/O boundary (#104), single Settings source of
truth (#98), typed events instead of untyped **fields (#99), and no
subprocess.run for SDK-able services (#103, #105). Each rule names the
concrete issue it came from so future contributors know the *why*.

testing-manifesto.md codifies six rules drawn from this week's
iteration-probe bugs: one test per state-machine edge plus a fallthrough
adversarial (would have caught #97/#120/#128), an adversarial test for
the false case of every external-dep assumption, both ==0 and !=0
branches for every subprocess.returncode (specifically #128), a
contract test pinning every Fake to its Real, hypothesis property
tests on >4-branch / user-input functions (#102), and an adversarial
test that every infinite-loop guard actually fires.

tests/test_manifestos_discovery.py is the meta-validation gate: it
discovers and parses both files, asserts each rule has a rationale,
asserts the spec-mandated issue references are present, and includes
adversarial tests that stubs and missing files are detectable. 22
tests, all pass.
hadamrd added a commit that referenced this pull request May 28, 2026
…loop) (#149)

Closes the feedback loop the CTO described: every bug we fix becomes a
permanent gate. Today's PR #147 (critic SDK event-capture mismatch)
exposed a 4-PR train of bugs with the same shape — #97, #120, #128,
#147 — all driven by string-literal discriminators that didn't match
across module boundaries.

The critic (PR #141) reads the quality manifesto + flags sev1
violations. This rule + the critic infrastructure together mean the
next worker that writes ``event["type"] == "result"`` (or similar
cross-module string-comparison) gets the PR auto-blocked with the
manifesto rationale.
hadamrd added a commit that referenced this pull request May 28, 2026
Adds the customer-facing documentation for the manifestos + brainstormer
feature that closed the cosmetic-tickets gap. Real customers consuming
this OSS need to know:

1. The four files they own (.forge/product-vision.md, axes.yaml,
   quality-manifesto.md, testing-manifesto.md).
2. The brainstormer dry-run + --apply workflow.
3. The feedback loop (`forge-loop manifesto suggest --from-pr <N>`)
   where every bug becomes a permanent gate.
4. What the worker + critic see (manifestos injected into briefs;
   sev1 violations block auto-merge).

README: new section "Manifestos & the brainstormer (axis-aligned
tickets)" between Briefs and CLI reference. CLI reference table gains
`brainstorm`, `brainstorm --apply`, `manifesto suggest --from-pr`.

GUIDE: new section 4 "Manifestos: drive what gets built (not just how)"
between "discipline matters" and "the brief is your contract" — with
the real Titan brainstormer output as the worked example. Sections
5-10 renumbered accordingly.

Both docs cite PR #147 as the canonical feedback-loop example: a
stringly-typed event-boundary bug that surfaced after #97/#120/#128
all had the same shape; the fix landed the manifesto rule that the
critic now enforces.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant