Auto-refine: re-fire stalled rounds when agent is idle + head SHA unchanged by dgershman · Pull Request #507 · radiusmethod/crow

dgershman · 2026-06-14T21:06:51Z

Closes #505

Summary

When Crow tracks a PR sitting in CHANGES_REQUESTED and the agent goes idle after one auto-refine round (busy at dispatch, transient error, agent disagreed with a finding but didn't reply, ran out of context), Crow never re-prompts. The CROW-456 dedup key (sessionID, changesRequested, latestReviewID) was deliberately keyed this way so the agent's own response push doesn't loop the prompt — but it overshoots, also wedging genuine stalls.

This PR adds a bounded re-fire pass that re-emits the same review's transition when all conditions hold:

PR still in CHANGES_REQUESTED and still open (isOpen)
Managed terminal at readiness .agentLaunched and hook-driven AgentActivityState == .idle
Head SHA unchanged since dispatch (the anti-loop guarantee)
10-minute quiet window elapsed since the last emission
At most maxStalledRefires = 1 re-fire per emission — caps the re-prompt count so an agent that legitimately replies instead of pushing doesn't drive infinite re-prompts (or duplicate reply comments)

Plus flips AutoRespondSettings.respondToChangesRequested default to true for fresh configs (see "Default-flip reach" below).

Quiet-window rationale

10 min = 10 poll cycles (IssueTracker.pollInterval = 60s). Claude Code agents can spend that long thinking on a hard finding before responding — short windows would interrupt productive work. Constant: IssueTracker.stalledRefireQuietWindow, easy to tune.

Anti-loop invariants (correctness properties)

EmittedTransitionMeta.headShaAtEmit records the head SHA at dispatch. When the agent pushes a real fix, previousPRStatus[sid].headSha advances; the equality gate in shouldReFireStalledChangesRequested fails; the entry stays in the map; no re-fire. Verified by doesNotReFireWhenAgentPushedNewCommit.
EmittedTransitionMeta.reFireCount (capped at 1) prevents unbounded re-prompts when the agent posts a reply on the PR rather than pushing — a legitimate terminal state encouraged by the auto-respond prompt itself ("If a comment is unclear or you disagree, leave a reply…"). Verified by doesNotReFireOncePerEmissionCapReached + secondReFireOnSameMetaIsSuppressedByCap.
Synthetic re-fires carry PRStatusTransition.isReFire = true. AppDelegate.onPRStatusTransitions skips notifyPRTransition for these so we don't post a fresh macOS notification for the same review every quiet window; AutoRespondCoordinator.handle still dispatches the re-prompt (the useful part).
PRStatus.isOpen (new field, default true for backward decode, set from pr.state == "OPEN") blocks re-fire on merged or closed-unmerged PRs. GitHub doesn't reset reviewDecision on close, so without this gate a dead PR would re-prompt the agent to push to nothing. Verified by doesNotReFireOnMergedPR + doesNotReFireOnClosedUnmergedPR.
Review sessions (session.kind == .review) are skipped, matching the existing AutoRespondCoordinator policy — Crow never commits on behalf of a reviewer.
The toggle gate (respondToChangesRequestedProvider) suppresses re-fire entirely when the user has opted out, so no synthetic transitions reach the notification fan-out for opted-out users.

Settings default flip — limited reach (honest framing)

AutoRespondSettings.respondToChangesRequested was false (a footgun: users assumed Crow refined but the toggle was silently off). Now defaults to true. However, the reach is genuinely narrow: AutoRespondSettings has no custom encode(to:), so any user with a previously-saved config has respondToChangesRequested: false written explicitly, and the decodeIfPresent ... ?? true path returns the persisted value. So the new true default only lands on:

Brand-new configs (no store.json yet)
Configs predating the autoRespond key entirely (very old upgrades)

Existing active users who never touched the toggle keep false. Flipping their value blindly would also clobber real opt-outs, which the decoder can't distinguish from "untouched old default" — so we don't. A one-time opt-in migration prompt is a reasonable follow-up but is out of scope for this PR.

Covered by AutoRespondSettingsDefaultsTests.

Persistence migration

PersistedIssueTrackerState gained an optional emittedTransitionMeta: [String: EmittedTransitionMeta]? field. EmittedTransitionMeta itself now has reFireCount: Int (default 0, backward-decode ?? 0). Pre-CROW-505 stores hydrate from the legacy emittedTransitionKeys array: each key gets meta with emittedAt = now (fresh quiet-window clock — no re-fire flood at upgrade), headShaAtEmit sourced from previousPRStatus, and reFireCount = 0. Pre-cap stores that already have the map but lack reFireCount decode it as 0. New writes populate the map and leave emittedTransitionKeys empty so older builds reading the array gracefully fall back to "no keys" (a one-time downgrade prompt burst, documented).

Tests (all listed in the ticket acceptance, plus expansions for each review round)

Ticket / Review	Test	Result
1. Bug repro	`reFiresWhenAllConditionsMet`	✅
2. Anti-loop guarantee (head SHA)	`doesNotReFireWhenAgentPushedNewCommit`	✅
3. Round-2 still works	`roundTwoFromReviewerProducesDistinctDedupKey` + existing `PRStatusTransitionTests`	✅
4. Agent busy = no re-fire	`doesNotReFireWhileAgentWorking` / `Waiting` / `Done`	✅
5. Quiet window respected	`doesNotReFireBeforeQuietWindowElapsed`	✅
Review #1 (toggle gate)	`IssueTrackerStalledRefireWiringTests.toggleOffSuppresses…`, `toggleOnPermits…`, review-session skip, agent-activity, readiness, missing-terminal	✅
Review #2 (open/closed gate)	`doesNotReFireOnMergedPR`, `doesNotReFireOnClosedUnmergedPR`	✅
Review #3 (re-fire cap)	`doesNotReFireOncePerEmissionCapReached`, `reFiresWhileUnderCap`, `secondReFireOnSameMetaIsSuppressedByCap`, `toggleOnPermits…` asserts `isReFire == true` + `reFireCount == 1`	✅
Persistence round-trip	`issueTrackerStateRoundTripsThroughDisk` (extended with `reFireCount`) + `emittedTransitionMetaDecodesLegacyEntryWithoutReFireCount`	✅
Settings default flip	4 `AutoRespondSettingsDefaultsTests`	✅

Full suite (236 tests, 26 suites) green via arch -arm64 swift test --arch arm64.

Session-card "Resume refine" button — deferred

The ticket's alternative path. The auto re-fire (now bounded + idempotent) directly answers the user's complaint ("if crow is tracking a pr, it should refine when it needs to") and dispatchManual(.addressChanges, ...) already exists for any future button. Adding a SwiftUI surface now means another set of trigger conditions to keep in sync with the auto path. Worth a follow-up if telemetry shows users wanting a faster manual kick.

Test plan

Unit tests for all five ticket acceptance scenarios pass
Persistence round-trip survives legacy and new shapes (issueTrackerStateRoundTripsThroughDisk)
Existing PRStatusTransitionTests, IssueTrackerDedupTests, IssueTrackerCompletionTests, etc. still pass — no regressions
swift build --arch arm64 clean
All three review rounds' findings addressed: toggle gate (Review Native macOS IDE with Corveil branding and GitHub integration #1), open/closed gate (Review Integrate Claude Code hooks and event system for session activity tracking #2), re-fire cap + notification-noise gate (Review ride-workspace: detect existing PRs and use their branch instead of creating new ones #3)
Manual smoke: attach to a CHANGES_REQUESTED PR; kill the agent before it pushes; wait ~10 min; confirm Console shows auto-refine re-fired (stalled review) ... count=1/1 and the prompt arrives in the terminal once
Manual cap check: leave the session idle for another 10 min; confirm a second re-fire does NOT occur (Console silent)
Manual anti-loop check: with the agent actively working, watch a poll cycle pass without re-firing

🤖 Generated with Claude Code

When Crow tracks a PR sitting in CHANGES_REQUESTED and the agent goes idle after one auto-refine round, Crow never re-prompts — the dedup key keeps the round suppressed until the reviewer submits a new formal review. Now Crow re-fires the same transition when the managed terminal's agent is idle, the head SHA hasn't advanced since dispatch (anti-loop guarantee from CROW-456 preserved), and a 10-minute quiet window has elapsed. Storage: `IssueTracker.emittedTransitionKeys: Set<String>` becomes `emittedTransitionMeta: [String: EmittedTransitionMeta]`, where the meta carries `emittedAt` (quiet-window clock) and `headShaAtEmit` (SHA-equality anti-loop gate). `PersistedIssueTrackerState` gains an optional `emittedTransitionMeta` field; pre-CROW-505 stores are migrated by synthesizing meta from the legacy key array, sourcing `headShaAtEmit` from the also-persisted `previousPRStatus` and starting the quiet-window clock at now (no immediate re-fire flood at upgrade). Also flips `AutoRespondSettings.respondToChangesRequested` default to true for new configs. Existing users' explicit choices stay sticky: `decodeIfPresent` returns the written value, so users who explicitly toggled the setting off keep it off. 🐦‍⬛ Generated with Claude Code, orchestrated by Crow Co-Authored-By: Claude <noreply@anthropic.com> Crow-Session: C69B361D-F6EE-49FC-9F5B-E7A3928F5F1E

dhilgaertner

Code & Security Review

Solid, well-scoped change with a strong test story around the pure re-fire predicate. The anti-loop invariant (head-SHA gate) is correct and well-covered, the persistence migration is careful (fresh emittedAt at upgrade, empty legacy array for graceful downgrade), and the settings-default flip preserves explicit user choices. One behavioral regression keeps this from a clean approve.

Critical Issues

None.

Security Review

Strengths:

No new network/API scopes — re-fire reuses the existing polled PRStatus/hook state.
Anti-loop guarantee preserved: shouldReFireStalledChangesRequested requires meta.headShaAtEmit == currentStatus.headSha, so an agent push (which advances the head SHA) kills the re-fire. Verified by doesNotReFireWhenAgentPushedNewCommit. A nil baseline conservatively refuses to re-fire (doesNotReFireWhenBaselineShaNil).
Review sessions are skipped (session.kind != .review), matching AutoRespondCoordinator's policy — Crow won't commit on behalf of a reviewer.

Concerns:
None.

Code Quality

Yellow — re-fire pass ignores the respondToChangesRequested toggle, spamming notifications for opted-out users

reFireStalledChangesRequested (Sources/Crow/App/IssueTracker.swift) is appended unconditionally to the transition list and is never gated on the auto-respond setting:

emittedTransitionMeta is populated for every .changesRequested transition in applyPRStatuses regardless of the toggle (emission happens in the tracker; the toggle is only consulted downstream in AutoRespondCoordinator.handle). So opted-out users still accumulate meta entries.
The PR deliberately preserves the sticky "off" choice (decodeIfPresent ... ?? true, decoderPreservesExplicitFalseChoice). But for such a user, a PR sitting in CHANGES_REQUESTED with an idle managed terminal at .agentLaunched and an unchanged head SHA will re-emit a synthetic transition every quiet window (~10 min).
Each synthetic transition flows through onPRStatusTransitions (AppDelegate.swift:778), which calls notifyPRTransition for every transition before AutoRespondCoordinator.handle. postSystemNotification builds a unique identifier with a fresh timestamp (NotificationManager.swift:324), so nothing coalesces — a brand-new "Changes Requested" macOS notification fires every ~10 minutes, indefinitely, until the SHA changes or the PR leaves the bucket.
Meanwhile AutoRespondCoordinator.handle skips the actual dispatch (toggle off), so the re-fire produces zero useful action — it's pure notification noise that defeats the user's opt-out.

Fix: gate the re-fire on the auto-respond setting (e.g. wire a respondToChangesRequestedProvider: () -> Bool closure into IssueTracker, mirroring autoMergeWatcherEnabledProvider et al., and early-return from reFireStalledChangesRequested when it's off). That also keeps the re-fire pass from doing work for users who will never be auto-prompted. A focused test asserting "no synthetic transition when the toggle is off" would lock it in.

Code Quality (Green — consider)

The full reFireStalledChangesRequested wiring (toggle interaction, notification fan-out, emittedAt bump on re-fire) is exercised only at the pure-predicate level. An integration-style test over the map-walking method would catch the gating regression above and future wiring drift.
now.timeIntervalSince(meta.emittedAt) is computed twice (predicate + NSLog elapsed); negligible, fine to leave.

Test / Build Notes

CrowCore (275 tests) and CrowPersistence (30 tests) suites pass locally via swift test --arch arm64.
The new IssueTrackerStalledRefireTests live in the main Crow target, which cannot be built in this review environment — it requires the vendored Frameworks/GhosttyKit.xcframework, which is absent from the checkout (error: local binary target 'GhosttyKit' ... does not contain a binary artifact). The predicate logic was verified by reading; CI should confirm the run.

Summary Table

Color	Meaning	Verdict effect
Red	Must fix	Request changes
Yellow	Should fix	Request changes
Green	Consider	Approve allowed

Recommendation: Request Changes — driven by [0 Red, 1 Yellow, 2 Green] findings. The re-fire pass should respect the respondToChangesRequested toggle so it doesn't re-notify (with no corresponding action) users who explicitly opted out.

🐦‍⬛ Reviewed by Crow via Claude Code

Without this gate a user who explicitly opted out of auto-respond would still see a fresh "Changes Requested" macOS notification every quiet window — `onPRStatusTransitions` calls `notifyPRTransition` unconditionally and only the actual prompt dispatch is gated downstream in `AutoRespondCoordinator.handle`, so the synthetic re-fire produced pure notification noise for zero useful action. Adds `respondToChangesRequestedProvider: () -> Bool` to `IssueTracker` mirroring `autoMergeWatcherEnabledProvider` et al., early-returns from `reFireStalledChangesRequested` when the toggle is off, and wires the provider from `AppDelegate`. Also adds an integration test suite ("IssueTracker stalled re-fire wiring") that drives the map-walking method directly so future wiring drift is caught: toggle-off suppresses + does not bump emittedAt, toggle- on emits + bumps, review sessions are skipped, agent activity gate, and unlaunched/missing-terminal gates. The pure-predicate suite remains as the unit-level coverage. Addresses review feedback on PR #507. 🐦‍⬛ Generated with Claude Code, orchestrated by Crow Co-Authored-By: Claude <noreply@anthropic.com> Crow-Session: C69B361D-F6EE-49FC-9F5B-E7A3928F5F1E

dhilgaertner

Code & Security Review

Solid, well-documented PR. The anti-loop head-SHA invariant is the right core design and is well tested, the persistence migration is careful, and the notification-noise gate (respondToChangesRequestedProvider) is a thoughtful catch. One correctness gap blocks approval.

Security Review

Strengths:

No new network/API scopes, secrets, or shell-injection surface. Re-fire reuses the existing AutoRespondPrompts text and TerminalRouter.send path.
The review-session gate (session.kind != .review, IssueTracker.swift:1679) correctly mirrors AutoRespondCoordinator.shouldSkipReviewSession, so Crow still never commits on behalf of a reviewer.
The opt-out gate at IssueTracker.swift:1668 prevents notification spam for users who disabled auto-respond. Good defensive call.

Code Quality

Yellow — re-fire can fire on merged/closed PRs (no open-state guard). shouldReFireStalledChangesRequested (Sources/Crow/App/IssueTracker.swift:1740-1752) gates only on reviewStatus == .changesRequested. It never checks that the PR is still open. The standing-state re-fire reads previousPRStatus[sid] directly, which the existing CROW-456 emit path never did — the old path only fired on transition edges, so a PR that merged/closed while in CHANGES_REQUESTED produced no edge and no fire. The new path reintroduces that risk:

allKnownPRs includes stalePRs (merged/closed states fetched for tracked sessions, IssueTracker.swift:470-472), so buildPRStatus runs for them. A merged or closed PR commonly retains reviewDecision == CHANGES_REQUESTED (or the latestReviewStates fallback at IssueTracker.swift:2133-2139 keeps it), so newStatus.reviewStatus == .changesRequested.
The per-session re-arm cleanup (IssueTracker.swift:1534) only drops the meta entry when reviewStatus transitions away from .changesRequested — which never happens here — so the entry persists and previousPRStatus[sid] stays in the bucket.
Merging/closing does not change the PR head SHA, so the anti-loop SHA guard (condition 3) still passes.

Result: for a PR that was in CHANGES_REQUESTED and then merged or (more commonly) closed-unmerged, with an idle agent and a still-launched managed terminal, the re-fire emits a synthetic transition every quiet window indefinitely, prompting the agent to "address review feedback, commit, and push" to a dead PR. PRStatus exposes isMerged (mergeable == .merged) but has no closed-unmerged signal, so a complete fix likely means either (a) dropping the changesRequested meta when newStatus.isMerged and gating re-fire on the PR still being in the live open-viewer set, or (b) adding an open/closed flag to PRStatus. There is no test covering a merged or closed PR in IssueTrackerStalledRefireTests — doesNotReFireWhenPRLeftChangesRequestedBucket only covers .approved/.reviewRequired/.unknown.

Consider (Green — non-blocking)

Default-flip reach (AppConfig.swift:255,264). AutoRespondSettings encodes both Bools unconditionally (no custom encode(to:)), so any user with a previously-saved config has respondToChangesRequested: false written explicitly and keeps it off via decodeIfPresent. The flip therefore only reaches genuinely fresh configs — existing users who never touched the toggle (the "footgun" cohort the PR describes) still won't get auto-refine until they enable it. Confirm that matches intent.
Downgrade re-fire flood. persistTrackerState writes emittedTransitionKeys: [] always (IssueTracker.swift:1638). On a downgrade to a pre-CROW-505 build the old reader sees zero dedup keys and re-fires every currently-tracked transition once. Documented as "graceful downgrade"; just flagging the one-time prompt burst as a known cost.
Dictionary-mutation-during-iteration in reFireStalledChangesRequested (IssueTracker.swift:1671 + 1709) is safe here thanks to copy-on-write value semantics (iteration proceeds over the pre-mutation snapshot), but it's subtle — a brief comment would help future readers.

Summary Table

Color	Meaning	Verdict effect
Red	Must fix	Request changes
Yellow	Should fix	Request changes
Green	Consider	Approve allowed

Recommendation: Request Changes — driven by [0 Red, 1 Yellow, 3 Green] findings. The merged/closed-PR re-fire is the one blocker; the rest are confirmations and minor notes.

Note: I reviewed by static analysis and reasoning; I did not independently run the Swift suite (the PR reports 225 tests green).

🐦‍⬛ Reviewed by Crow via Claude Code

Reviewer caught a correctness gap on PR #507: the standing-state re-fire pass reads `previousPRStatus[sid]` directly rather than firing on transition edges, and a PR that merged or closed-unmerged while in CHANGES_REQUESTED keeps `reviewDecision` unchanged. With the agent idle, head SHA unchanged (closing doesn't move HEAD), and the quiet window elapsed, re-fire would prompt the agent to "address review feedback" on a dead PR every quiet window indefinitely. Adds `PRStatus.isOpen: Bool` (default `true` for backward decode), populated from `pr.state == "OPEN"` in `buildPRStatus`. Gates `shouldReFireStalledChangesRequested` on `currentStatus.isOpen`. Also extends the `applyPRStatuses` re-arm cleanup to drop changesRequested meta entries on the open→closed edge — defense in depth so the map can't accumulate dead entries that the predicate has to repeatedly reject. Adds two new predicate tests for merged and closed-unmerged PRs covering both `mergeable == .merged` and `mergeable == .unknown`. Also addresses the non-blocking notes from the same review: - Adds a docstring on `reFireStalledChangesRequested` explaining that the in-loop mutation of `emittedTransitionMeta` is safe thanks to Swift dictionary copy-on-write iteration semantics. The other two Green notes are confirmations and need no code change: the default-flip's reach (encoder writes the field unconditionally, so the new `true` default only lands on genuinely fresh configs — matches CROW-505's stated intent), and the documented one-time downgrade prompt burst. 🐦‍⬛ Generated with Claude Code, orchestrated by Crow Co-Authored-By: Claude <noreply@anthropic.com> Crow-Session: C69B361D-F6EE-49FC-9F5B-E7A3928F5F1E

dhilgaertner

Code & Security Review

Solid, well-documented change. The anti-loop head-SHA invariant is correctly preserved (previousPRStatus[sid] is refreshed every poll at IssueTracker.swift:1571 before reFireStalledChangesRequested reads it), the same-poll double-fire is avoided (fresh meta gets emittedAt = now, elapsed 0 < quietWindow), the COW iteration-during-mutation reasoning is sound, and the persistence migration + isOpen decode default are backward-safe. Model-level tests (AutoRespondSettingsDefaultsTests, CrowPersistence round-trip) pass locally. The full app suite couldn't run in the review env (missing Frameworks/GhosttyKit.xcframework binary artifact) — not a PR issue.

Two concerns below force request-changes; both are about behavior/intent, not crashes.

Security Review

Strengths:

No new network/API scopes; re-fire reuses the existing prompt-injection path.
Review sessions are correctly excluded from re-fire (reFireStalledChangesRequested checks session.kind != .review), mirroring AutoRespondCoordinator.shouldSkipReviewSession — Crow never commits on behalf of a reviewer.
Opt-out gate (respondToChangesRequestedProvider) prevents notification noise for users who disabled the toggle.

Concerns: none security-specific.

Code Quality

🟡 Unbounded re-fire when the agent legitimately replies instead of pushing (Sources/Crow/App/IssueTracker.swift — reFireStalledChangesRequested / shouldReFireStalledChangesRequested)

The anti-loop guarantee keys entirely on head-SHA advancement. But the auto-respond prompt itself explicitly instructs the agent to not always push: "If a comment is unclear or you disagree, leave a reply explaining your reasoning instead of changing the code" (AutoRespondCoordinator.swift:156). When the agent does exactly that — posts a reply, doesn't push — the head SHA is unchanged, the agent goes idle, and there is no re-fire cap. So every quiet window (10 min) the identical prompt is re-injected and a fresh notifyPRTransition macOS notification fires, indefinitely, until a human/reviewer rotates the PR out of CHANGES_REQUESTED.

Worst case is external-facing: the agent re-runs every 10 min and may post duplicate reply comments onto the human reviewer's PR — repeated noise on someone else's PR, plus repeated Claude invocations. The PR description lists "agent disagreed with a finding but didn't reply" as a stall to recover, but the head-SHA gate can't distinguish that from "agent disagreed and did reply," which is a legitimate terminal state.

Suggest bounding it: a per-dedup-key re-fire counter (stop after N re-fires with no SHA change), and/or suppressing the macOS notification on re-fires (the prompt injection is the useful part; a notification every 10 min is just noise).

🟡 The default-flip is largely a no-op for the exact users it targets (Packages/CrowCore/Sources/CrowCore/Models/AppConfig.swift:255,264)

AutoRespondSettings.respondToChangesRequested now defaults true, framed as fixing the footgun where "users assumed Crow refined but the toggle was silently off." But that footgun population won't be helped:

AppConfig.init(from:) decodes autoRespond via decodeIfPresent(...) ?? AutoRespondSettings() (AppConfig.swift:127), and AutoRespondSettings has no custom encode(to:), so any config saved by a recent build has respondToChangesRequested: false written explicitly.
AutoRespondSettings.init(from:) then reads that persisted false → stays off.

So the new true default only reaches brand-new configs and configs predating the autoRespond key entirely. Existing active users who ran a recent version and never touched the toggle — precisely the people who "assumed Crow refined" — keep it off. The decoder genuinely can't distinguish "explicit opt-out" from "untouched old default," so a blind migration would also clobber real opt-outs — but the description should at least be corrected to reflect the limited reach, or an intentional one-time migration considered. The code is safe; the gap is between stated intent and effect.

🟢 (consider) Stale previousPRStatus for de-monitored sessions — re-fire reads previousPRStatus[sid] for every meta entry, but that map is only refreshed for sessions returned by this poll's viewer query. If a PR drops out of the monitored set while a meta entry lingers, re-fire could evaluate against stale status. Low probability; consider scoping re-fire to sessions actually refreshed this cycle.

🟢 (consider) The NSLog re-fire line is nicely greppable; consider also logging the re-fire count once a cap exists, for tuning the quiet window.

Summary Table

Color	Meaning	Verdict effect
Red	Must fix	Request changes
Yellow	Should fix	Request changes
Green	Consider	Approve allowed

Recommendation: Request Changes — driven by [0 Red, 2 Yellow, 2 Green] findings. Both Yellows are bounded behavioral fixes; the architecture and correctness invariants are otherwise sound.

🐦‍⬛ Reviewed by Crow via Claude Code

When the agent legitimately replies on the PR instead of pushing — a behavior the auto-respond prompt explicitly invites ("If a comment is unclear or you disagree, leave a reply…") — the head SHA stays unchanged and the agent goes idle. The previous re-fire pass had no upper bound, so the identical prompt would re-inject every quiet window forever, potentially driving duplicate reply comments on the reviewer's PR. Adds `EmittedTransitionMeta.reFireCount: Int` (backward-decode `?? 0`) and `IssueTracker.maxStalledRefires: Int = 1`. The pure predicate's new condition (5) refuses re-fire once `reFireCount >= maxRefires`; each successful re-fire bumps the count. Edge events (review id rotates, author pushes, bucket exits, PR closes) clear the entry and naturally reset the counter via the existing re-arm path. Also tags synthetic transitions with `PRStatusTransition.isReFire = true`. `AppDelegate.onPRStatusTransitions` skips `notifyPRTransition` for re-fires, so a stalled review doesn't generate a fresh macOS notification every quiet window — the re-prompt itself (still routed through `AutoRespondCoordinator.handle`) is the useful part; the notification was pure noise. The default-flip Yellow from the same review is a description-only concern (encoder writes the field unconditionally → existing users who never touched the toggle keep `false`; the new `true` default only lands on fresh configs). PR description updated to reflect the limited reach honestly. A one-time opt-in migration prompt is a reasonable follow-up but is out of scope. New tests: - `doesNotReFireOncePerEmissionCapReached` + `reFiresWhileUnderCap` (pure predicate, cap semantics) - `secondReFireOnSameMetaIsSuppressedByCap` (integration over the map-walking pass — exercises the bump + re-evaluation) - `toggleOnPermitsSyntheticTransitionAndBumpsEmittedAt` extended to assert `isReFire == true` and `reFireCount == 1` - `emittedTransitionMetaDecodesLegacyEntryWithoutReFireCount` (backward decode default) - `issueTrackerStateRoundTripsThroughDisk` extended with `reFireCount` 🐦‍⬛ Generated with Claude Code, orchestrated by Crow Co-Authored-By: Claude <noreply@anthropic.com> Crow-Session: C69B361D-F6EE-49FC-9F5B-E7A3928F5F1E

dhilgaertner

Code & Security Review

Thorough re-fire design with strong backward-compat discipline and genuinely comprehensive tests. I verified the core invariants by reading the wiring end-to-end and running the buildable package suites.

Verified correctness

Anti-loop (head SHA) holds: shouldReFireStalledChangesRequested requires meta.headShaAtEmit == s.headSha; a real agent push advances previousPRStatus[sid].headSha and kills the re-fire. previousPRStatus[session.id] is updated (IssueTracker.swift:1582) before the re-fire pass runs (1592), so the gate reads fresh state.
isOpen gate is actually effective. The viewer query is OPEN-only, but a merged/closed CHANGES_REQUESTED PR is picked up by the stale-PR follow-up (collectStalePRURLs → fetchStalePRStates) and flows through applyPRStatuses, so buildPRStatus sets isOpen=false, the open→closed edge drops the meta entry (1556), and previousPRStatus reflects the close. The predicate's s.isOpen check is correct defense-in-depth on top of that.
Cross-provider state: pr.state == "OPEN" is correct for GitLab too — GitLabCodeBackend.normalizeState maps opened→OPEN on both the monitored and stale paths.
Synthetic transition routing: AppDelegate skips notifyPRTransition on isReFire while AutoRespondCoordinator.handle still dispatches the re-prompt; respondToChangesRequested is gated consistently at both re-fire generation and dispatch.
Cap, quiet window, review-session skip, terminal-readiness/agent-idle gates all read correctly.
Backward-compatible decoding throughout: isOpen ?? true, reFireCount ?? 0, optional emittedTransitionMeta, legacy-key migration with fresh emittedAt. Dictionary mutation-during-iteration is safe (COW snapshot).

Security Review

Strengths:

Re-fire is gated on the user toggle, so an opted-out user gets neither the dispatch nor notification noise.
Review sessions are skipped — Crow never commits on behalf of a reviewer.
No new API scopes, secrets, or injection surface; prompt text is the same proven single-line pattern.

Concerns: none blocking.

Code Quality (Green — considerations only, no changes required to merge)

Default flip to true (respondToChangesRequested) changes the consent posture for fresh configs — Crow now auto-types into a managed terminal by default. This is the explicit purpose of CROW-505, reach is narrow (new configs only; explicit opt-outs stay sticky; respondToFailedChecks stays off), and it's well documented. Flagging only for visibility.
Cap=1 + reply path: if the agent legitimately replied on the PR instead of pushing, the one allowed re-fire can produce a duplicate reply comment. Already acknowledged as a deliberate tradeoff in the PR body.
Narrow stale edges (bounded to one re-fire by the cap): (a) a degraded stale-fetch cycle (staleFetch.complete == false) the same poll a PR closes leaves previousPRStatus.isOpen stale; (b) collectStalePRURLs only refreshes .active/.paused/.inReview sessions, while the re-fire pass walks the meta map without a session.status gate — a session moving to completed/archived exactly as its PR closes could re-fire once on stale state. Both are consistent with the existing emission path and bounded; consider mirroring the session.status filter in the re-fire walk as a future tidy-up.

Testing

CrowCore (275 tests) and CrowPersistence (31 tests) pass locally, including the new AutoRespondSettings defaults suite and the emittedTransitionMeta round-trip / legacy-decode tests.
Tests/CrowTests (the IssueTracker stalled-refire suite) couldn't be built in this review environment — Frameworks/GhosttyKit.xcframework contains no binary artifact here. This is an environment limitation, not a PR issue; the test code reads correctly against the APIs and the author reports the full 236-test suite green.

Summary Table

Color	Meaning	Verdict effect
Red	Must fix	Request changes
Yellow	Should fix	Request changes
Green	Consider	Approve allowed

Recommendation: Approve — driven by [0 Red, 0 Yellow, 3 Green] findings.

🐦‍⬛ Reviewed by Crow via Claude Code

…CROW-508) (#509) Closes #508 ## Summary Replaces the edge-triggered "needs refine" machinery shipped in #507 (1-day-old) with a stateless rule that derives the answer from the PR snapshot on every poll. Motivated by [shell-crm#202](radiusmethod/shell-crm#202): even with #507 deployed, that PR sat in `CHANGES_REQUESTED` for 30+ minutes without Crow re-prompting after a restart — the bookkeeping that #507 added carried its own failure modes (poll loop starvation, missing persistence, dedup map drift after restart). **The rule:** emit `.changesRequested` when ALL of: 1. PR is OPEN and `reviewDecision == CHANGES_REQUESTED` 2. `lastChangesRequestedAt` (max `submittedAt` across CHANGES_REQUESTED reviews) > `lastSubstantiveCommitAt` (max `committedDate` across commits with `parents.totalCount < 2` AND whose subject doesn't start with `Merge branch|remote-tracking|pull request`) 3. Managed terminal is idle (readiness == `.agentLaunched`, agent activity == `.idle`) 4. `AutoRespondSettings.respondToChangesRequested` is on 5. PR has been observed at least once this Crow process (first-observation skip) 6. 7 minutes have elapsed since the last dispatch for this PR (cooldown) Anti-loop is **automatic**: any real (non-merge, non-rebase) commit advances `lastSubstantiveCommitAt` past `lastChangesRequestedAt` and the rule stops firing on the next poll. No `headShaAtEmit` snapshot needed. The GitHub "Update branch" button produces a merge commit that's filtered out upstream, so it can't trick the rule. Restart-safe by construction: a fresh Crow process re-derives the truth on the next successful poll. The 7-min cooldown is in-memory only; worst case after restart is one duplicate prompt before the cooldown re-arms. ## Decisions made - **Same-PR deletion, not follow-up.** The diff is easier to reason about as one atomic swap, and reverting is a single-SHA revert. Listed all deleted symbols below. - **Cooldown of 7 min** (middle of the ticket's 5–10 min range). 7 min sits comfortably between "long enough that an agent thinking through a hard finding isn't interrupted" and "short enough that a true stall surfaces within ~3 poll cycles." - **Tree-equals-parents check skipped** (the ticket called it optional). The merge-message + parent-count filter catches every shape I observed in the wild; the tree check would add an extra API call per PR for marginal coverage. - **GraphQL query bumped `latestReviews(first: 5)` → `first: 20`** and added `commits(last: 30)` with `oid`, `messageHeadline`, `committedDate`, `parents.totalCount`. Per-PR cost is small; the stateless rule needs both timestamps every poll. - **No transitions() emission for `.changesRequested`.** The pure `PRStatus.transitions(…)` function now only emits `.checksFailing`. `.changesRequested` lives entirely in `IssueTracker.applyPRStatuses` (the stateless dispatch path). ## Symbols deleted - `PersistedIssueTrackerState` (CrowPersistence) - `StoreData.issueTrackerState` field (decode-tolerant — existing stores ignore the key, no migration needed) - `EmittedTransitionMeta` (CrowCore) - `PRStatusTransition.dedupeKey`, `.latestReviewID`, `.isReFire` fields - `PRStatus.latestReviewID` field - `PRRecord.latestReviewID` field (replaced with `lastChangesRequestedAt`, `lastSubstantiveCommitAt`) - `IssueTracker.emittedTransitionMeta` map - `IssueTracker.hydratePersistedState()`, `persistTrackerState()` - `IssueTracker.reFireStalledChangesRequested(now:)` - `IssueTracker.shouldReFireStalledChangesRequested(meta:…)` predicate - `IssueTracker.stalledRefireQuietWindow`, `maxStalledRefires` constants - `IssueTracker.parseSessionID(fromDedupKey:)`, `parseKind(fromDedupKey:)` helpers - `AppDelegate` re-fire notification skip block ## New code - `PRStatus.needsRefine(status:terminalIdle:)` — pure rule in CrowCore. ~12 lines. - `IssueTracker.seenPRs: Set<String>` (PR URL keyed; first-observation skip) - `IssueTracker.lastRefineDispatchAt: [String: Date]` (PR URL keyed; cooldown) - `IssueTracker.needsRefineCooldown` constant (7 min) - `IssueTracker.isManagedTerminalIdle(sessionID:)` helper - `GitHubCodeBackend.isMergeCommitMessage(_:)` helper (merge-prefix detection, shared with tests) - `PRRecord.lastChangesRequestedAt`, `.lastSubstantiveCommitAt` fields - `PRStatus.lastChangesRequestedAt`, `.lastSubstantiveCommitAt` fields ## Acceptance tests All four ticket scenarios are automated: | # | Scenario | Test | |---|----------|------| | 1 | Round-N stall fires on idle poll (after first-observation skip) | `IssueTrackerNeedsRefineTests.roundNStallFiresOnSecondPoll` | | 2 | Merge-from-main does NOT reset the rule | `IssueTrackerNeedsRefineTests.mergeFromMainDoesNotFlipTheRule` + `BackendsTests.testParseMonitoredPRsLastSubstantiveCommitExcludesMergesAndRebases` | | 3 | Real fix flips the rule off | `IssueTrackerNeedsRefineTests.realFixStopsTheRule` + `PRStatusNeedsRefineTests.doesNotFireWhenCommitIsNewerThanReview` | | 4 | Restart-safe: skip → fire → cooldown blocks → cooldown elapses → fire again | `IssueTrackerNeedsRefineTests.restartFreshTrackerSkipsThenFiresThenCoolsDown` | Plus defense-in-depth tests for the busy-agent, pre-launch-terminal, opt-out, missing-timestamp, closed-PR, and merge-message-shape gates. Total new/rewritten: 23 tests across 3 suites. ## Deliberate-break verifications Each guard is pinned by a test that flips loudly when the guard is removed: - **Remove cooldown** → `restartFreshTrackerSkipsThenFiresThenCoolsDown` poll 3 count would go from 1 to 2. - **Remove first-observation-skip** → `roundNStallFiresOnSecondPoll` poll 1 count would go from 0 to 1. - **Remove merge-commit filter** → `testParseMonitoredPRsLastSubstantiveCommitExcludesMergesAndRebases` would observe `lastSubstantiveCommitAt == 2026-06-07` instead of `2026-06-01`. I read each assertion to confirm the inversion would fail loudly rather than mechanically running them (the assertions themselves are the forcing function). ## Test plan - [x] `swift test --package-path Packages/CrowCore` — 273 tests pass (includes new `PRStatus.needsRefine (CROW-508)` suite) - [x] `swift test --package-path Packages/CrowPersistence` — 30 tests pass (includes new "decoder ignores legacy issueTrackerState blob" coverage) - [x] `swift test --package-path Packages/CrowProvider` — 49 tests pass (includes new commit-parsing + merge-filter tests) - [x] `swift test` (root) — 219 tests pass (includes new `IssueTracker stateless needsRefine (CROW-508)` suite) - [x] `swift build --arch arm64` clean - [ ] Manual smoke: attach a `CHANGES_REQUESTED` PR with no agent push since the review → Crow re-prompts on the second poll, then waits 7 min before re-prompting. - [ ] Manual cap check: rebase via "Update branch" → `lastSubstantiveCommitAt` does NOT advance, Crow keeps prompting on the next poll. - [ ] Manual anti-loop check: agent pushes a real fix → `lastSubstantiveCommitAt` advances, Crow stops prompting on the next poll. --------- Co-authored-by: Claude <noreply@anthropic.com>

dgershman requested a review from dhilgaertner as a code owner June 14, 2026 21:06

dgershman added the crow:merge Crow auto-merge on green label Jun 14, 2026

dhilgaertner requested changes Jun 14, 2026

View reviewed changes

dgershman requested a review from dhilgaertner June 14, 2026 21:17

dhilgaertner requested changes Jun 14, 2026

View reviewed changes

dgershman requested a review from dhilgaertner June 14, 2026 22:21

dhilgaertner requested changes Jun 14, 2026

View reviewed changes

dgershman requested a review from dhilgaertner June 14, 2026 22:39

dhilgaertner approved these changes Jun 14, 2026

View reviewed changes

dgershman merged commit 7331c7f into main Jun 14, 2026
2 checks passed

dgershman deleted the feature/crow-505-auto-refine-stalled branch June 14, 2026 22:46

This was referenced Jun 15, 2026

Replace edge-triggered changesRequested detection with a stateless 'last PR event was a CHANGES_REQUESTED review and no substantive commit since' check #508

Closed

Stateless needsRefine: derive CHANGES_REQUESTED state from PR alone (CROW-508) #509

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-refine: re-fire stalled rounds when agent is idle + head SHA unchanged#507

Auto-refine: re-fire stalled rounds when agent is idle + head SHA unchanged#507
dgershman merged 4 commits into
mainfrom
feature/crow-505-auto-refine-stalled

dgershman commented Jun 14, 2026 •

edited

Loading

Uh oh!

dhilgaertner left a comment

Uh oh!

dhilgaertner left a comment

Uh oh!

dhilgaertner left a comment

Uh oh!

dhilgaertner left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dgershman commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Quiet-window rationale

Anti-loop invariants (correctness properties)

Settings default flip — limited reach (honest framing)

Persistence migration

Tests (all listed in the ticket acceptance, plus expansions for each review round)

Session-card "Resume refine" button — deferred

Test plan

Uh oh!

dhilgaertner left a comment

Choose a reason for hiding this comment

Code & Security Review

Critical Issues

Security Review

Code Quality

Code Quality (Green — consider)

Test / Build Notes

Summary Table

Uh oh!

dhilgaertner left a comment

Choose a reason for hiding this comment

Code & Security Review

Security Review

Code Quality

Consider (Green — non-blocking)

Summary Table

Uh oh!

dhilgaertner left a comment

Choose a reason for hiding this comment

Code & Security Review

Security Review

Code Quality

Summary Table

Uh oh!

dhilgaertner left a comment

Choose a reason for hiding this comment

Code & Security Review

Verified correctness

Security Review

Code Quality (Green — considerations only, no changes required to merge)

Testing

Summary Table

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dgershman commented Jun 14, 2026 •

edited

Loading