Auto-refine: re-fire stalled rounds when agent is idle + head SHA unchanged#507
Conversation
When Crow tracks a PR sitting in CHANGES_REQUESTED and the agent goes idle after one auto-refine round, Crow never re-prompts — the dedup key keeps the round suppressed until the reviewer submits a new formal review. Now Crow re-fires the same transition when the managed terminal's agent is idle, the head SHA hasn't advanced since dispatch (anti-loop guarantee from CROW-456 preserved), and a 10-minute quiet window has elapsed. Storage: `IssueTracker.emittedTransitionKeys: Set<String>` becomes `emittedTransitionMeta: [String: EmittedTransitionMeta]`, where the meta carries `emittedAt` (quiet-window clock) and `headShaAtEmit` (SHA-equality anti-loop gate). `PersistedIssueTrackerState` gains an optional `emittedTransitionMeta` field; pre-CROW-505 stores are migrated by synthesizing meta from the legacy key array, sourcing `headShaAtEmit` from the also-persisted `previousPRStatus` and starting the quiet-window clock at now (no immediate re-fire flood at upgrade). Also flips `AutoRespondSettings.respondToChangesRequested` default to true for new configs. Existing users' explicit choices stay sticky: `decodeIfPresent` returns the written value, so users who explicitly toggled the setting off keep it off. 🐦⬛ Generated with Claude Code, orchestrated by Crow Co-Authored-By: Claude <noreply@anthropic.com> Crow-Session: C69B361D-F6EE-49FC-9F5B-E7A3928F5F1E
dhilgaertner
left a comment
There was a problem hiding this comment.
Code & Security Review
Solid, well-scoped change with a strong test story around the pure re-fire predicate. The anti-loop invariant (head-SHA gate) is correct and well-covered, the persistence migration is careful (fresh emittedAt at upgrade, empty legacy array for graceful downgrade), and the settings-default flip preserves explicit user choices. One behavioral regression keeps this from a clean approve.
Critical Issues
None.
Security Review
Strengths:
- No new network/API scopes — re-fire reuses the existing polled
PRStatus/hook state. - Anti-loop guarantee preserved:
shouldReFireStalledChangesRequestedrequiresmeta.headShaAtEmit == currentStatus.headSha, so an agent push (which advances the head SHA) kills the re-fire. Verified bydoesNotReFireWhenAgentPushedNewCommit. Anilbaseline conservatively refuses to re-fire (doesNotReFireWhenBaselineShaNil). - Review sessions are skipped (
session.kind != .review), matchingAutoRespondCoordinator's policy — Crow won't commit on behalf of a reviewer.
Concerns:
None.
Code Quality
Yellow — re-fire pass ignores the respondToChangesRequested toggle, spamming notifications for opted-out users
reFireStalledChangesRequested (Sources/Crow/App/IssueTracker.swift) is appended unconditionally to the transition list and is never gated on the auto-respond setting:
emittedTransitionMetais populated for every.changesRequestedtransition inapplyPRStatusesregardless of the toggle (emission happens in the tracker; the toggle is only consulted downstream inAutoRespondCoordinator.handle). So opted-out users still accumulate meta entries.- The PR deliberately preserves the sticky "off" choice (
decodeIfPresent ... ?? true,decoderPreservesExplicitFalseChoice). But for such a user, a PR sitting inCHANGES_REQUESTEDwith an idle managed terminal at.agentLaunchedand an unchanged head SHA will re-emit a synthetic transition every quiet window (~10 min). - Each synthetic transition flows through
onPRStatusTransitions(AppDelegate.swift:778), which callsnotifyPRTransitionfor every transition beforeAutoRespondCoordinator.handle.postSystemNotificationbuilds a unique identifier with a fresh timestamp (NotificationManager.swift:324), so nothing coalesces — a brand-new "Changes Requested" macOS notification fires every ~10 minutes, indefinitely, until the SHA changes or the PR leaves the bucket. - Meanwhile
AutoRespondCoordinator.handleskips the actual dispatch (toggle off), so the re-fire produces zero useful action — it's pure notification noise that defeats the user's opt-out.
Fix: gate the re-fire on the auto-respond setting (e.g. wire a respondToChangesRequestedProvider: () -> Bool closure into IssueTracker, mirroring autoMergeWatcherEnabledProvider et al., and early-return from reFireStalledChangesRequested when it's off). That also keeps the re-fire pass from doing work for users who will never be auto-prompted. A focused test asserting "no synthetic transition when the toggle is off" would lock it in.
Code Quality (Green — consider)
- The full
reFireStalledChangesRequestedwiring (toggle interaction, notification fan-out,emittedAtbump on re-fire) is exercised only at the pure-predicate level. An integration-style test over the map-walking method would catch the gating regression above and future wiring drift. now.timeIntervalSince(meta.emittedAt)is computed twice (predicate + NSLogelapsed); negligible, fine to leave.
Test / Build Notes
CrowCore(275 tests) andCrowPersistence(30 tests) suites pass locally viaswift test --arch arm64.- The new
IssueTrackerStalledRefireTestslive in the mainCrowtarget, which cannot be built in this review environment — it requires the vendoredFrameworks/GhosttyKit.xcframework, which is absent from the checkout (error: local binary target 'GhosttyKit' ... does not contain a binary artifact). The predicate logic was verified by reading; CI should confirm the run.
Summary Table
| Color | Meaning | Verdict effect |
|---|---|---|
| Red | Must fix | Request changes |
| Yellow | Should fix | Request changes |
| Green | Consider | Approve allowed |
Recommendation: Request Changes — driven by [0 Red, 1 Yellow, 2 Green] findings. The re-fire pass should respect the respondToChangesRequested toggle so it doesn't re-notify (with no corresponding action) users who explicitly opted out.
Without this gate a user who explicitly opted out of auto-respond would
still see a fresh "Changes Requested" macOS notification every quiet
window — `onPRStatusTransitions` calls `notifyPRTransition`
unconditionally and only the actual prompt dispatch is gated downstream
in `AutoRespondCoordinator.handle`, so the synthetic re-fire produced
pure notification noise for zero useful action.
Adds `respondToChangesRequestedProvider: () -> Bool` to `IssueTracker`
mirroring `autoMergeWatcherEnabledProvider` et al., early-returns from
`reFireStalledChangesRequested` when the toggle is off, and wires the
provider from `AppDelegate`.
Also adds an integration test suite ("IssueTracker stalled re-fire
wiring") that drives the map-walking method directly so future wiring
drift is caught: toggle-off suppresses + does not bump emittedAt, toggle-
on emits + bumps, review sessions are skipped, agent activity gate, and
unlaunched/missing-terminal gates. The pure-predicate suite remains as
the unit-level coverage.
Addresses review feedback on PR #507.
🐦⬛ Generated with Claude Code, orchestrated by Crow
Co-Authored-By: Claude <noreply@anthropic.com>
Crow-Session: C69B361D-F6EE-49FC-9F5B-E7A3928F5F1E
dhilgaertner
left a comment
There was a problem hiding this comment.
Code & Security Review
Solid, well-documented PR. The anti-loop head-SHA invariant is the right core design and is well tested, the persistence migration is careful, and the notification-noise gate (respondToChangesRequestedProvider) is a thoughtful catch. One correctness gap blocks approval.
Security Review
Strengths:
- No new network/API scopes, secrets, or shell-injection surface. Re-fire reuses the existing
AutoRespondPromptstext andTerminalRouter.sendpath. - The review-session gate (
session.kind != .review, IssueTracker.swift:1679) correctly mirrorsAutoRespondCoordinator.shouldSkipReviewSession, so Crow still never commits on behalf of a reviewer. - The opt-out gate at
IssueTracker.swift:1668prevents notification spam for users who disabled auto-respond. Good defensive call.
Code Quality
Yellow — re-fire can fire on merged/closed PRs (no open-state guard). shouldReFireStalledChangesRequested (Sources/Crow/App/IssueTracker.swift:1740-1752) gates only on reviewStatus == .changesRequested. It never checks that the PR is still open. The standing-state re-fire reads previousPRStatus[sid] directly, which the existing CROW-456 emit path never did — the old path only fired on transition edges, so a PR that merged/closed while in CHANGES_REQUESTED produced no edge and no fire. The new path reintroduces that risk:
allKnownPRsincludesstalePRs(merged/closed states fetched for tracked sessions,IssueTracker.swift:470-472), sobuildPRStatusruns for them. A merged or closed PR commonly retainsreviewDecision == CHANGES_REQUESTED(or thelatestReviewStatesfallback atIssueTracker.swift:2133-2139keeps it), sonewStatus.reviewStatus == .changesRequested.- The per-session re-arm cleanup (
IssueTracker.swift:1534) only drops the meta entry whenreviewStatustransitions away from.changesRequested— which never happens here — so the entry persists andpreviousPRStatus[sid]stays in the bucket. - Merging/closing does not change the PR head SHA, so the anti-loop SHA guard (condition 3) still passes.
Result: for a PR that was in CHANGES_REQUESTED and then merged or (more commonly) closed-unmerged, with an idle agent and a still-launched managed terminal, the re-fire emits a synthetic transition every quiet window indefinitely, prompting the agent to "address review feedback, commit, and push" to a dead PR. PRStatus exposes isMerged (mergeable == .merged) but has no closed-unmerged signal, so a complete fix likely means either (a) dropping the changesRequested meta when newStatus.isMerged and gating re-fire on the PR still being in the live open-viewer set, or (b) adding an open/closed flag to PRStatus. There is no test covering a merged or closed PR in IssueTrackerStalledRefireTests — doesNotReFireWhenPRLeftChangesRequestedBucket only covers .approved/.reviewRequired/.unknown.
Consider (Green — non-blocking)
- Default-flip reach (
AppConfig.swift:255,264).AutoRespondSettingsencodes bothBools unconditionally (no customencode(to:)), so any user with a previously-saved config hasrespondToChangesRequested: falsewritten explicitly and keeps it off viadecodeIfPresent. The flip therefore only reaches genuinely fresh configs — existing users who never touched the toggle (the "footgun" cohort the PR describes) still won't get auto-refine until they enable it. Confirm that matches intent. - Downgrade re-fire flood.
persistTrackerStatewritesemittedTransitionKeys: []always (IssueTracker.swift:1638). On a downgrade to a pre-CROW-505 build the old reader sees zero dedup keys and re-fires every currently-tracked transition once. Documented as "graceful downgrade"; just flagging the one-time prompt burst as a known cost. - Dictionary-mutation-during-iteration in
reFireStalledChangesRequested(IssueTracker.swift:1671+1709) is safe here thanks to copy-on-write value semantics (iteration proceeds over the pre-mutation snapshot), but it's subtle — a brief comment would help future readers.
Summary Table
| Color | Meaning | Verdict effect |
|---|---|---|
| Red | Must fix | Request changes |
| Yellow | Should fix | Request changes |
| Green | Consider | Approve allowed |
Recommendation: Request Changes — driven by [0 Red, 1 Yellow, 3 Green] findings. The merged/closed-PR re-fire is the one blocker; the rest are confirmations and minor notes.
Note: I reviewed by static analysis and reasoning; I did not independently run the Swift suite (the PR reports 225 tests green).
Reviewer caught a correctness gap on PR #507: the standing-state re-fire pass reads `previousPRStatus[sid]` directly rather than firing on transition edges, and a PR that merged or closed-unmerged while in CHANGES_REQUESTED keeps `reviewDecision` unchanged. With the agent idle, head SHA unchanged (closing doesn't move HEAD), and the quiet window elapsed, re-fire would prompt the agent to "address review feedback" on a dead PR every quiet window indefinitely. Adds `PRStatus.isOpen: Bool` (default `true` for backward decode), populated from `pr.state == "OPEN"` in `buildPRStatus`. Gates `shouldReFireStalledChangesRequested` on `currentStatus.isOpen`. Also extends the `applyPRStatuses` re-arm cleanup to drop changesRequested meta entries on the open→closed edge — defense in depth so the map can't accumulate dead entries that the predicate has to repeatedly reject. Adds two new predicate tests for merged and closed-unmerged PRs covering both `mergeable == .merged` and `mergeable == .unknown`. Also addresses the non-blocking notes from the same review: - Adds a docstring on `reFireStalledChangesRequested` explaining that the in-loop mutation of `emittedTransitionMeta` is safe thanks to Swift dictionary copy-on-write iteration semantics. The other two Green notes are confirmations and need no code change: the default-flip's reach (encoder writes the field unconditionally, so the new `true` default only lands on genuinely fresh configs — matches CROW-505's stated intent), and the documented one-time downgrade prompt burst. 🐦⬛ Generated with Claude Code, orchestrated by Crow Co-Authored-By: Claude <noreply@anthropic.com> Crow-Session: C69B361D-F6EE-49FC-9F5B-E7A3928F5F1E
dhilgaertner
left a comment
There was a problem hiding this comment.
Code & Security Review
Solid, well-documented change. The anti-loop head-SHA invariant is correctly preserved (previousPRStatus[sid] is refreshed every poll at IssueTracker.swift:1571 before reFireStalledChangesRequested reads it), the same-poll double-fire is avoided (fresh meta gets emittedAt = now, elapsed 0 < quietWindow), the COW iteration-during-mutation reasoning is sound, and the persistence migration + isOpen decode default are backward-safe. Model-level tests (AutoRespondSettingsDefaultsTests, CrowPersistence round-trip) pass locally. The full app suite couldn't run in the review env (missing Frameworks/GhosttyKit.xcframework binary artifact) — not a PR issue.
Two concerns below force request-changes; both are about behavior/intent, not crashes.
Security Review
Strengths:
- No new network/API scopes; re-fire reuses the existing prompt-injection path.
- Review sessions are correctly excluded from re-fire (
reFireStalledChangesRequestedcheckssession.kind != .review), mirroringAutoRespondCoordinator.shouldSkipReviewSession— Crow never commits on behalf of a reviewer. - Opt-out gate (
respondToChangesRequestedProvider) prevents notification noise for users who disabled the toggle.
Concerns: none security-specific.
Code Quality
🟡 Unbounded re-fire when the agent legitimately replies instead of pushing (Sources/Crow/App/IssueTracker.swift — reFireStalledChangesRequested / shouldReFireStalledChangesRequested)
The anti-loop guarantee keys entirely on head-SHA advancement. But the auto-respond prompt itself explicitly instructs the agent to not always push: "If a comment is unclear or you disagree, leave a reply explaining your reasoning instead of changing the code" (AutoRespondCoordinator.swift:156). When the agent does exactly that — posts a reply, doesn't push — the head SHA is unchanged, the agent goes idle, and there is no re-fire cap. So every quiet window (10 min) the identical prompt is re-injected and a fresh notifyPRTransition macOS notification fires, indefinitely, until a human/reviewer rotates the PR out of CHANGES_REQUESTED.
Worst case is external-facing: the agent re-runs every 10 min and may post duplicate reply comments onto the human reviewer's PR — repeated noise on someone else's PR, plus repeated Claude invocations. The PR description lists "agent disagreed with a finding but didn't reply" as a stall to recover, but the head-SHA gate can't distinguish that from "agent disagreed and did reply," which is a legitimate terminal state.
Suggest bounding it: a per-dedup-key re-fire counter (stop after N re-fires with no SHA change), and/or suppressing the macOS notification on re-fires (the prompt injection is the useful part; a notification every 10 min is just noise).
🟡 The default-flip is largely a no-op for the exact users it targets (Packages/CrowCore/Sources/CrowCore/Models/AppConfig.swift:255,264)
AutoRespondSettings.respondToChangesRequested now defaults true, framed as fixing the footgun where "users assumed Crow refined but the toggle was silently off." But that footgun population won't be helped:
AppConfig.init(from:)decodesautoRespondviadecodeIfPresent(...) ?? AutoRespondSettings()(AppConfig.swift:127), andAutoRespondSettingshas no customencode(to:), so any config saved by a recent build hasrespondToChangesRequested: falsewritten explicitly.AutoRespondSettings.init(from:)then reads that persistedfalse→ stays off.
So the new true default only reaches brand-new configs and configs predating the autoRespond key entirely. Existing active users who ran a recent version and never touched the toggle — precisely the people who "assumed Crow refined" — keep it off. The decoder genuinely can't distinguish "explicit opt-out" from "untouched old default," so a blind migration would also clobber real opt-outs — but the description should at least be corrected to reflect the limited reach, or an intentional one-time migration considered. The code is safe; the gap is between stated intent and effect.
🟢 (consider) Stale previousPRStatus for de-monitored sessions — re-fire reads previousPRStatus[sid] for every meta entry, but that map is only refreshed for sessions returned by this poll's viewer query. If a PR drops out of the monitored set while a meta entry lingers, re-fire could evaluate against stale status. Low probability; consider scoping re-fire to sessions actually refreshed this cycle.
🟢 (consider) The NSLog re-fire line is nicely greppable; consider also logging the re-fire count once a cap exists, for tuning the quiet window.
Summary Table
| Color | Meaning | Verdict effect |
|---|---|---|
| Red | Must fix | Request changes |
| Yellow | Should fix | Request changes |
| Green | Consider | Approve allowed |
Recommendation: Request Changes — driven by [0 Red, 2 Yellow, 2 Green] findings. Both Yellows are bounded behavioral fixes; the architecture and correctness invariants are otherwise sound.
When the agent legitimately replies on the PR instead of pushing — a
behavior the auto-respond prompt explicitly invites ("If a comment is
unclear or you disagree, leave a reply…") — the head SHA stays
unchanged and the agent goes idle. The previous re-fire pass had no
upper bound, so the identical prompt would re-inject every quiet
window forever, potentially driving duplicate reply comments on the
reviewer's PR.
Adds `EmittedTransitionMeta.reFireCount: Int` (backward-decode `?? 0`)
and `IssueTracker.maxStalledRefires: Int = 1`. The pure predicate's
new condition (5) refuses re-fire once `reFireCount >= maxRefires`;
each successful re-fire bumps the count. Edge events (review id
rotates, author pushes, bucket exits, PR closes) clear the entry and
naturally reset the counter via the existing re-arm path.
Also tags synthetic transitions with `PRStatusTransition.isReFire =
true`. `AppDelegate.onPRStatusTransitions` skips `notifyPRTransition`
for re-fires, so a stalled review doesn't generate a fresh macOS
notification every quiet window — the re-prompt itself (still routed
through `AutoRespondCoordinator.handle`) is the useful part; the
notification was pure noise.
The default-flip Yellow from the same review is a description-only
concern (encoder writes the field unconditionally → existing users
who never touched the toggle keep `false`; the new `true` default
only lands on fresh configs). PR description updated to reflect
the limited reach honestly. A one-time opt-in migration prompt is a
reasonable follow-up but is out of scope.
New tests:
- `doesNotReFireOncePerEmissionCapReached` + `reFiresWhileUnderCap`
(pure predicate, cap semantics)
- `secondReFireOnSameMetaIsSuppressedByCap` (integration over the
map-walking pass — exercises the bump + re-evaluation)
- `toggleOnPermitsSyntheticTransitionAndBumpsEmittedAt` extended to
assert `isReFire == true` and `reFireCount == 1`
- `emittedTransitionMetaDecodesLegacyEntryWithoutReFireCount`
(backward decode default)
- `issueTrackerStateRoundTripsThroughDisk` extended with `reFireCount`
🐦⬛ Generated with Claude Code, orchestrated by Crow
Co-Authored-By: Claude <noreply@anthropic.com>
Crow-Session: C69B361D-F6EE-49FC-9F5B-E7A3928F5F1E
dhilgaertner
left a comment
There was a problem hiding this comment.
Code & Security Review
Thorough re-fire design with strong backward-compat discipline and genuinely comprehensive tests. I verified the core invariants by reading the wiring end-to-end and running the buildable package suites.
Verified correctness
- Anti-loop (head SHA) holds:
shouldReFireStalledChangesRequestedrequiresmeta.headShaAtEmit == s.headSha; a real agent push advancespreviousPRStatus[sid].headShaand kills the re-fire.previousPRStatus[session.id]is updated (IssueTracker.swift:1582) before the re-fire pass runs (1592), so the gate reads fresh state. isOpengate is actually effective. The viewer query is OPEN-only, but a merged/closed CHANGES_REQUESTED PR is picked up by the stale-PR follow-up (collectStalePRURLs→fetchStalePRStates) and flows throughapplyPRStatuses, sobuildPRStatussetsisOpen=false, the open→closed edge drops the meta entry (1556), andpreviousPRStatusreflects the close. The predicate'ss.isOpencheck is correct defense-in-depth on top of that.- Cross-provider
state:pr.state == "OPEN"is correct for GitLab too —GitLabCodeBackend.normalizeStatemapsopened→OPENon both the monitored and stale paths. - Synthetic transition routing:
AppDelegateskipsnotifyPRTransitiononisReFirewhileAutoRespondCoordinator.handlestill dispatches the re-prompt;respondToChangesRequestedis gated consistently at both re-fire generation and dispatch. - Cap, quiet window, review-session skip, terminal-readiness/agent-idle gates all read correctly.
- Backward-compatible decoding throughout:
isOpen ?? true,reFireCount ?? 0, optionalemittedTransitionMeta, legacy-key migration with freshemittedAt. Dictionary mutation-during-iteration is safe (COW snapshot).
Security Review
Strengths:
- Re-fire is gated on the user toggle, so an opted-out user gets neither the dispatch nor notification noise.
- Review sessions are skipped — Crow never commits on behalf of a reviewer.
- No new API scopes, secrets, or injection surface; prompt text is the same proven single-line pattern.
Concerns: none blocking.
Code Quality (Green — considerations only, no changes required to merge)
- Default flip to
true(respondToChangesRequested) changes the consent posture for fresh configs — Crow now auto-types into a managed terminal by default. This is the explicit purpose of CROW-505, reach is narrow (new configs only; explicit opt-outs stay sticky;respondToFailedChecksstays off), and it's well documented. Flagging only for visibility. - Cap=1 + reply path: if the agent legitimately replied on the PR instead of pushing, the one allowed re-fire can produce a duplicate reply comment. Already acknowledged as a deliberate tradeoff in the PR body.
- Narrow stale edges (bounded to one re-fire by the cap): (a) a degraded stale-fetch cycle (
staleFetch.complete == false) the same poll a PR closes leavespreviousPRStatus.isOpenstale; (b)collectStalePRURLsonly refreshes.active/.paused/.inReviewsessions, while the re-fire pass walks the meta map without asession.statusgate — a session moving to completed/archived exactly as its PR closes could re-fire once on stale state. Both are consistent with the existing emission path and bounded; consider mirroring thesession.statusfilter in the re-fire walk as a future tidy-up.
Testing
- CrowCore (275 tests) and CrowPersistence (31 tests) pass locally, including the new
AutoRespondSettings defaultssuite and theemittedTransitionMetaround-trip / legacy-decode tests. Tests/CrowTests(the IssueTracker stalled-refire suite) couldn't be built in this review environment —Frameworks/GhosttyKit.xcframeworkcontains no binary artifact here. This is an environment limitation, not a PR issue; the test code reads correctly against the APIs and the author reports the full 236-test suite green.
Summary Table
| Color | Meaning | Verdict effect |
|---|---|---|
| Red | Must fix | Request changes |
| Yellow | Should fix | Request changes |
| Green | Consider | Approve allowed |
Recommendation: Approve — driven by [0 Red, 0 Yellow, 3 Green] findings.
…CROW-508) (#509) Closes #508 ## Summary Replaces the edge-triggered "needs refine" machinery shipped in #507 (1-day-old) with a stateless rule that derives the answer from the PR snapshot on every poll. Motivated by [shell-crm#202](radiusmethod/shell-crm#202): even with #507 deployed, that PR sat in `CHANGES_REQUESTED` for 30+ minutes without Crow re-prompting after a restart — the bookkeeping that #507 added carried its own failure modes (poll loop starvation, missing persistence, dedup map drift after restart). **The rule:** emit `.changesRequested` when ALL of: 1. PR is OPEN and `reviewDecision == CHANGES_REQUESTED` 2. `lastChangesRequestedAt` (max `submittedAt` across CHANGES_REQUESTED reviews) > `lastSubstantiveCommitAt` (max `committedDate` across commits with `parents.totalCount < 2` AND whose subject doesn't start with `Merge branch|remote-tracking|pull request`) 3. Managed terminal is idle (readiness == `.agentLaunched`, agent activity == `.idle`) 4. `AutoRespondSettings.respondToChangesRequested` is on 5. PR has been observed at least once this Crow process (first-observation skip) 6. 7 minutes have elapsed since the last dispatch for this PR (cooldown) Anti-loop is **automatic**: any real (non-merge, non-rebase) commit advances `lastSubstantiveCommitAt` past `lastChangesRequestedAt` and the rule stops firing on the next poll. No `headShaAtEmit` snapshot needed. The GitHub "Update branch" button produces a merge commit that's filtered out upstream, so it can't trick the rule. Restart-safe by construction: a fresh Crow process re-derives the truth on the next successful poll. The 7-min cooldown is in-memory only; worst case after restart is one duplicate prompt before the cooldown re-arms. ## Decisions made - **Same-PR deletion, not follow-up.** The diff is easier to reason about as one atomic swap, and reverting is a single-SHA revert. Listed all deleted symbols below. - **Cooldown of 7 min** (middle of the ticket's 5–10 min range). 7 min sits comfortably between "long enough that an agent thinking through a hard finding isn't interrupted" and "short enough that a true stall surfaces within ~3 poll cycles." - **Tree-equals-parents check skipped** (the ticket called it optional). The merge-message + parent-count filter catches every shape I observed in the wild; the tree check would add an extra API call per PR for marginal coverage. - **GraphQL query bumped `latestReviews(first: 5)` → `first: 20`** and added `commits(last: 30)` with `oid`, `messageHeadline`, `committedDate`, `parents.totalCount`. Per-PR cost is small; the stateless rule needs both timestamps every poll. - **No transitions() emission for `.changesRequested`.** The pure `PRStatus.transitions(…)` function now only emits `.checksFailing`. `.changesRequested` lives entirely in `IssueTracker.applyPRStatuses` (the stateless dispatch path). ## Symbols deleted - `PersistedIssueTrackerState` (CrowPersistence) - `StoreData.issueTrackerState` field (decode-tolerant — existing stores ignore the key, no migration needed) - `EmittedTransitionMeta` (CrowCore) - `PRStatusTransition.dedupeKey`, `.latestReviewID`, `.isReFire` fields - `PRStatus.latestReviewID` field - `PRRecord.latestReviewID` field (replaced with `lastChangesRequestedAt`, `lastSubstantiveCommitAt`) - `IssueTracker.emittedTransitionMeta` map - `IssueTracker.hydratePersistedState()`, `persistTrackerState()` - `IssueTracker.reFireStalledChangesRequested(now:)` - `IssueTracker.shouldReFireStalledChangesRequested(meta:…)` predicate - `IssueTracker.stalledRefireQuietWindow`, `maxStalledRefires` constants - `IssueTracker.parseSessionID(fromDedupKey:)`, `parseKind(fromDedupKey:)` helpers - `AppDelegate` re-fire notification skip block ## New code - `PRStatus.needsRefine(status:terminalIdle:)` — pure rule in CrowCore. ~12 lines. - `IssueTracker.seenPRs: Set<String>` (PR URL keyed; first-observation skip) - `IssueTracker.lastRefineDispatchAt: [String: Date]` (PR URL keyed; cooldown) - `IssueTracker.needsRefineCooldown` constant (7 min) - `IssueTracker.isManagedTerminalIdle(sessionID:)` helper - `GitHubCodeBackend.isMergeCommitMessage(_:)` helper (merge-prefix detection, shared with tests) - `PRRecord.lastChangesRequestedAt`, `.lastSubstantiveCommitAt` fields - `PRStatus.lastChangesRequestedAt`, `.lastSubstantiveCommitAt` fields ## Acceptance tests All four ticket scenarios are automated: | # | Scenario | Test | |---|----------|------| | 1 | Round-N stall fires on idle poll (after first-observation skip) | `IssueTrackerNeedsRefineTests.roundNStallFiresOnSecondPoll` | | 2 | Merge-from-main does NOT reset the rule | `IssueTrackerNeedsRefineTests.mergeFromMainDoesNotFlipTheRule` + `BackendsTests.testParseMonitoredPRsLastSubstantiveCommitExcludesMergesAndRebases` | | 3 | Real fix flips the rule off | `IssueTrackerNeedsRefineTests.realFixStopsTheRule` + `PRStatusNeedsRefineTests.doesNotFireWhenCommitIsNewerThanReview` | | 4 | Restart-safe: skip → fire → cooldown blocks → cooldown elapses → fire again | `IssueTrackerNeedsRefineTests.restartFreshTrackerSkipsThenFiresThenCoolsDown` | Plus defense-in-depth tests for the busy-agent, pre-launch-terminal, opt-out, missing-timestamp, closed-PR, and merge-message-shape gates. Total new/rewritten: 23 tests across 3 suites. ## Deliberate-break verifications Each guard is pinned by a test that flips loudly when the guard is removed: - **Remove cooldown** → `restartFreshTrackerSkipsThenFiresThenCoolsDown` poll 3 count would go from 1 to 2. - **Remove first-observation-skip** → `roundNStallFiresOnSecondPoll` poll 1 count would go from 0 to 1. - **Remove merge-commit filter** → `testParseMonitoredPRsLastSubstantiveCommitExcludesMergesAndRebases` would observe `lastSubstantiveCommitAt == 2026-06-07` instead of `2026-06-01`. I read each assertion to confirm the inversion would fail loudly rather than mechanically running them (the assertions themselves are the forcing function). ## Test plan - [x] `swift test --package-path Packages/CrowCore` — 273 tests pass (includes new `PRStatus.needsRefine (CROW-508)` suite) - [x] `swift test --package-path Packages/CrowPersistence` — 30 tests pass (includes new "decoder ignores legacy issueTrackerState blob" coverage) - [x] `swift test --package-path Packages/CrowProvider` — 49 tests pass (includes new commit-parsing + merge-filter tests) - [x] `swift test` (root) — 219 tests pass (includes new `IssueTracker stateless needsRefine (CROW-508)` suite) - [x] `swift build --arch arm64` clean - [ ] Manual smoke: attach a `CHANGES_REQUESTED` PR with no agent push since the review → Crow re-prompts on the second poll, then waits 7 min before re-prompting. - [ ] Manual cap check: rebase via "Update branch" → `lastSubstantiveCommitAt` does NOT advance, Crow keeps prompting on the next poll. - [ ] Manual anti-loop check: agent pushes a real fix → `lastSubstantiveCommitAt` advances, Crow stops prompting on the next poll. --------- Co-authored-by: Claude <noreply@anthropic.com>
Closes #505
Summary
When Crow tracks a PR sitting in
CHANGES_REQUESTEDand the agent goes idle after one auto-refine round (busy at dispatch, transient error, agent disagreed with a finding but didn't reply, ran out of context), Crow never re-prompts. The CROW-456 dedup key(sessionID, changesRequested, latestReviewID)was deliberately keyed this way so the agent's own response push doesn't loop the prompt — but it overshoots, also wedging genuine stalls.This PR adds a bounded re-fire pass that re-emits the same review's transition when all conditions hold:
CHANGES_REQUESTEDand still open (isOpen).agentLaunchedand hook-drivenAgentActivityState == .idlemaxStalledRefires = 1re-fire per emission — caps the re-prompt count so an agent that legitimately replies instead of pushing doesn't drive infinite re-prompts (or duplicate reply comments)Plus flips
AutoRespondSettings.respondToChangesRequesteddefault totruefor fresh configs (see "Default-flip reach" below).Quiet-window rationale
10 min = 10 poll cycles (
IssueTracker.pollInterval = 60s). Claude Code agents can spend that long thinking on a hard finding before responding — short windows would interrupt productive work. Constant:IssueTracker.stalledRefireQuietWindow, easy to tune.Anti-loop invariants (correctness properties)
EmittedTransitionMeta.headShaAtEmitrecords the head SHA at dispatch. When the agent pushes a real fix,previousPRStatus[sid].headShaadvances; the equality gate inshouldReFireStalledChangesRequestedfails; the entry stays in the map; no re-fire. Verified bydoesNotReFireWhenAgentPushedNewCommit.EmittedTransitionMeta.reFireCount(capped at 1) prevents unbounded re-prompts when the agent posts a reply on the PR rather than pushing — a legitimate terminal state encouraged by the auto-respond prompt itself ("If a comment is unclear or you disagree, leave a reply…"). Verified bydoesNotReFireOncePerEmissionCapReached+secondReFireOnSameMetaIsSuppressedByCap.PRStatusTransition.isReFire = true.AppDelegate.onPRStatusTransitionsskipsnotifyPRTransitionfor these so we don't post a fresh macOS notification for the same review every quiet window;AutoRespondCoordinator.handlestill dispatches the re-prompt (the useful part).PRStatus.isOpen(new field, defaulttruefor backward decode, set frompr.state == "OPEN") blocks re-fire on merged or closed-unmerged PRs. GitHub doesn't resetreviewDecisionon close, so without this gate a dead PR would re-prompt the agent to push to nothing. Verified bydoesNotReFireOnMergedPR+doesNotReFireOnClosedUnmergedPR.session.kind == .review) are skipped, matching the existingAutoRespondCoordinatorpolicy — Crow never commits on behalf of a reviewer.respondToChangesRequestedProvider) suppresses re-fire entirely when the user has opted out, so no synthetic transitions reach the notification fan-out for opted-out users.Settings default flip — limited reach (honest framing)
AutoRespondSettings.respondToChangesRequestedwasfalse(a footgun: users assumed Crow refined but the toggle was silently off). Now defaults totrue. However, the reach is genuinely narrow:AutoRespondSettingshas no customencode(to:), so any user with a previously-saved config hasrespondToChangesRequested: falsewritten explicitly, and thedecodeIfPresent ... ?? truepath returns the persisted value. So the newtruedefault only lands on:store.jsonyet)autoRespondkey entirely (very old upgrades)Existing active users who never touched the toggle keep
false. Flipping their value blindly would also clobber real opt-outs, which the decoder can't distinguish from "untouched old default" — so we don't. A one-time opt-in migration prompt is a reasonable follow-up but is out of scope for this PR.Covered by
AutoRespondSettingsDefaultsTests.Persistence migration
PersistedIssueTrackerStategained an optionalemittedTransitionMeta: [String: EmittedTransitionMeta]?field.EmittedTransitionMetaitself now hasreFireCount: Int(default 0, backward-decode?? 0). Pre-CROW-505 stores hydrate from the legacyemittedTransitionKeysarray: each key gets meta withemittedAt = now(fresh quiet-window clock — no re-fire flood at upgrade),headShaAtEmitsourced frompreviousPRStatus, andreFireCount = 0. Pre-cap stores that already have the map but lackreFireCountdecode it as 0. New writes populate the map and leaveemittedTransitionKeysempty so older builds reading the array gracefully fall back to "no keys" (a one-time downgrade prompt burst, documented).Tests (all listed in the ticket acceptance, plus expansions for each review round)
reFiresWhenAllConditionsMetdoesNotReFireWhenAgentPushedNewCommitroundTwoFromReviewerProducesDistinctDedupKey+ existingPRStatusTransitionTestsdoesNotReFireWhileAgentWorking/Waiting/DonedoesNotReFireBeforeQuietWindowElapsedIssueTrackerStalledRefireWiringTests.toggleOffSuppresses…,toggleOnPermits…, review-session skip, agent-activity, readiness, missing-terminaldoesNotReFireOnMergedPR,doesNotReFireOnClosedUnmergedPRdoesNotReFireOncePerEmissionCapReached,reFiresWhileUnderCap,secondReFireOnSameMetaIsSuppressedByCap,toggleOnPermits…assertsisReFire == true+reFireCount == 1issueTrackerStateRoundTripsThroughDisk(extended withreFireCount) +emittedTransitionMetaDecodesLegacyEntryWithoutReFireCountAutoRespondSettingsDefaultsTestsFull suite (236 tests, 26 suites) green via
arch -arm64 swift test --arch arm64.Session-card "Resume refine" button — deferred
The ticket's alternative path. The auto re-fire (now bounded + idempotent) directly answers the user's complaint ("if crow is tracking a pr, it should refine when it needs to") and
dispatchManual(.addressChanges, ...)already exists for any future button. Adding a SwiftUI surface now means another set of trigger conditions to keep in sync with the auto path. Worth a follow-up if telemetry shows users wanting a faster manual kick.Test plan
issueTrackerStateRoundTripsThroughDisk)PRStatusTransitionTests,IssueTrackerDedupTests,IssueTrackerCompletionTests, etc. still pass — no regressionsswift build --arch arm64cleanCHANGES_REQUESTEDPR; kill the agent before it pushes; wait ~10 min; confirm Console showsauto-refine re-fired (stalled review) ... count=1/1and the prompt arrives in the terminal once🤖 Generated with Claude Code