fix: widen /codex:review wait threshold to ~5 files / 200 lines#315
fix: widen /codex:review wait threshold to ~5 files / 200 lines#315robertbpugh wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 47d000052b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| - Treat untracked files or directories as reviewable work even when `git diff --shortstat` is empty. | ||
| - Only conclude there is nothing to review when the relevant working-tree status is empty or the explicit branch diff is empty. | ||
| - Recommend waiting only when the review is clearly tiny, roughly 1-2 files total and no sign of a broader directory-sized change. | ||
| - Recommend waiting when the review is small, roughly up to 5 files or 200 lines of diff, with no sign of a broader directory-sized change. Codex finishes within Claude's prompt cache window (5 minutes) for diffs of this size, so wait in the foreground and the result feeds the next step. |
There was a problem hiding this comment.
Gate foreground recommendation on both size thresholds
Using or here makes large diffs qualify as “wait” whenever just one dimension is small (for example, a 1-file/1500-line refactor or a 9-file/80-line sweep), which contradicts the stated “small review” intent and can push long reviews into foreground mode where the user is blocked and cache-window assumptions no longer hold. The recommendation should require both limits (or an equivalent combined smallness check), not either one independently.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Updates the /codex:review command’s execution-mode guidance so that small reviews more often run in the foreground (to avoid unnecessary wait/background prompting) and larger/unclear reviews continue to be recommended for background execution.
Changes:
- Widened the “recommend wait” threshold from ~1–2 files to ~5 files or ~200 lines.
- Added rationale tying the foreground recommendation to staying within the prompt-cache window for small diffs.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@copilot apply changes based on the comments in this thread |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8b358db1c1
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| - Treat untracked files or directories as reviewable work even when `git diff --shortstat` is empty. | ||
| - Only conclude there is nothing to review when the relevant working-tree status is empty or the explicit branch diff is empty. | ||
| - Recommend waiting only when the review is clearly tiny, roughly 1-2 files total and no sign of a broader directory-sized change. | ||
| - Recommend waiting when the review is small, roughly up to 5 files or about 200 insertions+deletions as reported by `git diff --shortstat`, with no sign of a broader directory-sized change. Codex finishes within Claude's prompt cache window (5 minutes) for changes of this size, so wait in the foreground and the result feeds the next step. |
There was a problem hiding this comment.
Require both thresholds before recommending foreground wait
Change this condition to require both the file-count and line-count limits, not either one. As written (up to 5 files or about 200 insertions+deletions), very large reviews still get a foreground recommendation whenever one dimension is small (for example, a 1-file/1500-line rewrite or an 8-file/80-line sweep), which contradicts the “small review” gate and can block users on long-running reviews.
Useful? React with 👍 / 👎.
…ai#303, openai#315) - Drop non-JSONL garbage on app-server stdout (openai#311) - Fix broker shutdown hang on unresponsive socket (openai#300) - Treat jobs as inactive when their PID is dead (openai#297) - Fix stale shared broker auth after account switch (openai#303) - Widen /codex:review wait threshold to ~5 files / 200 lines (openai#315) - Update test fixtures to include pid for PID-alive checks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…penai#255, openai#165, openai#253, openai#209, openai#239, openai#307, openai#315) - Raise adversarial review inline diff limits to 50 files / 1MB (openai#327) - Add wall-clock timeouts to JSON-RPC requests (openai#302) - Use codex-namespaced clientInfo.name (openai#200, openai#255) - Handle EAGAIN in hook scripts for non-blocking stdin (openai#165) - Replace Node.js SessionStart hook with faster bash script (openai#253) - Add --fast flag for service_tier support (openai#209) - Set timeout: 600000 on foreground review/adversarial-review flows (openai#239) - Rename gpt-5-4-prompting skill to codex-prompting (openai#307) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Widen the wait/background threshold in
/codex:reviewfrom "1-2 files" to "up to 5 files or 200 lines of diff."Motivation
The current threshold trips on most real PRs. Even a 3-file fix asks "wait or background?", and the answer is "wait." Codex finishes inside the prompt-cache window.
Concrete case: a 3-file / 43-line wp-calypso review today hit the prompt. Codex finished in about 2 minutes. The 5-minute cache window was nowhere near expiring. The dispatcher prompt was pure round-trip cost.
5 files / 200 lines covers the small-PR case without losing the background recommendation for refactors, where the cache miss is already priced in.
Test plan
/codex:reviewon a 3-file / 50-line diff recommends "Wait for results"/codex:reviewon a 10-file / 500-line diff recommends "Run in background"--wait/--backgroundflags still bypass the dispatcher