Problem
We have a single reviewer pipeline: a manifesto-aware critic that enforces our internal style and architecture rules. It catches what we've codified — but it misses bug classes we haven't yet written rules for: generic correctness slips, common security anti-patterns, and language-idiom drift.
Concrete example: dogfood week shipped 4 iteration-probe bugs and a critic↔SDK field mismatch that the manifesto critic green-lit because no rule covered those shapes. A second reviewer specialized in correctness (CodeRabbit) would likely have flagged at least the SDK field mismatch as a type/contract issue.
CTO asked specifically about CodeRabbit. It's free for OSS, so the trial cost is config + 4 weeks of attention.
Acceptance criteria
Test matrix
- Smoke (manual): Open a throwaway PR touching one file in
src/forge_loop/ with an obvious correctness bug (e.g. == vs is, or an unhandled None); confirm CodeRabbit posts a review comment within ~5 minutes. Close the PR.
- Smoke (config sanity): Open a PR touching only a file under
tests/; confirm CodeRabbit does NOT post review comments (path filter works).
- Adversarial / sad path: Open a PR with a pure-style change CodeRabbit would normally nitpick (e.g. variable rename, import reorder); confirm noise is suppressed by the config, OR if it isn't, document the gap in the trial verdict.
- Docs check: Grep
CONTRIBUTING.md for "CodeRabbit" and confirm the reviewer-roles section is present.
- No unit/integration tests required — this is repo configuration, not code. The "tests" are PR-level smoke checks listed above.
Out of scope
- Do NOT install CodeRabbit on any other Criteo or personal repo. This trial is
hadamrd/forge-loop only.
- Do NOT change or remove the existing manifesto-aware critic. Both run in parallel for the trial.
- Do NOT migrate to a paid CodeRabbit plan or evaluate paid features.
- Do NOT auto-resolve CodeRabbit comments or wire them into the loop runner's auto-fix path. Human reads them this trial.
- Do NOT write the 4-week verdict in this PR — that's a follow-up issue.
- Do NOT add CodeRabbit-specific rules to the manifesto yet. Harvest happens at verdict time.
File pointers
Worker note
AC is moderately wide (config + docs + external GitHub App install + measurement setup). The GitHub App install step requires browser/UI action you cannot do from CLI — when you hit it, STOP and leave a clear handoff comment on the PR telling the operator "install CodeRabbit at https://github.com/apps/coderabbitai then re-trigger". Apply commit discipline: wip-commit after .coderabbit.yaml is drafted, again after CONTRIBUTING.md edit, then push even if the App install handoff is pending. Run the EXIT CHECKLIST before declaring done.
Original report
Why
CTO asked about CodeRabbit as a complementary reviewer. We already have a manifesto-aware critic, but a second-opinion reviewer catches a different class of bugs (generic correctness, common security patterns, language idiom drift). Free for OSS — low risk to trial.
Compounding context: today's dogfood-week catch list (4 iteration-probe bugs, critic-SDK field mismatch, cli.py bloat) is heavy on subtle correctness issues. CodeRabbit is specifically known for catching these.
What
Enable CodeRabbit on the hadamrd/forge-loop public repo as a 4-week trial. Measure signal-to-noise. Decide retain/drop.
Acceptance (original)
- CodeRabbit installed on
hadamrd/forge-loop via the GitHub App (free OSS plan)
.coderabbit.yaml in the repo configures it to focus on src/forge_loop/ (skip tests for now) and to defer to our manifesto-aware critic on style — CodeRabbit handles correctness + security patterns + idiom drift
- After 4 weeks, count: (a) issues CodeRabbit flagged that our critic missed, (b) noise (style nitpicks duplicated with ruff), (c) bug-class overlap with the manifesto
- Decision: retain (and adopt the bug classes it catches into the manifesto) or drop (it duplicates our critic too much)
File pointers (original)
.coderabbit.yaml (new — config)
docs/CONTRIBUTING.md — note that CodeRabbit is the second reviewer
- A follow-up ticket after the trial documents the verdict
Axis citation
axis:modernization-gated. Unblocks faster bug detection on every axis by adding a second reviewer that catches what our manifesto-driven critic doesn't yet codify.
Problem
We have a single reviewer pipeline: a manifesto-aware critic that enforces our internal style and architecture rules. It catches what we've codified — but it misses bug classes we haven't yet written rules for: generic correctness slips, common security anti-patterns, and language-idiom drift.
Concrete example: dogfood week shipped 4 iteration-probe bugs and a critic↔SDK field mismatch that the manifesto critic green-lit because no rule covered those shapes. A second reviewer specialized in correctness (CodeRabbit) would likely have flagged at least the SDK field mismatch as a type/contract issue.
CTO asked specifically about CodeRabbit. It's free for OSS, so the trial cost is config + 4 weeks of attention.
Acceptance criteria
hadamrd/forge-loopunder the free OSS plan; the install is visible in repo Settings → Integrations..coderabbit.yamlexists at repo root and: (a) scopes review path filters tosrc/forge_loop/**, (b) excludestests/**from review, (c) disables style/lint nitpicks that overlap with ruff (reviews.tools.ruffleft on, generic style comments off), (d) setsreviews.profile: assertiveor equivalent so correctness/security pattern checks are prioritized.docs/CONTRIBUTING.md(orCONTRIBUTING.mdif that's where it lives —(investigate)) gains a "Reviewers" section naming the manifesto-aware critic as primary stylistic/architectural reviewer and CodeRabbit as second-opinion correctness/security reviewer, with one sentence on how to resolve conflicts (manifesto wins on style; CodeRabbit wins on correctness unless the critic has a documented rule).Test matrix
src/forge_loop/with an obvious correctness bug (e.g.==vsis, or an unhandledNone); confirm CodeRabbit posts a review comment within ~5 minutes. Close the PR.tests/; confirm CodeRabbit does NOT post review comments (path filter works).CONTRIBUTING.mdfor "CodeRabbit" and confirm the reviewer-roles section is present.Out of scope
hadamrd/forge-looponly.File pointers
.coderabbit.yaml— NEW. Root of repo. See https://docs.coderabbit.ai/guides/configure-coderabbit for schema.docs/CONTRIBUTING.mdorCONTRIBUTING.md—(investigate)which path exists; add "Reviewers" subsection.README.md— optional one-line mention under a "Quality" or "Reviewers" badge area(investigate).Worker note
AC is moderately wide (config + docs + external GitHub App install + measurement setup). The GitHub App install step requires browser/UI action you cannot do from CLI — when you hit it, STOP and leave a clear handoff comment on the PR telling the operator "install CodeRabbit at https://github.com/apps/coderabbitai then re-trigger". Apply commit discipline: wip-commit after
.coderabbit.yamlis drafted, again after CONTRIBUTING.md edit, then push even if the App install handoff is pending. Run the EXIT CHECKLIST before declaring done.Original report
Why
CTO asked about CodeRabbit as a complementary reviewer. We already have a manifesto-aware critic, but a second-opinion reviewer catches a different class of bugs (generic correctness, common security patterns, language idiom drift). Free for OSS — low risk to trial.
Compounding context: today's dogfood-week catch list (4 iteration-probe bugs, critic-SDK field mismatch, cli.py bloat) is heavy on subtle correctness issues. CodeRabbit is specifically known for catching these.
What
Enable CodeRabbit on the
hadamrd/forge-looppublic repo as a 4-week trial. Measure signal-to-noise. Decide retain/drop.Acceptance (original)
hadamrd/forge-loopvia the GitHub App (free OSS plan).coderabbit.yamlin the repo configures it to focus onsrc/forge_loop/(skip tests for now) and to defer to our manifesto-aware critic on style — CodeRabbit handles correctness + security patterns + idiom driftFile pointers (original)
.coderabbit.yaml(new — config)docs/CONTRIBUTING.md— note that CodeRabbit is the second reviewerAxis citation
axis:modernization-gated. Unblocks faster bug detection on every axis by adding a second reviewer that catches what our manifesto-driven critic doesn't yet codify.