Skip to content

fix(watch,review): role guard for start_watch_role + 406 diff-too-large fallback#216

Closed
ProtocolWarden wants to merge 3 commits into
mainfrom
oc-watchdog/20260601-2155-fix-watch-role-guard
Closed

fix(watch,review): role guard for start_watch_role + 406 diff-too-large fallback#216
ProtocolWarden wants to merge 3 commits into
mainfrom
oc-watchdog/20260601-2155-fix-watch-role-guard

Conversation

@ProtocolWarden
Copy link
Copy Markdown
Owner

Summary

Three fixes from watchdog cycle 9+ (2026-06-01/02):

  • edf5cb5 fix(watch): reject unknown roles in start_watch_role to prevent crash loops — adds validation guard; unknown roles now emit diagnostic and return 1 instead of restarting every 30s (root cause: watch-all invoked with role="all", ran crash loop for 5+ hours)
  • 4c74e1a fix(observer): resolve custodian audit violations from PR Export validation failure metrics for alerting #213 merge (C1/C36/C41/C43/T2/D6/D10/RUFF in observer module)
  • 44fefee fix(review): handle HTTP 406 diff-too-large in get_pr_diff — fall back to file-list summary so review watcher can still process oversized PRs

Note

This branch diverged from main before PR #214 merged; observer fixes in 4c74e1a overlap with PR #214's custodian fix. Net-new contributions vs main: role guard (scripts/operations-center.sh) and 406 fallback (github_pr.py + pr_review_watcher).

Test plan

  • tests/unit/er000_phase0_golden/: 15/15 passed (confirmed pre-push)
  • tests/unit/observer/: 181/181 passed (confirmed pre-push)
  • Watch-all status: no role="all" crash loop

🤖 Generated with Claude Code

ProtocolWarden and others added 3 commits June 1, 2026 21:57
… loops

`start_watch_role` had no validation for the role argument; any unknown
role (e.g. "all") fell through to the board_worker.main else-branch with
--role <unknown>, which argparse rejected with exit code 2. The bash
wrapper then restarted the process every 30s indefinitely, producing a
silent crash loop with no operational watcher. Added an explicit guard
that emits a diagnostic and returns 1 immediately. Fixed a live crash
loop (PID 590871, role="all", 333+ cycles, 5+ hours) that prior cycles
misclassified as historical.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fix all custodian violations in the observer/alert module introduced
when PR #213 merged without a pre-commit audit pass:
- alert_channels: remove async from notify() (no await inside); fix
  G004 f-string logging; update message to include condition name
- alert_validation/exporters/structured_logging: fix G004 f-string
  logging; add encoding= to open() calls; add ensure_ascii=False to
  json.dump(s) calls; fix DTZ007 naive strptime → strptime with +0000%z
- metrics: add MetricUnit.COUNT usage so D6 (never-constructed) resolves
- docs/README.md: link two orphan coverage-gating docs (DC7)
- tests: convert async def notify tests to sync (matching non-async impl);
  fix test_health_check_degraded_error_rate threshold (3% → not CRITICAL)
- tests/test_alert_validation/test_validation_metrics_exporter: add
  missing assert statements for T2 findings

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…file list

Root cause: GitHub returns 406 when PR diff exceeds the API size limit.
get_pr_diff() silently swallowed HTTPStatusError and returned "", causing
the review watcher to log "empty diff" and skip the PR permanently.

Fix: Detect 406 explicitly in get_pr_diff() and fall back to the files API
(_pr_diff_too_large_summary), returning a file-list summary prefixed with
[DIFF_TOO_LARGE]. The review watcher now logs a distinct warning and
proceeds with the partial diff rather than skipping the PR entirely.

Incidental: ruff-format alignment fixes inherited in both touched files.

Affected repo: OperationsCenter
Gate/check fixed: PR review watcher now processes large-diff PRs (#214)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ProtocolWarden
Copy link
Copy Markdown
Owner Author

Closing in favor of PR #217 — clean cherry-pick of net-new changes only (role guard + 406 fallback) onto current main, without the overlapping observer fixes from commit 4c74e1a which were already applied by PR #214.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant