Skip to content

rules-test-coverage-trim-2026-05: trim rill-*.md (1308→943) + wire 7 + add 3 skill tests#33

Open
tarr1124 wants to merge 12 commits into
mainfrom
feature/rules-test-coverage-trim-2026-05
Open

rules-test-coverage-trim-2026-05: trim rill-*.md (1308→943) + wire 7 + add 3 skill tests#33
tarr1124 wants to merge 12 commits into
mainfrom
feature/rules-test-coverage-trim-2026-05

Conversation

@tarr1124
Copy link
Copy Markdown
Contributor

Summary

  • Rules trim: .claude/rules/rill-*.md 10 files 1308→943 lines (28% reduction). No semantic changes — all technical invariants preserved (lane definitions, 5-case branching, two-channel write, Tier 3 denylist, ADR-046 D46-7 modes, PUBLIC repo guards, etc.). File-by-file commits enable bisect.
  • Test coverage: test/run-all.sh now wires 11 skill tests (distill + briefing + clip-tweet + close + focus + newsletter + page + solve + new inspect + repair + eval). Existing test-briefing/test-solve test bugs fixed. New smoke tests for /inspect, /repair, /eval.
  • Codex review fixes: valid /eval queries.yaml fixture format + set-e-safe run-all.sh failure counter (((TOTAL_FAIL++))TOTAL_FAIL=$((TOTAL_FAIL + 1))).

Verification

  • Pre-trim baseline run-all.sh: 10 suite PASS / 1 FAIL (/page transient bash quirk), 134/144 assertion PASS
  • Post-trim run-all.sh: 11/11 suite PASS confirmed after retry of 3 Cloudflare-522 transients. /page improved FAIL→PASS post-trim
  • Spot check: prose quality of /focus, /close, /solve, /newsletter outputs reviewed — no regression
  • Codex review v2: 1 [P1] + 1 [P2] remaining, deferred to follow-up task (tasks/fix-eval-test-comprehensiveness-2026-05/_task.md, draft) — issues are about /eval smoke test depth (stratified-sampling + EV-02 strictness), out of trim scope

Test plan

  • CI green
  • bash test/run-all.sh exit 0 in a clean clone
  • (post-merge subjective) JP-response readability check over 1-2 days

🤖 Generated with Claude Code

tarr1124 added 12 commits May 14, 2026 14:03
Add test scaffolding so test/run-all.sh exercises 11 OSS skills
(briefing, clip-tweet, close, distill, eval, focus, inspect, newsletter,
page, repair, solve) instead of /distill only. Three new tests are added
(test-eval.sh, test-inspect.sh, test-repair.sh, all smoke tests modelled
on test-distill.sh).

Test-only fixes to the existing eight tests:
- Optional-ize the `cp $REPO_DIR/{taxonomy,CLAUDE}.md` overlays.
  OSS rill does not ship taxonomy.md or CLAUDE.md at the repo root, so
  the unconditional cp aborted every test on a clean clone. The fixture
  copies under test/fixtures/ are already in place, so optional overlays
  preserve the vault-overlay intent without breaking OSS runs.
- test-briefing.sh: scalarize SC-02..SC-05 grep counts via
  `{ ... ; } | tr -d '[:space:]'` so assert_gt receives a single integer
  (the multi-line stdout otherwise triggered an `((...))` syntax error).
- test-solve.sh: raise `--max-turns` from 50 to 200, and accept `done`
  in the P4-01 task-status assertion alongside `waiting` and `open`
  (the new /solve may legitimately reach `done` within a single run).

Also add test/results/ to .gitignore (timestamped run artifacts).
Compress operational notes and rationale paragraphs while preserving all technical invariants (lane definitions, 5-case branching, two-channel write invariant, Tier 3 denylist, PUBLIC repo guard). No semantic changes.
Compress prose in Substance section and Anti-patterns; shorten Good Example. Removed redundant scaffold paragraphs. No semantic changes.
Compress section descriptions and remove redundant scaffold paragraphs.
Compress notes/entity principles; merge redundant projects/ deprecation paragraphs.
Compress section descriptions while preserving status transition rules and File-First principle.
Compress reports/ and pages/ subsection descriptions while preserving recipe pair convention.
Compress subsection bullets and cross-cutting rules.
Compress structure/.processed/subdirectory bullets.
Compress bullets in Tag Management, Reference Rules, Entity References.
…178→154)

Compress Critical Invariants and detailed rules index in rill-core.md; replace rill-tasks.md Good Example block with abridged prose form.
…e counter

[P1] test-eval.sh: replace 1-entry 'queries: [{id, text}]' fixture with 7-entry top-level sequence matching real eval/queries.yaml format (id/query/type/scope), spanning the 4 supported types for Phase 2 stratified sampling.

[P2] run-all.sh: replace '((TOTAL_FAIL++))' with 'TOTAL_FAIL=$((TOTAL_FAIL + 1))' — post-increment when TOTAL_FAIL=0 returns exit 1 which under set -e would abort run-all.sh on the first failing suite.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant