feat(plan-check): add plan validation workflows#117
feat(plan-check): add plan validation workflows#117ZenusZhang wants to merge 1 commit intoPolyArch:devfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new plan-check validation/reporting workflow (plus Codex skill wiring) and integrates gen-plan check mode controls/config to support draft + plan validation/repair flows.
Changes:
- Introduces deterministic plan-check backend: schema validation helpers, IO validation, findings/report assembly, and a Codex skill entrypoint.
- Extends gen-plan with
--check/--no-checkflags +gen_plan_checkconfig resolution and updates docs/tests accordingly. - Improves hook/install robustness: symlink-tolerant goal-tracker path matching, methodology-analysis completion gating, and Codex hooks installer dry-run behavior.
Reviewed changes
Copilot reviewed 48 out of 49 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test-unified-codex-config.sh | Ensures tests run with isolated XDG config home for defaults validation. |
| tests/test-plan-check.sh | Adds fixture-style tests for schema validation, report assembly, rewrite/recheck behavior, and regression cases. |
| tests/test-methodology-analysis.sh | Adds tests for readiness vs completion behavior in methodology analysis flow. |
| tests/test-gen-plan.sh | Adds assertions for --check/--no-check + gen_plan_check documentation and validator behavior. |
| tests/test-gen-idea.sh | Adds tests for raw-argument parsing behavior in gen-idea IO validation. |
| tests/test-config-merge.sh | Adds coverage for plan_check_recheck + gen_plan_check merged-config behavior. |
| tests/test-codex-hook-install.sh | Adds coverage for hooks installer existence, large codex features list, dry-run behavior, and plan-check skill installation. |
| tests/test-bitlesson-select-routing.sh | Adds regression test for empty codex_model fallback under nounset. |
| tests/test-ask-codex.sh | Verifies nested codex exec disables codex_hooks when supported. |
| tests/test-allowlist-validators.sh | Adds coverage for blocking stale same-round loop contract edits. |
| tests/run-all-tests.sh | Wires new plan-check and gen-plan-check suites into the overall test runner. |
| tests/robustness/test-hook-system-robustness.sh | Adds symlink-root coverage for goal-tracker read/write/edit validators and backtick-literal regression assertion. |
| tests/fixtures/gen-plan-check/default-template.md | Adds fixture template for gen-plan check-mode tests. |
| tests/fixtures/gen-plan-check/default-expected.md | Adds expected output fixture for gen-plan check-mode tests. |
| tests/fixtures/gen-plan-check/default-draft.md | Adds draft fixture for gen-plan check-mode tests. |
| skills/humanize-plan-check/SKILL.md | Adds Codex flow entrypoint documentation for plan-check. |
| skills/humanize-gen-plan/SKILL.md | Documents check-mode flags and config priority in the gen-plan skill. |
| scripts/validate-plan-check-io.sh | Adds IO validator for plan-check (plan existence, output dir resolution/creation, permissions, flags). |
| scripts/validate-gen-plan-io.sh | Adds --check/--no-check flags to the gen-plan IO validator and usage output. |
| scripts/validate-gen-idea-io.sh | Adds --raw-arguments parsing mode to preserve/normalize inline idea text and flags. |
| scripts/plan-check.sh | Adds deterministic report assembler: validates findings array schema, writes findings.json + report.md. |
| scripts/lib/plan-check-common.sh | Adds shared plan-check library: schema validators, report helpers, rewrite helpers, config resolution, ambiguity ID postprocessing. |
| scripts/lib/gen-plan-check-mode.sh | Adds resolver implementing --no-check > --check > config > default priority. |
| scripts/install-skill.sh | Ensures humanize-plan-check is installed and improves DRY-RUN behavior ordering. |
| scripts/install-codex-hooks.sh | Avoids probing Codex features during dry-run; improves feature-list probing robustness. |
| scripts/bitlesson-select.sh | Sources loop-common earlier to ensure default model is initialized under set -u. |
| scripts/ask-codex.sh | Disables Codex hooks for nested codex exec runs when supported by installed CLI. |
| hooks/loop-write-validator.sh | Uses new symlink-tolerant loop_paths_match for goal-tracker path checks. |
| hooks/loop-read-validator.sh | Uses new symlink-tolerant loop_paths_match for goal-tracker path checks. |
| hooks/loop-edit-validator.sh | Uses loop_paths_match for goal-tracker + adds “wrong directory path” validation for loop edits. |
| hooks/loop-codex-stop-hook.sh | Separates methodology readiness gate from completion mutation; ensures completion is finalized before exit. |
| hooks/lib/methodology-analysis.sh | Splits readiness check from completion logic to avoid side effects during gating. |
| hooks/lib/loop-common.sh | Adds loop_paths_match helper; adjusts fallback text escaping for backticks. |
| docs/usage.md | Documents new plan-check command and gen-plan check-mode behavior/config. |
| docs/install-for-kimi.md | Documents installing the new humanize-plan-check skill. |
| docs/install-for-codex.md | Documents installing the new humanize-plan-check skill. |
| config/default_config.json | Adds defaults for plan_check_recheck and gen_plan_check. |
| commands/plan-check.md | Adds Claude command spec for plan-check workflow and sub-agent contract. |
| commands/gen-plan.md | Adds check-mode spec (draft check + plan check/repair + gating) and new allowed tools. |
| commands/gen-idea.md | Switches gen-idea validation to --raw-arguments mode. |
| agents/plan-consistency-checker.md | Adds agent spec for plan contradiction detection output schema. |
| agents/plan-ambiguity-checker.md | Adds agent spec for plan ambiguity detection with stable, hash-based IDs. |
| agents/draft-plan-drift-checker.md | Adds drift checker spec used only to explain existing primary findings from sources. |
| agents/draft-consistency-checker.md | Adds draft contradiction checker spec. |
| agents/draft-ambiguity-checker.md | Adds draft ambiguity checker spec with stable IDs. |
| README.md | Bumps version to 1.17.0. |
| .claude-plugin/plugin.json | Bumps plugin version to 1.17.0. |
| .claude-plugin/marketplace.json | Bumps marketplace version to 1.17.0. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "$val_g + $val_a + $val_b" | ||
| fi | ||
|
|
||
| # ========================================n# Test 8: gen_plan_check default is false |
There was a problem hiding this comment.
There’s a stray n embedded in the section divider comment, which will show up in test output and makes the header inconsistent with the other test blocks. Remove the extra character so the divider and the # Test 8 header are on separate lines.
| # 3 - Output directory does not exist | ||
| # 4 - Output file already exists | ||
| # 5 - No write permission to output directory |
There was a problem hiding this comment.
The exit-code documentation at the top doesn’t match the current checks: this script creates the report directory when missing (so it’s not simply “does not exist”), and exit 4 is triggered when the report path exists as a non-directory (not when an output file exists). Update these comments so callers/debuggers don’t get misled about what exit codes 3 and 4 mean.
| tmpfile="$(mktemp "$target_dir/.plan-check-write.XXXXXX")" | ||
| printf '%s\n' "$content" > "$tmpfile" | ||
| mv "$tmpfile" "$target_path" |
There was a problem hiding this comment.
plan_check_atomic_write replaces the target file with a newly-created temp file, which will typically reset file permissions to the mktemp default (often 0600) instead of preserving the original mode. This can unexpectedly make previously-readable plan files private after a rewrite. Consider copying the existing file’s mode/ownership to the temp file before mv (e.g., chmod --reference/chown --reference when available, or stat + chmod fallback), or use a write-in-place approach that preserves permissions.
| tmpfile="$(mktemp "$target_dir/.plan-check-write.XXXXXX")" | |
| printf '%s\n' "$content" > "$tmpfile" | |
| mv "$tmpfile" "$target_path" | |
| tmpfile="$(mktemp "$target_dir/.plan-check-write.XXXXXX")" | |
| trap 'rm -f "$tmpfile"' RETURN | |
| printf '%s\n' "$content" > "$tmpfile" | |
| if [[ -e "$target_path" ]]; then | |
| if chmod --reference="$target_path" "$tmpfile" 2>/dev/null; then | |
| : | |
| else | |
| local target_mode="" | |
| target_mode="$(stat -c '%a' "$target_path" 2>/dev/null || stat -f '%Lp' "$target_path" 2>/dev/null || true)" | |
| if [[ -n "$target_mode" ]]; then | |
| chmod "$target_mode" "$tmpfile" | |
| fi | |
| fi | |
| if chown --reference="$target_path" "$tmpfile" 2>/dev/null; then | |
| : | |
| fi | |
| fi | |
| mv "$tmpfile" "$target_path" | |
| trap - RETURN |
| try: | ||
| findings = json.load(sys.stdin) | ||
| except Exception: | ||
| sys.stdout.write(sys.stdin.read()) |
There was a problem hiding this comment.
In plan_check_postprocess_ambiguity_ids, the fallback path on JSON parse failure attempts to re-read sys.stdin after json.load(sys.stdin) has already consumed some/all bytes, so the "pass-through" output can be truncated or empty. To reliably pass through the original content, read stdin into a string first, then json.loads(raw) in the try block, and on failure print raw unchanged.
| try: | |
| findings = json.load(sys.stdin) | |
| except Exception: | |
| sys.stdout.write(sys.stdin.read()) | |
| raw = sys.stdin.read() | |
| try: | |
| findings = json.loads(raw) | |
| except Exception: | |
| sys.stdout.write(raw) |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4c89cf94fd
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if [[ "$in_task_section" -eq 1 && "$line" =~ ^##[[:space:]] && ! "$line" =~ ^### ]]; then | ||
| break | ||
| fi | ||
| if [[ "$in_task_section" -eq 1 && "$line" =~ ^\|[\-\|]+\|$ ]]; then |
There was a problem hiding this comment.
Accept spaced Task Breakdown table separators
The parser only enters table mode when a separator matches ^\|[\-\|]+\|$, which rejects common markdown separators like | --- | --- | (and aligned variants with :). When that happens, task rows are never parsed, so task-level schema checks are skipped; invalid tags, nonexistent target ACs, and bad dependencies can all pass without blocker findings. This pattern is reused across the task validators, so the gap affects multiple checks.
Useful? React with 👍 / 👎.
| local normalized_acs | ||
| normalized_acs="$(_plan_check_normalize_target_acs "$target_ac")" | ||
|
|
||
| while IFS= read -r single_ac; do | ||
| [[ -z "$single_ac" ]] && continue |
There was a problem hiding this comment.
Emit finding when Target AC cell has no parsable AC ID
This validator only reports errors for AC IDs that were successfully parsed and then found missing; if parsing yields no IDs (for example - or malformed text), the loop is skipped and no finding is emitted. That means tasks with an empty/invalid Target AC cell are treated as valid even though they do not map to any acceptance criterion, which weakens the schema gate and can leave tasks without a verification target.
Useful? React with 👍 / 👎.
Add the standalone plan-check command, Codex/Kimi skill wiring, shared validation helpers, and integrated gen-plan check mode with draft and plan repair flows. Squashed from the local RLCR iteration history on top of the latest origin/dev; excludes the ask-gemini command/test follow-up from this plan-check PR.
Summary
Tests