Skip to content

feat(plan-check): add plan validation workflows#117

Open
ZenusZhang wants to merge 1 commit intoPolyArch:devfrom
ZenusZhang:plan-check
Open

feat(plan-check): add plan validation workflows#117
ZenusZhang wants to merge 1 commit intoPolyArch:devfrom
ZenusZhang:plan-check

Conversation

@ZenusZhang
Copy link
Copy Markdown
Contributor

Summary

  • add standalone plan-check command, skill wiring, validators, and review agents
  • integrate gen-plan check mode with draft and plan repair flows
  • keep ask-gemini command/test changes out of this PR

Tests

  • bash tests/test-plan-check.sh
  • bash tests/test-gen-plan-check.sh
  • bash tests/test-codex-hook-install.sh
  • bash tests/test-unified-codex-config.sh

Copilot AI review requested due to automatic review settings April 27, 2026 03:33
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new plan-check validation/reporting workflow (plus Codex skill wiring) and integrates gen-plan check mode controls/config to support draft + plan validation/repair flows.

Changes:

  • Introduces deterministic plan-check backend: schema validation helpers, IO validation, findings/report assembly, and a Codex skill entrypoint.
  • Extends gen-plan with --check/--no-check flags + gen_plan_check config resolution and updates docs/tests accordingly.
  • Improves hook/install robustness: symlink-tolerant goal-tracker path matching, methodology-analysis completion gating, and Codex hooks installer dry-run behavior.

Reviewed changes

Copilot reviewed 48 out of 49 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/test-unified-codex-config.sh Ensures tests run with isolated XDG config home for defaults validation.
tests/test-plan-check.sh Adds fixture-style tests for schema validation, report assembly, rewrite/recheck behavior, and regression cases.
tests/test-methodology-analysis.sh Adds tests for readiness vs completion behavior in methodology analysis flow.
tests/test-gen-plan.sh Adds assertions for --check/--no-check + gen_plan_check documentation and validator behavior.
tests/test-gen-idea.sh Adds tests for raw-argument parsing behavior in gen-idea IO validation.
tests/test-config-merge.sh Adds coverage for plan_check_recheck + gen_plan_check merged-config behavior.
tests/test-codex-hook-install.sh Adds coverage for hooks installer existence, large codex features list, dry-run behavior, and plan-check skill installation.
tests/test-bitlesson-select-routing.sh Adds regression test for empty codex_model fallback under nounset.
tests/test-ask-codex.sh Verifies nested codex exec disables codex_hooks when supported.
tests/test-allowlist-validators.sh Adds coverage for blocking stale same-round loop contract edits.
tests/run-all-tests.sh Wires new plan-check and gen-plan-check suites into the overall test runner.
tests/robustness/test-hook-system-robustness.sh Adds symlink-root coverage for goal-tracker read/write/edit validators and backtick-literal regression assertion.
tests/fixtures/gen-plan-check/default-template.md Adds fixture template for gen-plan check-mode tests.
tests/fixtures/gen-plan-check/default-expected.md Adds expected output fixture for gen-plan check-mode tests.
tests/fixtures/gen-plan-check/default-draft.md Adds draft fixture for gen-plan check-mode tests.
skills/humanize-plan-check/SKILL.md Adds Codex flow entrypoint documentation for plan-check.
skills/humanize-gen-plan/SKILL.md Documents check-mode flags and config priority in the gen-plan skill.
scripts/validate-plan-check-io.sh Adds IO validator for plan-check (plan existence, output dir resolution/creation, permissions, flags).
scripts/validate-gen-plan-io.sh Adds --check/--no-check flags to the gen-plan IO validator and usage output.
scripts/validate-gen-idea-io.sh Adds --raw-arguments parsing mode to preserve/normalize inline idea text and flags.
scripts/plan-check.sh Adds deterministic report assembler: validates findings array schema, writes findings.json + report.md.
scripts/lib/plan-check-common.sh Adds shared plan-check library: schema validators, report helpers, rewrite helpers, config resolution, ambiguity ID postprocessing.
scripts/lib/gen-plan-check-mode.sh Adds resolver implementing --no-check > --check > config > default priority.
scripts/install-skill.sh Ensures humanize-plan-check is installed and improves DRY-RUN behavior ordering.
scripts/install-codex-hooks.sh Avoids probing Codex features during dry-run; improves feature-list probing robustness.
scripts/bitlesson-select.sh Sources loop-common earlier to ensure default model is initialized under set -u.
scripts/ask-codex.sh Disables Codex hooks for nested codex exec runs when supported by installed CLI.
hooks/loop-write-validator.sh Uses new symlink-tolerant loop_paths_match for goal-tracker path checks.
hooks/loop-read-validator.sh Uses new symlink-tolerant loop_paths_match for goal-tracker path checks.
hooks/loop-edit-validator.sh Uses loop_paths_match for goal-tracker + adds “wrong directory path” validation for loop edits.
hooks/loop-codex-stop-hook.sh Separates methodology readiness gate from completion mutation; ensures completion is finalized before exit.
hooks/lib/methodology-analysis.sh Splits readiness check from completion logic to avoid side effects during gating.
hooks/lib/loop-common.sh Adds loop_paths_match helper; adjusts fallback text escaping for backticks.
docs/usage.md Documents new plan-check command and gen-plan check-mode behavior/config.
docs/install-for-kimi.md Documents installing the new humanize-plan-check skill.
docs/install-for-codex.md Documents installing the new humanize-plan-check skill.
config/default_config.json Adds defaults for plan_check_recheck and gen_plan_check.
commands/plan-check.md Adds Claude command spec for plan-check workflow and sub-agent contract.
commands/gen-plan.md Adds check-mode spec (draft check + plan check/repair + gating) and new allowed tools.
commands/gen-idea.md Switches gen-idea validation to --raw-arguments mode.
agents/plan-consistency-checker.md Adds agent spec for plan contradiction detection output schema.
agents/plan-ambiguity-checker.md Adds agent spec for plan ambiguity detection with stable, hash-based IDs.
agents/draft-plan-drift-checker.md Adds drift checker spec used only to explain existing primary findings from sources.
agents/draft-consistency-checker.md Adds draft contradiction checker spec.
agents/draft-ambiguity-checker.md Adds draft ambiguity checker spec with stable IDs.
README.md Bumps version to 1.17.0.
.claude-plugin/plugin.json Bumps plugin version to 1.17.0.
.claude-plugin/marketplace.json Bumps marketplace version to 1.17.0.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/test-config-merge.sh Outdated
"$val_g + $val_a + $val_b"
fi

# ========================================n# Test 8: gen_plan_check default is false
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s a stray n embedded in the section divider comment, which will show up in test output and makes the header inconsistent with the other test blocks. Remove the extra character so the divider and the # Test 8 header are on separate lines.

Copilot uses AI. Check for mistakes.
Comment thread scripts/validate-plan-check-io.sh Outdated
Comment on lines +8 to +10
# 3 - Output directory does not exist
# 4 - Output file already exists
# 5 - No write permission to output directory
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exit-code documentation at the top doesn’t match the current checks: this script creates the report directory when missing (so it’s not simply “does not exist”), and exit 4 is triggered when the report path exists as a non-directory (not when an output file exists). Update these comments so callers/debuggers don’t get misled about what exit codes 3 and 4 mean.

Copilot uses AI. Check for mistakes.
Comment thread scripts/lib/plan-check-common.sh Outdated
Comment on lines +399 to +401
tmpfile="$(mktemp "$target_dir/.plan-check-write.XXXXXX")"
printf '%s\n' "$content" > "$tmpfile"
mv "$tmpfile" "$target_path"
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plan_check_atomic_write replaces the target file with a newly-created temp file, which will typically reset file permissions to the mktemp default (often 0600) instead of preserving the original mode. This can unexpectedly make previously-readable plan files private after a rewrite. Consider copying the existing file’s mode/ownership to the temp file before mv (e.g., chmod --reference/chown --reference when available, or stat + chmod fallback), or use a write-in-place approach that preserves permissions.

Suggested change
tmpfile="$(mktemp "$target_dir/.plan-check-write.XXXXXX")"
printf '%s\n' "$content" > "$tmpfile"
mv "$tmpfile" "$target_path"
tmpfile="$(mktemp "$target_dir/.plan-check-write.XXXXXX")"
trap 'rm -f "$tmpfile"' RETURN
printf '%s\n' "$content" > "$tmpfile"
if [[ -e "$target_path" ]]; then
if chmod --reference="$target_path" "$tmpfile" 2>/dev/null; then
:
else
local target_mode=""
target_mode="$(stat -c '%a' "$target_path" 2>/dev/null || stat -f '%Lp' "$target_path" 2>/dev/null || true)"
if [[ -n "$target_mode" ]]; then
chmod "$target_mode" "$tmpfile"
fi
fi
if chown --reference="$target_path" "$tmpfile" 2>/dev/null; then
:
fi
fi
mv "$tmpfile" "$target_path"
trap - RETURN

Copilot uses AI. Check for mistakes.
Comment thread scripts/lib/plan-check-common.sh Outdated
Comment on lines +489 to +492
try:
findings = json.load(sys.stdin)
except Exception:
sys.stdout.write(sys.stdin.read())
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In plan_check_postprocess_ambiguity_ids, the fallback path on JSON parse failure attempts to re-read sys.stdin after json.load(sys.stdin) has already consumed some/all bytes, so the "pass-through" output can be truncated or empty. To reliably pass through the original content, read stdin into a string first, then json.loads(raw) in the try block, and on failure print raw unchanged.

Suggested change
try:
findings = json.load(sys.stdin)
except Exception:
sys.stdout.write(sys.stdin.read())
raw = sys.stdin.read()
try:
findings = json.loads(raw)
except Exception:
sys.stdout.write(raw)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4c89cf94fd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread scripts/lib/plan-check-common.sh Outdated
if [[ "$in_task_section" -eq 1 && "$line" =~ ^##[[:space:]] && ! "$line" =~ ^### ]]; then
break
fi
if [[ "$in_task_section" -eq 1 && "$line" =~ ^\|[\-\|]+\|$ ]]; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Accept spaced Task Breakdown table separators

The parser only enters table mode when a separator matches ^\|[\-\|]+\|$, which rejects common markdown separators like | --- | --- | (and aligned variants with :). When that happens, task rows are never parsed, so task-level schema checks are skipped; invalid tags, nonexistent target ACs, and bad dependencies can all pass without blocker findings. This pattern is reused across the task validators, so the gap affects multiple checks.

Useful? React with 👍 / 👎.

Comment on lines +699 to +703
local normalized_acs
normalized_acs="$(_plan_check_normalize_target_acs "$target_ac")"

while IFS= read -r single_ac; do
[[ -z "$single_ac" ]] && continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Emit finding when Target AC cell has no parsable AC ID

This validator only reports errors for AC IDs that were successfully parsed and then found missing; if parsing yields no IDs (for example - or malformed text), the loop is skipped and no finding is emitted. That means tasks with an empty/invalid Target AC cell are treated as valid even though they do not map to any acceptance criterion, which weakens the schema gate and can leave tasks without a verification target.

Useful? React with 👍 / 👎.

Add the standalone plan-check command, Codex/Kimi skill wiring, shared validation helpers, and integrated gen-plan check mode with draft and plan repair flows.

Squashed from the local RLCR iteration history on top of the latest origin/dev; excludes the ask-gemini command/test follow-up from this plan-check PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants