ci: trigger eval-skills on every PR so required checks always report#79
Merged
Merged
Conversation
Branch protection requires "Unit tests", "Aggregate scores", and "Evaluate gate" — all of which come from this workflow. With the `paths:` filter, PRs that don't touch `skills/**`, `evals/**`, or this file never trigger the workflow, so the three required checks stay forever in "Expected — Waiting for status to be reported" and block merge. (#74 hit this.) Remove the trigger-level `paths:` filter so the workflow always runs on PRs. The existing diff job and per-job `if:` conditions already short-circuit the real work when no skills changed, and GitHub treats skipped required checks as passing.
Skill eval results
Only suites whose source actually changed since their last recorded score were re-run. Soft-failing while we stabilise the baseline. |
jfong-ld
approved these changes
Jun 15, 2026
tiffanylphan
added a commit
that referenced
this pull request
Jun 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The required checks come from a path-filtered workflow, so PRs that don't touch
skills/**orevals/**never trigger it and get stuck "waiting for status to report." This PR removes the path filter so the workflow fires on every PR. Nothing is weakened: the diff job inside the workflow already gates the actual eval work on whether skills changed, and skipped required checks count as passing in branch protection. Net cost: ~40s of cheap CI per PR; no extra API tokens.Branch protection on
mainrequires three status checks — Unit tests, Aggregate scores, and Evaluate gate — all of which are produced by.github/workflows/eval-skills.yml. The workflow is currently path-filtered:For any PR that doesn't touch one of those paths, the workflow never triggers, so the three required checks stay forever in
Expected — Waiting for status to be reportedand the PR isBLOCKEDfrom merging even with every other check green and reviews approved.#74 hit this — it only changed
.claude-plugin/marketplace.jsonandREADME.md, so the workflow never fired and the merge button has been stuck for days.Fix
Remove the trigger-level
paths:filter so the workflow runs on every PR. The cost is small:diffjob is preserved — it still computes whether any skills actually changed.evaluateandaggregatealready have job-levelif: needs.diff.outputs.has_changes == 'true'guards, so they skip cleanly on PRs that don't touch skills.unit-testjob (~30s) and thediffjob (~10s).The cron-schedule and
workflow_dispatchtriggers are unchanged.Test plan
.github/workflows/eval-skills.yml, so the workflow self-triggers under the old paths filter — required checks will report on this PR via the existing rule).#74unblocks once this lands — push an empty/no-op commit there so the new trigger fires, and verify all three required checks report SUCCESS or SKIPPED.README.md) and confirm the three required checks report there too.🤖 Generated with Claude Code