feat(ci): run liquid tests in parallel, grouped per firm and template type#27
feat(ci): run liquid tests in parallel, grouped per firm and template type#27Benjvandam wants to merge 5 commits into
Conversation
Replace the per-template sequential test loop with a two-phase approach: group changed reconciliations by their resolved firm, then run each firm's handles through a single `run-test --status` call that executes them concurrently (CLI Promise.all). Batches are capped at MAX_PARALLEL_TESTS (default 5) and run sequentially so the total number of in-flight tests against the live platform stays bounded, since the CLI has no client-side rate limiting. Also gate the reconciliation_texts check before firm resolution so shared-part changes (whose ".id" is a plain string) are skipped cleanly instead of aborting the step on a jq error, and drop the per-iteration CONFIG_JSON re-write (a no-op; token refresh is handled elsewhere).
In CI, consola prefixes every line with a '[log] ' tag and the test spinner interleaves ANSI/braille frames onto the same line, so the per-handle status is not at the start of a line. Normalise carriage returns and ANSI escapes and match '<handle>: PASSED/FAILED' on a left word boundary instead of anchoring to '^', which previously caused every handle to be reported as 'status could not be determined'.
WalkthroughThe GitHub Actions workflow for running liquid tests refactors from per-handle sequential execution to firm-grouped batch execution. Template detection becomes JSON-based for space safety, a ChangesFirm-grouped batch test execution
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
.github/workflows/run_tests.yml (1)
209-218: 💤 Low valueHandle names with regex metacharacters may cause incorrect status matching.
The
HANDLEvariable is interpolated directly into the grep regex pattern without escaping. If a handle contains regex metacharacters (e.g.,.,+,*), the pattern could match unintended strings or fail to match correctly.For typical alphanumeric handles this is unlikely to be an issue, but for robustness you could escape the handle:
♻️ Optional: escape handle for regex
for HANDLE in "${BATCH[@]}"; do + ESCAPED_HANDLE=$(printf '%s' "${HANDLE}" | sed 's/[.[\*^$()+?{|\\]/\\&/g') - if printf '%s\n' "${CLEAN_OUTPUT}" | grep -qE "(^|[^[:alnum:]_])${HANDLE}: PASSED"; then + if printf '%s\n' "${CLEAN_OUTPUT}" | grep -qE "(^|[^[:alnum:]_])${ESCAPED_HANDLE}: PASSED"; then echo "${HANDLE}: passed" - elif printf '%s\n' "${CLEAN_OUTPUT}" | grep -qE "(^|[^[:alnum:]_])${HANDLE}: FAILED"; then + elif printf '%s\n' "${CLEAN_OUTPUT}" | grep -qE "(^|[^[:alnum:]_])${ESCAPED_HANDLE}: FAILED"; then🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/run_tests.yml around lines 209 - 218, The HANDLE variable is being used directly in the grep regex patterns without escaping special regex metacharacters, which could cause incorrect pattern matching if the handle contains characters like dot, plus, asterisk, or other regex special characters. Before using HANDLE in the grep patterns on the lines that check for PASSED and FAILED status, escape the variable by replacing regex metacharacters with their escaped equivalents. You can do this by creating an escaped version of HANDLE using sed or a similar method to replace characters like dot, asterisk, plus, question mark, brackets, pipes, and parentheses with their backslash-escaped versions, then use this escaped variable in the grep regex patterns instead of the raw HANDLE variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/run_tests.yml:
- Line 107: The unquoted `${{ env.CHANGED_TEMPLATES }}` variable expansion in
the for loop is vulnerable to command injection if directory names contain shell
metacharacters. Quote the variable expansion by wrapping it as `"${{
env.CHANGED_TEMPLATES }}"` in the for loop statement to prevent unintended shell
interpretation, or alternatively refactor to use a safer iteration method such
as passing the templates as a JSON array and parsing them with jq to avoid
direct bash expansion of potentially malicious directory names.
---
Nitpick comments:
In @.github/workflows/run_tests.yml:
- Around line 209-218: The HANDLE variable is being used directly in the grep
regex patterns without escaping special regex metacharacters, which could cause
incorrect pattern matching if the handle contains characters like dot, plus,
asterisk, or other regex special characters. Before using HANDLE in the grep
patterns on the lines that check for PASSED and FAILED status, escape the
variable by replacing regex metacharacters with their escaped equivalents. You
can do this by creating an escaped version of HANDLE using sed or a similar
method to replace characters like dot, asterisk, plus, question mark, brackets,
pipes, and parentheses with their backslash-escaped versions, then use this
escaped variable in the grep regex patterns instead of the raw HANDLE variable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e87b6978-d9be-4112-9bf8-bae3ac6a0967
📒 Files selected for processing (1)
.github/workflows/run_tests.yml
The CLI runs liquid tests for both reconciliation_texts (--handle) and account_templates (--account-template). Group changed templates by '<firm>|<type>' so each bucket maps to one run-test call (the CLI takes a single firm and a single type per call) and pick the flag per bucket. Account template folder names contain spaces and characters like & ( ) ', so make the whole path space-safe: emit the changed-files list as JSON, hand it between steps/jobs as a JSON array, quote the config reads, and parse the run-test output by exact template name instead of a regex. Shared parts (and anything without liquid tests) are still skipped.
With json: true the changed-files action defaults to escape_json: true, which double-escapes the quotes ([\"a\"] instead of ["a"]). jq then fails to parse it, no templates are detected, and the test job is skipped. escape_json: false emits plain JSON that jq can read.
|
@coderabbitai full review |
✅ Action performedFull review finished. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/run_tests.yml:
- Line 231: The grep pattern on this line uses a strict `$` anchor immediately
after `(PASSED|FAILED)` which fails to match lines with trailing whitespace or
invisible characters. Modify the pattern to allow optional trailing whitespace
by inserting `[[:space:]]*` between the `(PASSED|FAILED)` group and the `$`
anchor. This will match test result lines regardless of whether they have
trailing spaces or other whitespace characters.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 37dcc4f2-491e-4166-9276-09aa873dc355
📒 Files selected for processing (1)
.github/workflows/run_tests.yml
| NAME="${LINE#\[log\] }" | ||
| NAME="${NAME%: *}" | ||
| RESULTS["${NAME}"]="${STATUS}" | ||
| done < <(printf '%s\n' "${CLEAN_OUTPUT}" | grep -oE '\[log\] [^[:space:]].*: (PASSED|FAILED)$') |
There was a problem hiding this comment.
Grep pattern may fail if CLI output has trailing whitespace.
The $ anchor requires the line to end exactly with PASSED or FAILED. If the CLI output has trailing whitespace or other invisible characters (which survive the ANSI stripping), the pattern won't match and all templates will show as "status could not be determined".
🛠️ Suggested fix to allow trailing whitespace
- done < <(printf '%s\n' "${CLEAN_OUTPUT}" | grep -oE '\[log\] [^[:space:]].*: (PASSED|FAILED)$')
+ done < <(printf '%s\n' "${CLEAN_OUTPUT}" | grep -oE '\[log\] [^[:space:]].*: (PASSED|FAILED)[[:space:]]*$')📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| done < <(printf '%s\n' "${CLEAN_OUTPUT}" | grep -oE '\[log\] [^[:space:]].*: (PASSED|FAILED)$') | |
| done < <(printf '%s\n' "${CLEAN_OUTPUT}" | grep -oE '\[log\] [^[:space:]].*: (PASSED|FAILED)[[:space:]]*$') |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/run_tests.yml at line 231, The grep pattern on this line
uses a strict `$` anchor immediately after `(PASSED|FAILED)` which fails to
match lines with trailing whitespace or invisible characters. Modify the pattern
to allow optional trailing whitespace by inserting `[[:space:]]*` between the
`(PASSED|FAILED)` group and the `$` anchor. This will match test result lines
regardless of whether they have trailing spaces or other whitespace characters.
| json: true | ||
| escape_json: false |
There was a problem hiding this comment.
🟠 Major — every test in a PR is silently skipped (green CI) when any changed template path contains &, (, or ).
Confirmed live on the validation run be_market #2949 / run 27955521933: 13 templates changed (incl. account_templates/Voorraad), yet check-changed-templates printed No templates changed, emitted [], and test-templates was skipped — nothing was tested.
Root cause: with safe_output: true (tj-actions/changed-files default), the JSON values are backslash-escaped for shell metacharacters, so account_templates/Leningen & Leasings (42)/main.liquid is emitted as …Leningen \& Leasings \(42\)…. \& / \( are invalid JSON escapes, so jq -r '.[]' aborts with a parse error:
$ printf '%s' "$ALL_CHANGED_FILES" | jq -r '.[]'
jq: parse error: Invalid escape at line 1, column 139
Because that jq runs inside the mapfile < <(… | jq … | grep …) process substitution, its non-zero exit doesn't trip set -e — jq just outputs nothing, the template list comes out empty, and the job skips while staying green. escape_json is not the relevant knob here; the backslashes come from safe_output.
Fix — disable safe_output so the array is valid JSON (the list is already passed via env: and parsed with jq, never eval'd, so this is safe):
json: true
escape_json: false
safe_output: false
files: |Worth a follow-up so this can't fail silently again: have the filter step fail loudly if jq errors, e.g. jq -er '.[]' with an explicit non-zero check instead of swallowing it in the process substitution.
| # Store outputs | ||
| # all_changed_files is a JSON array (json: true). Extract the unique template | ||
| # directories (reconciliation_texts / account_templates / shared_parts), space-safely. | ||
| pattern='(reconciliation_texts|account_templates|shared_parts)/[^/]+' |
There was a problem hiding this comment.
💡 Suggestion — shared_parts is extracted here only to be skipped later, which spins up the whole test-templates job for nothing on a shared-parts-only PR.
This pattern includes shared_parts, so a PR touching only shared parts yields a non-empty changed_templates (e.g. ["shared_parts/foo"]), the if: … != '[]' guard passes, and test-templates checks out + npm installs the CLI just to hit the "no liquid tests for this template type" skip and exit green. Since the "run reconciliations that use changed shared parts" feature is currently commented out, you could drop shared_parts from the pattern so such PRs short-circuit to [] and skip the job entirely:
pattern='(reconciliation_texts|account_templates)/[^/]+'(If you keep it for a future re-enable of the shared-parts test feature, no change needed — just flagging the wasted run.)
michieldegezelle
left a comment
There was a problem hiding this comment.
🟡 Minor / question — the per-template re-fetch of CONFIG_JSON from secrets was dropped; was that intentional?
The old test loop re-ran echo '${{ secrets.CONFIG_JSON }}' > $HOME/.silverfin/config.json before each reconciliation, with the comment "fetch the newest version of the tokens from the secrets, in case they were updated by the initiation of a concurrent workflow." This PR removes that and relies on the single load in the Load Silverfin config file from secrets step at job start.
Benign in the common case (the CLI refreshes the access token locally during the run). The only scenario it changes: if a concurrent workflow rotates the OAuth refresh token mid-run, this job can no longer pick up the new value and its remaining batches would fail to authenticate. Given the move to batched parallel runs the old per-template re-read wouldn't have helped mid-batch anyway, so this is likely fine — just flagging the dropped safeguard for a conscious sign-off, since it isn't mentioned in the PR description.
| TEMPLATE_BUCKETS["${FIRM_ID}|${TEMPLATE_TYPE}"]+="${IDENTIFIER}"$'\n' | ||
| echo "Queued ${IDENTIFIER} (${TEMPLATE_TYPE}) for firm ${FIRM_ID}" |
There was a problem hiding this comment.
🟠 Major — templates with an empty test YAML are reported as FAILED, not skipped.
5 of be_market's 20 account templates currently have a placeholder-only test file (Verworpen uitgaven, Huurinkomsten (detail), Opsplitsing kosten wagenpark, Overlopende rekeningen, Standaardtoelichting). For these the CLI prints:
[info] <name>: there are no tests stored in the YAML file
[error] Error running tests for <name>
[log] <name>: FAILED
runTestsStatusOnly returns FAILED whenever buildTestParams bails on an empty YAML (testContent.trim().split("\n").length <= 1), so this job will go red for templates that simply have no tests yet — which shouldn't be a CI failure.
The [log] <name>: FAILED line is identical to a real failure, so it can't be told apart at the case step. Cleanest fix is a pre-flight skip before queuing — replicate the CLI's own empty check so we never queue a test-less template:
# Skip templates whose test YAML is empty (CLI would report these as FAILED).
TEST_REL=$(cat "${CURRENT_DIR}/config.json" | jq -r '.test // empty')
TEST_PATH="${CURRENT_DIR}/${TEST_REL}"
if [[ -z "${TEST_REL}" || ! -s "${TEST_PATH}" ]] \
|| [[ "$(grep -cvE '^[[:space:]]*(#|$)' "${TEST_PATH}")" -eq 0 ]]; then
echo "Skipping ${IDENTIFIER} (no liquid tests defined)"
break
fi(~6 lines, right before the TEMPLATE_BUCKETS[…]+=… queue line.) This is in scope for this PR and easy. The more correct long-term fix is CLI-side: have runTestsStatusOnly emit a distinct SKIPPED/NO_TESTS status instead of FAILED for the empty-YAML case, so every consumer benefits — worth a silverfin-cli follow-up.
| @@ -31,39 +31,33 @@ jobs: | |||
| since_last_remote_commit: false | |||
| dir_names: false | |||
| base_sha: 'main' | |||
There was a problem hiding this comment.
🟠 Major — a template whose folder name has a non-ASCII character is silently dropped by changed-files (untested, green CI).
be_market#2949 modifies account_templates/Op te stellen … creditnota’s /main.liquid, but that path never appears in all_changed_files (run 27957849400) — so the template was never queued or tested, while the run stayed green. Every other changed path that run was pure ASCII and came through fine.
Cause: the folder name contains a curly apostrophe ’ (U+2019). With git's default core.quotepath=true, git emits the path double-quoted and octal-escaped:
"account_templates/…creditnota\342\200\231s /main.liquid"
changed-files doesn't reconcile that quoted form against the **/**.{liquid,…} glob, so it's filtered out.
Fix — disable quotepath before the changed-files step so non-ASCII paths emit raw:
- run: git config --global core.quotepath falseDefense-in-depth: that folder also ends in a trailing space (…creditnota’s before the /), which is fragile across the whole pipeline — worth renaming to drop the trailing space (and ideally the curly apostrophe) regardless.
| # and type run together in one parallel batch. Newline-separated keeps identifiers | ||
| # that contain spaces (account template names) intact. | ||
| TEMPLATE_BUCKETS["${FIRM_ID}|${TEMPLATE_TYPE}"]+="${IDENTIFIER}"$'\n' | ||
| echo "Queued ${IDENTIFIER} (${TEMPLATE_TYPE}) for firm ${FIRM_ID}" |
There was a problem hiding this comment.
💡 Suggestion — tighten per-template logging and group each parallel batch (with a final summary).
Each template currently emits 3 lines during queueing (Checking … / Using first available firm ID: … / Queued …), so ~13 templates ≈ 40 lines before a single test runs — and the two things that matter (identifier + firm) get buried, then the spinner frames pile on top.
Proposed shape — collapsible group per parallel batch, plus a flat summary at the end:
Queue phase — drop Checking …, keep only the firm-fallback Warning: lines (the real diagnostics), collapse to one entry each:
+ Deelnemingen → firm 6056
+ expenses → firm 1355
…
Run phase — wrap each batch in an Actions log group so it folds up and reads as one parallel run:
echo "::group::firm ${FIRM_ID} · ${TEMPLATE_TYPE} · ${#BATCH[@]} in parallel"
# … run + per-template PASSED/FAILED …
echo "::endgroup::"renders as:
▸ firm 6056 · accountTemplate · 1 in parallel
Deelnemingen: PASSED
▸ firm 542 · reconciliationText · 4 in parallel
2018_204_3_writedowns_provisions: PASSED
2018_274_APT_8: PASSED
liquidation_reserve: PASSED
2018_275_A_liquidationreserve: FAILED
End — one aligned summary keyed on the two important columns:
─ Results ──────────────────────────────────
PASS Deelnemingen firm 6056
PASS vol_10 firm 6056
FAIL vkt_5 firm 6056
FAIL 2018_275_A_liquidationreserve firm 542
─────────────────────────────────────────────
2 failed, 11 passed
The queue loop (firm resolution) and the run loop (results) are already separate, so this is mostly swapping echo lines, wrapping the batch loop in ::group::/::endgroup::, and adding a final loop over the collected results. (Spinner clutter is orthogonal — CLI-side, see the follow-up note.)
michieldegezelle
left a comment
There was a problem hiding this comment.
Potential improvements — silverfin-cli follow-ups (separate repo, not this PR)
Two CLI changes would let this workflow drop its output-scrubbing and stop mis-reporting:
- Spinner in non-interactive CI.
lib/cli/spinner.jsspin()starts unconditionally; in CI it animates viareadline.clearLine/cursor escapes on a 120 ms interval — the interleaved noise this workflow strips withtr '\r' '\n'+ an ANSIsed. One-line guard makes it a no-op off-TTY:spin(text = "…") { if (this.running || !process.stdout.isTTY) return; /* … */ }
- Distinct status for test-less templates.
runTestsStatusOnlyreturnsFAILEDwhenbuildTestParamsbails on an empty YAML (there are no tests stored in the YAML file). EmittingSKIPPED/NO_TESTSinstead would let every consumer treat "no tests yet" as not-a-failure (see the inline empty-YAML note).
What this does
Two things:
The silverfin-cli can already run several templates in a single
run-test --statuscall (it runs them at the same time viaPromise.all). The workflow wasn't using that - it called the CLI once per template in a bash loop. Now we group the changed templates and run each group in one call.Account templates (new)
Before this PR only
reconciliation_textswere tested. The CLI can also testaccount_templates(--account-template), and a repo likebe_markethas 20 of them that were never run in CI.This was confirmed on the current
@mainworkflow: see silverfin/be_market#2950, where a PR that changes only an account template skips the test job entirely - the template isn't even detected (the old filter only matchesreconciliation_texts|shared_parts, and the old test step only ranreconciliation_texts).This PR tests them the same way as reconciliations. The CLI takes one firm and one template type per call, so templates are grouped by
<firm>|<type>and each group is one call. Account template folder names contain spaces and characters like& ( ) ', so the whole path is made space-safe: the changed-files list is passed around as JSON (json: true,escape_json: false), config reads are quoted, and the output is matched by exact template name instead of a regex.Bug we found and fixed
The old workflow resolved a firm for every changed template, including shared parts. Shared parts have
.idas a plain string (not a firm -> id map), so a shared-parts-only PR hit ajqerror and aborted the whole test step (it runs underset -eo pipefail). This PR decides the template type first and skips anything without liquid tests before resolving a firm, so that can't happen anymore.Details
test_firm_id->SF_TEST_FIRM_ID-> first key of.id).MAX_PARALLEL_TESTS(default 10), batches run one after another, so no more than that many tests run at once (the CLI has no rate limiting).run-test --statusexits 0 even when a test fails, so we read the<name>: PASSED/FAILEDlines from stdout and fail the job if any failed. In CI consola prefixes those lines with[log]and the spinner interleaves frames, so we normalise the output before reading it.How to test (for reviewers)
The reusable workflow can't run on its own - a template repo has to call it. To test this branch end to end:
be_market), point the ref at this branch in.github/workflows/run_tests.yml:uses: silverfin/bso_github_actions/.github/workflows/run_tests.yml@run_tests_in_parallel{% comment %} test {% endcomment %}line in theirmain.liquid. Touch both reconciliations and account templates to cover both types.no-test-requiredlabel so the unrelatedcheck_testscheck stays green.Running batch for firm 542 / reconciliationText (3 in parallel): ...and one
<name>: passedper template.Before / after (live runs on be_market)
Before - current
@main: silverfin/be_market#2950 - changes only an account template. Thetest-templatesjob is skipped; account templates are not detected or tested today.After - this branch: silverfin/be_market#2949 - last green run tested both types at once:
Deelnemingen(firm 6056),Maaltijdcheques(firm 1355) - passedFailure reporting was also confirmed during validation: an account template with a space in its name (
Rekening Courant) was correctly detected, grouped, and reported asfailed.Follow-ups (not in this PR)
Promise.all, and isolating a single hard error instead ofprocess.exit(1)killing the whole batch.