Skip to content

ci: Phase 1 low-risk CI wall-time optimizations#941

Merged
IEvangelist merged 2 commits into
mainfrom
dapine/ci-optimization
May 13, 2026
Merged

ci: Phase 1 low-risk CI wall-time optimizations#941
IEvangelist merged 2 commits into
mainfrom
dapine/ci-optimization

Conversation

@IEvangelist
Copy link
Copy Markdown
Member

Summary

Phase 1 of CI wall-time optimizations — all changes are low-risk with no behavioural trade-offs. Each item is independently revertable. Background research and tradeoffs are documented in the commit message.

What's in this PR

Frontend / Vitest

  • Consolidate test:unit — was 10 sequential vitest run calls, each paying full Vite + TS-transform cold-start cost. Now a single invocation that picks up the same files via vitest.config.ts's include pattern. Individual test:unit:* scripts kept for local granular runs.
  • Vitest pool: 'threads'worker_threads has lower spawn overhead than the default forks pool on Linux. Tests are environment: 'node' with no native addons or thread-unsafe APIs.
  • Cache node_modules/.astro keyed on content + astro.config.mjs so Astro's content layer doesn't re-hydrate from scratch when MDX hasn't changed.
  • pnpm install --prefer-offline — skips registry HEAD probes when the setup-node pnpm-store cache hits. Falls back to network on misses.
  • compression-level: 0 on Playwright artifact uploads — reports contain PNGs, webm videos, and traces that are already compressed.

.NET

  • DOTNET_NOLOGO=1 and DOTNET_CLI_TELEMETRY_OPTOUT=1 on both apphost-build.yml and tools-tests.yml.
  • tools-tests.yml explicit build + --no-builddotnet test was implicitly rebuilding each project. Now we restore once, build once, and run dotnet test --no-build --no-restore.

GitHub Actions infra

  • concurrency group on ci.yml with cancel-in-progress: ${{ github.event_name == 'pull_request' }}. Stacked PR runs cancel after rebases; pushes to main/release/* are never cancelled.
  • changes job uses fetch-depth: 1 plus a targeted git fetch --depth=1 of the PR base and head SHAs. frontend-build.yml keeps fetch-depth: 0 because Lunaria still requires full history.

Verified locally

  • pnpm test:unit runs all 159 unit tests across 17 files in 5.5 s with the new consolidated invocation + threads pool.
  • YAML for all four workflows parses cleanly.

Explicitly NOT recommended (researched and verified)

Listed here so reviewers don't ask:

  • NODE_OPTIONS=--max-old-space-size=* — Node 22+ auto-sizes heap to ~50% of RAM; manual flags are cargo cult on 16 GB runners.
  • pnpm --reporter=silent — already silent in non-TTY CI.
  • DOTNET_SKIP_FIRST_TIME_EXPERIENCE — deprecated since .NET Core 3.0.
  • build:skip-search — Pagefind IS exercised by site-search.spec.ts and ts-api-search.spec.ts.
  • Vitest isolate: falseapi-markdown.vitest.test.ts uses vi.mock; state-bleed risk isn't worth the marginal win.
  • Caching ~/.cache/ms-playwright — Playwright's own docs advise against it.

Follow-ups (not in this PR)

Two higher-reward changes that trade off discipline or runner minutes; I'll open separate PRs once we agree:

  • NuGet caching with packages.lock.json — adds dependency-bump discipline.
  • Shard Playwright across viewport projects (desktop-chromium, tablet-chromium, mobile-chromium) — 3× runner minutes for the E2E phase in exchange for ~60% wall-time reduction.

Copilot AI review requested due to automatic review settings May 13, 2026 20:21
Reduces CI time on every PR with no behavioural trade-offs. Each change is
independent and can be reverted in isolation.

Frontend / Vitest:
- Consolidate test:unit into a single `vitest run` call (was 10 sequential
  invocations, each paying full Vite + TS-transform cold-start cost).
- Switch Vitest to `pool: 'threads'` (Linux + `environment: 'node'`;
  no native addons, no thread-unsafe APIs in tests).
- Cache `node_modules/.astro` (Astro content-layer) keyed on content +
  `astro.config.mjs`.
- Add `--prefer-offline` to `pnpm install` to skip registry HEAD
  probes when the setup-node pnpm-store cache hits.
- Set `compression-level: 0` on Playwright artifact uploads (HTML
  report, raw results) since the payloads are already compressed
  (png/webm/trace).

.NET:
- Set `DOTNET_NOLOGO=1` and `DOTNET_CLI_TELEMETRY_OPTOUT=1` on both
  `apphost-build.yml` and `tools-tests.yml`.
- `tools-tests.yml`: add explicit `dotnet build --no-restore` then
  run `dotnet test --no-build --no-restore` so MSBuild does not
  implicitly rebuild during the test step.

GitHub Actions infra:
- Add `concurrency` group to `ci.yml` so superseded PR commits cancel
  in-flight runs while main/release pushes keep running.
- `changes` job now uses `fetch-depth: 1` plus a targeted
  `git fetch --depth=1` for the PR base + head SHAs rather than
  a full-history clone. `frontend-build.yml` keeps `fetch-depth: 0`
  because Lunaria still requires the full git history.

Verified locally:
- `pnpm test:unit` passes all 159 unit tests across 17 files in 5.5s
  with the new consolidated invocation + threads pool.
- YAML for all four workflows parses cleanly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@IEvangelist IEvangelist force-pushed the dapine/ci-optimization branch from d0c124e to 6762e04 Compare May 13, 2026 20:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes CI wall time by reducing redundant test startup/build work and adding workflow-level caching and runner settings without changing product behavior.

Changes:

  • Consolidates frontend unit tests into one Vitest invocation and switches Vitest to the threads pool.
  • Adds frontend dependency/cache/artifact upload optimizations.
  • Updates .NET workflows and CI orchestration to reduce repeated work and cancel superseded PR runs.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/frontend/vitest.config.ts Sets Vitest to use the threads worker pool.
src/frontend/package.json Consolidates test:unit into one Vitest run.
.github/workflows/tools-tests.yml Adds .NET CLI env flags and separates build from test with --no-build.
.github/workflows/frontend-build.yml Adds pnpm --prefer-offline, Astro content cache, and uncompressed artifact uploads.
.github/workflows/ci.yml Adds workflow concurrency and shallow PR diff fetches.
.github/workflows/apphost-build.yml Adds .NET CLI env flags for AppHost build.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/ci.yml Outdated
Address PR review feedback: the previous `${{ github.workflow }}-${{ github.ref }}`
group was shared across consecutive pushes to the same branch. GitHub Actions
concurrency cancels older *pending* runs in a group even when
`cancel-in-progress: false`, so rapid pushes to `main`/`release/*` could
silently drop a required CI run.

New group key:
  ${{ github.workflow }}-${{ github.event.pull_request.number || github.run_id }}

- pull_request: group keyed on PR number — new commits cancel in-flight
  runs for the same PR (intended behaviour).
- push: group includes `github.run_id` so every push gets a unique group;
  pushes never cancel or queue against each other.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@IEvangelist IEvangelist enabled auto-merge (squash) May 13, 2026 20:34
Comment thread .github/workflows/ci.yml
# `cancel-in-progress: false`, which would silently drop required CI on
# `main`.)
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.run_id }}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, why isn't this the default behavior? I don't understand this change.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Two things going on:

Why GHA doesn't do this by default: concurrency control isn't universally safe — release pipelines, audit jobs, and workflows with external side-effects need every commit to run to completion. So GHA's default is "run everything in parallel" and concurrency: is the opt-in switch.

What this specific block does:

  • PRs — key includes github.event.pull_request.number + cancel-in-progress: true. New commits to a PR cancel the older in-flight run for that same PR. Saves runner-minutes on rapid fixup / rebase pushes.
  • Pushes (main / release/*) — key falls through to github.run_id, which is unique per run, so each push effectively gets its own group. Combined with cancel-in-progress: false, push runs neither cancel each other nor queue against each other. Every commit to main always gets its own CI run.

The reason this is split rather than just using a single shared group with cancel-in-progress: false is the bug the bot caught above: when runs share a group, GHA cancels older pending runs regardless of cancel-in-progress. Giving push events a unique group sidesteps that entirely.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New commits to a PR cancel the older in-flight run for that same PR. Saves runner-minutes on rapid fixup / rebase pushes.

I thought that is the normal "default" behavior...

Comment thread .github/workflows/frontend-build.yml
@IEvangelist IEvangelist merged commit 98b8fda into main May 13, 2026
6 checks passed
@IEvangelist IEvangelist deleted the dapine/ci-optimization branch May 13, 2026 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants