Performance: stamp CodeVitals points with commit time and dedup re-tested commits#50046
Open
LiamSarsfield wants to merge 10 commits into
Open
Performance: stamp CodeVitals points with commit time and dedup re-tested commits#50046LiamSarsfield wants to merge 10 commits into
LiamSarsfield wants to merge 10 commits into
Conversation
Contributor
|
Thank you for your PR! When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:
This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖 Follow this PR Review Process:
If you have questions about anything, reach out in #jetpack-developers for guidance! |
dee9d21 to
bdf80f6
Compare
…sted commits Two FORMS-705 loop-integrity items not covered by the FORMS-713 PR (#49999): - Commit-time timestamp: the poster stamped Date.now(), but CodeVitals orders a trend by the posted timestamp and the Scheduler reads the latest point to decide 'last tested'. run-performance-tests.js now captures the plugin HEAD commit time (git show -s --format=%ct), measure-lcp.js carries it in results.git.timestamp, and post-to-codevitals.js stamps it (resolvePostTimestamp), falling back to build time with a warning only when no commit time is available. - Cross-commit dedup: before a live post, the poster queries the same gitaudit evolution endpoint the Scheduler reads (metric 58) and skips if the hash already has a point, so a re-run / retryBuild / double-trigger can't append a duplicate to the append-only store. Fails open (a flaky read never blocks a post) and is gated after the dry-run return, so the token-free CI smoke test still makes no network call. Configurable via CODEVITALS_DEDUP_URL / CODEVITALS_REPO / CODEVITALS_DEDUP_METRIC_ID; disable with --no-dedup / CODEVITALS_SKIP_DEDUP. Adds 10 unit tests (52 total). The remaining FORMS-705 items are TeamCity pipeline config, not repo code; docs/teamcity-codevitals-runbook.md gives step-by-step UI instructions for them, including the GATE-1 read/write host reconciliation that the dedup read depends on.
…rification
Follow-ups from a four-agent verification pass of the timestamp/dedup change
(no correctness bugs were found; these close low-severity gaps):
- Make hashAlreadyPosted fail-open total: build the evolution URL inside the try,
so even a pathological branch name (encodeURIComponent throw) fails open instead
of escaping as an exit-1 transport error.
- Add 4 tests (56 total): dedup endpoint identity (metric 58 / repo / branch / limit
in the URL — catches a wrong metric silently no-op'ing), unknown-hash bypass,
fail-open on an unexpected non-{data:[]} body, and a dry run making no dedup read
even with dedup fully configured (pins the dry-run-before-dedup ordering).
- Runbook: state outright that retryBuild can double-post until GATE-1 (dedup is
inert until the read/write hosts are reconciled); note CALIBRATION_PASSES defaults
to 9 in the script vs 5 inline (set it explicitly) and mention exposing ITERATIONS;
warn that the inline PLAYWRIGHT_BROWSERS_PATH export overrides the caching param.
…-GATE-1 Keep the 15s abort timer armed across response.json() so a dedup endpoint that sends headers then stalls the body aborts and fails open at 15s, rather than falling through to undici's ~300s default and delaying the post. Add a fail-open test for a rejecting body read; tidy the catch to error?.message. Flip dedup from opt-out to opt-in in main() (--dedup / CODEVITALS_ENABLE_DEDUP, explicit --no-dedup / CODEVITALS_SKIP_DEDUP still wins). Until GATE-1 proves the gitaudit read series (metric 58) is the codevitals.run write series, a default-on dedup could wrongly skip a real post on the append-only, no-rollback store; off by default removes that go-live risk. Function-level guard unchanged.
Replace the synchronous body-reject test (which would stay green if the body-read fix were reverted) with a real signal-driven abort test: json() hangs until the AbortController fires, dedupTimeoutMs is now injectable, and a per-test timeout turns a reverted fix into a failure instead of a hang. Extract the dedup-enable decision into a pure resolveDedupEnabled(argv, env) helper and cover the argv/env truth table (default off, opt-in, opt-out always wins) — the GATE-1 default-off safety was previously untested through main(). Correct the hashAlreadyPosted docstring: "never wrongly skip" holds only while dedup is off; if enabled before GATE-1, metric 58 is a different populated series and a coincidental hash match could wrongly skip. Warn on the unexpected-200-shape fail-open path so schema drift is visible at go-live.
Cover the headline feature's producer side: parameterize/export getGitInfo and add deterministic temp-git-repo tests for the %ct read (epoch seconds × 1000), Upstream-Ref-wins-over-mirror-hash selection, and the non-git → unknown/null fallback. This is the load-bearing commit-time mechanism, previously exercised only through resolvePostTimestamp with synthetic inputs. Treat GIT_COMMIT + GIT_COMMIT_TIMESTAMP_MS as a paired override: don't clobber a caller-supplied timestamp with HEAD's %ct. Harden the round-2 additions: widen the body-read abort guard's timeout budget (~400x the abort) so a cold `node --test` warmup can't false-fail correct code while a genuine hang still trips it; String()-coerce resolveDedupEnabled's env read now that the helper is exported; tighten the unexpected-shape warn comment.
…o a plausible epoch-ms window Round-4 review convergence: two cross-vendor findings on the commit-timestamp path, both cheap and both serving the append-only/no-rollback integrity spine. - run-performance-tests.js: extract resolveCommitTimestampEnv(gitInfo, env) (pure, exported), called before GIT_COMMIT is overwritten so it sees the caller's original override intent. GIT_COMMIT + GIT_COMMIT_TIMESTAMP_MS are a paired override; a lone GIT_COMMIT_TIMESTAMP_MS is now dropped in favour of HEAD time (it can no longer stamp HEAD's hash with an unrelated time), and an unpaired hash override warns about the provenance split. The winning source is logged, so the choice is no longer silent. Fixes the round-3 don't-clobber regression (lone timestamp silently trusted). - post-to-codevitals.js: resolvePostTimestamp now bounds the timestamp to a wide plausible epoch-ms window (~2001-2100). A 10-digit epoch-seconds value (would post as 1970) and micro/nanosecond magnitudes fall back to build time with a warning instead of permanently backdating the append-only trend. - Tests (+2, 64/64): a 7-case resolveCommitTimestampEnv truth-table (paired wins, lone dropped, empty-string absent, unpaired hash warns, plain HEAD, null fallbacks) and a resolvePostTimestamp unit-error case (epoch seconds, micro, window edge). ESLint clean. Deferred (diminishing returns): dedup-masks-invalid-URL (unreachable while dedup ships OFF; fold into GATE-1 enablement), POST-path unbounded body read (pre-existing, build-hang only), multi-metric dedup guard (future-facing).
…entrypoint Round-6 review: the R4/R5 runner fix established 'never trust a lone GIT_COMMIT_TIMESTAMP_MS', but only on the runner path. The poster's own documented entrypoints (pnpm report / report:dry) read process.env.GIT_COMMIT_TIMESTAMP_MS straight into config, so a direct/manual run could stamp a results-file hash with a stale inherited timestamp and backdate the append-only trend. - post-to-codevitals.js: extract a pure, exported pairedCommitTimestampMs(env) (mirrors resolveDedupEnabled). The env commit timestamp is honored only when paired with a GIT_COMMIT hash override; a lone value is dropped so resolvePostTimestamp falls back to results.git.timestamp (the runner-produced, provenance-matched value) or build time. The runner always sets GIT_COMMIT before spawning the poster child, so its env handoff is unaffected — this only closes the direct pnpm report path. - Tests (+1, 65/65): paired → value carried, lone → dropped, neither → undefined. ESLint clean. Deferred (diminishing returns): POST-path unbounded body read (pre-existing, build-hang only, bounded by CI timeout); dedup-masks-invalid-URL (unreachable while dedup ships OFF; fold into GATE-1 enablement); Upstream-Ref hash/timestamp provenance split (un-fixable in code → mirror %ct landing-order go-live precondition).
…annel Round-7 review (both vendors): the R4-R6 fixes paired the commit timestamp in the runner-env and poster-config channels, but left the dominant one unpaired. measure-lcp.js wrote results.git.timestamp from an unpaired GIT_COMMIT_TIMESTAMP_MS, and resolvePostTimestamp PREFERS results.git.timestamp over the poster's guarded config fallback — so 'GIT_COMMIT_TIMESTAMP_MS=<stale> pnpm measure && pnpm report' backdated the append-only trend, bypassing pairedCommitTimestampMs entirely (empirically proven). The R6 comment calling that value 'provenance-matched' was false. - measure-lcp.js: extract an exported pure resolveResultsGit(env) that routes the timestamp through the already-tested pairedCommitTimestampMs, so the pairing rule is single-sourced across all three channels. A lone GIT_COMMIT_TIMESTAMP_MS is dropped before it can reach results.git.timestamp. Guard main() with isDirectInvocation (the pattern post-to-codevitals.js and run-performance-tests.js already use) so the module is importable for tests — measure-lcp.js's first unit coverage. - post-to-codevitals.js: correct the pairedCommitTimestampMs comment — results.git.timestamp is provenance-matched because it is itself paired at the measure-lcp write site. - Tests (+1, 66/66): resolveResultsGit truth table (paired -> both written, lone -> time dropped, hash-only -> time omitted, non-numeric -> dropped). Revert-proven: restoring the unpaired read fails the test. ESLint clean. The paired-timestamp invariant now holds end-to-end (measure -> runner -> poster). Deferred (diminishing returns): unified payload provenance resolver (residual is a contrived hand-crafted results-file case); dedup-masks-invalid-URL (unreachable while dedup ships OFF; GATE-1 enablement); live POST body-read bound (pre-existing, build-hang only); Upstream-Ref hash/timestamp split (mirror %ct landing-order go-live precondition).
…timeout The live POST cleared its 30s abort timer immediately after fetch() resolved, leaving response.text() (non-OK) and response.json() (OK) body reads unbounded. A server that sent headers then stalled the body could hang the poster past the intended timeout (undici's ~300s default at best) — the exact hang the dedup read already guards against, and whose comment names this live-POST path as the still-unbounded one. Keep one abort timer armed across the whole response and clear it in a single finally. An aborted json() is re-thrown as-is so the outer catch reports a clean timeout rather than a misleading "invalid JSON". Timeout is configurable (postTimeoutMs, default 30s) so the new body-stall regression tests can use a short bound; both (OK json + non-OK text) are revert-proven to hang without the fix.
…the CodeVitals poster The dedup code referred to the read/write backend-coupling precondition as "GATE-1", a codename from an internal runbook with no referent for a reader of this repo. Keep the safety rationale (dedup reads gitaudit/metric 58 while posts write to codevitals.run, so it must stay off until those are the same store) and restate it in plain terms in the canonical note, its cross-references, the CLI dry-run output, and one test comment. No behaviour change.
ea5331d to
601c17e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Tracked in Linear: FORMS-705 / FORMS-696. Builds on #49999 (FORMS-713), now merged to trunk.
Proposed changes
Two fixes to the internal
tools/performanceCodeVitals poster, both keeping the append-only metrics trend correct. No shipped Jetpack code changes.Stamp each point with the commit's time. CodeVitals orders a trend by the posted timestamp, so using
Date.now()scrambles the order whenever a commit is measured later than it landed. The runner reads the plugin's HEAD commit time (git show -s --format=%ct) and carries it through to the poster, which stamps that. It falls back to build time (with a warning) only when no commit time is available, and rejects implausible values.Skip the post when the commit was already measured. Re-runs, retried builds, and double-triggers would otherwise append duplicate points to a store that has no rollback. The poster checks whether the commit's hash already has a point and skips the post if it does. It fails open: a flaky check never blocks a legitimate post. The check runs only on the live path, so the dry-run CI check still makes no network calls. Dedup ships off by default, opt-in per environment.
The duplicate check reads from a different backend than the poster writes to, so until those point at the same store it finds no matches. That makes it safe but inert today: it can miss a duplicate, but it never skips a real commit's post. Enable it once the read and write backends are reconciled.
Adds unit tests for the timestamp resolver and every dedup path (skip, proceed, fail-open, disabled, dry-run): 68 in total, all runnable with no Docker, token, or network.
Related product discussion/links
Does this pull request change what data or activity we track or use?
No. It changes only the internal
tools/performanceCodeVitals tooling: it corrects the timestamp posted with each metric and avoids duplicate points. No new data is collected.Testing instructions
From
tools/performance, runpnpm test:unit. 68 tests pass with no Docker, token, or network, and cover the timestamp resolver and every dedup path.Timestamp, without Docker.
pnpm report:dryreads a measurement file thatpnpm measure/pnpm testproduces under Docker. A fresh checkout has none, so run it (manually) at a small fixture withRESULTS_PATH:The payload carries
"timestamp": 1735660800000(the commit time), not the current time. Remove thetimestampfield and the dry run warns and falls back to build time.