Skip to content

test-stability: eliminate timeout flakiness in the web test suite (zero-flake requirement) #1596

Description

@cliffhall

Background

During the Wave-1 smoke/audit sweep, multiple independent runs of clients/web npm run test:coverage intermittently failed — anywhere from 2 to 12 failures per run — in a recurring set of unrelated files:

  • ServerConfigModal.test.tsx
  • InspectorView.test.tsx (localStorage persistence / History)
  • PromptArgumentsForm.test.tsx
  • ServerImportJsonModal.test.tsx / ServerImportConfigModal.test.tsx
  • useServers.test.tsx
  • ResourcesScreen.test.tsx
  • inspectorClient integration message-matching test

Every failure clustered at the default 5s waitFor/userEvent timeout under v8-instrumented load; re-running unloaded, in isolation, or with --testTimeout=30000 passes clean. These are environmental timing flakes, not logic bugs — but they suppress the coverage table (forcing scoped reruns), erode the CI signal, and violate the project's zero-flake bar.

Two adjacent tooling papercuts observed in the same sweep (fold in if cheap):

  • Passing any --coverage.* CLI override crashes the Storybook addon-vitest plugin (Cannot create property … on boolean 'true'); only the bare --coverage boolean works.
  • npx vitest --coverage pulls in the Storybook project and errors; the project's own scoped bin is required.

Requirement

Zero test-flakiness — the web suite (test, test:coverage, test:integration) must pass deterministically and repeatedly, including under concurrent load and v8 instrumentation. Do not merely raise a global timeout to hide the races.

Scope

  • Root-cause each flaky test and make it deterministic: use fake timers with explicit advancement where timers drive the assertion; give userEvent an advanceTimers/{delay:null} setup where it interacts with fake timers; replace arbitrary sleeps and wall-clock dependence with awaited conditions; scope justified per-test timeouts only where a legitimately slow async settle needs it.
  • Eliminate any reliance on real-timer debounce windows in the modals/forms listed above.
  • Investigate the inspectorClient integration message-matching race and stabilize it.
  • Address the two Storybook/vitest --coverage tooling papercuts if within reach.

Acceptance

  • npm run validate (web), npm run test:coverage (web), and npm run test:integration (web) each pass green on N consecutive full-suite runs (target N ≥ 5), including at least one run under simulated load, with zero intermittent failures.
  • No test was disabled/skipped to achieve this; fixes are deterministic, not timeout-inflation band-aids.
  • Per-file coverage gate still holds (≥90 on all four dimensions) for any files touched.

Refs: tracking #1579.

Metadata

Metadata

Assignees

Labels

v2Issues and PRs for v2

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions