Skip to content

fix: stream replay suite progress#795

Open
thymikee wants to merge 2 commits into
mainfrom
fix/replay-suite-progress-cancel
Open

fix: stream replay suite progress#795
thymikee wants to merge 2 commits into
mainfrom
fix/replay-suite-progress-cancel

Conversation

@thymikee

Copy link
Copy Markdown
Member

Summary

Stream bounded replay-suite progress for agent-device test --maestro: suite start, per-file start/pass/fail/skip, attempts/retries, session names, and artifact paths now appear while the suite is still running. Ctrl-C/client disconnect now cancels parent replay work and best-effort aborts active iOS runner sessions so simulator-driving work does not continue detached.

Also fixes two iOS Maestro runtime issues found while validating the React Navigation suite: custom-scheme simulator deep links no longer pre-launch stale app state before simctl openurl, and horizontal percentage swipes avoid the iOS interactive-back edge. Scope: 17 files. Closes #794.

Validation

Focused tests passed: progress/client/session-test/cancellation/Maestro runtime/iOS open tests, 9 files and 230 tests. pnpm format, pnpm check:quick, pnpm build, and node --test test/integration/smoke-*.test.ts passed.

Live iOS simulator validation on iPhone 17 Pro C25DBB5B-9254-4293-A8D5-2785C78DE03A: full React Navigation Maestro suite passed 38 passed, 0 failed in 532.0s with streaming progress and artifact/session paths. Ctrl-C verification exited immediately with code 130, and a follow-up one-file Maestro run on the same simulator passed in 14.9s, showing no active simulator-driving work was left blocking the device.

Local pnpm check:unit wrapper could not run because pnpm 11.1.2 signature verification/fetch is blocked in this environment; the direct underlying unit command also has sandbox/environment failures unrelated to this change (localhost listen EPERM, Swift typecheck, Android helper/upload/materialization suites).

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown

Size Report

Metric Base Current Diff
JS raw 1.3 MB 1.3 MB +2.8 kB
JS gzip 410.0 kB 411.0 kB +969 B
npm tarball 543.1 kB 544.0 kB +953 B
npm unpacked 1.8 MB 1.8 MB +2.8 kB

Startup median (7 runs, lower is better):

Scenario Base Current Diff
CLI --version 27.8 ms 29.5 ms +1.6 ms
CLI --help 52.8 ms 55.0 ms +2.2 ms

Top changed chunks:

Chunk Raw diff Gzip diff
dist/src/session.js +1.5 kB +467 B
dist/src/9542.js +750 B +290 B
dist/src/2415.js +645 B +226 B
dist/src/apps.js +29 B +15 B

@thymikee

Copy link
Copy Markdown
Member Author

Android validation update:

  • Booted AVD Pixel_9_Pro_XL_API_37 as emulator-5554 and installed React Navigation example with ./gradlew installDebug -PreactNativeDevServerPort=8081.
  • Configured Metro reverse for tcp:8081 and tcp:8082.
  • Smoke Android Maestro subset passed with streaming progress: bottom-tabs.yml + stack-basic.yml, 2 passed in 64.6s.
  • Full Android Maestro suite streamed bounded progress throughout and completed 36 passed, 2 failed in 1185.1s.
  • The two failures reproduce in isolation, so they appear Android-specific/pre-existing rather than caused by this PR's progress/cancellation changes:
    • drawer-master-detail.yml: launch gate fails after retries with uiautomator dump did not return XML while waiting for Pages.
    • screen-layout.yml: Suspend remains visible after tapping it; final assertNotVisible Suspend fails.

Artifacts:

  • Full run: /private/tmp/agent-device-794-android-full/1b9ab21f1d75300e
  • Isolated failure rerun: /private/tmp/agent-device-794-android-failures/0a0aee53a77a6fe9

@thymikee

Copy link
Copy Markdown
Member Author

CI update: all checks passed except Fallow Code Quality. Please address before merge readiness.\n\nFallow findings:\n- src/daemon/handlers/session-test.ts: runReplayTestSuite at line 31 is now 94 LOC, cyclomatic 24, cognitive 9.\n- src/cli-test-progress.ts: formatReplayTestCaseProgressEvent at line 23 is cyclomatic 10, cognitive 13, CRAP 31.6.\n\nKeep the live iOS/Android validation evidence in the PR. Fix should be targeted: extract focused helpers where it improves locality, keep progress output bounded/agent-friendly, preserve JSON behavior, and rerun pnpm check:fallow --base origin/main plus the focused tests already used. Also make sure the cancellation path remains covered and does not regress the Ctrl-C live behavior.

@thymikee

Copy link
Copy Markdown
Member Author

Addressed the review blockers in dd3f6698b.

What changed:

  • Fallow findings fixed by splitting runReplayTestSuite setup into focused helpers and reducing formatReplayTestCaseProgressEvent branching.
  • HTTP disconnect cancellation now arms only after auth succeeds.
  • The iOS runner abort loop now follows request cancellation state, not res.writableFinished, so streaming disconnects stop aborting as soon as handler cleanup clears the request.
  • Disconnect-triggered iOS runner abort is scoped to explicit iOS requests or replay/test requests, so Android-only RPC disconnects do not trigger global iOS abort.
  • Unknown future progress event types are ignored again instead of failing older clients.
  • Replay test retry loop now stops when the parent request is canceled.
  • Removed the no-op announceReplayTestRun path and renamed the misleading HTTP NDJSON progress test.

Regression timing/root cause:

  • The streaming-progress silence/cancel behavior regressed when replay suites moved to long-running daemon-side Maestro execution without per-test progress and without complete parent-request cancellation propagation. This PR fixes that path.
  • The Android helper slowdown observed while validating was separate: persistent Android snapshot-helper sessions were introduced in d087a62d1 / PR fix: improve Maestro Android reliability and snapshot speed #612 on 2026-05-29 with a session socket budget of timeoutMs + 2s. Busy RN accessibility captures could exceed that, causing a premature session timeout and slow one-shot fallback. That is handled separately in PR fix: relax Android snapshot helper session timeout #796.

Validation:

  • pnpm exec vitest run src/__tests__/cli-test-progress.test.ts src/daemon/handlers/__tests__/session-test-suite.test.ts src/utils/__tests__/daemon-client.test.ts passed: 43 tests.
  • pnpm exec vitest run test/integration/provider-scenarios/daemon-http-server.test.ts passed: 7 tests.
  • pnpm check:quick passed.
  • pnpm check:fallow --base origin/main passed.
  • node --test test/integration/smoke-*.test.ts passed: 8 tests.

Note: direct ./node_modules/.bin/vitest run --project unit is not a clean signal in this sandbox; unrelated tests fail on loopback bind EPERM / Swift typecheck / daemon entrypoint environment. The impacted focused tests above pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(test): stream replay suite progress and cancel active runs

1 participant