fix: improve daemon diagnostics and remove compact snapshots#786
Conversation
Size Report
Startup median (7 runs, lower is better):
Top changed chunks:
|
cfb8a78 to
943ee74
Compare
|
Review — one P1 (docs governance), two P2s; behavior verified good liveReviewed both halves, with live validation of the What I verified works
Findings
Verdict: good to merge once the ADR is restored-as-amended; 2 and 3 can ride along or follow up. |
943ee74 to
e4c7910
Compare
e4c7910 to
924a33d
Compare
Re-review of the update — P1 resolved well; one scope flag on the new budget commitResolved:
New scope flag — Ask: keep it if you add one sentence of rationale to the body and rename the commit (e.g. With the ADR restored, nothing else blocks merge. |
924a33d to
18bb5e4
Compare
Summary
Improves daemon startup failure evidence for isolated replay/test and normal daemon startup paths.
Before: a daemon that exited before publishing metadata collapsed into a generic metadata wait failure.
After: startup failures include the state dir, daemon metadata paths, daemon log path, child process pid/exit evidence, cleanup results, and a bounded daemon log tail when available.
Also removes compact snapshot mode end to end across CLI/client contracts, daemon flags, iOS runner contracts/plans, replay serialization, and Android snapshot filtering. The legacy
-ctoken is still accepted only forsnapshot/diffas a no-op passthrough so old commands do not fail, but no snapshot request forwards compact behavior. ADR 0004 is restored as an amended record for the surviving iOS snapshot backend strategy.The iOS query-sweep recovery tier budget is relaxed from 1s to 3s so degraded-but-enumerable simulator trees can complete in the regular visible capture plan instead of prematurely falling through to lower-fidelity recovery.
Closes #687
Touched files: 75
Validation
After rebasing onto latest
origin/main, passed directoxfmt --write, focused Vitest coverage for parser/replay/capture/selector/snapshot paths (14files,394tests), andpnpm check:quick. After reverting unrelated fixture-app copy/formatting noise,src/platforms/ios/__tests__/runner-client.test.tspassed in isolation andpnpm check:quickpassed again; the full focused rerun had only runner-client cache/timing failures that passed on immediate isolated rerun.Before the rebase, also passed
pnpm build,pnpm build:xcuitest, and live iOS simulator checks on2AF9D7DF-516A-47F5-8D4B-4EA0600E611E:prepare ios-runner, Settingssnapshot -iand legacysnapshot -i -creturned equivalent healthy tree-backed output, Settings replay passed, and sampled Settings click/back interactions completed successfully.Test-app replay evidence:
gesture-lab.adpassed;checkout-form.adstill failed at the existing iOSkeyboard dismissunsupported-operation step after successful app launch and fills, so it remains residual fixture/runtime risk rather than validation for this PR.SkillGym remains blocked in this sandbox because the runner environment requires external Codex/Claude runners with network access.