Skip to content

feat(selfupdate): auto-upgrade on every install layout and platform (canonical fallback + boot trampoline)#62

Merged
ysyneu merged 3 commits into
mainfrom
feat/selfupdate-canonical
Jun 10, 2026
Merged

feat(selfupdate): auto-upgrade on every install layout and platform (canonical fallback + boot trampoline)#62
ysyneu merged 3 commits into
mainfrom
feat/selfupdate-canonical

Conversation

@ysyneu

@ysyneu ysyneu commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Why

A manually-installed runner (sudo mv into root-owned /usr/local/bin, run as a regular user) could never self-update — it logged cannot self-update (binary directory not writable) on every advertisement and stayed behind forever (hit by a colleague on macOS, runner stuck while 0.0.18 was advertised). Two more platform gaps hid behind that warn:

  • Windows self-update was structurally dead: syscall.Exec is a stub returning EWINDOWS, releases ship .zip but only tar.gz extraction existed, and os.Rename cannot replace a running exe.
  • darwin misjudged the installer layout: os.Executable() returns the unresolved symlink path (/usr/local/bin/...), so the writability probe checked the wrong directory even when the real binary sat in a writable state dir.

Design

Converge every install layout onto a user-writable canonical location instead of insisting on in-place swap (rustup-shim model):

  1. ResolveTargetEvalSymlinks the exe. Writable dir → in-place swap, byte-for-byte the old behavior (systemd state-dir layout, root installs, dev builds). Read-only dir → upgrades target <state home>/bin/flashduty-runner (default ~/.flashduty/bin/), seeded from the running binary first so .bak rollback always has a predecessor.
  2. Boot trampoline (run cmd) — a relocated runner that finds a strictly newer canonical binary execs into it. The stale PATH entry becomes a permanent launcher for the current version (reboot/manual restart both land on the newest binary). Version-gated by running canonical version + semver compare: a manually-reinstalled newer PATH binary is never downgraded (matters under --disable-auto-update), dev builds never trampoline, unparseable canonical output never gets exec'd.
  3. WindowsrestartSelf = StartProcess+exit, trying CREATE_BREAKAWAY_FROM_JOB first then plain spawn (NSSM/WinSW Job Objects); even if the job kills the child, the service manager restarts through the old path and the trampoline converges the version. Zip extraction keyed by artifact URL. Swap = rename-aside sequences (a running exe can be renamed, not replaced); rollback renames the running binary to .failed before restoring .bak.
  4. Probation anchors at the running binary, deliberately NOT the canonical path — a PATH binary must never clear a crashed canonical's in-flight marker, which would reset the crash-loop accounting that triggers rollback.

Backend needs no change: fc-safari's artifact resolver already serves per-OS/arch assets (zip for windows).

Notes for rollout

  • Runners older than this release still lack the fallback — existing manual installs need one last by-hand upgrade to cross over; after that they self-update forever.
  • With the canonical fallback active, the file in /usr/local/bin keeps its installed mtime/version; flashduty-runner version and the running service always reflect the current version (README documents this).

Verification

  • Full test suite green on darwin and linux (docker, non-root so chmod-555 read-only fixtures actually bind); new tests: resolve matrix incl. symlink layout, seed idempotence, zip extraction, relocated Apply→probation→commit chain, relocated rollback + failed-version skip, version parse/compare.
  • Cross-compiled all six GOOS/GOARCH release targets; go vet clean on windows; golangci-lint 0 issues; gofumpt clean.
  • Live E2E on macOS against a local backend (exact colleague scenario): v0.0.1 in a chmod-555 dir connected → binary directory not writable; upgrading at the canonical state-home path → download/verify/swap → real exec → v0.0.2 handshake → self-update committed. Restarting the untouched old PATH binary → newer self-updated binary found at canonical path; restarting into it → v0.0.2 connected. Probation/rollback exercised in unit tests.
  • Windows code paths are unit-tested cross-platform where possible (zip, URL keying) and cross-compiled, but not executed on a real Windows host — flagging honestly for reviewer awareness.

🤖 Generated with Claude Code

ysyneu added 3 commits June 10, 2026 11:45
…uto-upgrade on every install layout and platform

A manually-installed runner (root-owned /usr/local/bin, run as a regular
user) could never self-update: the in-place swap needs a writable binary
directory, so the handler logged a warn and stayed behind forever. Windows
never worked at all (syscall.Exec is an EWINDOWS stub; releases ship zip
but only tar.gz extraction existed), and darwin misjudged the installer's
symlink layout because os.Executable() returns the unresolved link path.

Root fix: converge every layout onto a user-writable canonical location
instead of insisting on in-place swap.

- ResolveTarget: EvalSymlinks the exe; writable dir -> in-place (systemd
  state-dir layout, root installs, dev builds: unchanged). Read-only dir
  -> upgrades land at <state home>/bin/flashduty-runner, seeded from the
  running binary first so .bak rollback always has a predecessor.
- Boot trampoline (run cmd): a relocated runner that finds a strictly
  newer canonical binary execs into it, so the stale PATH entry becomes a
  permanent launcher for the current version. Version-gated via
  'canonical version' + semver: a manually-reinstalled newer PATH binary
  is never downgraded (matters under --disable-auto-update) and dev
  builds never trampoline.
- Windows: restartSelf = StartProcess+Exit with CREATE_BREAKAWAY_FROM_JOB
  then plain-spawn fallback (service-manager Job Objects); zip extraction
  keyed by artifact URL; swap/restore via rename-aside sequences since a
  running exe can be renamed but not replaced.
- Probation anchors at the resolved running binary, deliberately not the
  canonical path: a PATH binary must not clear a crashed canonical's
  in-flight marker and reset its crash-loop accounting.

Backend needs no change: fc-safari already resolves per-OS/arch assets.
Existing fleet note: runners older than this release still lack the
fallback, so manual installs need one last by-hand upgrade to cross over.

Verified: full suite on darwin + linux (docker, non-root so read-only
fixtures bind); cross-compiled all six GOOS/GOARCH targets; live E2E on
macOS against a local backend — read-only-dir runner downloaded 0.0.2,
swapped at the canonical path, re-exec'd, committed after handshake, and
a later start of the old PATH binary trampolined straight into 0.0.2.
Go's os.Chmod maps to FILE_ATTRIBUTE_READONLY, which does not
write-protect a directory on Windows — the 0555 fixture silently stays
writable and ResolveTarget correctly reports in-place, failing the
relocated-path expectations. Real Windows read-only dirs are ACL-based;
the dirWritable probe handles them, but a unit test can't cheaply set
one up. Platform-shared mechanics (seed, Apply, rollback, zip) remain
covered on Windows by the fixtures that don't need a locked dir.
…x platforms

The runner is unix-only in practice: it shells out to bash/ripgrep, ships
no Windows installer or service wrapper, and its README documents only
Linux + macOS. Windows was merely a goreleaser build target producing a
zip that can't actually run. Carrying Windows-specific self-update code
(rename-aside swap, StartProcess re-exec, zip extraction) was building for
a platform we don't ship — so remove it rather than maintain unrunnable
paths.

- goreleaser: drop windows from goos; remove the zip format override.
- CI: drop windows-latest from the test matrix.
- selfupdate: delete swap_windows.go / restart_windows.go; collapse the
  _unix.go files to swap.go / restart.go (restart.go now //go:build unix,
  syscall.Exec being unix-only); drop the zip extractor + isZipURL and
  their archive/zip + net/url imports; runnerBinaryName is now a plain
  const (no .exe variant); drop the x/sys/windows dependency.
- README: note the auto-update download URL is backend-decided — a private
  mirror must also be configured as the backend's install_script_url, or
  pushed upgrades resolve from the public GitHub host.

Self-update on the platforms we ship (Linux + macOS, in-place and the
read-only-dir canonical fallback) is unchanged and stays fully covered.

Verified: go build + full test suite on darwin and linux; all four shipped
GOOS/GOARCH targets cross-compile; golangci-lint 0 issues; gofumpt clean.
@ysyneu ysyneu merged commit 5b8241f into main Jun 10, 2026
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant