Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/go.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ jobs:
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, windows-latest, macos-latest]
os: [ubuntu-latest, macos-latest]

runs-on: ${{ matrix.os }}

Expand Down
7 changes: 3 additions & 4 deletions .goreleaser.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,11 @@ builds:
- CGO_ENABLED=0
ldflags:
- -s -w -X main.Version={{.Version}} -X main.GitCommit={{.Commit}} -X main.BuildTime={{.Date}}
# Unix-only: the runner shells out to a Unix userland (bash, ripgrep) and
# ships no Windows installer/service, so a Windows binary can't actually
# run. Building one would only produce an unusable artifact.
goos:
- linux
- windows
- darwin
goarch:
- amd64
Expand All @@ -33,9 +35,6 @@ archives:
{{- else if eq .Arch "386" }}i386
{{- else }}{{ .Arch }}{{ end }}
{{- if .Arm }}v{{ .Arm }}{{ end }}
format_overrides:
- goos: windows
formats: zip
# Bundle the matched-os/arch fduty next to the runner binary in the archive
# root. A glob (not a literal path) so an os/arch the bundler skipped just
# contributes nothing rather than failing the release; the runner falls back
Expand Down
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,9 @@ curl -fsSL https://raw.githubusercontent.com/flashcatcloud/flashduty-runner/main

On Linux with systemd the script also creates a `flashduty` service user, writes `/etc/flashduty-runner/env`, installs a hardened unit, and runs `systemctl enable --now`. On macOS and non-systemd Linux it installs the binary only. Run with `--help` for all flags.

Runners installed this way **upgrade themselves**. Flashduty pushes new releases over the runner's existing control channel; the runner verifies the download, atomically swaps its own binary in place, and re-execs — automatically rolling back if the new build can't reconnect within a probation window. This works because the installer keeps the binary in a writable state dir (`/var/lib/flashduty-runner/bin`, symlinked from `/usr/local/bin`) owned by the `flashduty` service user, so self-update is exclusive to this install path.
Runners **upgrade themselves** regardless of how they were installed. Flashduty pushes new releases over the runner's existing control channel; the runner verifies the download, atomically swaps its binary, and restarts into the new version — automatically rolling back if the new build can't reconnect within a probation window. When the runner's own directory is writable (the installer's `/var/lib/flashduty-runner/bin` layout, or any user-writable location), the swap happens in place. When it is not — e.g. a manual install in root-owned `/usr/local/bin` run as a regular user — the upgrade lands at the canonical state-home path (`<runner home>/bin/flashduty-runner`, default `~/.flashduty/bin/`) instead, and on every later start the original binary hands off to the newest canonical version automatically. One consequence of that fallback: the file in `/usr/local/bin` keeps its installed timestamp/version on disk, while `flashduty-runner version` and the running service always reflect the current version.

The auto-update download URL is chosen by the **backend**, not remembered from how the runner was installed (the install-time `MIRROR_URL` is never persisted). A self-hosted or air-gapped deployment that installs via a private mirror must therefore also point the backend's `install_script_url` at that same mirror, or backend-pushed upgrades will resolve from the public GitHub release host.

### Manual Binary Installation

Expand All @@ -145,7 +147,7 @@ tar -xzf flashduty-runner_Darwin_x86_64.tar.gz
sudo mv flashduty-runner /usr/local/bin/
```

> **No auto-upgrade on this path.** A root-owned binary in `/usr/local/bin` run as a non-root service user cannot replace itself, so manually-installed runners do not receive backend-pushed upgrades — Flashduty attempts one, logs a failed attempt, and backs off. Use the one-line installer above for hands-off upgrades.
> **Auto-upgrade works here too.** A root-owned binary in `/usr/local/bin` cannot replace itself in place, so the first backend-pushed upgrade is installed to the runner's writable state home (`~/.flashduty/bin/flashduty-runner` by default) and the process restarts into it; from then on, starting `/usr/local/bin/flashduty-runner` transparently hands off to the newest self-updated version. To pin the installed version instead, run with `--disable-auto-update`.

### Docker Installation

Expand Down
6 changes: 4 additions & 2 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,9 @@ curl -fsSL https://raw.githubusercontent.com/flashcatcloud/flashduty-runner/main

在带 systemd 的 Linux 上,脚本会创建 `flashduty` 系统用户、写入 `/etc/flashduty-runner/env`、安装加固过的 systemd 单元并执行 `systemctl enable --now`。macOS 和无 systemd 的 Linux 仅安装二进制。使用 `--help` 查看全部参数。

通过这种方式安装的 Runner 会**自动升级**:Flashduty 通过 Runner 既有的控制通道下发新版本,Runner 校验下载、原地原子替换自身二进制并重新 exec——若新版本在试用窗口内无法重连则自动回滚。这依赖安装脚本把二进制放在可写状态目录(`/var/lib/flashduty-runner/bin`,并从 `/usr/local/bin` 建立软链)且归 `flashduty` 服务用户所有,因此自动升级仅在此安装方式下可用。
无论以何种方式安装,Runner 都会**自动升级**:Flashduty 通过 Runner 既有的控制通道下发新版本,Runner 校验下载、原子替换二进制并重启进入新版本——若新版本在试用窗口内无法重连则自动回滚。当 Runner 自身目录可写(安装脚本的 `/var/lib/flashduty-runner/bin` 布局,或任何用户可写位置)时原地替换;不可写时(如手动安装在 root 所有的 `/usr/local/bin`、以普通用户运行),升级会落在 Runner 状态主目录的规范路径(`<runner home>/bin/flashduty-runner`,默认 `~/.flashduty/bin/`),此后每次启动原始二进制都会自动接力到最新的规范版本。该回退方式的一个表现是:`/usr/local/bin` 中的文件保持安装时的版本不变,但 `flashduty-runner version` 与实际运行的服务始终是最新版本。

自动升级的下载地址由**后端**决定,并不会记忆 Runner 安装时用的镜像(安装时的 `MIRROR_URL` 不落盘)。因此私有化 / 内网部署若通过私有镜像安装,必须把**后端的** `install_script_url` 也指向同一个镜像,否则后端下发的升级会回退到公开的 GitHub release 源。

### 手动二进制安装

Expand All @@ -145,7 +147,7 @@ tar -xzf flashduty-runner_Darwin_x86_64.tar.gz
sudo mv flashduty-runner /usr/local/bin/
```

> **此方式不支持自动升级。** 位于 root 所有的 `/usr/local/bin`、以非 root 服务用户运行的二进制无法替换自身,因此手动安装的 Runner 不会接收后端下发的升级——Flashduty 会尝试一次、记录一次失败并退避。如需免手动升级,请使用上面的一键安装脚本
> **此方式同样支持自动升级。** 位于 root 所有的 `/usr/local/bin` 的二进制无法原地替换自身,因此第一次后端下发的升级会安装到 Runner 的可写状态主目录(默认 `~/.flashduty/bin/flashduty-runner`)并重启进入新版本;此后启动 `/usr/local/bin/flashduty-runner` 会透明接力到最新的自更新版本。若希望锁定安装版本,请使用 `--disable-auto-update` 运行

```bash
docker run -d \
Expand Down
35 changes: 32 additions & 3 deletions cmd/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -240,6 +240,31 @@ func runRunner() error {
return fmt.Errorf("failed to pin runner home: %w", err)
}

// Boot trampoline. A runner whose own directory is read-only (manual
// install in root-owned /usr/local/bin run as a regular user) self-updates
// at the canonical state-home path instead of in place — so if a strictly
// newer binary is already there from a past upgrade, hand this process over
// to it. The stale PATH entry thereby acts as a launcher for the current
// version on every later start (reboot, manual restart). Version-gated:
// a manually-reinstalled PATH binary that is already newer than the
// canonical one is never downgraded (matters under --disable-auto-update,
// where nothing would ever re-raise the canonical version), and dev builds
// (unparseable version) never trampoline.
exe, err := os.Executable()
if err != nil {
return fmt.Errorf("cannot resolve executable: %w", err)
}
target := selfupdate.ResolveTarget(exe, environment.RunnerStateHome())
if target.Relocated {
if v := selfupdate.BinaryVersion(context.Background(), target.Path); selfupdate.NewerVersion(v, Version) {
slog.Info("newer self-updated binary found at canonical path; restarting into it",
"current", Version, "canonical_version", v, "canonical", target.Path)
if rerr := selfupdate.RestartInto(target.Path); rerr != nil {
slog.Warn("failed to restart into canonical binary; continuing on current version", "error", rerr)
}
}
}

// Ensure the `fduty` CLI is in the bundled-tools dir AND resolves on the bash
// PATH. Hard-fails startup rather than serving a runner that 127s every fduty
// call; no-op staging when the cloud image / install.sh already placed it.
Expand All @@ -261,9 +286,13 @@ func runRunner() error {

// Boot-time self-update probation: if a prior upgrade swapped in this
// binary, count the boot attempt and roll back to the previous one if it
// keeps failing. No-op when there's no marker (normal startup).
exe, _ := os.Executable()
pm := selfupdate.NewProbationMgr(exe, Version, selfupdateMaxAttempts)
// keeps failing. No-op when there's no marker (normal startup). Anchored at
// the resolved self (target.Exe): the marker lives next to the binary that
// was swapped, which is the one now running. A relocated PATH binary must
// NOT anchor at the canonical path — it would misread a crashed canonical
// binary's in-flight probation marker as stale and clear it, resetting the
// crash-loop accounting that eventually triggers rollback there.
pm := selfupdate.NewProbationMgr(target.Exe, Version, selfupdateMaxAttempts)
bootOutcome, _ := pm.CheckOnBoot()

// Create message handler
Expand Down
10 changes: 6 additions & 4 deletions environment/environment.go
Original file line number Diff line number Diff line change
Expand Up @@ -723,17 +723,19 @@ func BundledToolsDir() string {
if d := os.Getenv("FLASHDUTY_RUNNER_BIN_DIR"); d != "" {
return d
}
home := runnerStateHome()
home := RunnerStateHome()
if home == "" {
return ""
}
return filepath.Join(home, "bin")
}

// runnerStateHome resolves the runner's state/home root the same way cmd does
// RunnerStateHome resolves the runner's state/home root the same way cmd does
// (FLASHDUTY_RUNNER_HOME > deprecated FLASHDUTY_RUNNER_WORKSPACE > ~/.flashduty),
// so the bundled-tools dir always lands inside the writable state tree.
func runnerStateHome() string {
// so the bundled-tools dir always lands inside the writable state tree. Also
// the root under which self-update keeps the canonical runner binary
// (<home>/bin) when the running executable's own directory is read-only.
func RunnerStateHome() string {
if d := os.Getenv("FLASHDUTY_RUNNER_HOME"); d != "" {
return d
}
Expand Down
3 changes: 2 additions & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ require (
github.com/modelcontextprotocol/go-sdk v1.6.1
github.com/spf13/cobra v1.8.1
github.com/stretchr/testify v1.11.1
golang.org/x/mod v0.37.0
mvdan.cc/sh/v3 v3.13.1
)

Expand All @@ -26,7 +27,7 @@ require (
github.com/yosida95/uritemplate/v3 v3.0.2 // indirect
golang.org/x/net v0.53.0 // indirect
golang.org/x/oauth2 v0.35.0 // indirect
golang.org/x/sys v0.43.0 // indirect
golang.org/x/sys v0.46.0 // indirect
gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)
10 changes: 6 additions & 4 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -52,14 +52,16 @@ github.com/yosida95/uritemplate/v3 v3.0.2 h1:Ed3Oyj9yrmi9087+NczuL5BwkIc4wvTb5zI
github.com/yosida95/uritemplate/v3 v3.0.2/go.mod h1:ILOh0sOhIJR3+L/8afwt/kE++YT040gmv5BQTMR2HP4=
github.com/yuin/goldmark v1.8.2 h1:kEGpgqJXdgbkhcOgBxkC0X0PmoPG1ZyoZ117rDVp4zE=
github.com/yuin/goldmark v1.8.2/go.mod h1:ip/1k0VRfGynBgxOz0yCqHrbZXhcjxyuS66Brc7iBKg=
golang.org/x/mod v0.37.0 h1:vF1DjpVEshcIqoEaauuHebaLk1O1forxjxBaVn884JQ=
golang.org/x/mod v0.37.0/go.mod h1:m8S8VeM9r4dzDwjrKO0a1sZP3YjeMamRRlD+fmR2Q/0=
golang.org/x/net v0.53.0 h1:d+qAbo5L0orcWAr0a9JweQpjXF19LMXJE8Ey7hwOdUA=
golang.org/x/net v0.53.0/go.mod h1:JvMuJH7rrdiCfbeHoo3fCQU24Lf5JJwT9W3sJFulfgs=
golang.org/x/oauth2 v0.35.0 h1:Mv2mzuHuZuY2+bkyWXIHMfhNdJAdwW3FuWeCPYN5GVQ=
golang.org/x/oauth2 v0.35.0/go.mod h1:lzm5WQJQwKZ3nwavOZ3IS5Aulzxi68dUSgRHujetwEA=
golang.org/x/sys v0.43.0 h1:Rlag2XtaFTxp19wS8MXlJwTvoh8ArU6ezoyFsMyCTNI=
golang.org/x/sys v0.43.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw=
golang.org/x/tools v0.42.0 h1:uNgphsn75Tdz5Ji2q36v/nsFSfR/9BRFvqhGBaJGd5k=
golang.org/x/tools v0.42.0/go.mod h1:Ma6lCIwGZvHK6XtgbswSoWroEkhugApmsXyrUmBhfr0=
golang.org/x/sys v0.46.0 h1:noSf2Fq6F8DBgS+LysIkx7rIExoNHJsxOAtPp4rthXw=
golang.org/x/sys v0.46.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw=
golang.org/x/tools v0.45.0 h1:18qN3FAooORvApf5XjCXgsuayZOEtXf6JK18I3+ONa8=
golang.org/x/tools v0.45.0/go.mod h1:LuUGqqaXcXMEFEruIVJVm5mgDD8vww/z/SR1gQ4uE/0=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15 h1:YR8cESwS4TdDjEe65xsg0ogRM/Nc3DYOhEAlW+xobZo=
gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
Expand Down
18 changes: 8 additions & 10 deletions selfupdate/capability.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,16 @@ package selfupdate

import (
"os"
"path/filepath"
)

// CanSelfUpdate reports whether the runner can replace its own binary in place.
// The atomic swap copies the current binary to <dir>/.bak and renames the new
// one over the canonical path, so the executable's *directory* must be writable
// by the service user. A manual install that drops the binary in root-owned
// /usr/local/bin and runs as a non-root user cannot — and must not attempt to —
// self-update; the handler checks this before a doomed download ever begins, so
// such runners simply stay on their version instead of failing in a loop.
func CanSelfUpdate(exePath string) bool {
f, err := os.CreateTemp(filepath.Dir(exePath), ".swcheck-*")
// dirWritable reports whether the runner can create and rename files in dir —
// the operations the atomic swap needs (stage the new binary, snapshot .bak,
// rename over the target path). Probed by actually creating a temp file:
// permission bits alone lie on read-only mounts and ACL'd directories.
// ResolveTarget uses this to decide between an in-place swap and the
// canonical state-home fallback.
func dirWritable(dir string) bool {
f, err := os.CreateTemp(dir, ".swcheck-*")
if err != nil {
return false
}
Expand Down
32 changes: 8 additions & 24 deletions selfupdate/capability_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,33 +6,17 @@ import (
"testing"
)

// TestCanSelfUpdateWritableDir verifies that a runner whose executable lives in
// a writable directory can self-update, and that the writability probe leaves no
// litter behind.
func TestCanSelfUpdateWritableDir(t *testing.T) {
// TestDirWritable verifies the writability probe and that it leaves no litter.
func TestDirWritable(t *testing.T) {
dir := t.TempDir()
exe := filepath.Join(dir, "flashduty-runner")
if err := os.WriteFile(exe, []byte("x"), 0o755); err != nil { //nolint:gosec // test fixture
t.Fatal(err)
}
if !CanSelfUpdate(exe) {
t.Fatal("expected CanSelfUpdate true for a writable exe dir")
if !dirWritable(dir) {
t.Fatal("expected dirWritable true for a temp dir")
}
entries, _ := os.ReadDir(dir)
for _, e := range entries {
if e.Name() != "flashduty-runner" {
t.Fatalf("writability probe left a stray file: %s", e.Name())
}
if len(entries) != 0 {
t.Fatalf("writability probe left a stray file: %s", entries[0].Name())
}
}

// TestCanSelfUpdateNonWritableDir verifies that a manual install whose binary
// sits in a directory the runner cannot write (modelled here as a missing
// parent) reports it cannot self-update, so the handler skips the upgrade
// locally instead of looping on a swap it can never complete.
func TestCanSelfUpdateNonWritableDir(t *testing.T) {
exe := filepath.Join(t.TempDir(), "missing-subdir", "flashduty-runner")
if CanSelfUpdate(exe) {
t.Fatal("expected CanSelfUpdate false when the exe dir is not writable")
if dirWritable(filepath.Join(dir, "missing-subdir")) {
t.Fatal("expected dirWritable false for a missing dir")
}
}
2 changes: 2 additions & 0 deletions selfupdate/layout.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import "path/filepath"
type layout struct {
Current string // canonical executable path
Bak string // previous binary, kept until commit
FailedBin string // windows rename-aside path during rollback (Current + ".failed")
Marker string // upgrade.json
Failed string // upgrade_failed — version we last rolled back from
StagingDir string // download/extract scratch
Expand All @@ -18,6 +19,7 @@ func layoutFor(exePath string) layout {
return layout{
Current: exePath,
Bak: exePath + ".bak",
FailedBin: exePath + ".failed",
Marker: filepath.Join(dir, "upgrade.json"),
Failed: filepath.Join(dir, "upgrade_failed"),
StagingDir: staging,
Expand Down
14 changes: 7 additions & 7 deletions selfupdate/probation.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,11 @@ package selfupdate
import (
"log/slog"
"os"
"syscall"
)

// ProbationMgr runs the boot-time probation/rollback state machine driven by
// the persisted upgrade marker. reExec is an injectable seam so tests never
// actually exec; nil means real syscall.Exec.
// actually restart; nil means the real platform restartSelf.
type ProbationMgr struct {
ExePath string
CurrentVersion string
Expand Down Expand Up @@ -39,8 +38,10 @@ func (p *ProbationMgr) CheckOnBoot() (Outcome, error) {
if m.RolledBack {
// We rolled back to this (old) binary; the failed target was already
// recorded at rollback time so the handler skips re-advertisements.
// Clear the marker and boot normally.
// Clear the marker (and the windows rename-aside leftover) and boot
// normally.
_ = clearMarker(l.Marker)
_ = os.Remove(l.FailedBin)
return Outcome{}, nil
}
if m.TargetVersion != p.CurrentVersion {
Expand All @@ -62,7 +63,7 @@ func (p *ProbationMgr) CheckOnBoot() (Outcome, error) {
}

func (p *ProbationMgr) rollback(l layout, m Marker) (Outcome, error) {
if err := os.Rename(l.Bak, l.Current); err != nil {
if err := restoreBinary(l); err != nil {
return Outcome{}, err // no .bak -> cannot roll back; stay (will keep crash-looping, surfaced via offline)
}
// Remember the version we backed out of so the handler ignores the backend's
Expand All @@ -73,9 +74,7 @@ func (p *ProbationMgr) rollback(l layout, m Marker) (Outcome, error) {
_ = writeMarker(l.Marker, m)
reexec := p.reExec
if reexec == nil {
reexec = func() error {
return syscall.Exec(l.Current, os.Args, os.Environ()) //nolint:gosec // G204: l.Current is our own canonical executable path, not user input
}
reexec = func() error { return restartSelf(l.Current) }
}
_ = reexec()
return Outcome{}, nil
Expand Down Expand Up @@ -106,5 +105,6 @@ func (p *ProbationMgr) Commit() {
l := layoutFor(p.ExePath)
_ = clearMarker(l.Marker)
_ = os.Remove(l.Bak)
_ = os.Remove(l.FailedBin)
clearFailedVersion(l.Failed)
}
Loading
Loading