Skip to content

bug(worker): worker pip install -e . poisons operator's system python; needs brief rule + boot guard #144

@hadamrd

Description

@hadamrd

Problem

A forge-loop worker (suspected #124) ran pip install -e . from its worktree /tmp/wt-loop-124, creating an editable install in the operator's system Python site-packages:

/home/hexalgo/.local/lib/python3.12/site-packages/forge_loop -> /tmp/wt-loop-124

Concrete fallout:

  • System import forge_loop resolved to the worker's half-finished code, shadowing the uv-managed install at /home/hexalgo/.local/share/uv/tools/forge-loop/.
  • forge-loop --help exposed a phantom brainstorm subcommand that did not exist on trunk — operator closed feat(brainstormer): forge-loop brainstorm CLI (dry-run + --apply) #124 prematurely on a false signal.
  • After the worktree was reaped, the editable .pth pointed nowhere; import forge_loop returned a NAMESPACE package with no __file__, breaking every subsequent debug session.
  • Manual cleanup required: pip uninstall forge-loop && rm -rf .../site-packages/forge_loop .../site-packages/roles && uv tool install --reinstall --force ..

Root cause: the worker brief permits arbitrary pip install for dependency setup and does not forbid pip install -e . against the worktree, so the worker can mutate the operator's Python environment outside the worktree boundary. No boot-time guard detects an already-poisoned environment.

Acceptance criteria

  • src/forge_loop/briefs/worker.md.tmpl contains an explicit, prominent rule forbidding pip install -e (and pip install .) against the worktree, with the recommended alternative (uv venv + uv pip install -e inside a worktree-local venv) named.
  • On forge-loop run startup, a boot guard detects a poisoned operator environment — i.e. pip show forge-loop (or equivalent metadata inspection) reports an editable Location: pointing into /tmp/wt-loop-* or any path under the configured worktree root — and refuses to start with a clear, copy-pasteable cleanup command in the error message.
  • The critic step flags sev1 when a PR diff touches pyproject.toml / setup.py / setup.cfg AND the worker session log (or transcript) contains a pip install -e invocation.
  • The boot guard's detection logic is pure / unit-testable: it takes a parsed pip show payload (or equivalent) and a worktree-root prefix, and returns a structured result — not coupled to a live pip subprocess inside the unit test.
  • The boot guard's error message names the exact offending path and the exact cleanup commands (uninstall + rm + uv tool install --reinstall --force), so the operator does not have to reconstruct them.
  • All new behavior is covered by tests in the matrix below; existing forge-loop run happy path remains green (no false positives when the uv-managed install is the only forge_loop on disk).

Test matrix

Unit

  • tests/test_boot_poison_guard.py (new): given a mocked pip show forge-loop output whose Location: is /tmp/wt-loop-1/src, the guard returns poisoned=True with the offending path.
  • Same file: given a Location: under ~/.local/share/uv/tools/forge-loop/..., guard returns poisoned=False.
  • Same file: given pip show returning non-zero (package not installed on system python), guard returns poisoned=False (not an error).
  • Critic unit test: synthetic worker log with pip install -e . + PR diff touching pyproject.toml → critic emits a sev1 finding with a recognizable tag (e.g. pip-editable-poison).
  • Critic unit test (negative): worker log with pip install requests only + same diff → no sev1 from this rule.

Integration

  • Pre-poison the test environment with a fake editable install whose .pth points into a /tmp/wt-loop-*-shaped path; invoke forge-loop run (or its boot entrypoint) and assert it exits non-zero with the cleanup commands in stderr.
  • Render briefs/worker.md.tmpl and assert the rendered brief contains the prohibition string (so a future template refactor cannot silently drop the rule).

Adversarial / sad path

  • Worker session that tries to bypass the rule by writing pip install -e . indirectly (e.g. via a Makefile target, a shell script, or python -m pip install -e .): critic still flags sev1. At minimum, document this in the test as an explicit known-limitation case if full coverage is out of scope.

Out of scope

  • Running workers inside a fully isolated container / nsjail. This issue is about cheap guardrails, not full sandboxing.
  • Migrating the project off pip / onto uv everywhere. The fix must work with the current toolchain.
  • Auto-cleaning the operator's environment from the boot guard. The guard reports and refuses; it does not mutate the operator's site-packages.
  • Retroactively re-opening or re-litigating feat(brainstormer): forge-loop brainstorm CLI (dry-run + --apply) #124. This ticket addresses the class of bug, not the specific past incident.
  • Adding generic "dangerous pip command" linting beyond the editable-install case.

File pointers

  • src/forge_loop/briefs/worker.md.tmpl — add the prohibition rule, prominently (near the top of the constraints/rules section).
  • src/forge_loop/runner/boot.py — add the poisoning-detection guard, called from the forge-loop run entrypoint before any worker dispatch. (investigate exact entrypoint module if boot.py does not exist.)
  • src/forge_loop/critic.py and/or src/forge_loop/_critic_sdk.py — add the worker-log scan rule. (investigate which file owns rule registration.)
  • tests/test_boot_poison_guard.py (new).
  • tests/test_critic_pip_editable_rule.py (new, or extend existing critic test module if one exists — investigate).
  • tests/test_worker_brief_template.py (new or extend) — asserts the rendered brief contains the prohibition string.

Worker note

AC is wide — it spans brief template, boot guard, critic rule, and 3+ test modules across at least 3 packages. Worker, you are at high risk of running out of turns before pushing. Apply COMMIT DISCIPLINE (wip-commit every ~20 turns / 5 file-edits) aggressively from the start. Land the boot guard + its unit tests as the first commit (highest operator value, smallest surface) so even a partial run delivers something shippable. Run the EXIT CHECKLIST even if the critic rule slice feels incomplete.

Original report

Discovered during the Titan cobay test: a forge-loop worker (probably #124, possibly others) ran pip install -e . from its worktree /tmp/wt-loop-124. This created an editable install in system python's site-packages (/home/hexalgo/.local/lib/python3.12/site-packages/forge_loop -> /tmp/wt-loop-124).

Effects:

  • System python's import forge_loop resolved to the worker's half-finished code, not the uv-managed install at /home/hexalgo/.local/share/uv/tools/forge-loop/.
  • The CTO saw a brainstorm CLI in forge-loop --help that didn't exist on trunk — false signal led to closing feat(brainstormer): forge-loop brainstorm CLI (dry-run + --apply) #124 prematurely.
  • The worker's incomplete code shadowed every subsequent python3 -c "import forge_loop" invocation.
  • When the worker's worktree was reaped, the editable install kept a stale pth pointing nowhere — import forge_loop returned a NAMESPACE package with no __file__.

Cleanup required: pip uninstall forge-loop && rm -rf /home/hexalgo/.local/lib/python3.12/site-packages/forge_loop /home/hexalgo/.local/lib/python3.12/site-packages/roles + uv tool install --reinstall --force ..

Root cause: worker brief allows pip install for dependency setup; no guard against pip install -e . (editable install of the worker's OWN code) which mutates the OPERATOR's python environment outside the worktree boundary.

Original fix proposal: (1) hard rule in briefs/worker.md.tmpl; (2) critic sev1 on pyproject.toml-touching PRs whose worker log shows pip install -e; (3) boot guard on forge-loop run that refuses to start if system python has an editable forge-loop pointing into /tmp/wt-loop-*.

Metadata

Metadata

Assignees

No one assigned

    Labels

    loop:readyLoop runner will autonomously attempt this issuepo:expandedPO subagent expanded the issue bodypriority:p1Important, near-term

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions