diff --git a/AGENTS.md b/AGENTS.md index c62ce9d..2db47d4 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -583,6 +583,21 @@ Canonical state file for AskUserQuestion. Atomic write via tmp+rename. Schema: ### MCP HTTP transport blocklist `codec_config._HTTP_BLOCKED`: `python_exec`, `terminal`, `process_manager`, `pm2_control`, `ax_control`. These skills are NEVER exposed over HTTP MCP. They remain available locally (voice, chat) and over stdio MCP only. +### `file_write` skill path-blocking (Phase 1 Wave 1, PR-1C — closes D-4) + +The MCP-exposed `file_write` skill (`skills/file_write.py`) refuses writes to security-sensitive paths. This is a defense-in-depth layer paired with PR-1A's load-time AST gate: even if `file_write` reached a skill directory, the file wouldn't execute on load — but PR-1C closes the write at the source. + +**Blocked paths (case-insensitive, realpath-resolved):** +- The entire **`~/.codec/`** tree — covers `skills/`, `plugins/`, `auth/`, `oauth_state.json`, `audit.log`, `config.json`, `memory.db`, `agents//*`, `notifications.json`, `pending_questions.json`, `agent_global_grants.json`, `triggers_killed.json`, plus any future state file added by later phases. +- The repo's built-in `/skills/` directory — protects the hash-pinned trusted manifest from PR-1A. +- The macOS system tree — `/System`, `/Library`, `/usr`, `/bin`, `/sbin`, `/etc`, `/var`, `/dev`, `/Volumes` (realpath-resolved so `/etc → /private/etc` aliases work). + +**Audit emission:** every refused write emits a `file_write_blocked` audit event (`source=codec-skill-file-write`, `outcome=error`, `level=warning`, `extra={target_path, requested_path, reason}`) to `~/.codec/audit.log`. Forensic visibility for any MCP-client attempt at a sensitive path. + +**Path resolution:** the parent directory is realpath-resolved before the blocklist comparison so symlink-into-blocked-root traversal attempts can't slip through. `/tmp` and `$HOME` are also realpath-resolved in the final sanity check (this fixed a pre-existing bug where macOS's `/tmp → /private/tmp` alias was tripping the `/private` block). + +**Contributor note:** users edit `~/.codec/triggers.json` or `~/.codec/config.json` through the dashboard PWA or a direct editor, NOT through `file_write`. Legitimate `file_write` targets are `~/Documents`, `~/Desktop`, `/tmp`, `~/Projects`, `~/codec-workspace`, etc. — the user-facing parts of `$HOME`. + ### Skill creation flow — review-and-approve only (Phase 1 Wave 1, PR-1B — closes D-2 + D-3) Skill creation is exclusively via the review-and-approve flow: diff --git a/docs/audits/PHASE-1-CONSOLIDATED-TRIAGE.md b/docs/audits/PHASE-1-CONSOLIDATED-TRIAGE.md index 8a8025c..835e270 100644 --- a/docs/audits/PHASE-1-CONSOLIDATED-TRIAGE.md +++ b/docs/audits/PHASE-1-CONSOLIDATED-TRIAGE.md @@ -146,7 +146,7 @@ If you do all 8: you have layered defenses against D-1 / D-2 / D-3 / D-4 / D-7 u | **D-1** | Security | Skill registry lazy-load = RCE for anyone who can drop a `.py` file | **W1 — CLOSED ([#42](https://github.com/AVADSA25/codec/pull/42), `48ec5d5`)** | | **D-2** | Security | `/api/forge` fetches arbitrary URL → LLM → writes skill, no review gate | **W1 — CLOSED ([#43](https://github.com/AVADSA25/codec/pull/43), `ff16664`)** | | **D-3** | Security | `/api/save_skill` writes directly to skills/ with only substring check | **W1 — CLOSED ([#43](https://github.com/AVADSA25/codec/pull/43), `ff16664`)** | -| **D-4** | Security | `file_write` skill (MCP-exposed) can write to `~/.codec/skills/` | **W1** | +| **D-4** | Security | `file_write` skill (MCP-exposed) can write to `~/.codec/skills/` | **W1 — CLOSED (PR-1C)** | | **D-5** | Security | `permission_gate` accepts path-traversal via `fnmatch` (no realpath) | **W1** | | C-1 | Reliability | `codec.py` daemon ignores SIGINT/SIGTERM; leaks sox + tkinter on every restart | W4 | | C-2 | Reliability | `~/.codec/pwa_response.json` race conditions + no correlation_id | W4 | diff --git a/docs/audits/PHASE-1-SECURITY.md b/docs/audits/PHASE-1-SECURITY.md index 7976982..60838f2 100644 --- a/docs/audits/PHASE-1-SECURITY.md +++ b/docs/audits/PHASE-1-SECURITY.md @@ -102,6 +102,8 @@ No human review gate. No `/api/skill/review` staging. `is_dangerous_skill_code`' ### D-4 — `file_write` skill (MCP-exposed) can write to `~/.codec/skills/` [CRITICAL] **Location:** `skills/file_write.py` (`SKILL_MCP_EXPOSE = True`) + `codec_config.py:_HTTP_BLOCKED` (file_write NOT blocked on HTTP) **CWE / OWASP:** CWE-22 Path Traversal / OWASP A01 Broken Access Control / Agentic A02 Tool Misuse + +> **Closed by PR-1C** (branch `fix/pr1c-file-write-block-roots`). `skills/file_write.py:_is_safe_target` was refactored to (1) realpath the blocklist at module load so macOS `/etc → /private/etc` aliases are caught regardless of which name the caller uses, (2) block the entire `~/.codec/` tree (covers skills, plugins, oauth_state.json, audit.log, config.json, memory.db, agents/, notifications.json, pending_questions.json, agent_global_grants.json, triggers_killed.json — every security-sensitive file at once), (3) block `/skills/` so the built-in skill directory can't be tampered with. Pre-existing bug also fixed: `/tmp` writes were silently failing because realpath resolves to `/private/tmp` and the old code hard-coded `/private` as a blocked root; now `/tmp` and `~` are both realpath-resolved in the sanity check. Defense in depth pairs with PR-1A's load-time gate. 21 tests in `tests/test_file_write.py` cover blocked-path refusal (incl. symlink resolution) and regression on legitimate paths (`~/Documents`, `~/Desktop`, `/tmp`, `~/Projects`, `~/codec-workspace`). **Description:** `file_write` enforces `realpath` resolution + `_BLOCKED_ROOTS` (`/System`, `/Library`, `/etc`, etc.) + `_BLOCKED_FILENAME_PATTERNS` (`.ssh`, `.env`, `secret`, etc.) + `_BLOCKED_EXTS`. But `~/.codec/skills/.py` is: - Under `$HOME` (passes the home check at line 103). - Does not start with any `_BLOCKED_ROOTS` after realpath. diff --git a/skills/.manifest.json b/skills/.manifest.json index e8fdcaf..2f1af77 100644 --- a/skills/.manifest.json +++ b/skills/.manifest.json @@ -29,7 +29,7 @@ "fact_extract.py": "200a3529dd4c83556efea4ad04240716a3a8aec2fc50a9ad08b3bc4b9968df5c", "file_ops.py": "fc70495aae3cbd1c8f3de821bb77e4c22d6fff9cc73c413d9ad7066ef14d78bb", "file_search.py": "4d8798965a7c4ecde76273f0bff25b1d65fd449f5b845cff1de9cb7e1311dfe4", - "file_write.py": "a62534b6882d0d546da7efb3d1d9d26905863977c0c67f7d87cf94c4dd9f2311", + "file_write.py": "ea00ade127ffd43ac33192f334d7be6b9b210fe75ba875df6d89005dfe782ea9", "google_calendar.py": "4a1d9edd26fc0162a0a62bb85f1538d7380aba14d4f8b50cc75792e37ecee8be", "google_docs.py": "75980457cb9304e970e9dcaaa12e2acca51559344d2f022896260a6a884f8318", "google_drive.py": "29ce91f3c5bbb43f67a18268c318fddab98627c4c879f5d76d2faa519274307c", diff --git a/skills/file_write.py b/skills/file_write.py index f0eb1b7..8315a4e 100644 --- a/skills/file_write.py +++ b/skills/file_write.py @@ -51,11 +51,59 @@ _AUDIT_LOG = os.path.expanduser("~/.codec/file_write.log") # ── Path safety ── -# These directory roots are ALWAYS blocked, regardless of who's calling. -_BLOCKED_ROOTS = [ +# System directory roots that are ALWAYS blocked. These are compared after +# the candidate path has been realpath-resolved, so symlinks can't slip +# through. The blocklist itself is also realpath-resolved at module load, +# which on macOS turns `/etc` into `/private/etc`, `/bin` into `/usr/bin`, +# etc. — so the comparison works regardless of which alias the caller uses. +# `/private` is deliberately NOT in this list: macOS realpaths /tmp into +# /private/tmp, and /tmp is a legitimate write target. The specific +# /private/etc and /private/var subdirs are still blocked via the +# realpath-resolved entries below. +_BLOCKED_SYSTEM_ROOTS = [ "/System", "/Library", "/usr", "/bin", "/sbin", "/etc", - "/var", "/private", "/dev", "/Volumes", + "/var", "/dev", "/Volumes", ] + +# Security-sensitive CODEC directories. Per audit finding D-4, the +# file_write skill must NEVER write into these — they govern skill loading, +# plugin lifecycle hooks, agent permission grants, audit-log integrity, +# OAuth tokens, and the API-key-bearing config file. Compare with realpath +# in case ~/.codec is a symlink to a non-default location. +def _codec_blocked_roots() -> list[str]: + """Compute the CODEC-internal paths file_write must never touch. + Resolved at module load — if ~/.codec moves, restart the dashboard.""" + codec_home = os.path.realpath(os.path.expanduser("~/.codec")) + # The whole ~/.codec tree is off-limits to file_write. CODEC's own + # state (skills, plugins, oauth_state.json, config.json, audit.log, + # memory.db, agents/, notifications.json, ...) all live here. Users + # who legitimately need to edit a config file have other tools. + out = [codec_home] + # Built-in skills directory inside the repo — hash-pinned in + # .manifest.json (PR-1A) but defense in depth. + skills_repo = os.path.realpath(os.path.dirname(os.path.abspath(__file__))) + out.append(skills_repo) + return out + + +# Realpath-resolved blocklist. Built once at module load. +def _build_blocked_roots() -> list[str]: + roots: list[str] = [] + for p in _BLOCKED_SYSTEM_ROOTS: + try: + roots.append(os.path.realpath(p)) + except Exception: + roots.append(p) + roots.extend(_codec_blocked_roots()) + return roots + + +_BLOCKED_ROOTS_REAL = _build_blocked_roots() + +# Public alias kept for any external introspection or test that references +# the old name. Same values as _BLOCKED_ROOTS_REAL. +_BLOCKED_ROOTS = _BLOCKED_ROOTS_REAL + # Any filename (case-insensitive substring) in this list is blocked. _BLOCKED_FILENAME_PATTERNS = [ ".ssh", ".gnupg", ".env", "credentials", "secrets", "secret", @@ -66,11 +114,19 @@ # Block extensions that could be executable shells / trust-sensitive. _BLOCKED_EXTS = [".pem", ".key", ".p12", ".pfx", ".keystore"] +# Realpath-resolved allowable scope. `/tmp` realpaths to `/private/tmp` on +# macOS — the home/tmp sanity check must compare against the resolved form +# or every /tmp write trips the /private/var-style realpath. +_TMP_REAL = os.path.realpath("/tmp") +_HOME_REAL = os.path.realpath(os.path.expanduser("~")) + def _is_safe_target(path: str): """Return (True, "") if safe to write; (False, reason) otherwise. - Resolves symlinks via realpath so a symlink into /etc can't slip through. + Resolves symlinks via realpath so a symlink into a blocked root can't + slip through. Blocked roots include the macOS system tree plus all of + `~/.codec/` and `/skills/` (closes audit finding D-4). """ if not path: return False, "Empty path." @@ -84,26 +140,31 @@ def _is_safe_target(path: str): real_parent = parent real_path = os.path.join(real_parent, os.path.basename(expanded)) - for blocked in _BLOCKED_ROOTS: - if real_path == blocked or real_path.startswith(blocked + os.sep): - return False, f"Blocked system path: {blocked}" - + # Filename + extension checks apply globally, regardless of directory. base_lower = os.path.basename(real_path).lower() for pat in _BLOCKED_FILENAME_PATTERNS: if pat in base_lower: return False, f"Blocked filename pattern: {pat!r}" - for ext in _BLOCKED_EXTS: if base_lower.endswith(ext): return False, f"Blocked extension: {ext}" - # Sanity: must be under $HOME or /tmp (broad but not everything). - home = os.path.realpath(os.path.expanduser("~")) - tmp = "/tmp" - if not (real_path.startswith(home + os.sep) or real_path.startswith(tmp + os.sep)): + # Blocked root check (system tree + ~/.codec/ + /skills/). + for blocked in _BLOCKED_ROOTS_REAL: + if real_path == blocked or real_path.startswith(blocked + os.sep): + return False, f"Blocked path: {blocked}" + + # Final sanity: must be under realpath($HOME) or realpath(/tmp). + under_home = ( + real_path == _HOME_REAL or real_path.startswith(_HOME_REAL + os.sep) + ) + under_tmp = ( + real_path == _TMP_REAL or real_path.startswith(_TMP_REAL + os.sep) + ) + if not (under_home or under_tmp): return False, ( f"Target must live under $HOME or /tmp (got: {real_path}). " - "Adjust file_write._BLOCKED_ROOTS if you need wider scope." + "Adjust file_write._BLOCKED_SYSTEM_ROOTS if you need wider scope." ) return True, "" @@ -228,6 +289,32 @@ def run(task: str, context: str = "") -> str: safe, reason = _is_safe_target(path) if not safe: + # Forensic emit: an MCP client (or local caller) just tried to write + # to a sensitive path. Per audit D-4 closure, refusals are audited + # so the operator can grep for `event=file_write_blocked` in + # ~/.codec/audit.log and see what was attempted. + try: + from codec_audit import log_event + expanded = os.path.expanduser(path) + try: + real_path = os.path.realpath(expanded) + except Exception: + real_path = expanded + log_event( + "file_write_blocked", + source="codec-skill-file-write", + message=f"file_write refused {path}: {reason}", + level="warning", + outcome="error", + extra={ + "target_path": real_path, + "requested_path": path, + "reason": reason, + }, + ) + except Exception: + # Audit failure must not mask the refusal. + pass return f"file_write: refused — {reason}" mode_label = _extract_mode(task) diff --git a/tests/test_file_write.py b/tests/test_file_write.py new file mode 100644 index 0000000..c4ed45c --- /dev/null +++ b/tests/test_file_write.py @@ -0,0 +1,259 @@ +"""Tests for skills/file_write.py — verifies the path-safety gate refuses +writes to security-sensitive directories while still allowing legitimate +user paths. + +Closes D-4 (CRITICAL): the file_write skill is MCP-exposed and NOT in +codec_config._HTTP_BLOCKED, so claude.ai over the 30-day OAuth token can +call it. Before this PR, _BLOCKED_ROOTS only listed /System, /etc, etc. — +so the skill happily wrote to ~/.codec/skills/.py, which (combined +with the load-time gate from PR-1A's defense-in-depth) was a write-path +to disk that an attacker could chain. This PR closes the write path at +the skill itself. + +Reference: docs/audits/PHASE-1-SECURITY.md finding D-4. +""" +from __future__ import annotations + +import os +import sys +from pathlib import Path + +REPO = Path(__file__).resolve().parent.parent +sys.path.insert(0, str(REPO)) +sys.path.insert(0, str(REPO / "skills")) + +import file_write # noqa: E402 + + +HOME = os.path.expanduser("~") +CODEC_DIR = os.path.expanduser("~/.codec") +REPO_SKILLS = str(REPO / "skills") + + +# ── Tests that must REFUSE writes (D-4 closure) ─────────────────────────────── + + +def test_refuses_codec_skills_dir(): + """Writing into ~/.codec/skills/ is the D-1 RCE write-path.""" + safe, reason = file_write._is_safe_target( + os.path.join(CODEC_DIR, "skills", "backdoor.py") + ) + assert not safe, "Must refuse writes to ~/.codec/skills/ (D-4 / D-1 chain)" + assert reason, "Must surface a reason" + + +def test_refuses_codec_plugins_dir(): + """Writing into ~/.codec/plugins/ — plugins wrap every tool call per + CLAUDE.md §3 Plugin lifecycle hooks. Same RCE risk as skills.""" + safe, reason = file_write._is_safe_target( + os.path.join(CODEC_DIR, "plugins", "innocent.py") + ) + assert not safe, "Must refuse writes to ~/.codec/plugins/" + assert reason + + +def test_refuses_oauth_state_file(): + """OAuth tokens are 30-day bearer credentials per CLAUDE.md §10 don't-touch.""" + safe, reason = file_write._is_safe_target( + os.path.join(CODEC_DIR, "oauth_state.json") + ) + assert not safe, "Must refuse writes to ~/.codec/oauth_state.json" + assert reason + + +def test_refuses_audit_log(): + """Audit log integrity is the compliance foundation (CLAUDE.md §6).""" + safe, reason = file_write._is_safe_target( + os.path.join(CODEC_DIR, "audit.log") + ) + assert not safe, "Must refuse writes to ~/.codec/audit.log" + + +def test_refuses_codec_config_file(): + """config.json contains API keys, dashboard token, PIN hash.""" + safe, reason = file_write._is_safe_target( + os.path.join(CODEC_DIR, "config.json") + ) + assert not safe, "Must refuse writes to ~/.codec/config.json" + + +def test_refuses_memory_db(): + """memory.db is the SQLite store for conversations + facts.""" + safe, reason = file_write._is_safe_target( + os.path.join(CODEC_DIR, "memory.db") + ) + assert not safe, "Must refuse writes to ~/.codec/memory.db" + + +def test_refuses_agents_state_file(): + """~/.codec/agents//state.json governs the Phase 3 agent runtime.""" + safe, reason = file_write._is_safe_target( + os.path.join(CODEC_DIR, "agents", "abc123", "state.json") + ) + assert not safe, "Must refuse writes inside ~/.codec/agents/" + + +def test_refuses_agent_global_grants(): + """Cross-agent allowlist; tampering elevates permission grants.""" + safe, reason = file_write._is_safe_target( + os.path.join(CODEC_DIR, "agent_global_grants.json") + ) + assert not safe, "Must refuse writes to ~/.codec/agent_global_grants.json" + + +def test_refuses_pending_questions(): + """pending_questions.json — direct edits race ask_user.ask() (CLAUDE.md §10).""" + safe, reason = file_write._is_safe_target( + os.path.join(CODEC_DIR, "pending_questions.json") + ) + assert not safe, "Must refuse writes to ~/.codec/pending_questions.json" + + +def test_refuses_triggers_killed(): + """Per-trigger kill state — tampering re-enables muted triggers.""" + safe, reason = file_write._is_safe_target( + os.path.join(CODEC_DIR, "triggers_killed.json") + ) + assert not safe + + +def test_refuses_repo_skills_dir(): + """/skills/ holds built-in skills; an attacker writing here + contaminates the hash-pinned trusted manifest from PR-1A.""" + safe, reason = file_write._is_safe_target( + os.path.join(REPO_SKILLS, "backdoor.py") + ) + assert not safe, "Must refuse writes to /skills/ (built-in skill dir)" + assert reason + + +def test_refuses_codec_root_itself(): + """Even direct files at ~/.codec/ are off-limits.""" + safe, reason = file_write._is_safe_target( + os.path.join(CODEC_DIR, "evil.txt") + ) + assert not safe, "Must refuse arbitrary writes anywhere under ~/.codec/" + + +def test_refuses_symlinked_path_into_codec(tmp_path): + """An attacker dropping a symlink ~/Documents/sneaky → ~/.codec/skills + must NOT slip past the gate. realpath resolution + blocked-root check + of the resolved path is the fix.""" + # Create a symlink in tmp_path pointing into ~/.codec/skills/ + target = os.path.join(CODEC_DIR, "skills") + link = tmp_path / "sneaky" + try: + os.symlink(target, str(link)) + except OSError: + # Some filesystems don't support symlinks; skip in that case + import pytest + pytest.skip("filesystem does not support symlinks") + + # Now try to write through the symlink + safe, reason = file_write._is_safe_target( + os.path.join(str(link), "backdoor.py") + ) + assert not safe, ( + "Symlink into ~/.codec/skills/ must be resolved by realpath and refused" + ) + + +# ── Tests that must STILL ACCEPT (no regression on legitimate paths) ────────── + + +def test_accepts_documents_dir(): + """The main legitimate write target for claude.ai's file_write usage.""" + safe, reason = file_write._is_safe_target( + os.path.join(HOME, "Documents", "notes", "plan.md") + ) + assert safe, f"~/Documents must remain writable (D-4 regression): {reason}" + + +def test_accepts_desktop_dir(): + safe, reason = file_write._is_safe_target( + os.path.join(HOME, "Desktop", "scratch.txt") + ) + assert safe, f"~/Desktop must remain writable: {reason}" + + +def test_accepts_tmp_dir(): + safe, reason = file_write._is_safe_target("/tmp/codec-tmp.log") + assert safe, f"/tmp must remain writable: {reason}" + + +def test_accepts_arbitrary_home_subdir(tmp_path, monkeypatch): + """Writes to arbitrary user subdirs under $HOME (e.g. ~/Projects/foo.md) + should still work — that's the entire utility of the skill.""" + # Use the real $HOME so the existing `under_home` guard is exercised. + safe, reason = file_write._is_safe_target( + os.path.join(HOME, "Projects", "test-pr1c-only.md") + ) + assert safe, f"~/Projects must remain writable: {reason}" + + +def test_accepts_codec_workspace_subdir(): + """~/codec-workspace is used by the Vibe doSave() flow — must stay open.""" + safe, reason = file_write._is_safe_target( + os.path.join(HOME, "codec-workspace", "snippet.py") + ) + assert safe, f"~/codec-workspace must remain writable: {reason}" + + +# ── Existing protections still active (regression checks) ───────────────────── + + +def test_refuses_etc_passwd(): + """Pre-existing /etc block must still work.""" + safe, _ = file_write._is_safe_target("/etc/passwd") + assert not safe + + +def test_refuses_ssh_key(): + """Pre-existing .ssh/ filename pattern block must still work.""" + safe, _ = file_write._is_safe_target(os.path.join(HOME, ".ssh", "id_rsa")) + assert not safe + + +def test_refuses_env_file(): + """Pre-existing .env filename pattern block must still work.""" + safe, _ = file_write._is_safe_target(os.path.join(HOME, "project", ".env")) + assert not safe + + +# ── Audit emission on blocked write (D-4 closure §3) ────────────────────────── + + +def test_blocked_write_emits_file_write_blocked_audit_event(monkeypatch): + """When file_write refuses a target, it must emit an audit event so the + operator can grep ~/.codec/audit.log for attempted writes to sensitive + paths.""" + captured = [] + + def fake_log_event(event_type, *args, **kwargs): + captured.append({ + "event_type": event_type, + "args": args, + "kwargs": kwargs, + }) + + monkeypatch.setattr("codec_audit.log_event", fake_log_event) + + result = file_write.run( + task=f"path: {os.path.join(CODEC_DIR, 'skills', 'attempt.py')}\n" + "content: print('payload')" + ) + + assert result.startswith("file_write: refused"), ( + f"Expected refusal message, got: {result!r}" + ) + matching = [c for c in captured if c["event_type"] == "file_write_blocked"] + assert len(matching) == 1, ( + f"Expected exactly one file_write_blocked audit event, " + f"got {len(matching)}: {captured}" + ) + extra = matching[0]["kwargs"].get("extra", {}) + assert "target_path" in extra + assert "reason" in extra + assert "skills" in extra["target_path"], ( + f"target_path should reference the attempted sensitive path: {extra}" + )