Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -583,6 +583,23 @@ Canonical state file for AskUserQuestion. Atomic write via tmp+rename. Schema:
### MCP HTTP transport blocklist
`codec_config._HTTP_BLOCKED`: `python_exec`, `terminal`, `process_manager`, `pm2_control`, `ax_control`. These skills are NEVER exposed over HTTP MCP. They remain available locally (voice, chat) and over stdio MCP only.

### Agent permission gate + path blocklist (Phase 1 Wave 1, PR-1D — closes D-5 + D-14 + D-16)

`permission_gate` in `codec_agent_runner.py` enforces the Step 9 manifest on every Action. As of PR-1D:

1. **`..` segments rejected outright.** `~/Documents/../../etc/passwd` is refused even if the resulting realpath happens to match a different grant (or none). Previously `fnmatch.fnmatch` glob-matched the raw string and missed the traversal.
2. **Realpath both sides.** Action paths and grant roots are resolved via `os.path.realpath` before comparison — symlinks pointing outside a granted root are caught.
3. **Prefix-on-realpath comparison.** `fnmatch` was replaced with `action_real.startswith(grant_real + os.sep)`. Trade-off: a grant like `~/Documents/*.md` now accepts any file under `~/Documents/`, not just `*.md`. Safety > granularity per the audit recommendation.
4. **Audit emission on rejection.** Every refused action emits `permission_gate_blocked` (`source=codec-agent-runner`, `outcome=error`, `level=warning`, `extra={requested_path, resolved_path, reason}`) before raising `PermissionViolation`. The wrapping `_run_agent` continues to emit `agent_blocked_on_permission` — gate-level + agent-level audit are complementary.

`_PATH_BLOCKLIST_SUBSTRINGS` in `codec_agent_plan.py` is the auto-extract blocklist for paths the user types in a plan description. Extended in PR-1D to cover the full security-sensitive `~/.codec/` set:

`/.ssh`, `/.aws`, `/.gnupg`, `/.config/gh`, `/Library/Keychains`, `/Library/Application Support/com.apple`, `/.codec/secrets`, `/.codec/auth`, `/etc/`, `/var/`, `/private/`, `/System/`, `/usr/local/etc`, **`/.codec/skills`, `/.codec/plugins`, `/.codec/oauth_state.json`, `/.codec/audit.log`, `/.codec/agents`, `/.codec/agent_global_grants.json`, `/.codec/config.json`, `/.codec/memory.db`**.

**Segment-aware matching (D-16 closure).** Each entry is matched as a sequence of consecutive `/`-separated path SEGMENTS — not raw substring. So `~/Documents/notes_ssh/foo.md` no longer false-positive-matches `/.ssh` (the segment is `notes_ssh`, not `.ssh`), but `~/.ssh/config` does. Sequence matching also covers multi-segment entries like `/Library/Application Support/com.apple/<x>`.

**Defense in depth.** Even if an attacker convinces the LLM to draft a plan grant for `~/.codec/skills/**`, the blocklist prevents auto-extraction during plan drafting; the realpath gate catches it at runtime; PR-1A's load-time AST gate catches it at skill load; PR-1C's `file_write` block-roots catches it at the file_write call. Four independent layers — each closes the D-1 chain on its own.

### `file_write` skill path-blocking (Phase 1 Wave 1, PR-1C — closes D-4)

The MCP-exposed `file_write` skill (`skills/file_write.py`) refuses writes to security-sensitive paths. This is a defense-in-depth layer paired with PR-1A's load-time AST gate: even if `file_write` reached a skill directory, the file wouldn't execute on load — but PR-1C closes the write at the source.
Expand Down
55 changes: 53 additions & 2 deletions codec_agent_plan.py
Original file line number Diff line number Diff line change
Expand Up @@ -750,15 +750,66 @@ def _new_agent_id() -> str:
r")"
)

# Never auto-grant these — keep approval friction
# Never auto-grant these — keep approval friction.
#
# Closes audit findings D-14 (missing skill/plugin/auth/oauth paths) and
# D-16 (anchorless substring match). Each entry is `/`-separated and matched
# as a sequence of consecutive path SEGMENTS — not raw substring. So
# `~/Documents/notes_ssh/` does NOT match `/.ssh` (the segment is
# `notes_ssh`, not `.ssh`), but `~/.ssh/config` does (segment `.ssh`).
_PATH_BLOCKLIST_SUBSTRINGS = (
# Pre-D-14 entries:
"/.ssh", "/.aws", "/.gnupg", "/.config/gh", "/Library/Keychains",
"/Library/Application Support/com.apple",
"/.codec/secrets", "/.codec/auth",
"/etc/", "/var/", "/private/", "/System/", "/usr/local/etc",
# D-14 closure: every security-sensitive ~/.codec/* path. Auto-grant
# to any of these chains to D-1 RCE (skill drop → restart → exec).
"/.codec/skills",
"/.codec/plugins",
"/.codec/oauth_state.json",
"/.codec/audit.log",
"/.codec/agents",
"/.codec/agent_global_grants.json",
"/.codec/config.json",
"/.codec/memory.db",
)


def _path_segments_match(path: str, blocked_pattern: str) -> bool:
"""Segment-aware match (D-16 closure). Returns True iff
`blocked_pattern`'s `/`-separated segments appear consecutively in
`path`'s segments. `path` is expanduser'd + normalized (collapses
`..` and `.`) before comparison.

Examples:
_path_segments_match("~/.ssh/config", "/.ssh") → True
_path_segments_match("~/Documents/notes_ssh/", "/.ssh") → False
_path_segments_match("~/.codec/skills/x.py", "/.codec/skills") → True
_path_segments_match("/etc/passwd", "/etc/") → True
"""
try:
expanded = os.path.expanduser(path)
normalized = os.path.normpath(expanded)
path_parts = normalized.split(os.sep)
except Exception:
return True # fail-safe: treat unparseable paths as blocked
blocked_parts = [s for s in blocked_pattern.split("/") if s]
if not blocked_parts:
return False
n = len(blocked_parts)
for i in range(len(path_parts) - n + 1):
if path_parts[i:i + n] == blocked_parts:
return True
return False


def _is_path_blocklisted(path: str) -> bool:
"""True iff the path traverses any segment-sequence in
`_PATH_BLOCKLIST_SUBSTRINGS`. D-14 + D-16 closure."""
return any(_path_segments_match(path, b) for b in _PATH_BLOCKLIST_SUBSTRINGS)


def _normalize_path(p: str) -> str:
p = p.rstrip(".,;:!?\"')\n")
if p.endswith("/"):
Expand Down Expand Up @@ -788,7 +839,7 @@ def extract_user_paths(description: str) -> tuple[list[str], list[str]]:
raw = _normalize_path(m.group("path"))
if not raw or raw in seen:
continue
if any(b in raw for b in _PATH_BLOCKLIST_SUBSTRINGS):
if _is_path_blocklisted(raw):
continue
seen.add(raw)

Expand Down
96 changes: 87 additions & 9 deletions codec_agent_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@
"""
from __future__ import annotations

import fnmatch
import json
import logging
import os
Expand Down Expand Up @@ -92,12 +91,95 @@ def __init__(self, reason: str, needed: str, message: str = ""):


# ── Permission gate ───────────────────────────────────────────────────────────

def _emit_gate_blocked(action_path: str, reason: str, agent_id: str = "") -> None:
"""Emit a `permission_gate_blocked` audit event on rejection. Forensic
visibility per audit D-5 closure — operators can grep ~/.codec/audit.log
for blocked-action attempts. Never raises (audit failure must not mask
the underlying refusal)."""
try:
from codec_audit import log_event
try:
real = os.path.realpath(os.path.expanduser(action_path)) if action_path else ""
except Exception:
real = action_path or ""
log_event(
"permission_gate_blocked",
source="codec-agent-runner",
message=f"permission_gate refused {action_path!r}: {reason}",
level="warning",
outcome="error",
extra={
"requested_path": action_path,
"resolved_path": real,
"reason": reason,
"agent_id": agent_id,
},
)
except Exception:
pass


def _path_allowed(action_path: str, grants: Any) -> tuple[bool, str]:
"""Return (allowed, reason) for an action path against a set of grant
patterns (originally fnmatch-style, e.g. `~/Documents/**`).

Closes audit D-5 — three layered checks:
1. Reject `..` segments outright (no path-traversal bypass).
2. Realpath the action so symlinks are resolved.
3. Match against realpath'd grant roots (the substring before the
first glob char). Acceptance = action's realpath is at or under
the grant's realpath root.

Trade-off vs. raw fnmatch: a grant pattern like `~/Documents/*.md`
now accepts any file under realpath(~/Documents/), not just `*.md`.
The audit explicitly recommends this (prefix-on-realpath over
fnmatch) — safety > granularity.
"""
if not action_path:
return False, "empty_path"

# Reject .. anywhere in the path. expanduser is enough here — we don't
# need realpath to detect the segment "..".
if ".." in Path(os.path.expanduser(action_path)).parts:
return False, "path_traversal"

try:
action_real = os.path.realpath(os.path.expanduser(action_path))
except (OSError, RuntimeError, ValueError):
return False, "realpath_failed"

for grant in grants:
if not grant:
continue
grant_expanded = os.path.expanduser(grant)
# Strip glob suffix to find the directory root. Examples:
# "~/Documents/**" → "~/Documents"
# "~/Documents/*.md" → "~/Documents"
# "~/Documents" → "~/Documents"
glob_idx = grant_expanded.find("*")
grant_root = (grant_expanded[:glob_idx] if glob_idx >= 0
else grant_expanded).rstrip(os.sep) or os.sep
try:
grant_real = os.path.realpath(grant_root)
except (OSError, RuntimeError, ValueError):
continue
if action_real == grant_real or action_real.startswith(grant_real + os.sep):
return True, ""

return False, "not_under_grant"


def permission_gate(action: Action, agent_grants: Dict[str, Any],
global_grants: Dict[str, Any]) -> None:
"""The core Step 9 enforcement. Walks the action's resource use,
checks the union of per-agent grants and global allowlist. Raises
PermissionViolation on any gap.

Path checks use `_path_allowed` (realpath + dotdot rejection) — closes
audit finding D-5. Rejections emit a `permission_gate_blocked` audit
event before the exception so the operator gets forensic visibility.

Note: destructive ops fall through to strict_consent_gate (Step 3
§1.7) — even if pre-approved by the user. That's the universal
floor; permission_gate alone is not enough.
Expand All @@ -114,12 +196,9 @@ def permission_gate(action: Action, agent_grants: Dict[str, Any],
if action.touches_path:
write_paths = (set(agent_grants.get("write_paths", [])) |
set(global_grants.get("write_paths", [])))
# fnmatch supports glob patterns the LLM puts in manifest.
# Expand ~ on both sides so "~/Documents/x" matches "~/Documents/**".
action_path_abs = os.path.expanduser(action.path)
ok = any(fnmatch.fnmatch(action_path_abs, os.path.expanduser(p))
for p in write_paths)
ok, reason = _path_allowed(action.path, write_paths)
if not ok:
_emit_gate_blocked(action.path, reason)
raise PermissionViolation("path_not_authorized", action.path)

if action.reads_path and action.read_path:
Expand All @@ -129,10 +208,9 @@ def permission_gate(action: Action, agent_grants: Dict[str, Any],
# must be able to read it back (verify writes, read prior output, etc.).
write_paths_implicit = (set(agent_grants.get("write_paths", [])) |
set(global_grants.get("write_paths", [])))
action_read_abs = os.path.expanduser(action.read_path)
ok = any(fnmatch.fnmatch(action_read_abs, os.path.expanduser(p))
for p in read_paths | write_paths_implicit)
ok, reason = _path_allowed(action.read_path, read_paths | write_paths_implicit)
if not ok:
_emit_gate_blocked(action.read_path, reason)
raise PermissionViolation("read_path_not_authorized", action.read_path)

if action.network_call:
Expand Down
2 changes: 1 addition & 1 deletion docs/audits/PHASE-1-CONSOLIDATED-TRIAGE.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ If you do all 8: you have layered defenses against D-1 / D-2 / D-3 / D-4 / D-7 u
| **D-2** | Security | `/api/forge` fetches arbitrary URL → LLM → writes skill, no review gate | **W1 — CLOSED ([#43](https://github.com/AVADSA25/codec/pull/43), `ff16664`)** |
| **D-3** | Security | `/api/save_skill` writes directly to skills/ with only substring check | **W1 — CLOSED ([#43](https://github.com/AVADSA25/codec/pull/43), `ff16664`)** |
| **D-4** | Security | `file_write` skill (MCP-exposed) can write to `~/.codec/skills/` | **W1 — CLOSED ([#45](https://github.com/AVADSA25/codec/pull/45), `0065d90`)** |
| **D-5** | Security | `permission_gate` accepts path-traversal via `fnmatch` (no realpath) | **W1** |
| **D-5** | Security | `permission_gate` accepts path-traversal via `fnmatch` (no realpath) | **W1 — CLOSED (PR-1D)** |
| C-1 | Reliability | `codec.py` daemon ignores SIGINT/SIGTERM; leaks sox + tkinter on every restart | W4 |
| C-2 | Reliability | `~/.codec/pwa_response.json` race conditions + no correlation_id | W4 |
| C-3 | Reliability | `notifications.json` has 3 writers with 3 different write semantics | W4 |
Expand Down
6 changes: 6 additions & 0 deletions docs/audits/PHASE-1-SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,8 @@ The skill is `SKILL_MCP_EXPOSE = True` and NOT in `_HTTP_BLOCKED` (`codec_config
### D-5 — `permission_gate` accepts path-traversal strings via fnmatch [CRITICAL]
**Location:** `codec_agent_runner.py:95-142` (`permission_gate`)
**CWE / OWASP:** CWE-22 Path Traversal / CWE-23 Relative Path Traversal / Agentic A06 Excessive Agency

> **Closed by PR-1D** (branch `fix/pr1d-permission-gate-realpath`). `permission_gate` was refactored: (1) reject `..` segments outright before any matching, (2) `os.path.realpath` both sides of every comparison, (3) `fnmatch` replaced with `action_real.startswith(grant_real + os.sep)`. New helper `_path_allowed` consolidates the logic for both `touches_path` and `reads_path`. Rejections emit `permission_gate_blocked` audit events (source=`codec-agent-runner`, outcome=`error`, level=`warning`, extra={`requested_path`, `resolved_path`, `reason`, `agent_id`}) before raising `PermissionViolation`. 5 new tests in `tests/test_agent_runner.py` cover dotdot rejection (write + read paths), symlink-outside-grant rejection, symlink-within-grant acceptance, and audit emission. All 7 pre-existing permission_gate tests still pass.
**Description:** The Step 9 permission gate checks `action.path` via `fnmatch.fnmatch(os.path.expanduser(action.path), os.path.expanduser(p))`. `expanduser` resolves `~`, but does NOT resolve `..` or symlinks. `fnmatch` is glob-style and does NOT treat `..` specially. So:
- Plan grant: `~/Documents/**`
- Action path: `~/Documents/../../etc/passwd`
Expand Down Expand Up @@ -268,6 +270,8 @@ The skill is `SKILL_MCP_EXPOSE = True` and not in `_HTTP_BLOCKED`, so an attacke
### D-14 — `_PATH_BLOCKLIST_SUBSTRINGS` misses `~/.codec/skills` and `~/.codec/plugins` [MEDIUM]
**Location:** `codec_agent_plan.py:754-759`
**CWE / OWASP:** CWE-20 Improper Input Validation / Agentic A06 Excessive Agency

> **Closed by PR-1D** (branch `fix/pr1d-permission-gate-realpath`). `_PATH_BLOCKLIST_SUBSTRINGS` extended with 8 new entries: `/.codec/skills`, `/.codec/plugins`, `/.codec/oauth_state.json`, `/.codec/audit.log`, `/.codec/agents`, `/.codec/agent_global_grants.json`, `/.codec/config.json`, `/.codec/memory.db`. The LLM-drafted plan auto-extract path now drops any user-typed path landing in these directories — chain to D-1 RCE via auto-granted plan write_paths is closed. 7 parametrized tests in `tests/test_agent_plan.py` cover each new entry.
**Description:** When the user types a path in their project description (e.g. "save outputs to ~/Documents/foo"), `extract_user_paths` auto-grants the path in the manifest. The blocklist excludes `/.ssh`, `/.aws`, `/Library/Keychains`, `/etc/`, `/var/`, etc., AND `/.codec/secrets`, `/.codec/auth`. But it does NOT exclude `/.codec/skills`, `/.codec/plugins`, `/.codec/oauth_state.json`, or the repo's `skills/` directory. An LLM-drafted plan based on a user's vague description that mentions `~/.codec/skills/util/` would auto-grant write to that path → next agent run can drop a skill → D-1.
**Exploit chain:** User says "build me a skill that does X, save it to ~/.codec/skills/x_skill.py". Plan auto-grants `~/.codec/skills/x_skill.py/**`. Agent writes payload. Restart → RCE.
**Impact:** Bypass of the skill review gate via the auto-extract path.
Expand Down Expand Up @@ -301,6 +305,8 @@ Bonus: `auth_pin_hash` uses unsalted SHA-256. A 4-6-digit PIN is rainbow-table'd
### D-16 — `extract_user_paths` blocklist substring match is anchorless [MEDIUM]
**Location:** `codec_agent_plan.py:791`
**CWE / OWASP:** CWE-20 Improper Input Validation

> **Closed by PR-1D** (branch `fix/pr1d-permission-gate-realpath`). Replaced `any(b in raw for b in _PATH_BLOCKLIST_SUBSTRINGS)` with `_is_path_blocklisted(raw)` which calls `_path_segments_match` on each entry. `_path_segments_match` checks whether the blocklist pattern's `/`-separated segments appear as a CONSECUTIVE SUBSEQUENCE of the path's segments (path is `expanduser` + `normpath`-ed first to collapse `..` and `.`). So `~/Documents/notes_ssh/foo.md` no longer matches `/.ssh` as a false positive (the segment is `notes_ssh`, not `.ssh`), but `~/.ssh/config` still does. Three regression-coverage tests in `tests/test_agent_plan.py` (segment-aware false-positive negative, still-blocks-real-ssh, legitimate-user-paths-pass-through).
**Description:** `if any(b in raw for b in _PATH_BLOCKLIST_SUBSTRINGS): continue` — the substring `/.ssh` is checked against the raw path. But variations like `~/Documents/.ssh-backup/` would match (contains `/.ssh`), correctly blocking. However, `~/Documents/notes_ssh/` does NOT contain `/.ssh` (no leading dot+s before), so it passes — which is correct, but it shows the rule is purely substring. More worryingly, `~/Documents/passthrough/` doesn't contain any blocklist substring even though `passthrough` semantically signals "this is sensitive" (just an example — but the more important miss is `/.codec/skills` from D-14).
**Impact:** Sub-finding of D-14. False sense of security from a fragile substring filter.
**Recommended fix:** Convert to a regex-anchored or path-segment-aware check. Use `Path(raw).parts` and check for any forbidden segment.
Expand Down
Loading
Loading