Skip to content

Add a secret-canary regression suite covering every Frame egress path #206

@dgenio

Description

@dgenio

Summary

Introduce a dedicated test module that plants distinctive canary secrets (fake JWTs,
keys, emails, connection strings) in driver outputs and asserts they never appear in
any kernel egress: summary/table/raw Frames, handle expansions, streamed chunks,
trace args/errors, warnings, and adapter-rendered output.

Why this matters

The repo's firewall tests verify individual mechanisms, but the audit found three
independent egress leaks (depth fail-open, expansion bypass, cross-chunk splits —
ISSUES 1–3) precisely because no test asserts the global property "this string
never escapes." A canary suite turns the I-01 guarantee into an executable
invariant and becomes the regression net for every future firewall change.

Current evidence

  • tests/test_firewall_boundary.py, tests/test_redaction.py, tests/test_firewall_stream.py test mechanisms individually; none asserts absence of a planted secret across all paths from one scenario.
  • Demonstrated gaps: firewall/redaction.py depth fail-open; handles.py expand path; per-chunk streaming (firewall/transform.py apply_stream).
  • docs/agent-context/invariants.md documents I-01 as the firewall boundary invariant.

External context

Canary/taint-style negative assertions are the standard way redaction layers are
regression-tested.

Proposed implementation

  1. New tests/test_secret_canaries.py: fixture driver returning nested data
    embedding each canary at multiple depths and positions (field names, values,
    inside lists, split across stream chunks).
  2. Helper assert_no_canary(obj) that walks any structure/string and fails on
    any canary fragment.
  3. Exercise: invoke (all response modes) → expand (paged) → explain/trace query →
    stream → adapter render (adapters/openai.py, adapters/anthropic.py).
  4. Mark known-failing paths with xfail(strict=True) referencing ISSUES 1–3 until
    fixed, so the suite lands first and flips to enforcing.

AI-agent execution notes

  • Inspect first: tests/conftest.py (fixtures), the three firewall test modules, adapters/_base.py render path.
  • Determinism: canaries are fixed strings; no randomness.
  • Edge cases: canary as dict key; canary inside metadata; canary in driver exception text.
  • Do not weaken existing tests; this is additive.

Acceptance criteria

  • The suite covers summary, table, raw (admin), handle expansion, streaming, trace, and adapter outputs from a single scenario.
  • Known leaks are tracked as strict xfails tied to their fix issues.
  • Suite runs in make test with no network.

Test plan

The suite is the test plan; verify it fails when a redaction line is commented out
locally (mutation sanity check). Run make ci.

Documentation plan

Mention the canary suite in docs/agent-context/review-checklist.md (firewall
changes must keep it green); CHANGELOG Added (tests).

Migration and compatibility notes

Not expected to require migration.

Risks and tradeoffs

Walking every structure adds test time (negligible). Strict xfails require
coordination as fixes land — that is the point.

Suggested labels

testing, security, reliability

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions