Skip to content

Context-injection audit: hollow tests + stale-memory caveat #893

@kokevidaurre

Description

@kokevidaurre

Audit (2026-06) of the Squad Context System (src/lib/run-context.ts)

Ran the real loader against a real .agents tree across roles. Mechanics are correct: SYSTEM.md (L0) loads separately; strategy.md is the L1 "Company" layer; injection order is action-first (founder-context → alignment → feedback → goals → state → agent → strategy → briefing → cross-squad); role gating matches ROLE_SECTIONS; budget is enforced with graceful skips.

Findings

  • Test gapgatherSquadContext had no real behavioral assertions (expect(typeof ctx).toBe('string') / if (length>0) guards that silently skip). Layer order, role gating, L1=strategy, and budget behavior were unverified.
  • F1 — stale memory injected as current. feedback.md was injected under "act on this first" with no staleness caveat (only state.md had the feat: memory staleness caveats #721 note). Real runs showed 76-day-old feedback presented as current.
  • F2 — duplication for leads/coo. daily-briefing.md overlaps heavily with founder-context.md; the same blocks inject twice (~2k tokens wasted, attention diluted).
  • F3 — founder-context.md reaches all roles incl. scanner/worker. Confirm intended scope / PII posture.
  • F4 (latent) — strategy.md is injected late (reference-last by design); first to be trimmed if earlier layers grow. Currently safe (40–52% budget utilization).

This PR (F1 + tests)

  • Shared stalenessNote() helper; applied to feedback and state (DRY).
  • 6 real fixture-based tests: order, L1=strategy, role gating, budget-eviction, stale + fresh feedback.

F2/F3/F4 remain as follow-ups.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions