Skip to content

fix(context): caveat stale feedback + real injection-contract tests#894

Merged
kokevidaurre merged 1 commit into
developfrom
fix/context-injection-hardening
Jun 15, 2026
Merged

fix(context): caveat stale feedback + real injection-contract tests#894
kokevidaurre merged 1 commit into
developfrom
fix/context-injection-hardening

Conversation

@kokevidaurre

Copy link
Copy Markdown
Contributor

Why

A 2026-06 audit of the Squad Context System (src/lib/run-context.ts) found two problems behind a green suite:

  1. Hollow testsgatherSquadContext asserted nothing real (typeof ctx === 'string', or if (length>0) guards that silently skip when context is empty). Layer order, role gating, strategy.md-as-L1, and budget behavior were unverified.
  2. F1 — stale memory as currentfeedback.md was injected under "act on this first" with no staleness caveat (only state.md had the feat: memory staleness caveats #721 note). Real company runs surfaced 76-day-old feedback presented as current guidance.

What

  • Extract a shared stalenessNote() helper; apply to feedback and state (DRY — removes duplicated inline mtime math).
  • Add 6 fixture-based tests asserting real behavior: action-first order, strategy.md as the L1 Company layer (docs: update context-layer docs — strategy.md replaces priorities.md as L1 #878), role gating (scanner omits feedback/briefing, worker adds feedback, lead adds briefing), budget eviction of late layers, and the stale/fresh feedback caveat.

Verified

  • Full suite green (2018 tests), typecheck clean.
  • On the real repo: 76-day-old feedback now renders (Last updated 76 days ago — verify before relying on this).

Refs #893 — closes the test-gap + F1 boxes. F2 (briefing/founder-context duplication), F3 (founder-context scope), F4 (late strategy.md) remain tracked follow-ups.

Audit (2026-06) of the Squad Context System found the gatherSquadContext
tests asserted nothing real (typeof===string / if(length>0) guards that
silently skip), so layer order, role gating, strategy.md-as-L1, and
budget behavior were unverified — and feedback.md was injected under
'act on this first' with NO staleness caveat (only state.md had the #721
note). Real company runs showed 76-day-old feedback presented as current.

- Extract a shared stalenessNote() helper; apply to feedback AND state
  (DRY; removes the duplicated inline mtime math).
- Add 6 fixture-based tests asserting real behavior: action-first order,
  strategy.md as the L1 Company layer, role gating (scanner/worker/lead),
  budget eviction of late layers, and the stale/fresh feedback caveat.

Verified on the real repo: 76-day-old feedback now renders
'(Last updated 76 days ago — verify before relying on this)'.
Full suite green (2018 tests).

Refs #893 (test gap + F1; F2/F3/F4 remain follow-ups)

Co-Authored-By: Claude <noreply@anthropic.com>
@kokevidaurre kokevidaurre enabled auto-merge (squash) June 15, 2026 15:53
@kokevidaurre kokevidaurre merged commit cce7dd0 into develop Jun 15, 2026
13 checks passed
@codecov-commenter

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a helper function stalenessNote to flag stale memory layers (such as feedback and state files) based on their modification time, and integrates it into the squad context gathering process. It also adds a comprehensive suite of unit tests to verify the context injection contract, including layer ordering, role gating, budget dropping, and staleness caveats. The review feedback suggests clamping the calculated days to a minimum of zero to handle potential clock skew, and using Date objects instead of raw epoch seconds in utimesSync for better platform compatibility in tests.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/lib/run-context.ts
try {
const MS_PER_DAY = 24 * 60 * 60 * 1000;
const mtime = statSync(filePath).mtimeMs;
const daysAgo = Math.floor((Date.now() - mtime) / MS_PER_DAY);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent potential issues with clock skew or future-dated files (which can result in a negative daysAgo value), it is safer to clamp daysAgo to a minimum of 0 using Math.max(0, ...).

Suggested change
const daysAgo = Math.floor((Date.now() - mtime) / MS_PER_DAY);
const daysAgo = Math.max(0, Math.floor((Date.now() - mtime) / MS_PER_DAY));

Comment thread test/run-context.test.ts
Comment on lines +220 to +221
const tSec = (Date.now() - opts.feedbackMtimeDaysAgo * 86_400_000) / 1000;
utimesSync(feedbackFile, tSec, tSec);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a Date object with utimesSync is more idiomatic and robust than manually calculating epoch seconds as a float, which can occasionally lead to platform-specific truncation or type issues.

Suggested change
const tSec = (Date.now() - opts.feedbackMtimeDaysAgo * 86_400_000) / 1000;
utimesSync(feedbackFile, tSec, tSec);
const date = new Date(Date.now() - opts.feedbackMtimeDaysAgo * 86_400_000);
utimesSync(feedbackFile, date, date);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants