fix(context): caveat stale feedback + real injection-contract tests#894
Conversation
Audit (2026-06) of the Squad Context System found the gatherSquadContext tests asserted nothing real (typeof===string / if(length>0) guards that silently skip), so layer order, role gating, strategy.md-as-L1, and budget behavior were unverified — and feedback.md was injected under 'act on this first' with NO staleness caveat (only state.md had the #721 note). Real company runs showed 76-day-old feedback presented as current. - Extract a shared stalenessNote() helper; apply to feedback AND state (DRY; removes the duplicated inline mtime math). - Add 6 fixture-based tests asserting real behavior: action-first order, strategy.md as the L1 Company layer, role gating (scanner/worker/lead), budget eviction of late layers, and the stale/fresh feedback caveat. Verified on the real repo: 76-day-old feedback now renders '(Last updated 76 days ago — verify before relying on this)'. Full suite green (2018 tests). Refs #893 (test gap + F1; F2/F3/F4 remain follow-ups) Co-Authored-By: Claude <noreply@anthropic.com>
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Code Review
This pull request introduces a helper function stalenessNote to flag stale memory layers (such as feedback and state files) based on their modification time, and integrates it into the squad context gathering process. It also adds a comprehensive suite of unit tests to verify the context injection contract, including layer ordering, role gating, budget dropping, and staleness caveats. The review feedback suggests clamping the calculated days to a minimum of zero to handle potential clock skew, and using Date objects instead of raw epoch seconds in utimesSync for better platform compatibility in tests.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| try { | ||
| const MS_PER_DAY = 24 * 60 * 60 * 1000; | ||
| const mtime = statSync(filePath).mtimeMs; | ||
| const daysAgo = Math.floor((Date.now() - mtime) / MS_PER_DAY); |
There was a problem hiding this comment.
To prevent potential issues with clock skew or future-dated files (which can result in a negative daysAgo value), it is safer to clamp daysAgo to a minimum of 0 using Math.max(0, ...).
| const daysAgo = Math.floor((Date.now() - mtime) / MS_PER_DAY); | |
| const daysAgo = Math.max(0, Math.floor((Date.now() - mtime) / MS_PER_DAY)); |
| const tSec = (Date.now() - opts.feedbackMtimeDaysAgo * 86_400_000) / 1000; | ||
| utimesSync(feedbackFile, tSec, tSec); |
There was a problem hiding this comment.
Using a Date object with utimesSync is more idiomatic and robust than manually calculating epoch seconds as a float, which can occasionally lead to platform-specific truncation or type issues.
| const tSec = (Date.now() - opts.feedbackMtimeDaysAgo * 86_400_000) / 1000; | |
| utimesSync(feedbackFile, tSec, tSec); | |
| const date = new Date(Date.now() - opts.feedbackMtimeDaysAgo * 86_400_000); | |
| utimesSync(feedbackFile, date, date); |
Why
A 2026-06 audit of the Squad Context System (
src/lib/run-context.ts) found two problems behind a green suite:gatherSquadContextasserted nothing real (typeof ctx === 'string', orif (length>0)guards that silently skip when context is empty). Layer order, role gating, strategy.md-as-L1, and budget behavior were unverified.feedback.mdwas injected under "act on this first" with no staleness caveat (onlystate.mdhad the feat: memory staleness caveats #721 note). Realcompanyruns surfaced 76-day-old feedback presented as current guidance.What
stalenessNote()helper; apply to feedback and state (DRY — removes duplicated inline mtime math).strategy.mdas the L1 Company layer (docs: update context-layer docs — strategy.md replaces priorities.md as L1 #878), role gating (scanner omits feedback/briefing, worker adds feedback, lead adds briefing), budget eviction of late layers, and the stale/fresh feedback caveat.Verified
(Last updated 76 days ago — verify before relying on this).Refs #893 — closes the test-gap + F1 boxes. F2 (briefing/founder-context duplication), F3 (founder-context scope), F4 (late strategy.md) remain tracked follow-ups.