Skip to content

feat(extraction): EXP-05 H-310 — extend INSTRUCTION_MARKERS for BEAM phrasings#9

Draft
moralespanitz wants to merge 1 commit intoexperiment/phase2-combined-stackfrom
feature/h310-extend-instruction-markers
Draft

feat(extraction): EXP-05 H-310 — extend INSTRUCTION_MARKERS for BEAM phrasings#9
moralespanitz wants to merge 1 commit intoexperiment/phase2-combined-stackfrom
feature/h310-extend-instruction-markers

Conversation

@moralespanitz
Copy link
Copy Markdown

Summary

DB diagnostic on Stage-7 v1 ingestion (2000 facts) showed only 15 (0.8%) were tagged metadata.fact_role: 'instruction'. The original markers (always X / never Y / from now on) only match strict imperatives. BEAM users phrase instructions softly, so the EXP-05 boost has nothing to boost.

This PR extends the markers to 24 total (12 original preserved + 12 new BEAM soft imperatives) and adds an 11-pattern false-positive filter.

Stacked on experiment/phase2-combined-stack (PR base) since the EXP-05 commit lives there.

Validation

  • 14 new unit tests (9 positive + 5 FP-prevention), all pass
  • 33 total extraction-enrichment tests pass
  • 8 instruction-boost tests still pass
  • typecheck clean

Risks flagged

  • please is broad
  • want overcatches one-off intents
  • prefer matches preference statements (importance floored to 0.95)

Behind instructionBoostEnabled flag — defaults preserve current behavior.

Full hypothesis audit trail in atomicmemory-research/memory-research/benchmarks-sprint2/experiments/EXPERIMENT-LEDGER.md (H-310). Stage-7 v5 in flight will measure IF lift.

…phrasings

DB diagnostic showed only 0.8% of Stage-7 v1 ingest facts (15/2000)
were tagged metadata.fact_role='instruction', because the original
markers list ('always X', 'never Y', 'from now on', 'going forward')
matches strict imperatives but misses BEAM-style soft imperatives:

  - 'I want X'
  - 'make sure to Y'
  - 'please Z'
  - 'I prefer A'
  - 'I'd like B'
  - 'always include C'
  - 'remember to D'
  - ... (full list in the diff)

False-positive prevention:
  - 'I want to know' (question prefix) does not match
  - 'I prefer not to' (negation) handled separately

Targets BEAM IF (currently 0/2 in v1 dryrun). Honcho IF is 0.844 — the
biggest improvement headroom is closing this gap.

Behind the existing instructionBoostEnabled flag (defaults preserve
current behavior).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant