fix(memos-local-plugin): salvage partially structured L3 abstraction drafts#1672
Open
Sanjays2402 wants to merge 2 commits into
Open
fix(memos-local-plugin): salvage partially structured L3 abstraction drafts#1672Sanjays2402 wants to merge 2 commits into
Sanjays2402 wants to merge 2 commits into
Conversation
…drafts Closes MemTensor#1668 Before: `world_model_generate` aborted with `LLM_OUTPUT_MALFORMED` whenever the LLM returned a slightly-off draft, e.g. - `title` empty / non-string - `inference` / `constraints` returned as a string or non-array - list entries returned as plain strings or `{ body: ... }` shapes - `domain_tags` returned as a comma-joined string This made the L3 pipeline unusable on smaller / non-strict structured-output providers (e.g. `glm-4-flashx-250414`) even though the payload had usable content downstream. After: the inline `validate` callback is removed and `normaliseDraft` salvages the wire format: - `toEntries` accepts: * arrays (canonical) * a single string (becomes one description-only entry) * a single object (wrapped in an array) * per-entry strings (`["foo"]` -> `[{label:"", description:"foo"}]`) * per-entry objects keyed by `body` / `text` / `content` / `detail` / `summary` / `value` instead of `description`, and `name` / `title` / `heading` / `key` instead of `label` * `evidence_ids` / `evidence` aliases, including comma-joined strings - `normaliseTags` accepts: * arrays of strings (canonical) * comma / semicolon / newline-joined strings ("a, b, c" -> ["a","b","c"]) * arrays of `{label|name|tag|value|key}` objects - `deriveTitle` falls back through: cleaned LLM title -> first non-empty inference / environment / constraints label or description -> first non-heading body line -> joined domain tags A soft floor (`assertDraftMinimallyUsable`) still rejects drafts that are fully empty after normalization, so downstream validators continue to get the final say. When the parser had to coerce the wire format, an `l3.abstract.draft_salvaged` info log captures which keys were non-canonical for observability. Tests: 12/12 in `tests/unit/memory/l3/abstract.test.ts` (added 7 new salvage cases covering each scenario from the issue).
Contributor
There was a problem hiding this comment.
Pull request overview
Improves robustness of the L3 abstraction pipeline in apps/memos-local-plugin by normalizing/salvaging partially-structured LLM JSON drafts (instead of failing early on minor schema deviations), while still rejecting drafts that are truly empty after normalization.
Changes:
- Removed strict inline
completeJsonvalidation and added draft normalization + a minimal “soft floor” usability check. - Added coercion for common malformed shapes (singletons, strings,
{body: ...}-style entries, comma-joined tags) plus title derivation fallbacks. - Expanded unit test coverage to exercise the new salvage behaviors and updated the “missing triple” expectation accordingly.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| apps/memos-local-plugin/core/memory/l3/abstract.ts | Adds normalization/salvage logic (entries/tags/title), soft-floor validation, and an observability log when coercion occurs. |
| apps/memos-local-plugin/tests/unit/memory/l3/abstract.test.ts | Updates existing behavior expectation and adds new unit tests covering salvage/derivation cases. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| function collectEvidenceIds(o: Record<string, unknown>): string[] | undefined { | ||
| const raw = o.evidenceIds ?? o.evidence_ids ?? o.evidence; | ||
| if (Array.isArray(raw)) { | ||
| const ids = (raw as unknown[]).filter((s): s is string => typeof s === "string"); |
| * 1. The cleaned LLM-provided title. | ||
| * 2. The first inference / environment / constraints entry's label or | ||
| * description (whichever is non-empty), trimmed to ~80 chars. | ||
| * 3. The first non-heading markdown line of `body`, trimmed. |
Comment on lines
+551
to
+560
| // Canonical shape: array of strings. Also accept comma/whitespace | ||
| // separated strings (`"docker, alpine, pip"`) and arrays that mix | ||
| // strings and `{label}` / `{name}` / `{tag}` objects, since some | ||
| // providers return that shape under structured-output mode. | ||
| let candidates: unknown[]; | ||
| if (Array.isArray(raw)) { | ||
| candidates = raw as unknown[]; | ||
| } else if (typeof raw === "string") { | ||
| candidates = raw.split(/[,;\n]+/); | ||
| } else { |
Comment on lines
+477
to
+481
| function coerceEntry(r: unknown): L3AbstractionDraftEntry | null { | ||
| if (typeof r === "string") { | ||
| const description = sanitizeDerivedMarkdown(r); | ||
| if (!description) return null; | ||
| return { label: "", description }; |
Address Copilot review on PR MemTensor#1672: - collectEvidenceIds: trim each id and drop empty entries in the array branch so providers that return `["po_1 ", " tr_2"]` no longer break UI evidence-chip classification (`startsWith("po_")` was failing on leading whitespace). Mirrors the existing string-branch behavior. - buildBody: skip the empty bold prefix when an entry has no label, so string-only entries render as `- <description>` instead of `- **** \u2014 <description>`. Now that string entries are explicitly supported via coerceEntry, this rendering is reachable. - deriveTitle JSDoc: describe the actual heading/list-prefix-stripping behavior instead of claiming we skip heading lines. - normaliseTags comment: tighten to "comma/semicolon/newline-separated" to match the regex; note that whitespace inside a tag is preserved so multi-word tags survive. Tests: - New: array evidence ids with leading/trailing whitespace and empty strings get trimmed and dropped. - New: a draft mixing labelled and string-only entries renders the string-only entry as a plain bullet (no empty bold, no em-dash). - All 14 abstract.test.ts tests pass; tsc --noEmit clean.
Author
|
Thanks — addressed all four in commit
Tests: added |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1668
Problem
apps/memos-local-pluginabortedworld_model_generatewhenever the LLM returned a draft that was slightly off-schema, even when the payload was clearly salvageable. On smaller / non-strict structured-output providers (e.g.glm-4-flashx-250414), the L3 pipeline failed hard with errors like:l3.abstraction: 'title' must be a non-empty stringl3.abstraction: 'inference' must be an arrayl3.abstraction: 'constraints' must be an arrayThe wire shapes that triggered this in the wild:
titleempty /null/ non-stringinference/constraintsreturned as a string or single object instead of an array["foo"]) or{ body: "..." }shapes instead of{ label, description }domain_tagsreturned as a comma-joined stringFix
The inline
validatecallback incore/memory/l3/abstract.tswas removed; instead,normaliseDraftnow salvages the common malformed wire shapes and a soft floor check rejects only fully-empty drafts.Entry coercion (
toEntries/coerceEntry)Now accepts:
{ label: "", description: <string> }body/text/content/detail/summary/valueinstead ofdescription, andname/title/heading/keyinstead oflabelevidence_ids/evidencealiases (including comma-joined strings)Tag coercion (
normaliseTags)Now accepts:
"a, b, c"->["a","b","c"]){ label | name | tag | value | key }objectsTitle derivation (
deriveTitle)When the LLM omits a usable title, derive one in this order:
inference->environment->constraintsentry's label or descriptionbodydomain_tagsSoft floor (
assertDraftMinimallyUsable)If, after normalization, the draft has no title, no triple entries, no body, and no domain tags, we still throw
LLM_OUTPUT_MALFORMEDso we don't index garbage. Downstream validators continue to get the final say on whether a salvaged draft is good enough to persist.Observability
When the parser had to coerce the wire format, an
l3.abstract.draft_salvagedinfo log captures which keys were non-canonical, so operators can spot providers that consistently need salvaging.Tests
All 12 tests in
tests/unit/memory/l3/abstract.test.tspass (5 original + 7 new):llm_failedonly when even normalisation can't recover anything{body: ...}list entries to canonical{label, description}domain_tagsstring into an arrayThe pre-existing test for "returns llm_failed when the LLM returns missing triple" was rewritten as "salvages missing triple into an empty-but-titled draft" because that behavior is exactly the policy the issue asks us to flip: a draft with a usable title should not be discarded just because the triple is missing.
tsc -p tsconfig.json --noEmitpasses. The remaining failures in the broader unit suite (reward/*,storage/migrator,memory/l2/gain,memory/l3/cluster,pipeline/memory-core) reproduce on plainmainwithout these changes and are unrelated to this PR.Files Touched
apps/memos-local-plugin/core/memory/l3/abstract.ts(drop strict inline validate, addnormaliseDraftsalvaging, soft floor, observability log)apps/memos-local-plugin/tests/unit/memory/l3/abstract.test.ts(rewrite the missing-triple expectation, add 7 salvage tests)