fix(memos-local-plugin): salvage partially structured L3 abstraction drafts by Sanjays2402 · Pull Request #1672 · MemTensor/MemOS

Sanjays2402 · 2026-05-09T19:20:20Z

Problem

apps/memos-local-plugin aborted world_model_generate whenever the LLM returned a draft that was slightly off-schema, even when the payload was clearly salvageable. On smaller / non-strict structured-output providers (e.g. glm-4-flashx-250414), the L3 pipeline failed hard with errors like:

l3.abstraction: 'title' must be a non-empty string
l3.abstraction: 'inference' must be an array
l3.abstraction: 'constraints' must be an array

The wire shapes that triggered this in the wild:

title empty / null / non-string
inference / constraints returned as a string or single object instead of an array
list entries returned as plain strings (e.g. ["foo"]) or { body: "..." } shapes instead of { label, description }
domain_tags returned as a comma-joined string

Fix

The inline validate callback in core/memory/l3/abstract.ts was removed; instead, normaliseDraft now salvages the common malformed wire shapes and a soft floor check rejects only fully-empty drafts.

Entry coercion (`toEntries` / `coerceEntry`)

Now accepts:

arrays (canonical)
a single string -> one description-only entry
a single object -> wrapped in an array
per-entry strings -> { label: "", description: <string> }
per-entry objects keyed by body / text / content / detail / summary / value instead of description, and name / title / heading / key instead of label
evidence_ids / evidence aliases (including comma-joined strings)

Tag coercion (`normaliseTags`)

Now accepts:

arrays of strings (canonical)
comma / semicolon / newline-joined strings ("a, b, c" -> ["a","b","c"])
arrays of { label | name | tag | value | key } objects

Title derivation (`deriveTitle`)

When the LLM omits a usable title, derive one in this order:

Cleaned LLM-provided title
First non-empty inference -> environment -> constraints entry's label or description
First non-heading line of body
Joined domain_tags
Empty (soft floor will reject if everything else is also empty)

Soft floor (`assertDraftMinimallyUsable`)

If, after normalization, the draft has no title, no triple entries, no body, and no domain tags, we still throw LLM_OUTPUT_MALFORMED so we don't index garbage. Downstream validators continue to get the final say on whether a salvaged draft is good enough to persist.

Observability

When the parser had to coerce the wire format, an l3.abstract.draft_salvaged info log captures which keys were non-canonical, so operators can spot providers that consistently need salvaging.

Tests

All 12 tests in tests/unit/memory/l3/abstract.test.ts pass (5 original + 7 new):

salvages missing triple into an empty-but-titled draft instead of failing
returns llm_failed only when even normalisation can't recover anything
salvages string list entries into description-only entries
salvages {body: ...} list entries to canonical {label, description}
derives a title from inference when the LLM left it blank
falls back to domain tags when title and triple are unhelpful
splits a comma-joined domain_tags string into an array

The pre-existing test for "returns llm_failed when the LLM returns missing triple" was rewritten as "salvages missing triple into an empty-but-titled draft" because that behavior is exactly the policy the issue asks us to flip: a draft with a usable title should not be discarded just because the triple is missing.

tsc -p tsconfig.json --noEmit passes. The remaining failures in the broader unit suite (reward/*, storage/migrator, memory/l2/gain, memory/l3/cluster, pipeline/memory-core) reproduce on plain main without these changes and are unrelated to this PR.

Files Touched

apps/memos-local-plugin/core/memory/l3/abstract.ts (drop strict inline validate, add normaliseDraft salvaging, soft floor, observability log)
apps/memos-local-plugin/tests/unit/memory/l3/abstract.test.ts (rewrite the missing-triple expectation, add 7 salvage tests)

…drafts Closes MemTensor#1668 Before: `world_model_generate` aborted with `LLM_OUTPUT_MALFORMED` whenever the LLM returned a slightly-off draft, e.g. - `title` empty / non-string - `inference` / `constraints` returned as a string or non-array - list entries returned as plain strings or `{ body: ... }` shapes - `domain_tags` returned as a comma-joined string This made the L3 pipeline unusable on smaller / non-strict structured-output providers (e.g. `glm-4-flashx-250414`) even though the payload had usable content downstream. After: the inline `validate` callback is removed and `normaliseDraft` salvages the wire format: - `toEntries` accepts: * arrays (canonical) * a single string (becomes one description-only entry) * a single object (wrapped in an array) * per-entry strings (`["foo"]` -> `[{label:"", description:"foo"}]`) * per-entry objects keyed by `body` / `text` / `content` / `detail` / `summary` / `value` instead of `description`, and `name` / `title` / `heading` / `key` instead of `label` * `evidence_ids` / `evidence` aliases, including comma-joined strings - `normaliseTags` accepts: * arrays of strings (canonical) * comma / semicolon / newline-joined strings ("a, b, c" -> ["a","b","c"]) * arrays of `{label|name|tag|value|key}` objects - `deriveTitle` falls back through: cleaned LLM title -> first non-empty inference / environment / constraints label or description -> first non-heading body line -> joined domain tags A soft floor (`assertDraftMinimallyUsable`) still rejects drafts that are fully empty after normalization, so downstream validators continue to get the final say. When the parser had to coerce the wire format, an `l3.abstract.draft_salvaged` info log captures which keys were non-canonical for observability. Tests: 12/12 in `tests/unit/memory/l3/abstract.test.ts` (added 7 new salvage cases covering each scenario from the issue).

Copilot

Pull request overview

Improves robustness of the L3 abstraction pipeline in apps/memos-local-plugin by normalizing/salvaging partially-structured LLM JSON drafts (instead of failing early on minor schema deviations), while still rejecting drafts that are truly empty after normalization.

Changes:

Removed strict inline completeJson validation and added draft normalization + a minimal “soft floor” usability check.
Added coercion for common malformed shapes (singletons, strings, {body: ...}-style entries, comma-joined tags) plus title derivation fallbacks.
Expanded unit test coverage to exercise the new salvage behaviors and updated the “missing triple” expectation accordingly.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
apps/memos-local-plugin/core/memory/l3/abstract.ts	Adds normalization/salvage logic (entries/tags/title), soft-floor validation, and an observability log when coercion occurs.
apps/memos-local-plugin/tests/unit/memory/l3/abstract.test.ts	Updates existing behavior expectation and adds new unit tests covering salvage/derivation cases.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+function collectEvidenceIds(o: Record<string, unknown>): string[] | undefined {
+  const raw = o.evidenceIds ?? o.evidence_ids ?? o.evidence;
+  if (Array.isArray(raw)) {
+    const ids = (raw as unknown[]).filter((s): s is string => typeof s === "string");


+ *   1. The cleaned LLM-provided title.
+ *   2. The first inference / environment / constraints entry's label or
+ *      description (whichever is non-empty), trimmed to ~80 chars.
+ *   3. The first non-heading markdown line of `body`, trimmed.


+  // Canonical shape: array of strings. Also accept comma/whitespace
+  // separated strings (`"docker, alpine, pip"`) and arrays that mix
+  // strings and `{label}` / `{name}` / `{tag}` objects, since some
+  // providers return that shape under structured-output mode.
+  let candidates: unknown[];
+  if (Array.isArray(raw)) {
+    candidates = raw as unknown[];
+  } else if (typeof raw === "string") {
+    candidates = raw.split(/[,;\n]+/);
+  } else {


+function coerceEntry(r: unknown): L3AbstractionDraftEntry | null {
+  if (typeof r === "string") {
+    const description = sanitizeDerivedMarkdown(r);
+    if (!description) return null;
+    return { label: "", description };


Address Copilot review on PR MemTensor#1672: - collectEvidenceIds: trim each id and drop empty entries in the array branch so providers that return `["po_1 ", " tr_2"]` no longer break UI evidence-chip classification (`startsWith("po_")` was failing on leading whitespace). Mirrors the existing string-branch behavior. - buildBody: skip the empty bold prefix when an entry has no label, so string-only entries render as `- <description>` instead of `- **** \u2014 <description>`. Now that string entries are explicitly supported via coerceEntry, this rendering is reachable. - deriveTitle JSDoc: describe the actual heading/list-prefix-stripping behavior instead of claiming we skip heading lines. - normaliseTags comment: tighten to "comma/semicolon/newline-separated" to match the regex; note that whitespace inside a tag is preserved so multi-word tags survive. Tests: - New: array evidence ids with leading/trailing whitespace and empty strings get trimmed and dropped. - New: a draft mixing labelled and string-only entries renders the string-only entry as a plain bullet (no empty bold, no em-dash). - All 14 abstract.test.ts tests pass; tsc --noEmit clean.

Sanjays2402 · 2026-05-09T20:21:45Z

Thanks — addressed all four in commit e3142de:

abstract.ts:515 (collectEvidenceIds array branch): Now trims each entry and drops empties so leading/trailing whitespace from providers like "po_1 " no longer breaks startsWith("po_") evidence-chip classification. Matches the existing string-branch behavior.
abstract.ts:481 + buildBody (empty bold labels): Extracted a renderEntry helper that emits - <description> when label === "" instead of - **** — <description>. Applied across environment / inference / constraints lists.
abstract.ts:306 (deriveTitle doc): Updated to match the actual behavior: "first non-empty markdown line of body, with leading heading/list prefixes (#, -, *, +, 1.) stripped, trimmed."
abstract.ts:560 (normaliseTags doc): Tightened to "comma/semicolon/newline-separated strings" + a note that intra-tag whitespace is preserved (multi-word tags survive).

Tests: added "trims and drops empty evidence ids from array entries" and "renders string-only entries without an empty bold label in body". 14/14 in abstract.test.ts pass; tsc -p tsconfig.json --noEmit clean.

Copilot AI review requested due to automatic review settings May 9, 2026 19:20

Copilot started reviewing on behalf of Sanjays2402 May 9, 2026 19:20 View session

Copilot AI reviewed May 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(memos-local-plugin): salvage partially structured L3 abstraction drafts#1672

fix(memos-local-plugin): salvage partially structured L3 abstraction drafts#1672
Sanjays2402 wants to merge 2 commits into
MemTensor:mainfrom
Sanjays2402:fix/issue-1668

Sanjays2402 commented May 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Sanjays2402 commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Sanjays2402 commented May 9, 2026

Problem

Fix

Entry coercion (toEntries / coerceEntry)

Tag coercion (normaliseTags)

Title derivation (deriveTitle)

Soft floor (assertDraftMinimallyUsable)

Observability

Tests

Files Touched

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Sanjays2402 commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Entry coercion (`toEntries` / `coerceEntry`)

Tag coercion (`normaliseTags`)

Title derivation (`deriveTitle`)

Soft floor (`assertDraftMinimallyUsable`)