Skip to content

ADR-221: Implement Sample class hierarchy and ManifestStore with JSON round-trip serialisation#203

Merged
jodavis merged 4 commits into
feature/ADR-191-oop-ml-pipelinefrom
dev/claude/ADR-221-implement-the-sample-class-hierarchy-and
May 30, 2026
Merged

ADR-221: Implement Sample class hierarchy and ManifestStore with JSON round-trip serialisation#203
jodavis merged 4 commits into
feature/ADR-191-oop-ml-pipelinefrom
dev/claude/ADR-221-implement-the-sample-class-hierarchy-and

Conversation

@jodavis-claude
Copy link
Copy Markdown
Collaborator

Work item

ADR-221: Implement the Sample class hierarchy and Manifest[S] / ManifestStore in ml/pipeline/core/ — the shared data model that every subsequent pipeline stage depends on, including JSON round-trip serialisation and the content-hash formula.

Changes

  • ml/pipeline/__init__.py — new empty package marker
  • ml/pipeline/core/__init__.py — new empty package marker
  • ml/pipeline/core/sample.py — all six dataclasses (Sample, SampleWithPath, TextSample, AudioSample, SampleSpectrogram, SampleTokens) with every field from the spec; TextSample.id assigned str(uuid4())
  • ml/pipeline/core/manifest.pyManifest[S] generic with O(1) by_id/by_content_hash index dicts; ManifestStore.read() (type-registry dispatch on sample_type) and write() (version-1 JSON)
  • ml/pyproject.toml — minimal pytest config (testpaths = ["test"], pythonpath = ["."]) enabling plain pytest invocation from ml/
  • ml/test/__init__.py — new empty package marker
  • ml/test/pipeline/__init__.py — new empty package marker
  • ml/test/pipeline/core/__init__.py — new empty package marker
  • ml/test/pipeline/core/test_manifest.py — 26 unit tests covering all four round-trip scenarios, sample_type dispatch, TextSample.seed == 0 serialisation, SampleSpectrogram.parent_id preservation, and AudioSample.applied_values numeric-type fidelity

Design decisions

  • seed has no dataclass default — the spec states "PhraseVariator explicitly sets seed=0 at construction", meaning callers pass it explicitly; a default would silently mask missing arguments in later stages.
  • Mutable dataclasses (frozen=False) — the spec is silent on frozen=; downstream ModifierStage uses Sample instances as dict values (not keys), so mutability is safe and avoids unnecessary copy overhead.
  • pyproject.toml created in ml/ — resolves the known ambiguity that validate-build and validate-tests only cover .NET; enables pytest ml/test/pipeline/core/ without manual PYTHONPATH manipulation.

Generated by Claude Code

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 30, 2026

Test Results

401 tests  ±0   401 ✅ ±0   2m 5s ⏱️ -35s
  5 suites ±0     0 💤 ±0 
  5 files   ±0     0 ❌ ±0 

Results for commit 091b5ca. ± Comparison against base commit 2b4bc34.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Collaborator Author

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Priority 1 — Correctness

  • ml/pipeline/core/manifest.py line 67: ManifestStore.write() raises IndexError on an empty manifest. sample_type = _SAMPLE_TYPE_KEY[type(samples[0])] throws when manifest.samples is empty. Manifest([]) is a valid object (see test_empty_manifest), so write() must either support it or raise a descriptive ValueError("Cannot write empty manifest") instead of an opaque index error.

  • ml/pipeline/core/manifest.py line 61: ManifestStore.read() ignores the "version" field entirely. The JSON schema explicitly carries "version": 1. Without a version check, a file written by a future incompatible schema is silently deserialized as version 1 and produces corrupt objects with no diagnostic. Add if data.get("version") != 1: raise ValueError(f"Unsupported manifest version: {data.get('version')!r}").

Priority 4 — Test coverage

  • ml/test/pipeline/core/test_manifest.py: No test exercises ManifestStore().write(Manifest([]), path). The existing test_empty_manifest creates Manifest([]) and calls lookup methods but never calls write(), so the IndexError from issue 1 is not caught by the test suite. Add a test under TestManifestLookup (or a dedicated class) that confirms the behaviour of writing an empty manifest.

Style (non-blocking)

  • ml/test/pipeline/core/test_manifest.py line 4: import tempfile is unused — tests use the tmp_path pytest fixture.
  • ml/pipeline/core/manifest.py line 61: _SAMPLE_TYPE_CLASS[data["sample_type"]] raises a bare KeyError on unrecognised types. A ValueError(f"Unknown sample_type: {data['sample_type']!r}") would be more informative.

Generated by Claude Code

Copy link
Copy Markdown
Owner

@jodavis jodavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This review is focused only on the dev-team changes. Make these changes, and overall the design should be that subagents work together, with the script as the orchestrator, but they don't interact directly with GitHub or Jira. The top level agent passively watches the script's progress and makes all the updates to PRs, work items, and issues. If there's subs additional information that the top level agent needs to complete an update, we should find a way to plumb it through. But in general, the status updates front the script and the content in the context file should be enough.

There is one other thing: since the developer and reviewer can no longer communicate through PR threads, they will have to pass a structured list of comment threads back and forth, including all the comments and their responses, who they came from, and whether or not there are resolved. That structure with have to go into the context file as well, so the top level session can see them. Might need to add an ID as well, so the top level agent can map them to GitHub PR comment threads and update the right ones.

Also, there is a bug again that the review comments were added as a normal comment instead of a review. All the rules about PR comments that we added in the reviewer need to be transferred to the top level agent now.

Comment thread .claude/agents/debugger.md Outdated
Comment thread .claude/agents/developer.md Outdated
Comment thread .claude/agents/researcher.md Outdated
Comment thread .claude/agents/reviewer.md Outdated
Comment thread .claude/commands/debugger-investigate.md Outdated
Comment thread .claude/commands/dev-team.md Outdated
Comment thread .claude/commands/dev-team.md Outdated
Comment thread .claude/commands/dev-team.md Outdated
Comment thread .claude/commands/dev-team.md Outdated
Comment thread .claude/commands/prepare-pr-details.md
Copy link
Copy Markdown
Collaborator Author

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No prior review threads.

Two new Priority 1 (correctness/fault tolerance) issues found in ml/pipeline/core/manifest.py:

  1. manifest.py:63 — data["sample_type"] is an unguarded dict subscript. The preceding version check uses data.get("version") defensively, but sample_type access raises KeyError instead of ValueError for a malformed file. Use data.get("sample_type") and raise ValueError explicitly to match the established pattern.

  2. manifest.py:70-81 — write() does not validate that all samples share the same type. Python generics are erased at runtime, so Manifest([TextSample(...), AudioSample(...)]) is constructable. If passed to write(), the method silently produces a JSON file whose sample_type header matches only the first entry; on read(), _deserialise tries to construct the wrong class for the remaining entries and fails with a confusing KeyError far from the original mistake — and the corrupt file is already on disk. Add a guard at the top of write(): if any(type(s) is not type(samples[0]) for s in samples): raise ValueError(f"Manifest contains mixed sample types: {set(type(s).name for s in samples)}").

All exit criteria are otherwise satisfied: all six dataclasses are present with fields matching spec lines 272-309; Manifest[S] with O(1) by_content_hash/by_id; ManifestStore read/write with discriminated dispatch; JSON schema version 1; all four round-trip tests plus dispatch, seed=0, parent_id, and numeric-type tests. The recent fix commits (version validation, ValueError for unknown sample_type, empty-manifest guard) are all correct additions.


Generated by Claude Code

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements the ADR-221 Python ML data model: a Sample dataclass hierarchy (TextSample, AudioSample, SampleSpectrogram, SampleTokens) and a Manifest[S] / ManifestStore with JSON v1 round-trip serialisation, plus pytest configuration to run the new unit tests. The diff also pulls in unrelated changes: a small C# PersistSettings test fix for Linux, a switch of validate-build.sh to a Traversal-based project, and a substantial restructure of the .claude dev-team pipeline (scrum-master pattern, new prepare-pr-details skill, removal of developer-create-pr, structured review/sign-off JSON).

Changes:

  • Add ml/pipeline/core/{sample,manifest}.py with six dataclasses and a ManifestStore that dispatches on sample_type, plus 26 round-trip tests and ml/pyproject.toml.
  • Migrate scripts/validate-build.sh to a Microsoft.Build.Traversal project that filters out WPF/Console hosts on Linux, with a scoped Directory.Build.props that removes inherited package references.
  • Refactor .claude/ dev-team workflow: top-level session handles all GitHub/Jira calls via [DEV-TEAM] milestone markers; developer-create-pr is replaced by prepare-pr-details, and reviewer/sign-off skills now return structured JSON.

Reviewed changes

Copilot reviewed 19 out of 24 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
ml/pipeline/core/sample.py Six dataclasses matching the ADR-221 spec.
ml/pipeline/core/manifest.py Manifest[S] with O(1) indexes and ManifestStore JSON read/write.
ml/pipeline/{,core}/init.py, ml/test/.../init.py New package markers.
ml/pyproject.toml Pytest config (testpaths, pythonpath, not e2e default, marker).
ml/test/pipeline/core/test_manifest.py 26 unit tests for round-trip, dispatch, lookup, empty manifest.
test/.../PersistSettingsTests.cs Skip negative path expectations when %LocalAppData% doesn't expand (Linux).
scripts/validate-build.sh Build the new traversal project instead of the solution.
scripts/validate-build.proj Traversal project including src/test, excluding WPF/Console hosts on Linux.
scripts/Directory.Build.props Strip root package references for traversal projects.
.claude/scripts/dev_team.py Add pr_details/signoff_notes/base_branch context, [DEV-TEAM] milestones, new ReviewStep/SignoffStep flow.
.claude/commands/prepare-pr-details.md New skill — returns PR JSON instead of calling gh.
.claude/commands/developer-create-pr.md Deleted (superseded).
.claude/commands/reviewer-review.md Uses git diff origin/$BASE_BRANCH..HEAD; returns body+comments+status.
.claude/commands/reviewer-sign-off.md Consumes $REVIEW_NOTES; returns sign_off_body+status.
.claude/commands/dev-team.md Scrum-master responsibilities and milestone-to-action table.
.claude/commands/debugger-investigate.md Reads issue from context file; writes debug-comment section.
.claude/commands/researcher-issue.md Adds MCP fallback when gh is unavailable.
.claude/agents/{developer,reviewer}.md Skill renames; reviewer no longer posts to GitHub.
.claude/settings.json Grant write permissions for .claude/logs/dev-team/* and command/agent/script dirs.

Comment thread ml/pipeline/core/manifest.py
Comment thread ml/pipeline/core/manifest.py
@jodavis jodavis marked this pull request as ready for review May 30, 2026 19:34
@jodavis jodavis force-pushed the dev/claude/ADR-221-implement-the-sample-class-hierarchy-and branch from 84ed04b to d79e446 Compare May 30, 2026 20:21
claude added 3 commits May 30, 2026 13:23
…core/

- sample.py: Sample(ABC), SampleWithPath(Sample,ABC), TextSample, AudioSample,
  SampleSpectrogram, SampleTokens dataclasses with all spec fields.
- manifest.py: Manifest[S] generic with O(1) by_id/by_content_hash lookups;
  ManifestStore.read() dispatches on sample_type; write() emits version-1 JSON.
  TextSample JSON omits path/parent_content_hash/applied_values; Spectrogram and
  Tokens JSON omits applied_values and includes parent_id per spec.
- pyproject.toml: minimal pytest config so plain `pytest` from ml/ works.
- Ambiguity resolutions: mutable dataclasses (no frozen=True); seed not a default
  field (callers pass seed=0 explicitly, matching spec's "PhraseVariator sets seed=0").

https://claude.ai/code/session_01DLVaMxWuhm75L3YaMdVTFT
1. Manifest.__init__: raise ValueError on duplicate sample ids (UUIDs must
   be unique); keep first occurrence for duplicate content_hash (documented
   contract — deterministic stages can produce identical hashes).

2. ManifestStore.read(): use data.get("sample_type") and raise ValueError
   explicitly when the key is absent, matching the defensive pattern used
   for "version".

3. ManifestStore.write(): validate that all samples share the same type
   before writing; a mixed-type Manifest would produce a JSON file whose
   sample_type header mismatches later entries, causing read() to KeyError.

Adds four new tests covering each guard.

https://claude.ai/code/session_01DLVaMxWuhm75L3YaMdVTFT
@jodavis jodavis force-pushed the dev/claude/ADR-221-implement-the-sample-class-hierarchy-and branch from d79e446 to b73a4c7 Compare May 30, 2026 20:23
Comment thread ml/pipeline/core/sample.py
@jodavis
Copy link
Copy Markdown
Owner

jodavis commented May 30, 2026

@copilot Address all PR review comments

Copy link
Copy Markdown
Contributor

Copilot AI commented May 30, 2026

@copilot Address all PR review comments

Addressed in existing commits b8b7c67 and b73a4c7. These include the manifest read/write validation hardening (version/sample_type/empty/mixed-type guards), improved error behavior, added coverage for empty write + validation edge cases, and test cleanup.

Copilot AI requested a review from jodavis May 30, 2026 21:00
TextSample now derives its id from content_hash via __post_init__,
so callers no longer pass id= explicitly. This means two TextSamples
with the same content always share the same id, enforcing the
deduplication invariant at the type level.

Update ManifestStore._deserialise to omit id= when constructing
TextSample, and update tests to match: duplicate-id and
duplicate-content-hash tests now use AudioSample (where id and
content_hash are independent) to exercise those code paths.

https://claude.ai/code/session_01DLVaMxWuhm75L3YaMdVTFT
@jodavis jodavis merged commit b835493 into feature/ADR-191-oop-ml-pipeline May 30, 2026
2 checks passed
@jodavis jodavis deleted the dev/claude/ADR-221-implement-the-sample-class-hierarchy-and branch May 30, 2026 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants