Skip to content

Cross-partition zero-copy copy/rename + Box-global GC#29

Merged
predatorray merged 7 commits into
mainfrom
claude/zero-copy-partitioning-lsd079
Jun 20, 2026
Merged

Cross-partition zero-copy copy/rename + Box-global GC#29
predatorray merged 7 commits into
mainfrom
claude/zero-copy-partitioning-lsd079

Conversation

@predatorray

Copy link
Copy Markdown
Owner

Why

Box partitioning (#19) split each Box into independent per-partition LSM engines with per-partition Syrup GC. That quietly broke two promises for copy/rename when the source and destination keys hash to different partitions:

  • it was no longer zero-copy — the client fell back to a GET+PUT byte copy;
  • a cross-partition rename was copy-then-delete across two owners and not atomic — a crash could leave both keys live forever.

The key realization is that the blob layer is already physically global: a SegmentRef resolves bytes by syrupId alone (SyrupReader opens the ledger directly, no partition scoping). So no bytes ever needed to move — what broke was the bookkeeping (GC liveness) and the cross-owner coordination. This PR fixes exactly those.

What changed

1. Cross-partition copy/rename is zero-copy again. The client fetches the source's CandyLocator parts from the source owner (new GetCandyLocator / PrepareRenameCandyLocatorResponse) and relays them to the destination owner (new ZeroCopyPut), which writes a destination locator reusing the very same Syrup segments. No object bytes are moved or re-stored.

2. Box-global Syrup GC (what makes #1 safe). Each partition owner publishes its referenced-Syrup set to boxes/<box>/partitions/<p>/refs; a Syrup is physically reclaimed only when no partition of the Box references it. Within-manifest reference counting is generalized Box-wide. Safe by construction — a partial/crashed rename can only retain a Syrup (a leak, the accepted v1 failure mode), never delete a referenced one.

3. Cross-partition rename is now eventually atomic (roll-forward), not "both keys live forever":

  • the source owner records a durable rename intent in its manifest (ManifestEdit bumped to v3; replayed on handover exactly like in-flight multipart state);
  • the destination owner writes a coordination rendezvous marker (boxes/<box>/renames/<token>) once its zero-copy put is durable;
  • the source owner finalizes — synchronously via the client's CompleteRename, on its maintenance sweep, or after a handover replays the intent — by reading the marker: present ⇒ tombstone the source (LWW-conditioned on the source HLC, so a re-PUT source is never clobbered) + clear the intent/marker; absent past rename.intent.abandon.millis (default 60 s) ⇒ drop the intent (the source stays live).
  • A reader may momentarily observe both keys (so this is eventual, not linearizable, atomicity); the one residual crash window degrades to "both keys live" — the old observable outcome — never data loss. Same-partition rename stays fully atomic.

Resumer note: the architecture has no server-to-server RPC, so completion is owner-local via the ZK rendezvous marker rather than a coordinator reaching across to delete the source. Same roll-forward guarantee, no new call path. Documented in CROSS_PARTITION_ZERO_COPY_PLAN.md.

Out of scope (unchanged): cross-partition UploadPartCopy still byte-copies; read scaling off owners; Syrup defragmentation.

Module-by-module

  • lsm — manifest v3 rename intents (ManifestEdit/State/Serializer); BoxEngine.resolveLocator, zeroCopyPut, deleteCandyConditional (LWW), intent record/list/clear, public referencedSyrups().
  • protocolGetCandyLocator / PrepareRename / ZeroCopyPut / CompleteRename + CandyLocatorResponse, with Part/SegmentRef/Hlc codecs.
  • coordinationpartitionRefsKey and renameMarkerKey.
  • server — per-partition ref publication + Box-global GC gate (GarbageCollector); the four handlers; owner-local rename-intent finalize/abandon sweep (CandyboxNode); rename.intent.abandon.millis config.
  • client — cross-partition copy/rename via the locator relay; rename eventually-atomic via the rendezvous.

Tests

  • New unit tests: LSM engine primitives + intent journal, manifest v3 round-trip, protocol codecs, server Box-global GC + rename finalize + abandon.
  • PartitionedBoxIT (9/9) including a crash-and-resume case (drive prepare+put, skip the finalize, then the maintenance sweep converges to source-gone).
  • Full unit suite across all modules.
  • Real BookKeeper/ZooKeeper CompactionGcCycleIT and ServerClientLifecycleIT.

Docs

CROSS_PARTITION_ZERO_COPY_PLAN.md (the design), plus updates to DESIGN.md (§3/§5/§6/§7/§9/§11/§12), README, OPERATIONS, the Hugo site (architecture/client/operations/reference), the shipped candybox.properties.example, and a "superseded" annotation on BOX_PARTITIONING_PLAN.md decision #5.

🤖 Generated with Claude Code

https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE


Generated by Claude Code

claude added 7 commits June 19, 2026 16:12
…CopyPut, conditional delete)

Add the engine-level primitives both features build on:
- ManifestEdit/State/Serializer v3 carry cross-partition RenameIntent records,
  replayed on handover exactly like in-flight multipart upload state.
- BoxEngine.resolveLocator (relay the source parts), zeroCopyPut (reuse a foreign
  partition's segments verbatim), deleteCandyConditional (LWW-safe source delete),
  rename-intent record/list/clear, and a public referencedSyrups() for Box-global GC.

Plan in CROSS_PARTITION_ZERO_COPY_PLAN.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE
New wire ops for the client relay: GetCandyLocator/PrepareRename (-> CandyLocatorResponse
carrying Parts + source HLC), ZeroCopyPut (-> HeadCandyResponse), CompleteRename (-> Ok),
with Part/SegmentRef/Hlc codec helpers. New coordination keys: per-partition referenced-Syrup
publication (partitionRefsKey) and the rename rendezvous marker (renameMarkerKey).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE
…elay

Server: per-partition referenced-Syrup publication + Box-global GC gate (a Syrup shared
cross-partition is never reclaimed while any sibling references it); ZeroCopyPut/GetCandyLocator/
PrepareRename/CompleteRename handlers; owner-local rename-intent finalize sweep (marker present
=> LWW-conditioned source delete; abandoned past the window => drop). Client: cross-partition
copy/rename now zero-copy via the locator relay, rename eventually-atomic via the rendezvous.
New config rename.intent.abandon.millis (default 60s).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE
…te-copy fallback)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE
…opy + crash/resume IT

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE
Update DESIGN.md (§3/§5/§6/§7/§9/§11/§12), README, OPERATIONS, the Hugo site
(architecture/client/operations/reference), and the shipped candybox.properties.example to
describe the restored cross-partition zero-copy copy/rename, Box-global Syrup GC, the rename
intent journal + rendezvous (eventually-atomic rename), the new wire ops and coordination keys,
and the rename.intent.abandon.millis config.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE
…ss-partition zero-copy

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE
@github-actions

Copy link
Copy Markdown

✅ S3 compatibility gate passed

  • mode: gate
  • result: 192 passed in 88.95s (0:01:28)

@codecov

codecov Bot commented Jun 20, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 87.14953% with 55 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.82%. Comparing base (d918879) to head (f5ec19f).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...a/me/predatorray/candybox/server/CandyboxNode.java 75.25% 16 Missing and 8 partials ⚠️
...redatorray/candybox/server/NodeRequestHandler.java 78.18% 6 Missing and 6 partials ⚠️
...redatorray/candybox/lsm/manifest/RenameIntent.java 20.00% 4 Missing and 4 partials ⚠️
...me/predatorray/candybox/client/CandyboxClient.java 85.71% 2 Missing and 2 partials ⚠️
.../me/predatorray/candybox/lsm/engine/BoxEngine.java 95.23% 0 Missing and 3 partials ⚠️
...rray/candybox/lsm/manifest/ManifestSerializer.java 94.11% 0 Missing and 2 partials ⚠️
...me/predatorray/candybox/protocol/MessageCodec.java 98.83% 0 Missing and 1 partial ⚠️
.../predatorray/candybox/server/GarbageCollector.java 80.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main      #29      +/-   ##
============================================
+ Coverage     83.66%   83.82%   +0.15%     
  Complexity      677      677              
============================================
  Files           161      162       +1     
  Lines          8876     9284     +408     
  Branches       1330     1397      +67     
============================================
+ Hits           7426     7782     +356     
- Misses          970      998      +28     
- Partials        480      504      +24     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@predatorray predatorray merged commit 4bf1f55 into main Jun 20, 2026
9 checks passed
@predatorray predatorray deleted the claude/zero-copy-partitioning-lsd079 branch June 20, 2026 02:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants