Cross-partition zero-copy copy/rename + Box-global GC#29
Merged
Conversation
…CopyPut, conditional delete) Add the engine-level primitives both features build on: - ManifestEdit/State/Serializer v3 carry cross-partition RenameIntent records, replayed on handover exactly like in-flight multipart upload state. - BoxEngine.resolveLocator (relay the source parts), zeroCopyPut (reuse a foreign partition's segments verbatim), deleteCandyConditional (LWW-safe source delete), rename-intent record/list/clear, and a public referencedSyrups() for Box-global GC. Plan in CROSS_PARTITION_ZERO_COPY_PLAN.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE
New wire ops for the client relay: GetCandyLocator/PrepareRename (-> CandyLocatorResponse carrying Parts + source HLC), ZeroCopyPut (-> HeadCandyResponse), CompleteRename (-> Ok), with Part/SegmentRef/Hlc codec helpers. New coordination keys: per-partition referenced-Syrup publication (partitionRefsKey) and the rename rendezvous marker (renameMarkerKey). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE
…elay Server: per-partition referenced-Syrup publication + Box-global GC gate (a Syrup shared cross-partition is never reclaimed while any sibling references it); ZeroCopyPut/GetCandyLocator/ PrepareRename/CompleteRename handlers; owner-local rename-intent finalize sweep (marker present => LWW-conditioned source delete; abandoned past the window => drop). Client: cross-partition copy/rename now zero-copy via the locator relay, rename eventually-atomic via the rendezvous. New config rename.intent.abandon.millis (default 60s). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE
…te-copy fallback) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE
…opy + crash/resume IT Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE
Update DESIGN.md (§3/§5/§6/§7/§9/§11/§12), README, OPERATIONS, the Hugo site (architecture/client/operations/reference), and the shipped candybox.properties.example to describe the restored cross-partition zero-copy copy/rename, Box-global Syrup GC, the rename intent journal + rendezvous (eventually-atomic rename), the new wire ops and coordination keys, and the rename.intent.abandon.millis config. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE
…ss-partition zero-copy Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE
✅ S3 compatibility gate passed
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #29 +/- ##
============================================
+ Coverage 83.66% 83.82% +0.15%
Complexity 677 677
============================================
Files 161 162 +1
Lines 8876 9284 +408
Branches 1330 1397 +67
============================================
+ Hits 7426 7782 +356
- Misses 970 998 +28
- Partials 480 504 +24 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Box partitioning (#19) split each Box into independent per-partition LSM engines with per-partition Syrup GC. That quietly broke two promises for
copy/renamewhen the source and destination keys hash to different partitions:GET+PUTbyte copy;renamewas copy-then-delete across two owners and not atomic — a crash could leave both keys live forever.The key realization is that the blob layer is already physically global: a
SegmentRefresolves bytes bysyrupIdalone (SyrupReaderopens the ledger directly, no partition scoping). So no bytes ever needed to move — what broke was the bookkeeping (GC liveness) and the cross-owner coordination. This PR fixes exactly those.What changed
1. Cross-partition copy/rename is zero-copy again. The client fetches the source's
CandyLocatorparts from the source owner (newGetCandyLocator/PrepareRename→CandyLocatorResponse) and relays them to the destination owner (newZeroCopyPut), which writes a destination locator reusing the very same Syrup segments. No object bytes are moved or re-stored.2. Box-global Syrup GC (what makes #1 safe). Each partition owner publishes its referenced-Syrup set to
boxes/<box>/partitions/<p>/refs; a Syrup is physically reclaimed only when no partition of the Box references it. Within-manifest reference counting is generalized Box-wide. Safe by construction — a partial/crashed rename can only retain a Syrup (a leak, the accepted v1 failure mode), never delete a referenced one.3. Cross-partition rename is now eventually atomic (roll-forward), not "both keys live forever":
ManifestEditbumped to v3; replayed on handover exactly like in-flight multipart state);boxes/<box>/renames/<token>) once its zero-copy put is durable;CompleteRename, on its maintenance sweep, or after a handover replays the intent — by reading the marker: present ⇒ tombstone the source (LWW-conditioned on the source HLC, so a re-PUTsource is never clobbered) + clear the intent/marker; absent pastrename.intent.abandon.millis(default 60 s) ⇒ drop the intent (the source stays live).Out of scope (unchanged): cross-partition
UploadPartCopystill byte-copies; read scaling off owners; Syrup defragmentation.Module-by-module
ManifestEdit/State/Serializer);BoxEngine.resolveLocator,zeroCopyPut,deleteCandyConditional(LWW), intent record/list/clear, publicreferencedSyrups().GetCandyLocator/PrepareRename/ZeroCopyPut/CompleteRename+CandyLocatorResponse, with Part/SegmentRef/Hlc codecs.partitionRefsKeyandrenameMarkerKey.GarbageCollector); the four handlers; owner-local rename-intent finalize/abandon sweep (CandyboxNode);rename.intent.abandon.millisconfig.Tests
PartitionedBoxIT(9/9) including a crash-and-resume case (drive prepare+put, skip the finalize, then the maintenance sweep converges to source-gone).CompactionGcCycleITandServerClientLifecycleIT.Docs
CROSS_PARTITION_ZERO_COPY_PLAN.md(the design), plus updates to DESIGN.md (§3/§5/§6/§7/§9/§11/§12), README, OPERATIONS, the Hugo site (architecture/client/operations/reference), the shippedcandybox.properties.example, and a "superseded" annotation onBOX_PARTITIONING_PLAN.mddecision #5.🤖 Generated with Claude Code
https://claude.ai/code/session_01MKLpqCjqn5Pt8dEiXuSTwE
Generated by Claude Code