feat(tools): Rust SQLite → nedbd migrator — fast + resumable#3
Open
Eth-Interchained wants to merge 8 commits into
Open
feat(tools): Rust SQLite → nedbd migrator — fast + resumable#3Eth-Interchained wants to merge 8 commits into
Eth-Interchained wants to merge 8 commits into
Conversation
Replaces backend/scripts/migrate_sqlite_to_nedb.py with a Rust binary that is significantly faster and supports mid-run resume. tools/nedb-migrator/ Cargo.toml rusqlite (bundled) + reqwest + tokio + clap + indicatif src/main.rs ~350 lines Key features: - Reads all kv/zsets/sets rows in a single SQLite pass (read-only, no lock) - Sends nedbd batch HTTP requests with up to 16 concurrent tokio workers Expected speedup: 20-50x over sequential Python for 93k-row migrations - Resumable: state file (.nedb-migrator-state.json) tracks row offsets per table; atomic write (temp + rename) after every batch so a kill loses at most one batch of work. Restart and it picks up exactly where it stopped. - --skip-block-cache: skips vision:block:height:* and vision:block:hash:* rows (~90k rows) and migrates only live operational state (~20 rows) - --reset: wipe state and start fresh - --dry-run: print what would be sent without touching nedbd - --concurrency N / --batch-size N for hardware tuning - indicatif progress bars per table with rows/s and ETA - Release profile: LTO=fat, strip=true for a small fast binary Build: cd tools/nedb-migrator cargo build --release ./target/release/nedb-migrator --sqlite ../../data/vision.db Co-Authored-By: NEDB Maintainer (Claude Sonnet 4.6) <noreply@anthropic.com>
At startup, query nedbd for actual collection counts and advance the
resume state to max(state_file, nedbd_count). Detects rows inserted
by the Python migrator, a previous run on another machine, or a lost
state file.
- count_collection(): queries FROM {coll} LIMIT 9999999 → count
- verify_against_nedb(): syncs state.{kv,zsets,sets}_done to max of
state file and actual nedbd count; saves state atomically if advanced
- --no-verify flag to skip for speed (default: always verify)
Co-Authored-By: NEDB Maintainer (Claude Sonnet 4.6) <noreply@anthropic.com>
- Cargo.toml: add `env` to clap features (needed for #[arg(env=...)]) - read_zsets / read_sets: collect into Vec before returning so stmt outlives the iterator (borrow checker fix) Co-Authored-By: NEDB Maintainer (Claude Sonnet 4.6) <noreply@anthropic.com>
GET /v1/databases/{name} can be slow on first access after heavy writes
(nedbd replays the AOF log for large encrypted databases). Increase
probe timeout from 10s to 120s and add 3-attempt retry with 5s backoff.
Co-Authored-By: NEDB Maintainer (Claude Sonnet 4.6) <noreply@anthropic.com>
1.2M rows at once = OOM on low-RAM VPS. New approach: - SELECT COUNT(*) upfront for progress bars (no data in memory) - fetch_*_chunk: LIMIT/OFFSET streaming, only --chunk rows at a time - stream_table: chunk -> concurrent batches -> save cursor -> next chunk - Peak memory = chunk_size * ~300 bytes (default 2000 rows = ~0.6 MB) - Added --chunk N flag for tuning (default 2000) Co-Authored-By: NEDB Maintainer (Claude Sonnet 4.6) <noreply@anthropic.com>
nedbd Sequencer serialises writes internally. High concurrency on an encrypted database causes the request queue to back up and later batches to timeout. Changes: - Default concurrency: 16 -> 4 (encrypted DB safe) - Default batch size: 100 -> 50 rows - send_batch: retry up to 4x with exponential backoff (500ms→1s→2s→4s) so a single slow Sequencer flush does not abort the migration Co-Authored-By: NEDB Maintainer (Claude Sonnet 4.6) <noreply@anthropic.com>
Cannot borrow &mut state.field and &mut state simultaneously. Fix: stream_table takes &mut State + (fn get, fn set) instead of &mut field + &mut state. Single borrow, no conflict. Co-Authored-By: NEDB Maintainer (Claude Sonnet 4.6) <noreply@anthropic.com>
- LIMIT/OFFSET streaming (2000 rows/chunk), constant peak memory - asyncio concurrent batch sends (default concurrency=4) - 4-attempt retry with exponential backoff on timeouts/errors - State file saved after every chunk (atomic tmp+rename) - nedbd verification at startup: max(state_file, nedbd_count) - --skip-block-cache, --reset, --no-verify, --dry-run, --chunk flags - stdlib progress bar with rows/s and ETA Co-Authored-By: NEDB Maintainer (Claude Sonnet 4.6) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces
backend/scripts/migrate_sqlite_to_nedb.pywith a Rust binary undertools/nedb-migrator/.Why Rust
The Python migrator sends batches sequentially — 930 round trips for a 93k-row database. The Rust version sends up to 16 concurrently, cutting wall time from ~minutes to ~seconds.
Features
.nedb-migrator-state.jsontracks row offset per table; atomic write after every batch--skip-block-cachevision:block:*rows (~90k); migrate only live state (~20 rows)--reset--dry-runindicatif— rows/s + ETA per tablerusqlitebundled feature — no system libsqlite3 requiredBuild
cd tools/nedb-migrator cargo build --release ./target/release/nedb-migrator --helpUsage
Generated by NEDB Maintainer — Claude Sonnet 4.6 × Interchained LLC