Full codebase review — Stage 1 preprocessing + 4DGS pipeline + interactive viewer by adityasingh2400 · Pull Request #2 · adityasingh2400/FreezeFrame

adityasingh2400 · 2026-03-28T02:41:34Z

Summary

Full codebase review of all work so far:

Stage 1 (Arshia): Audio sync + frame extraction pipeline (stage1/preprocess.py). Takes 4 raw phone videos, detects clap sync point via librosa onset detection, extracts time-aligned frames as 1-indexed JPGs in cam01/-cam04/ folders.
Stage 3 pipeline (run_training.py): End-to-end script that validates data, trains 4D Gaussian Splatting, exports per-frame .ply files, and generates the viewer manifest. Supports --fast and --export-only modes.
Interactive viewer (viewer/): Vite + Three.js + Spark.js web app with orbit/zoom controls, timeline scrubbing, playback speed control, camera presets with smooth animation, keyboard shortcuts, and Director Mode for automated cinematic camera paths.
4DGS training configs (4DGaussians/arguments/multipleview/replay.py and replay_fast.py): Tuned for 4 cameras, 80 frames, 720x1280 vertical video.

Scaffolds the full 4-stage pipeline so all 4 team members can clone and build in parallel against well-defined interface contracts. What's included: - config.yaml: centralized paths shared by all stages - scripts/utils.py: shared config loader (DRY across stages) - Stage stubs with function signatures + docstrings: - stage1_sync.py (Arshia): audio sync + frame extraction - stage2_colmap.py (Divij): COLMAP pose recovery + LLFF export - stage3_4dgs.py (Aditya): 4DGS training with auto format detection - stage4_viewer.py (Mia): viewer HTTP server - stage5/6: post-MVP stubs for gap repair + temporal polish - validate_contracts.py: checks Contracts A, B, C with colored output - download_demo_scene.py: pre-baked .splat for viewer dev from minute 0 - server/gemini_proxy.py: WebSocket proxy with all 7 Gemini Live tools - viewer/: HTML + JS with orbit/zoom/time controls + Gemini Live client - Makefile: make sync, colmap, train, view, proxy, demo, validate - .gitignore, .env.example, requirements.txt Dual-path contracts (COLMAP binary + LLFF poses_bounds.npy) so 4DGS has two input format options. Stage 3 auto-detects which is available.

Vite-based web viewer that loads Gaussian Splat .ply/.splat files with orbit/zoom controls, time scrubbing, playback speed control, camera presets, keyboard shortcuts, and Director Mode for automated cinematic camera paths. Currently loads a demo splat; will switch to multi-frame mode when 4DGS training output is available via manifest.json.

Single-command script that validates data, trains 4DGS, exports per-frame .ply files, and generates the viewer manifest. Supports --fast mode for quick iteration and --export-only to re-export from existing checkpoints. Also updates temporal grid resolution to 40 (matching 80 actual frames).

- Implement stage2_colmap.py: feature extraction, exhaustive matching, sparse mapper, LLFF export, dense reconstruction (cloud only) - Add --sparse-only and --strategy flags for local vs cloud runs - Add cloud_setup.sh for one-command cloud box provisioning - Update requirements.txt with pycolmap, config.yaml with cloud host Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

pycolmap.match_exhaustive aborts with a fatal error when writing matches for large image sets (320 images). Shell out to colmap exhaustive_matcher binary instead, with pycolmap as fallback if binary not on PATH. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

pycolmap crashes in nohup environments where PATH is stripped and shutil.which() returns None, forcing fallback to pycolmap which aborts on large datasets. Now calls colmap binary directly for extraction, matching, and mapping. pycolmap kept only for Reconstruction loading and LLFF conversion (read-only, no crash risk). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Set QT_QPA_PLATFORM=offscreen so colmap binary works on headless servers - Support .jpg frames in addition to .png (cloud box has .jpg files) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

GPU SIFT extraction/matching requires OpenGL which isn't available on the RunPod headless instance. Force CPU mode. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Matching uses CUDA directly — no OpenGL needed on headless server. CPU matching would take hours; A100 GPU takes ~5 minutes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

apt COLMAP has no CUDA, so 'all' (320 imgs) takes hours on CPU. n_per_cam=5 gives 20 images, ~190 pairs, CPU matching in ~5 min. Enough frames to get cross-camera overlap without exhaustive cost. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

apt COLMAP binary always initializes OpenGL for the matcher regardless of use_gpu flag. pycolmap with device=cpu bypasses OpenGL entirely. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Wide-angle multi-camera sports setups have limited cross-camera overlap. Relax init_min_num_inliers, init_min_tri_angle, and abs_pose thresholds so the mapper can seed a reconstruction from sparse cross-cam matches. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- image.cam_from_world() replaces image.rotation_matrix() + image.tvec - cam.focal_length is a direct attribute in pycolmap 4.x - pts_world dtype explicit float64 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Critical fixes: - Output to sparse_/ (not sparse/0/) matching 4DGS loader - Generate points3D_multipleview.ply (triggers MultipleView detection) - Generate poses_bounds_multipleview.npy (correct filename) - Flatten images as imageN.jpg for 4DGS name extraction compatibility - Add point cloud downsampling to <40k points via open3d - Add sparse-to-PLY fallback when dense reconstruction is skipped - GPU SIFT with high-quality settings matching 4DGS multipleviewprogress.sh - Near depth clamped to 0.01 minimum to prevent rendering artifacts

Detect CUDA support from colmap -h output ("without CUDA" string). Skip patch_match_stereo with clear message instead of hard-failing. 4DGS can train from sparse point cloud (points3D.bin) alone — dense is optional extra initialization density. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Untrack scene/sparse/0/*.bin, poses_bounds.npy, metadata.json so Aditya can pull COLMAP outputs directly from the repo. Also ignore .jpg frames (previously only .png was excluded). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

# Conflicts: # scripts/stage2_colmap.py

- restructure_for_4dgs.py: rewrites COLMAP images.bin/cameras.bin for 4DGS MultipleView loader (imageN.jpg naming, sequential camera IDs), converts points3D.bin → PLY, handles 3-camera setup (cam01 unregistered) - runpod_setup.sh: full pipeline from git clone to exported per-frame PLYs - configs/: A100-optimized training configs (batch=4, fast=5k/quality=10k iters) - HANDOFF.md: current project state for agent continuity

Fast config reduced to 3k-iter smoke test (data validation only). Quality config back to full 14k iterations with batch=2 to leave VRAM for denser Gaussian populations during densification.

- Added loading and error screens with improved accessibility features. - Introduced scene name display in the header. - Enhanced director mode button functionality and styling. - Updated loading progress display and error handling. - Refined CSS styles for better visual consistency and usability. - Adjusted frame counter formatting for improved readability. - Added event listeners for playback controls and director mode toggling.

…sion.cuda

… animation, removing unnecessary frequency control checks for camera movement.

…ap filling Replaces GPU-trained 4D reconstruction with instant bullet-time for a single user-selected moment. Gemini analyzes synced multi-camera video to find key moments via natural language, then Nano Banana Pro generates synthetic views between cameras using recursive edge-inward filling with up to 14 reference images. Viewer gets drag-to-rotate image strip mode with real/AI source badges. - bullet_time/ package: schemas, moment_detector, gap_filler, pipeline CLI - viewer: ImageStripPlayer with drag-to-rotate, bullet-time UI mode - server: Gemini Live proxy with find_moment/build_strip/show_strip tools - Removes scanline texture overlay, adds real vs AI-generated frame badges Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The model was generating identical views because the prompt described abstract geometry ("25% along the arc") instead of visual effects. New prompt specifies exact degree rotation per step, clockwise subject rotation, leftward background shift, and frozen pose. Also upgrades to gemini-3-pro-image-preview for highest quality output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replace the cold cyberpunk palette (cyan/orange on blue-black) with ReRoute's warm premium aesthetic: rose/amber accents on maroon-tinted dark backgrounds, cream text, DM Sans + Outfit + JetBrains Mono fonts, generous border radii. Viewer: warm dark backgrounds, rose (#D44060) for spatial, amber (#D4956A) for temporal, hover lifts, warm glows, custom scrollbar, gradient buttons. About page: full ReRoute light theme — cream (#FAF6F1) background, maroon (#7A1B2D) accents, white cards with warm shadows and hover effects. DESIGN.md updated to document the new direction. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Adopt ReRoute warm aesthetic for viewer + marketing page

- Landing page: FREEZEFRAME wordmark + logo + upload zone, nothing else - Fake processing screen: 4 sequential animated steps - Viewer: fullscreen strip, minimal wordmark, 5-bar listening indicator - Theme: cream background, solid maroon accents, no transparency - Sizes scaled up throughout Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…imation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Upload → 5 thumbnail circles appear staggered around central agent circle - Agent circle holds the listening bars, pulses on listening/speaking states - Dashed orbit ring connects thumbnails visually - Thumbnails float independently with subtle animation - On click/voice trigger: thumbnails merge into center with scale+fade - Agent circle absorbs with a brief pulse - triggerMerge() exposed for Phase 3 voice integration Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- server/gemini_proxy.py: WebSocket proxy bridging browser ↔ Gemini Live with sportscaster personality, 3 voice tools (describe/explain/navigate), moment catalog loaded on startup, audio/transcript relay - viewer/src/gemini_live.js: mic capture via AudioWorklet, PCM streaming, 24kHz audio playback queue, navigate → boomerang animation, overlay text - viewer/public/pcm-processor.js: Float32→Int16 AudioWorklet processor - viewer/index.html: add #viewer-overlay-text div - viewer/src/styles.css: overlay text styles (output/input/navigate/error) - viewer/src/main.js: import and call connectVoice() after viewer init Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Geometry-first pipeline with fg/bg separation and ghost prevention - Precomputed 4 key moments: Keanu dodge, Kobe fadeaway, roundhouse kick, water throw - 5-camera support with 28-degree gaps - Extreme black/white clamping for smoother boomerang playback - Concurrent Nano Banana polish calls - Depth Anything V2 on MPS for fast local depth estimation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Keanu: 101/90/100/100/100, Fadeaway: 437/420/429/444/448, Kick: 733/718/736/739/744, Water: 1132/1123/1137/1139/1142 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add readME

- Rewrote viewer UI with new styles, controls, splat player, and image-strip player - Updated Gemini Live voice integration and proxy server - Added bullet-time pipeline catalog and VGGT pipeline scripts - Reorganized docs into docs/ directory - Updated .gitignore to exclude large generated directories - Cleaned up deprecated stage1 preprocessing and empty gitkeep dirs

The logo, wordmark, and upload zone were sitting slightly too low on the viewport. Adds margin-top: -60px to #landing-inner to shift the centered content group upward, improving the visual weight distribution on the landing screen.

vercel · 2026-05-26T16:08:46Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
freezeframe	Ready	Preview, Comment	May 26, 2026 4:10pm

* Clean up * fix(vercel): add vercel.json to set rootDirectory=viewer * fix(vercel): remove invalid rootDirectory, scope commands to viewer/ * fix(viewer): remove broken raw_videos symlink Vite's prepareOutDir followed the public/raw_videos symlink (pointing to gitignored ../../raw_videos training data) and failed with ENOENT on Vercel. The symlink is unreferenced in viewer code; safe to drop.

coderabbitai · 2026-05-26T16:08:59Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c3de8738-8db9-43a6-be70-ed0a93b74598

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch main

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

adityasingh2400 and others added 30 commits March 27, 2026 18:06

stage 1

a5808dc

Stage 1: 1-indexed cam folders and JPG frames for 4DGS loader

36bcc63

Merge remote-tracking branch 'origin/arshia'

f1b6366

Use sys.executable instead of hardcoded python for subprocess calls

1d2bb36

Fix: headless Qt display + jpg frame support

8b7af44

- Set QT_QPA_PLATFORM=offscreen so colmap binary works on headless servers - Support .jpg frames in addition to .png (cloud box has .jpg files) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Fix: force CPU SIFT on headless server (no OpenGL)

19f8f44

GPU SIFT extraction/matching requires OpenGL which isn't available on the RunPod headless instance. Force CPU mode. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Fix: enable GPU for SIFT matching (CUDA, not OpenGL)

891a3d8

Matching uses CUDA directly — no OpenGL needed on headless server. CPU matching would take hours; A100 GPU takes ~5 minutes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Fix: use pycolmap CPU for matching to avoid apt COLMAP OpenGL crash

c77d45d

apt COLMAP binary always initializes OpenGL for the matcher regardless of use_gpu flag. pycolmap with device=cpu bypasses OpenGL entirely. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Fix: update pycolmap 4.x API for LLFF conversion

686ecda

- image.cam_from_world() replaces image.rotation_matrix() + image.tvec - cam.focal_length is a direct attribute in pycolmap 4.x - pts_world dtype explicit float64 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Merge branch 'divij-fixes'

03fe040

Add COLMAP sparse outputs and LLFF poses_bounds (45 cameras)

3bbb63c

Merge remote-tracking branch 'origin/divij'

e9e9452

# Conflicts: # scripts/stage2_colmap.py

Prioritize quality: restore 14k iters, batch=2 for VRAM headroom

20834b8

Fast config reduced to 3k-iter smoke test (data validation only). Quality config back to full 14k iterations with batch=2 to leave VRAM for denser Gaussian populations during densification.

Merge branch 'main' of https://github.com/adityasingh2400/Replay

bcc9caf

Fix: nvcc not in PATH on PyTorch 2.8 template, fall back to torch.ver…

920620e

…sion.cuda

Fix PEP 668: add --break-system-packages for RunPod Ubuntu 24.04

f93868d

Refactor rendering logic in main.js to always render the scene during…

53935ea

… animation, removing unnecessary frequency control checks for camera movement.

mia373 and others added 25 commits March 27, 2026 23:07

Merge branch 'main' of https://github.com/adityasingh2400/Replay

e129275

Add scene/images frames from arshia branch (4 cameras, 90 frames each)

719818f

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Merge pull request #3 from adityasingh2400/ui/reroute-aesthetic

2944d74

Adopt ReRoute warm aesthetic for viewer + marketing page

UI tweaks: dotted upload border, Upload Videos, bigger FREEZEFRAME title

519e98d

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Allow multiple file selection on upload input

44d276f

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

UI: bigger FREEZEFRAME title (44px), larger logo (400px), floating an…

f3ffbdd

…imation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Post processing refinements

4aad294

liveapi

07ff95f

Merge branch 'main' of https://github.com/adityasingh2400/Replay

2331704

Reprocess all 4 moments with manual per-camera frame alignment

65e5a73

Keanu: 101/90/100/100/100, Fadeaway: 437/420/429/444/448, Kick: 733/718/736/739/744, Water: 1132/1123/1137/1139/1142 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fragmented analysis fixed

9140635

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Sync animation

a973550

Merge branch 'main' of https://github.com/adityasingh2400/Replay

4da9f2c

Add README

794113b

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge pull request #5 from adityasingh2400/add-readme

e9682e7

Add readME

Merge branch 'add-readme'

afe34a8

vercel Bot deployed to Production May 26, 2026 16:10 View deployment

railway-app Bot temporarily deployed to clever-insight / production June 3, 2026 02:08 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Full codebase review — Stage 1 preprocessing + 4DGS pipeline + interactive viewer#2

Full codebase review — Stage 1 preprocessing + 4DGS pipeline + interactive viewer#2
adityasingh2400 wants to merge 63 commits into
greptile-basefrom
main

adityasingh2400 commented Mar 28, 2026

Uh oh!

vercel Bot commented May 26, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 26, 2026 •

edited

Loading

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

adityasingh2400 commented Mar 28, 2026

Summary

Uh oh!

vercel Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vercel Bot commented May 26, 2026 •

edited

Loading

coderabbitai Bot commented May 26, 2026 •

edited

Loading