cryptotn-gpu

GPU-accelerated MPS/MPO tensor network engine for cryptochrome radical pair spin dynamics.

Target: bond dimension χ ≥ 2500 on RTX 5060 Ti 16 GB (sm_120, Blackwell), breaking the χ ≤ 1500 CPU bottleneck of Hino et al. (arXiv:2509.22104, 2025).

Status: Phase A (CPU/quimb) complete and validated. Phase B (cuTensorNet GPU engine) functional; χ-convergence benchmarks running.

why this matters

The avian magnetic compass is thought to operate via the radical pair mechanism in flavin adenine dinucleotide (FAD) and tryptophan radicals in cryptochrome 4a. Simulating the spin dynamics of 30–60 coupled nuclear spins requires working in a Hilbert space of dimension 2⁶² ≈ 10¹⁸ — tractable only with matrix product state (MPS) methods.

Hino et al. (2025) established the state of the art: χ = 1500 on CPU, ~6 hours per trajectory for a 62-spin system. That wall limits the accuracy of compass sensitivity models. Higher χ → better truncation fidelity → more reliable predictions for field-dependent singlet yield Φ_S(B) — the observable that links quantum spin physics to actual bird navigation.

This project moves the ceiling by running the TDVP sweep on GPU with cuTensorNet, targeting χ = 2500 for N ≤ 40 spins and χ = 1024 for the full 62-spin ErCry4a system.

architecture

Phase A (CPU prototype)              Phase B (GPU hot path)
────────────────────────────         ─────────────────────────────────────
quimb + scipy                        cuTensorNet (cuQuantum Python)
ExactSolver  — dense expm, N ≤ 20   CupyKrylovSolver — GPU Arnoldi Krylov
MpsSolver    — TDVP, χ ≤ 500        CuTDVPSolver     — MPO-MPS TDVP, χ ≤ 2500
~minutes (N ≤ 20)                    ~minutes (χ = 1024, N = 62, RTX 5060 Ti)

cryptotn/
├── hamiltonian.py    spin Hamiltonians (hyperfine, Zeeman, exchange, Haberkorn)
├── radical_pair.py   system configs: FAD-W, ErCry4a, Tetrad-Trp (AtCry1)
├── tdvp.py           integrators: ExactSolver, MpsSolver
├── observables.py    singlet/triplet yields, compass sensitivity ΔΦ_S
└── cuda/
    └── engine.py     CupyKrylovSolver + CuTDVPSolver (Phase B)

Key implementation note: CuTDVPSolver uses a left-to-right-only 2-site sweep. The FSM MPO accumulates Liouvillian contributions L→R; adding a reverse pass causes N-fold trace double-counting. This is documented in cuda/engine.py §7.

benchmark results (Phase A)

Three mandatory benchmarks against published literature:

#	system	reference	status
1	FMO 7-site complex, 77 K / 300 K	Dunnett et al., J. Chem. Phys. 163, 104109 (2025)	passing — P₁(t) RMSE < 1e-4
2	ErCry4a, 3–4 nuclei (exact)	Hino et al., arXiv:2509.22104 (2025)	Φ_S validated; χ-convergence sweep running
3	Tetrad-Trp AtCry1	Babcock et al., JPCB 128, 4035 (2024)	scheduled

ErCry4a χ-convergence (Phase A, CPU, n_nuc=3):

χ	Φ_S	abs error vs exact	wall time
2	0.128665	0.0	190 s
4	0.128665	0.0	709 s
8	0.128654	1.1e-5	1158 s
16	0.128665	0.0	2274 s

Phase B GPU timing logs available in benchmarks/results/gpu_timing.jsonl.

quickstart

# Phase A — CPU, no GPU required
git clone https://github.com/[your-username]/cryptotn-gpu
cd cryptotn-gpu
pip install -e ".[dev]"
pytest tests/ -v

# smoke-test all benchmarks (~2 min)
python benchmarks/run_all.py --fast

# χ-convergence sweep (paper Figure §4.2)
python benchmarks/bench_chi.py --n-nuc 10

# ErCry4a field sweep
python benchmarks/bench_ercry4a.py --n-nuc 10 --b-fields 0.0 0.05 0.1 0.5

# FMO 7-site (77 K and 300 K)
python benchmarks/bench_fmo.py

WSL2 note: if running from Windows, copy the repo to /opt/ for full filesystem performance:

sudo cp -r cryptotn-gpu/ /opt/ && sudo chown -R $USER /opt/cryptotn-gpu

Phase B (GPU) setup

Requires CUDA 12.8+ and an NVIDIA GPU with sm_86 or newer (sm_120 for Blackwell):

pip install ".[gpu]"
# verifies both cupy and cuquantum
python -c "import cupy; import cuquantum; print('GPU ready:', cupy.cuda.runtime.getDeviceProperties(0)['name'])"
python verify_gpu.py

VRAM budget at χ = 2500:

N = 40 spins: MPS tensors ~4 GB + environments ~28 GB → fits on 16 GB with selective recompute
N = 62 spins: practical limit χ ≈ 1024 on 16 GB

key parameters

parameter	symbol	default	notes
bond dimension	χ	64 (A) / 2500 (B)	primary accuracy lever
singlet rate	k_S	0.263 μs⁻¹	ErCry4a (Hino 2025)
earth field	B	0.05 mT	Helsinki latitude
integration time	t_max	10 μs	~5 / k_S
time steps	n_steps	1000	dt = t_max / n_steps

roadmap

Phase A: CPU reference solvers (ExactSolver, MpsSolver)
Phase A: ErCry4a, FMO benchmarks validated
Phase B: CupyKrylovSolver (GPU sparse Krylov, RMSE < 1e-4 vs exact)
Phase B: CuTDVPSolver skeleton + LR-only 2-site sweep
Phase B: full χ = 2500 sweep on RTX 5060 Ti, N = 40
Tetrad-Trp benchmark (Babcock 2024)
arXiv preprint (physics.bio-ph + quant-ph)

references

Hino et al., arXiv:2509.22104 (2025) — ErCry4a TDVP at χ = 1500
Dunnett et al., J. Chem. Phys. 163, 104109 (2025) — TENSO FMO benchmark
Babcock et al., JPCB 128, 4035 (2024) — Tetrad-Trp superradiance in AtCry1
Maeda et al., Nature 453, 387 (2008) — FAD-W radical pair spin selectivity
Haberkorn, Mol. Phys. 32, 1491 (1976) — recombination kinetics master equation

license

MIT. Fork of KenHino/radicalpair-tensornetwork.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
benchmarks		benchmarks
cryptotn		cryptotn
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
verify_gpu.py		verify_gpu.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cryptotn-gpu

why this matters

architecture

benchmark results (Phase A)

quickstart

Phase B (GPU) setup

key parameters

roadmap

references

license

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cryptotn-gpu

why this matters

architecture

benchmark results (Phase A)

quickstart

Phase B (GPU) setup

key parameters

roadmap

references

license

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages